tools-yak
[Top] [All Lists]

[tools-yak@collab] Re: A ReStructuredText Primer

To: tools-yak@xxxxxxxxxxxxxxxxxxx
From: Eric Armstrong <eric.armstrong@xxxxxxx>
Date: Wed, 14 May 2003 17:50:07 -0700
Message-id: <3EC2E43F.50602@sun.com>
Bill and Garold: Thanks for your thoughts and valuable pointers.    (01)

When I started designing even a simple system, though, I found
that the parsing problem can easily turn out to be quite severe.    (02)

For example:
  * A bullet
  Some text
  - Another bullet    (03)

There are several ways to interpret that sequence:
          * A bullet. Some text. (continuing the bullet)
                  - Another bullet (in a sublist)
OR
          * A bullet<br>
             Some text. (continuing the bullet)
                  - Another bullet (in a sublist)
OR
           * A bullet
              <p>
              Some text. (A continuation-paragraph)
                   - Another bullet (in a sublist)
OR
            * A bullet
               </ul>
             <p>Some text (starting a new thought)
                   - Another bullet (under the new thought)    (04)

The first two could be distinguished by the capitalization of the
2nd line, to differentiate between
     * A bullet some text
and
      * A bullet<br>
         Some text    (05)

but the others  can be problematic, because "indentation" makes
the difference. If we assume an extra line break after the first bullet,
making:
  * A bullet    (06)

    Some text
      - Another bullet    (07)

Then we can be sure are talking about one of the last 3 cases. But
if the difference between them is a matter of indentation, then a
single-character difference in placement of text can make the difference
in how the text is interpreted.    (08)

The problem is that I, as a user, am frequently off by at *least* one
character, if not 2 or 3, when I deeply indent lists. In that case, how
can I know in advance what my formatting will produce??    (09)

Basically, I don't think I can. (And since I tend towards long, deeply
nested lists) this consideration could kill the whole "smart ascii" concept
for me. (I like that name, though. It sucks a lot less than "rst".)    (010)

Now, header nesting can be done pretty well down to 3 levels using
underlining, where === --> H2, ---- -> H3, and .... -> H4. Titles are
a problem, but there are other ways to handle that -- like in the subject
area of an email message.    (011)

So I like underlining, because it allows for multiple heading levels -- 
unlike,
say, simple capitalization. There could be other ways to do the same thing,
though. Something silly would be [_2] for a level 2 header. It's yucky, but
it demonstrates the point that other alternatives are possible.    (012)

But for lists, simple indentation really doesn't look like a good way to
control things, especially when they're nested. (Blockquotes are marginally
manageable, because I can pretty well distinguish between start of line
and anywhere else, using the Home key. But a blockquote inside a blockquote
would once again be very hard to write consistently.    (013)

So I reason myself down to the conclusion that, for any kind of basic
ascii markup to be useful, SOME clue as to the intended nesting is a
necessity. And it has to be markup, because indentation won't suffice.
(I need to know what I'm going to get, or a plain ascii writeup won't cut
it.)    (014)

That's as far as I've gotten, though. I can't think of a markup device that
would suffice -- unless it were tabs. Those would give me consistent
indentation that I could verify by inspection.    (015)

Then maybe any line that  begins with spaces (after leading tabs removed)
would be considered as a blockquote at that level in the output. It might
work -- but it would be a lot better with an editor that showed me tabs!    (016)

Finally, in the ideal world, I'd be able to include HTML markup for tables
and images and such. Not much point in reinventing that part of the syntax,
really. The idea is just to make the simple stuff easier. (Lists seem to 
lie at
the boundary. They're used too often, and the html syntax is just a bit too
clunky to want to use it, especially where nesting is present, but at 
the same
time getting the nesting right is tricky without the syntax!)    (017)

Eric Armstrong wrote:    (018)

> Pretty good set of structuring tools for formatting
> messages. The one part I don't really like is *x*
> for italics, and **x** for bold.
>
> Mozilla recognizes the first as bold.
> It ignores the second.
> And it will _even recognize this_ and underline it.
>
> It should be possible to italicize. They've thought of
> everything else. But I'm not sure how Mozilla does it.
>
> <http://docutils.sourceforge.net/docs/rst/quickstart.html>
>
> ------------------------------------------------------------------------
>
>
>   A ReStructuredText Primer
>
> Author:       Richard Jones
> Version:      1.10
>
> Contents
>
>     * Structure <#structure>
>     * Text styles <#text-styles>
>     * Lists <#lists>
>     * Preformatting (code samples) <#preformatting-code-samples>
>     * Sections <#sections>
>     * Images <#images>
>     * What Next? <#what-next>
>
> The text below contains links that look like "(quickref 
> <quickref.html>)". These are relative links that point to the Quick 
> reStructuredText <quickref.html> user reference. If these links don't 
> work, please refer to the master quick reference 
> <http://docutils.sourceforge.net/docs/rst/quickref.html> document.
>
>
>   Structure <#id15>
>
> From the outset, let me say that "Structured Text" is probably a bit 
> of a misnomer. It's more like "Relaxed Text" that uses certain 
> consistent patterns. These patterns are interpreted by a HTML 
> converter to produce "Very Structured Text" that can be used by a web 
> browser.
>
> The most basic pattern recognised is a *paragraph* (quickref 
> <quickref.html#paragraphs>). That's a chunk of text that is separated 
> by blank lines (one is enough). Paragraphs must have the same 
> indentation -- that is, line up at their left edge. Paragraphs that 
> start indented will result in indented quote paragraphs. For example:
>
>This is a paragraph.  It's quite
>short.
>
>   This paragraph will result in an indented block of
>   text, typically used for quoting other text.
>
>This is another one.
>  
>
> Results in:
>
>     This is a paragraph. It's quite short.
>
>         This paragraph will result in an indented block of text,
>         typically used for quoting other text.
>
>     This is another one.
>
>
>   Text styles <#id16>
>
> (quickref <quickref.html#inline-markup>)
>
> Inside paragraphs and other bodies of text, you may additionally mark 
> text for /italics/ with "*italics*" or *bold* with "**bold**".
>
> If you want something to appear as a fixed-space literal, use 
> "``double back-quotes``". Note that no further fiddling is done inside 
> the double back-quotes -- so asterisks "*" etc. are left alone.
>
> If you find that you want to use one of the "special" characters in 
> text, it will generally be OK -- reStructuredText is pretty smart. For 
> example, this * asterisk is handled just fine. If you actually want 
> text *surrounded by asterisks* to *not* be italicised, then you need 
> to indicate that the asterisk is not special. You do this by placing a 
> backslash just before it, like so "\*" (quickref 
> <quickref.html#escaping>), or by enclosing it in double back-quotes 
> (inline literals), like this:
>
>``\*``
>  
>
>
>   Lists <#id17>
>
> Lists of items come in three main flavours: *enumerated*,*bulleted* 
> and *definitions*. In all list cases, you may have as many paragraphs, 
> sublists, etc. as you want, as long as the left-hand side of the 
> paragraph or whatever aligns with the first line of text in the list item.
>
> Lists must always start a new paragraph -- that is, they must appear 
> after a blank line.
>
> *enumerated* lists (numbers, letters or roman numerals; quickref 
> <quickref.html#enumerated-lists>)
>
>     Start a line off with a number or letter followed by a period ".",
>     right bracket ")" or surrounded by brackets "( )" -- whatever
>     you're comfortable with. All of the following forms are recognised:
>
>1. numbers
>
>A. upper-case letters
>   and it goes over many lines
>
>   with two paragraphs and all!
>
>a. lower-case letters
>
>   3. with a sub-list starting at a different number
>   4. make sure the numbers are in the correct sequence though!
>
>I. upper-case roman numerals
>
>i. lower-case roman numerals
>
>(1) numbers again
>
>1) and again
>      
>
>     Results in (note: the different enumerated list styles are not
>     always supported by every web browser, so you may not get the full
>     effect here):
>
>        1. numbers
>
>       1.
>
>           upper-case letters and it goes over many lines
>
>           with two paragraphs and all!
>
>       1.
>
>           lower-case letters
>
>              3. with a sub-list starting at a different number
>              4. make sure the numbers are in the correct sequence though!
>
>        1. upper-case roman numerals
>
>        1. lower-case roman numerals
>
>        1. numbers again
>
>        1. and again
>
> *bulleted* lists (quickref <quickref.html#bullet-lists>)
>
>     Just like enumerated lists, start the line off with a bullet point
>     character - either "-", "+" or "*":
>
>* a bullet point using "*"
>
>  - a sub-list using "-"
>
>    + yet another sub-list
>
>  - another item
>      
>
>     Results in:
>
>         * a bullet point using "*"
>               o a sub-list using "-"
>                     + yet another sub-list
>               o another item
>
> *definition* lists (quickref <quickref.html#definition-lists>)
>
>     Unlike the other two, the definition lists consist of a term, and
>     the definition of that term. The format of a definition list is:
>
>what
>  Definition lists associate a term with a definition.
>
>*how*
>  The term is a one-line phrase, and the definition is one or more
>  paragraphs or body elements, indented relative to the term.
>  Blank lines are not allowed between term and definition.
>      
>
>     Results in:
>
>     what
>         Definition lists associate a term with a definition.
>     /how/
>         The term is a one-line phrase, and the definition is one or
>         more paragraphs or body elements, indented relative to the
>         term. Blank lines are not allowed between term and definition.
>
>
>   Preformatting (code samples) <#id18>
>
> (quickref <quickref.html#literal-blocks>)
>
> To just include a chunk of preformatted, never-to-be-fiddled-with 
> text, finish the prior paragraph with "::". The preformatted block is 
> finished when the text falls back to the same indentation level as a 
> paragraph prior to the preformatted block. For example:
>
>An example::
>
>    Whitespace, newlines, blank lines, and all kinds of markup
>      (like *this* or \this) is preserved by literal blocks.
>  Lookie here, I've dropped an indentation level
>  (but not far enough)
>
>no more example
>  
>
> Results in:
>
>     An example:
>
>  Whitespace, newlines, blank lines, and all kinds of markup
>    (like *this* or \this) is preserved by literal blocks.
>Lookie here, I've dropped an indentation level
>(but not far enough)
>    
>
>     no more example
>
> Note that if a paragraph consists only of "::", then it's removed from 
> the output:
>
>::
>
>    This is preformatted text, and the
>    last "::" paragraph is removed
>  
>
> Results in:
>
>This is preformatted text, and the
>last "::" paragraph is removed
>  
>
>
>   Sections <#id19>
>
> (quickref <quickref.html#section-structure>)
>
> To break longer text up into sections, you use *section headers*. 
> These are a single line of text (one or more words) with adornment: an 
> underline alone, or an overline and an overline together, in dashes 
> "-----", equals "======", tildes "~~~~~~" or any of the 
> non-alphanumeric characters = - ` : ' " ~ ^ _ * + # < > that you feel 
> comfortable with. An underline-only adornment is distinct from an 
> overline-and-underline adornment using the same character. The 
> underline/overline must be at least as long as the title text. Be 
> consistent, since all sections marked with the same adornment style 
> are deemed to be at the same level:
>
>Chapter 1 Title
>===============
>
>Section 1.1 Title
>-----------------
>
>Subsection 1.1.1 Title
>~~~~~~~~~~~~~~~~~~~~~~
>
>Section 1.2 Title
>-----------------
>
>Chapter 2 Title
>===============
>  
>
> This results in the following structure, illustrated by simplified 
> pseudo-XML:
>
><section>
>    <title>
>        Chapter 1 Title
>    <section>
>        <title>
>            Section 1.1 Title
>        <section>
>            <title>
>                Subsection 1.1.1 Title
>    <section>
>        <title>
>            Section 1.2 Title
><section>
>    <title>
>        Chapter 2 Title
>  
>
> (Pseudo-XML uses indentation for nesting and has no end-tags. It's not 
> possible to show actual processed output, as in the other examples, 
> because sections cannot exist inside block quotes. For a concrete 
> example, compare the section structure of this document's source text 
> and processed output.)
>
> Note that section headers are available as link targets, just using 
> their name. To link to the Lists <#lists> heading, I write "Lists_". 
> If the heading has a space in it like text styles <#text-styles>, we 
> need to quote the heading "`text styles`_".
>
> To indicate the document title, use a unique adornment style at the 
> beginning of the document. To indicate the document subtitle, use 
> another unique adornment style immediately after the document title. 
> For example:
>
>================
> Document Title
>================
>----------
> Subtitle
>----------
>
>Section Title
>=============
>
>...
>  
>
> Note that "Document Title" and "Section Title" both use equals signs, 
> but are distict and unrelated styles. The text of 
> overline-and-underlined titles (but not underlined-only) may be inset 
> for aesthetics.
>
>
>   Images <#id20>
>
> (quickref <quickref.html#directives>)
>
> To include an image in your document, you use the the image directive 
> <../../spec/rst/directives.html>. For example:
>
>.. image:: images/biohazard.png
>  
>
> results in:
>
> The images/biohazard.png part indicates the filename of the image you 
> wish to appear in the document. There's no restriction placed on the 
> image (format, size etc). If the image is to appear in HTML and you 
> wish to supply additional information, you may:
>
>.. image:: images/biohazard.png
>   :height: 100
>   :width: 200
>   :scale: 50
>   :alt: alternate text
>  
>
> See the full image directive documentation 
> <../../spec/rst/directives.html#images> for more info.
>
>
>   What Next? <#id21>
>
> This primer introduces the most common features of reStructuredText, 
> but there are a lot more to explore. The Quick reStructuredText 
> <quickref.html> user reference is a good place to go next. For 
> complete details, the reStructuredText Markup Specification 
> <../../spec/rst/reStructuredText.html> is the place to go ^1 <#id14>.
>
> Users who have questions or need assistance with Docutils or 
> reStructuredText should post a message 
> <mailto:docutils-users@lists.sourceforge.net> to the Docutils-Users 
> mailing list 
> <http://lists.sourceforge.net/lists/listinfo/docutils-users>. The 
> Docutils project web site <http://docutils.sourceforge.net/> has more 
> information.
>
> [1] <#id13>   If that relative link doesn't work, try the master 
> document: http://docutils.sourceforge.net/spec/rst/reStructuredText.html.
>
> ------------------------------------------------------------------------
> View document source <quickstart.txt>. Generated on: 2003-03-22 06:28 
> UTC. Generated by Docutils <http://docutils.sourceforge.net/> from 
> reStructuredText <http://docutils.sourceforge.net/rst.html> source.    (019)



-- 
This message is archived at:    (020)

http://collab.blueoxen.net/forums/cgi-bin/mesg.cgi?a=tools-yak&i=3EC2E43F.50602@sun.com    (021)
<Prev in Thread] Current Thread [Next in Thread>