Re: [Docutils-develop] Transforms...?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Fri, 05 Jul 2002 23:32:47 -0400 David Goodger
<go...@us...> wrote:

> Adam Chodorowski wrote:
> > I (will) have several little larger reST documents on a website and
> > I want to construct an index page with short descriptions and a link
> > to the full document. So I thought I'd simply extract the abstract /
> > introduction section of those documents for the short description on
> > the index page to avoid duplication and ease maintanence.
> > 
> > The idea is to churn through the documents twice: the first pass
> > creates the HTMl documents in full, while the second pass applies
> > this filter/transform to get the abstract and adds it to the index
> > page. 
> 
> I assume that there's more to the index file than just the abstracts
> (such as introductory material, and perhaps repeated wrappers around
> abstracts).  

It is actually a little more complex than that, since I want to do have news
items on the same page for which I intended to utilize the bibliographic
fields repeatedly (once for each news item, for since the author of each news
item can vary and definately the date). 

> I can think of several ways to do what you describe:
> 
> 1. First process each document individually, writing out each result
>    file as usual.  Then process the index file, which contains special
>    references (directives), one per document, which cause the
>    documents to be fully parsed a second time each and extract the
>    "abstract" topics and insert them into the body of the index
>    document.
> 
>    This would require some kind of "extraction" directives for the
>    index document.
> 
> 2. Store the abstracts as separate files, which are inserted into both
>    the individual documents and into the index file with "include"
>    directives (not yet implemented).
> 
>    This would require a new "include" directive (which is already on
>    the To Do list).
> 
> 3. As in (1), except process all of the individual documents in a
>    single process, storing the extracted abstracts in a list (so the
>    documents don't have to be processed a second time), and parse and
>    assemble the index file last.
> 
>    This would require a new specialized front-end, along with at least
>    a placeholder directive to locate the insertion-points for
>    abstracts.
> 
> 4. Write a full-blown templating system for Docutils.  I think
>    Python's ht2html.py is a good model: simple but effective.  Either
>    add programmability through a pre-processor like YAPTU or with
>    directives like "repeat".  Very vague ideas at this point.
> 
> Which were you thinking of?

Something along the lines of (1), although I did not intend to write the index
page in reST with special directives but rather write a script that calls the
docutils tools to generate the full documents and extract the abstracts into
files, which would then be concatenated with some extra HTML inserted
before/after and between them.

I do not like option (2) at all, since I would rather not split the document
up. One reason is that it would make it less readable as a plain text file
(unless one wrote a "plaintext" writer for docutils, but that would really be
a bit odd IMHO). 

All the other options you listed basically work fine for me. (4) is perhaps
the most tempting as a future system, but it would probably require some
substantial amount of work. For my current need it simply seems to be easier
to write some scripts and add a few filtering tools to docutils...

> > Another thing I would like to do is to either add templating support
> > to the HTML writer, or write a "fragment" HTML writer (which would
> > only write out the body of the document, so you can wrap it in your
> > own header/footer for layout (navigation bar etc)). But that's a
> > different topic. :)
> 
> The HTML writer already exposes the components, so you can just grab
> the document body (everything inside but not including <body> &
> </body>).  Use ``docutils.io.StringIO`` for the "destination_class"
> parameter of ``docutils.core.Publisher.__init__`` to avoid writing a
> file.  (Hmm, idea: NullIO class.)

Care to explain a little more? Perhaps I should take a closer look at the
relevant sources. Hmmm...

> However, the idea of custom headers & footers is what inspired the
> "decoration" element (which contains "header" & "footer" elements).
> It hasn't been fully developed yet.

Isn't that supposed to be a generic part of docutils for all kinds of writers?
I am not so interested in that, since the "decorations" that I want for my
online HTML version differ very greatly from the decorations I wish to have in
the PDF (for example) version. Perhaps I've misunderstood it though.

---
Adam Chodorowski <ad...@ch...>

Witness if you will Microsoft Outlook and Outlook Express, the two most
efficient virus propagation utilities ever devised by human intellectual
failure.
     -- Thomas C Greene / The Register