[lug] Typesetting Programs

rm at fabula.de rm at fabula.de
Fri Dec 13 04:01:58 MST 2002


On Thu, Dec 12, 2002 at 08:17:40PM -0500, Michael Hirsch wrote:
> 
> Thinking about all this a little more, and I think that rather
> than going directly to latex you might want to consider going to an
> intermediate form like xml.  Then you can write xslt macros to convert
> the xml to latex, html, pdf, info, etc.  The poin being that xml is
> designed to be easily processed into something else, but latex is only
> well designed to be processed into dvi.  Given that most of the word
> processors are going into an xml format, you could probably even convert
> to staroffice or abiword format pretty easily.

Sorry to jump in so late ...

XML _is_ a good choice as an intermediate storage format, so if one needs
to process the data with various tools converting to XML would be a good
choice. BUT: XML itself is only a markup language specification, the more 
important question is: what DTD/schema to use (i.e. what set of markup
tags/attributes to allow). Since DocBook was mentioned in this thread more
than once let me stress one thing here: DocBook is _not_ a general Book/
Documentation markup language. It was designed specifically for the docu-
mentation of software programs (and even here, it's only usefull for
Programs written in ALGOL-like languages, writing Haskell Books with
DocBook isn't fun). Unless you need to do exactly this you end up spending 
your time fighting the DTD :-/
If the project is sufficiently large it might be worth the time to develop
your own DTD (but if you do so: safe yourself a lot of time and effort
and read 'Structuring XML Documents' by D. Meggison before you start. It's
worth the time and money).
Which leaves us with the formatting part. Sadly there's no really con-
vincing solution available for Linux these days. There's OpenJade 
(the tool used by the DocBook toolchain) which uses DSSSL as a stylesheet
language, but the LaTeX output (which is used for producing the postscript
and pdf versions as well!) is horrible -- have a look at the TeX code 
emitted to see what i mean. Jade uses LaTeX as a pure renderer (i.e. "put
these three words here" etc.) and doesn't take advantage of TeX excellent
typography. The same goes for ReportLab and similar solutions
(BTW, is a lot of this sounds frustrated it's because i've fought a long
battle with many of the above solutions). Another approach would be 
to write your own transformer in one of the en-vogue scripting languages
like Perl/Python/CL/scheme etc. 
Someone mentioned the possibility of an XSLT transformation from
XML into TeX: this is a tempting idea, but one needs not to forget
that the XSLT model only talks about transforming one XML into another
XML format. It's possible to (ab)use the 'text' output option, but you
end up fighting the encoding problems (hint: you can't just write out
text since it might contain nasty TeX control characters like '@' or
'_').
Another point to consider: If your data will end up in huge tables
think twice about using LaTeX. Typesetting tables isn't the strongest
virtue of TeX. 
If this is for a larger commercial project i'd _strongly_ suggest using
a tool like FrameMaker (that can use XML as the source of data). You
get excellent formating, professional tables etc. The tool was really
designed for such tasks.
Yes, i know, it doesn't run under Linux (they _had_ a beta version out
for one year but decided to not sell it :-/).
I'm currently running Frame under Basilisk Mac-Emulation without any 
problems so far.

 just my 2c

    Ralf Mattes
the DTD restrictions 
> --Michael
> 





More information about the LUG mailing list