[lug] Typesetting Programs

Thu Dec 12 14:26:49 MST 2002

On Thu, Dec 12, 2002 at 02:04:55PM -0700, Tkil wrote:
> >>>>> "David" == David Morris <lists at morris-clan.net> writes:
> 
> David> The core processing code for creating the documentation is all
> David> very similar, just the method that data is gathered is
> David> different....all I really need is to find some set of tags I
> David> can insert into a text stream that will then get fed into a
> David> formatting engine.
> 
> I think you're handwaving a bit hard here, and that it will come back
> to haunt you.  Then again, I'm a noted pessimist, so... :)

I'm being a bit general here on purpose...I have to be to
some extent, and in truth it doesn't matter in terms of the
basic question (which is, really, bigger than my current
pressing need).

> When you use a phrase like "As an example, creating reports from the
> content of a database [...] and creating a documentation file from
> it.", I really worry about what you're going to come up with.  Sure,
> you can get shiny output that's equivalent to "DESCRIBE my_table", but
> that's all it is -- and while some people like that sort of thing, I
> find it annoying.

The purpose of that specific task is a sold as a database:
create a report on database content.  The difference, and
this is a problem with linux-based databases in general, is
that a simple report will not do.  I need to do some
computing on the results before it is reported, then a nice
document must be created.  The end result is in some cases a
deliverable to a very high profile customer, and will be
used by techs for years to come.  This means a carefully
crafted document, not just any simple output format.  The
traditional way of doing this (at least for at my company)
is to do things the hard way and either manually
generate/format such documents, or use WYSIWYG tools that
offer a fraction the functionality (or maintainability) of
somethign like LaTeX.

> Basically, magically deriving semantic markup ("this is a header",
> "this is a link", "this table describes street addresses and is
> capable of handling international postcodes") is *hard*.

I agree that from a generic sense, this is not just hard,
but *VERY* hard.  On the other hand, this is not a generic
situation:  All data is already sorted, linked together,
described, validated, etc.; it just needs documenting.

Now, don't confuse auto-generation of a document with
auto-documenting code.  The later task is *impossible*.
Until my computer is not only smarter than I am, but psychic
and pre-cognitiant as well, the later just cannot be done.
The best you can do is parse out a standard and give a
listing of what is already there, just in a different
format.

The issue is, though, gathering information that already
exists.  The documents that will be auto-generated will
contain no information that is not in the source...on the
other hand, the source is not a human-readable format in
some cases, or requires heavy processing to get into a
different format for some specific purpose.

> (This is speaking from a position of some experience.  A few years
> back, I was working on a project that would go through PDF files and
> try to reconstruct a table of contents, based on font size and style,
> position on page, etc.  It worked, but not without a lot of struggle.)

This is one thing that I will most definately NOT
do....items such as a table of contents should be generated
by the base package I am using, NOT by some cludge that I
come up with....and it looks like LaTeX will do this nicely.

--David