[lug] parsing tool for linux

rm at fabula.de rm at fabula.de
Mon Apr 8 05:57:57 MDT 2002


On Sun, Apr 07, 2002 at 10:33:05PM -0600, Tom Tromey wrote:
> 
> There's a couple projects out there to change gcc to generate XML
> cross-reference information.  I don't recall what they're called; look
> on sourceforge or google.  I know that they've been mentioned on the
> gcc list sometime in the last couple months.  It sounds like this is
> the kind of thing you're looking for.

There was one on freshmeat on Saturday, at http://introspector.sourceforge.net/.
BTW, AFAIK there are  raedy to use Lex files flying arround in the net, send me
mail if you are interested i'd have to dig rather deep in old harddisk piles :-)
There's a c tokenizer in the guile source distribution (CVS at :pserver:anoncvs at subversions.gnu.org:/cvs,
module guile-core) in the directory 'libguile', the file is called c-tokenizer.lex and 
can be compiled independent of guile like this:

-- Makefile.tokenizer ----

all:
	flex -t c-tokenize.lex > c-tokenize.c
	gcc -g -O2 -o c-tokenize c-tokenize.c 

The program acts as a filter, i.e. expects the code on STDIN. Output is like this:

| (comment . "/* The following is because we cannot portably get our hands on size_t
|  * (without autoconf's help, which isn't available because we want
|  * flex-generated scanners to compile on their own).
|  */")
| eol
| (id . "typedef")
| (id . "unsigned")
| (id . "int")
| (id . "yy_size_t")
| semicolon
| eol
| (id . "struct")
| (id . "yy_buffer_state")
| eol
| brace_open
| eol
| (id . "FILE")
| star
| (id . "yy_input_file")
| semicolon
| 

This is of course only a token stream. For your application i assume you want the
abstract syntax tree. So either you write a parser with Yacc (i know i have the 
yacc files for c somewhere ...) or you hack gcc.

  Ralf Mattes




More information about the LUG mailing list