[lug] Html to plain text

Atkinson, Chip CAtkinson at Circadence.com
Wed Jan 17 12:33:16 MST 2001


Look in the Perl Cookbook.  I know that there is an example there because I
saw it just yesterday.  I don't have the book with me right now though.
This recipe will strip the tags out across lines too.

Chip

> -----Original Message-----
> From: Ken Weinert [mailto:kenw at ihs.com]
> Sent: Wednesday, January 17, 2001 12:22 PM
> To: lug at lug.boulder.co.us
> Subject: Re: [lug] Html to plain text
> 
> 
> Take at look at this page, I think it will give you exactly what you
> want: http://home.netscape.com/newsref/std/x-remote.html
> 
> 
> * Carlos Hernández López (chernanl at banxico.org.mx) [010117 19:04]:
> > Yes, technically, html files ARE plain text. But what I 
> want to do is remove
> > all the html tags and get a  human  readable plain text 
> file. I need  exactly
> > what netscape does with the sequence that Wayde has described.
> > 
> > The  thing is that I need to do it  automatically, not by hand.
> > 
> > With Lynx I can get a plain text file but it is not so easy to read.
> > 
> > Any ideas?
> > 
> > "J. Wayde Allen" wrote:
> > 
> > > On Wed, 17 Jan 2001, Carlos Hernández López wrote:
> > >
> > > > Does anybody know an easy way to convert html files to 
> plain text files?
> > >
> > > Well ... one way using netscape is to use the click sequence:
> > >
> > >    file -> save as -> Format For Saved Document: Text
> > >
> > > - Wayde
> > >   (wallen at lug.boulder.co.us)
> > >
> > > _______________________________________________
> > > Web Page:  http://lug.boulder.co.us
> > > Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> > 
> > 
> > _______________________________________________
> > Web Page:  http://lug.boulder.co.us
> > Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> 
> -- 
> Ken Weinert   kenw at ihs.com 303-858-6956 (V) 303-705-4258 (F)
> GnuPG KeyID: 9274F1CE           GnuPG available at 
http://www.gnupg.org/
GnuPG Key Fingerprint: 1D87 3720 BB77 4489 A928  79D6 F8EC DD76 9274 F1CE
Black holes are God's physical manifestation of a floating point exception.





More information about the LUG mailing list