[lug] Scripting help, lynx
Jeffrey S. Haemer
jeffrey.haemer at gmail.com
Tue May 3 08:06:25 MDT 2011
Paul,
Another way to skin this cat:
for f in **/*.html; do
lynx -nolist -dump $f > ${f/html/txt}
done
You have to have globstar set, like this: shopt -s globstar . Me, I have it
in my .bashrc .
On Tue, May 3, 2011 at 6:58 AM, Paul Nowosielski
<paulnowosielski at yahoo.com>wrote:
> Dear All,
>
> I'm trying to convert all the html files
> into text using lynx. The files are in many directories
> with meaningful names.
>
> Can anyone assist me in creating a script
> That will go through each directory recursively
> and convert the files to text and preserve the base name.
>
> ex: file1.html file1.txt file2.html file2.txt (or something close to this)
>
> I have this so far, which correctly traverse the directories
> and spits out the text. But I am not understanding out how
> to direct to a txt file with the same name as the html file.
>
> find ./ -name *.html |xargs -I '{}' lynx -nolist -dump '{}'
>
> Any thoughts?
>
> Thank you,
>
> Paul
> _______________________________________________
> Web Page: http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
>
--
Jeffrey Haemer <jeffrey.haemer at gmail.com>
720-837-8908 [cell], http://seejeffrun.blogspot.com [blog],
http://www.youtube.com/user/goyishekop [vlog]
*και οστις σε αγγαρευσει μιλιον εν υπαγε μετ αυτου δυο*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20110503/4855b333/attachment.html>
More information about the LUG
mailing list