[lug] Grep question
Matt Thompson
thompsma at colorado.edu
Fri Jul 23 13:08:19 MDT 2004
On Fri, 2004-07-23 at 11:31, David Morris wrote:
> On Fri, Jul 23, 2004 at 03:46:23AM -0600, Daniel Webb wrote:
>
> > Come to think of it, what I really want is to be able to grep inside any
> > kind of file or archive that can possibly be converted to text (for
> > example, something.pdf.gz could be gunzipped, then pdftotext used to
> > convert to text). Does THAT exist?
>
> No that does not exist, but it would be relatively simple to
> create a script which does that for known file types. Just
> a list of if-then-else statements to handle various file
> types and convert them to text before doing the grep
> (piping everything through stdout/stdin when possible to
> avoid unneeded temporary files). The 'file' program can be
> of particular use here in determining what type each input
> file really is.
If someone here is bored enough to try this, I think a new lesspipe.sh
would get you there about halfway or more. Its core is a long elif
list, and you can already do a "less tar.gz:contained_file" to see
something in a tarball.
Now, oddly, it doesn't handle .pdf.gz, but does .ps.gz. I'm guessing
that's a bug between pdftotext, gzip, and -, but I'm not bored enough to
track it down.
--
Learning just means you were wrong and they were right. - Aram
Matt Thompson -- http://ucsub.colorado.edu/~thompsma/
440 UCB, Boulder, CO 80309-0440
JILA A510, 303-492-4662
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20040723/3ec7c81d/attachment.pgp>
More information about the LUG
mailing list