[lug] vi wildcards becomes mod_perl/apache/asp

Tkil tkil at scrye.com
Fri May 25 17:13:05 MDT 2001


>>>>> "John" == John Starkey <jstarkey at advancecreations.com> writes:

John> I had to delete several other tags. I was just using the above
John> as an example. I'm assuming your code /? would delete the
John> closing tags also. (will check it out, thanks).

yes; /? means "zero or one occurrance of a slash".  since we also
allow for zero characters after the tag name, the regex

   </?span[^>]*>

will match <span class="foo"> and </span>.

if you are removing multiple tags, you can do them all at once (unless
they interfere in weird ways):

   perl -i .bak -lpwe 's:</?(span|div|br)[^>]*>' file1 file2 ...

i've actually done this sort of thing to remove excess <font> tags and
whatnot.

you can do other transformations if you like, too; you're not limited
just to a single s/// operation.  i could have done that as:

   perl -i .bak -lpwe 's:</?span[^>]*>; s:</?div[^>]*>; s:</?br[^>]*>' \
      file1 file2 ...

some wysiwig html editors tend to leave zero-content turds around,
like <b></b> ... it's easy enough to filter those out too.  more
complex is handling cases where the pair of elements, or even a single
tag, is split over two lines.  i often use this style for long URLs:

   <a href="http://www.whereever.com/some/really/long/url/here.html"
     >http://www.whereever.com/some/really/long/url/here.html</a>

the bizarre line breaking is because the presence or absence of
whitespace *is* significant in the content of some elements,
especially for purposes of underlining links or splitting lines.
whitespace within elements themselves, on the other hand, is
guaranteed to not matter.

John> Cool. Thanks. LWP was a problem. 01mailrc.txt was what it was
John> trying to get from Tokyo. Looking at it, it's a bunch of email
John> addies and aliases. I think I'll do it by hand :}

hm.  if you install LWP by hand (which can be lots of fun; the
Bundle::LWP might help there, but i don't know if i've ever done that
one by hand), set your proxy, and configure CPAN to use an HTTP
mirror, you might have some luck.

t.



More information about the LUG mailing list