[lug] vi wildcards

Tkil tkil at scrye.com
Fri May 25 10:23:21 MDT 2001


>>>>> "John" == John Starkey <jstarkey at advancecreations.com> writes:

John> I also worked out a way to chop the unwanteds outta all 72 files
John> based on the info in this thread. I've been slowly force-feeding
John> myself regexp and perl. You guys pushed that along quit a bit.

if you didn't already find it, take a look at the '-i' flag for perl
(man perlrun).  e.g.:

   perl -i.bak -lpwe 's:</?span[^>]*>::gi' *.html

t.

p.s. in keeping with my title of "sick little monkey", here's one of
     the regexes i wrote as a partial replacement for the HTML::Parser
     module (when i was in a situation where i couldn't install any
     CPAN modules...):

        if ($t =~ m{\G
              (                   # (whole thing is in $1)
               <([^/]\w*)         # start of tag ($2)
               \s*
               ((?:               # attribute list ($3)
                 (?:[^>\s=]+      #   the attribute itself
                  (?:\s*=\s*      #   maybe followed by an equals sign and
                   (?:\"[^\"]*\"| #     double-quoted, or
                    \'[^\']*\'|   #     single-quoted, or
                    [^>\s]+)      #     plain value
                  )?              #   or maybe not.
                  \s*             #   and a bit of whitespace
                 ))*)             # we can have 0 or more attributes
               \s*>)}gcx)

     which matches opening anchors, and then i might have to parse the
     attribute list, which ended up in $3:

          my $attr_list = $3;

          while ($attr_list =~
                 m{(\S+?)               #   the attribute itself ($1)
                   (\s*=\s*             #   maybe followed by ($2)
                    (?:
                     \"([^\"]*)\"|      #     double-quoted ($3), or
                     \'([^\']*)\'|      #     single-quoted ($4), or
                     ([^>\s]+)))        #     plain value  ($5)
                   \s*}gcx)
          {
            # take the lower-case attribute name.
            my $key = lc($1);

            # assign a value, if we have one...
            my $val = $2 && (defined $3 ? $3 :
                             defined $4 ? $4 :
                             defined $5 ? $5 :
                             undef);

            # and store it for later.
            $attrs{$key} = $val;
          }




More information about the LUG mailing list