[lug] Regex Help
Anthony Foiani
tkil at scrye.com
Sun Jul 10 22:15:20 MDT 2011
"Jeffrey S. Haemer" <jeffrey.haemer at gmail.com> writes:
> (1) Perl permits "bookend" delimiters for regexes. Â Instead ofÂ
>
> m/regex/Â or m!regex!
>
> I often use
>
> m(regex)or m[regex] or m{regex}
>
> For me, these are more readable, and they work around almost all
> quoting-and-backslashing nonsense. Â If a regex is trivial, print if
> /regex/Â ; if it's not, bookends.
Hm... the problem is that all of those bookends you suggest are also
valid regex chars; while perl is probably smart enough to keep track
of nested pairs, it could lead to surprises.
I'm not dissing the bookends, really; I do use them sometimes,
especially because they allow me to do "parallel" substitutions:
while ( $line =~ s(foo bar baz)
(oof rab zab) ) { ... }
You can't do that with the non-paired variants. (Well, /x would allow
for some of these tricks, but introduces its own pain in the matching
regex to deal with desired whitespace.)
> (2) If I'm debugging a regex, I never do it in the program. Â I do it
> on the command line with this idiom:
>
> $ perl -ne 'print if s(regex)(XXXXX)'
>
> This sits and waits for me to type at it, then
> * only prints lines that match the regex
> * shows me exactly what it matched by substituting 'XXXXX' for what
> it found.
Agreed; I use the command line for this stuff a lot. For that matter,
I learned a huge amount of perl by hanging out on EFNet #perl, doing
one-liners as quickly as I could. Learned to abuse the -M switch a
lot, and came up with horrors like this:
http://www.foo.be/docs/tpj/issues/vol4_3/tpj0403-0013.html
Getting back to regexes, you might also want to play around with the
various regex debugging tools available in Perl. Try this in a
terminal window:
echo "foo bar baz" | \
perl -Mre=debugcolor -lnwe 'print $1 while /(ba.?)/g;'
> (3) Test-driven development (TDD) was made for regexes.
>
> I make a file with lines that should and shouldn't match, then
> feed it to the one-liner in #2. Vi lets me cut-and-paste a
> zillion variants quickly; I bet your favorite text editor will,
> too.
You can also use the __DATA__ feature to automate this a bit more, if
you are so inclined (see 'perldata' man page):
#!/usr/bin/perl
use warnings;
use strict;
# use re qw( debugcolor );
my $test_re = qr/ba.?/;
my $errors = 0;
my $case = 0;
while ( my $line = <DATA> )
{
++$case;
$line =~ s!\s+\z!!;
my ( $expected_count, $test_str ) = split ' ', $line, 2;
print "case $case: expect $expected_count matches " .
"of '$test_re' in '$test_str'\n";
my $count = 0;
++$count while $test_str =~ /$test_re/g;
if ( $count != $expected_count )
{
++$errors;
warn "error $errors: got $count matches";
}
}
exit ( $errors > 0 ? 1 : 0 );
__DATA__
2 foo bar baz
0 blah gibber fee
Anyway. Happy hacking!
t.
More information about the LUG
mailing list