[lug] Regex Help

Jeffrey S. Haemer jeffrey.haemer at gmail.com
Sun Jul 10 07:18:38 MDT 2011


Slightly off-topic, here's how I made my regex life much nicer.

This handful of related tricks has transformed a day's slow frustration into
a few minutes of typing, more times than I'm willing to admit.

Maybe one or two of these'll help someone else, too.

(1) Perl permits "bookend" delimiters for regexes.  Instead of

m/regex/ or m!regex!


I often use

m(regex)or m[regex] or m{regex}


For me, these are more readable, and they work around almost all
quoting-and-backslashing nonsense.  If a regex is trivial, print if /regex/ ;
if it's not, bookends.

(If the program I'm trying to develop for is in Javascript, or something
else that isn't Perl, I still do this. See trick #5, all the way at the
bottom.)

(2) If I'm debugging a regex, I never do it in the program.  I do it on the
command line with this idiom:

$ *perl -ne 'print if s(regex)(XXXXX)'*


This sits and waits for me to type at it, then

   - only prints lines that match the regex
   - shows me exactly what it matched by substituting 'XXXXX' for what it
   found.

(3) Test-driven development (TDD) was *made* for regexes.

I make a file with lines that should and shouldn't match, then feed it to
the one-liner in #2. Vi lets me cut-and-paste a zillion variants quickly; I
bet your favorite text editor will, too.

          $ *cat testfile*

http://localhost/whatever # yes
foo bar http://yahoo.com/whatever # yes
http://localhost/something/whatever # NO
http://127.0.0.1/whatever # yes
127.0.0.1/whatever # yes
localhost.localdomain/whatever # yes
 localhost/localdomain/whatever # NO
http://127.0.0.1/localhost/whatever # NO
http://127.0.0.1/something/whatever  # NO
http://localhost/something/else/whatever # yes
http://127.0.0.1/whatever/else # yes
...

My real work is still done by typing input at #2, but every time I think I
have the regex right, I recall the command, feed it the test file and look
for a "NO".

$ *perl -ne 'print if s(regex)(XXXXX)' < testfile | grep NO*
*
http://127.0.0.1/localhost/XXXXX # NO
*


Oops. Back to development.

Once that passes, and I'm feeling really cocky, I recall the command, and
tweak it in two places to reverse the test:

$ *perl -ne 'print unless s(regex)(XXXXX)' < testfile | grep -v NO*
*
http://yahoo.com/whatever # yes
*


(Rats.  Back to tweaking.)

(4) When -- at last! -- I have the right regex, fc pulls the command up in
$EDITOR, and I copy the sucker right into the program I really needed it for
in the first place.

(5) If my original program's not in Perl, I still do it this way.  I may
have to throw in a bunch of PITA quoting-and-backslashing at the end, but at
least I know I have exactly the right regex.

-- 
Jeffrey Haemer <jeffrey.haemer at gmail.com>
720-837-8908 [cell], http://seejeffrun.blogspot.com [blog],
http://www.youtube.com/user/goyishekop [vlog]
*פרייהייט? דאס איז יאַנג דינען וואָרט.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20110710/0ea066a3/attachment.html>


More information about the LUG mailing list