[lug] Perl question: how to print embedded metacharacters

Tkil tkil at scrye.com
Sat Nov 28 21:02:09 MST 2009


>>>>> "Chip" == Chip Atkinson <chip at pupman.com> writes:

Chip> I'm working on a perl script that reads strings from a
Chip> configuration file and prints them out.  What I'd like to be
Chip> able to do is have any embedded metacharacters interpreted but
Chip> I'd like to avoid doing the substitutions myself.

This thread went a little off the rails.  The short and naive answer
is: "use 'eval' in perl".  That is:

  my $str = 'foo\nbar';   # single quotes preserve backslashes
                          #   except \' and \\

  print $str;             # should print: foo\nbar

  print eval "\"$str\"";  # should print: foo then bar on a new line.

The problem with eval is that it is a full-on perl interpreter (hence
the need for double quotes around the value); and that means that the
string could invoke things like 'unlink' and 'system' (and
'socket'...)

(Perl 'eval' is also substantially slower than the simple hash lookup
I describe below, but the security implications alone are enough to
keep me from recommending it.)

You're really much better off just doing the substitution yourself.
It's small and fast, at least for simple one-character escapes:

  # see "quotes and quote-like operators" in perlop
  my %esc = ( '\t' => "\t", '\n' => "\n", '\r' => "\r" );

(Yes, I originally went here:

  my %esc = map { "\\".$_ => eval '"'.$_.'"' } qw( t n r f b a e );

but that seems like overkill.)

Anyway.  Now we want a regex that will match only those escapes:

  my $esc_re = join '|', sort keys %esc;
  $esc_re = qr/$esc_re/; # alas for qr//e...

Now you can do the substitution with a single s///:

  $string =~ s/($esc_re)/$esc{$1}/g;

If you want/need to handle all the possible escapes, then it gets a
bit uglier.  You can't simply handle each type one at a time; if one
escape evaluates to a backslash, it could completely confuse the
output.  So you end up with either a single hairy expression, or a set
of expressions that are linked statefully (e.g., using \G and /c).

Happy hacking,
t.



More information about the LUG mailing list