[lug] Perl question: how to print embedded metacharacters

Chip Atkinson chip at pupman.com
Sun Nov 29 09:42:36 MST 2009


Thanks for the suggestion.  While I'm not too worried about security since
it's a semi-personal utility, it's always good to consider just the same.  

The most important thing is speed since the files are rather huge.  I
think I'll implement the regex/hash approach tomorrow.

Thanks for the ideas!

Chip

On Sat, 28 Nov 2009, Tkil wrote:

> >>>>> "Chip" == Chip Atkinson <chip at pupman.com> writes:
> 
> Chip> I'm working on a perl script that reads strings from a
> Chip> configuration file and prints them out.  What I'd like to be
> Chip> able to do is have any embedded metacharacters interpreted but
> Chip> I'd like to avoid doing the substitutions myself.
> 
> This thread went a little off the rails.  The short and naive answer
> is: "use 'eval' in perl".  That is:
> 
>   my $str = 'foo\nbar';   # single quotes preserve backslashes
>                           #   except \' and \\
> 
>   print $str;             # should print: foo\nbar
> 
>   print eval "\"$str\"";  # should print: foo then bar on a new line.
> 
> The problem with eval is that it is a full-on perl interpreter (hence
> the need for double quotes around the value); and that means that the
> string could invoke things like 'unlink' and 'system' (and
> 'socket'...)
> 
> (Perl 'eval' is also substantially slower than the simple hash lookup
> I describe below, but the security implications alone are enough to
> keep me from recommending it.)
> 
> You're really much better off just doing the substitution yourself.
> It's small and fast, at least for simple one-character escapes:
> 
>   # see "quotes and quote-like operators" in perlop
>   my %esc = ( '\t' => "\t", '\n' => "\n", '\r' => "\r" );
> 
> (Yes, I originally went here:
> 
>   my %esc = map { "\\".$_ => eval '"'.$_.'"' } qw( t n r f b a e );
> 
> but that seems like overkill.)
> 
> Anyway.  Now we want a regex that will match only those escapes:
> 
>   my $esc_re = join '|', sort keys %esc;
>   $esc_re = qr/$esc_re/; # alas for qr//e...
> 
> Now you can do the substitution with a single s///:
> 
>   $string =~ s/($esc_re)/$esc{$1}/g;
> 
> If you want/need to handle all the possible escapes, then it gets a
> bit uglier.  You can't simply handle each type one at a time; if one
> escape evaluates to a backslash, it could completely confuse the
> output.  So you end up with either a single hairy expression, or a set
> of expressions that are linked statefully (e.g., using \G and /c).
> 
> Happy hacking,
> t.
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#hackingsociety
> 




More information about the LUG mailing list