[lug] Perl question: how to print embedded metacharacters
Chip Atkinson
chip at pupman.com
Sun Nov 29 09:42:36 MST 2009
Thanks for the suggestion. While I'm not too worried about security since
it's a semi-personal utility, it's always good to consider just the same.
The most important thing is speed since the files are rather huge. I
think I'll implement the regex/hash approach tomorrow.
Thanks for the ideas!
Chip
On Sat, 28 Nov 2009, Tkil wrote:
> >>>>> "Chip" == Chip Atkinson <chip at pupman.com> writes:
>
> Chip> I'm working on a perl script that reads strings from a
> Chip> configuration file and prints them out. What I'd like to be
> Chip> able to do is have any embedded metacharacters interpreted but
> Chip> I'd like to avoid doing the substitutions myself.
>
> This thread went a little off the rails. The short and naive answer
> is: "use 'eval' in perl". That is:
>
> my $str = 'foo\nbar'; # single quotes preserve backslashes
> # except \' and \\
>
> print $str; # should print: foo\nbar
>
> print eval "\"$str\""; # should print: foo then bar on a new line.
>
> The problem with eval is that it is a full-on perl interpreter (hence
> the need for double quotes around the value); and that means that the
> string could invoke things like 'unlink' and 'system' (and
> 'socket'...)
>
> (Perl 'eval' is also substantially slower than the simple hash lookup
> I describe below, but the security implications alone are enough to
> keep me from recommending it.)
>
> You're really much better off just doing the substitution yourself.
> It's small and fast, at least for simple one-character escapes:
>
> # see "quotes and quote-like operators" in perlop
> my %esc = ( '\t' => "\t", '\n' => "\n", '\r' => "\r" );
>
> (Yes, I originally went here:
>
> my %esc = map { "\\".$_ => eval '"'.$_.'"' } qw( t n r f b a e );
>
> but that seems like overkill.)
>
> Anyway. Now we want a regex that will match only those escapes:
>
> my $esc_re = join '|', sort keys %esc;
> $esc_re = qr/$esc_re/; # alas for qr//e...
>
> Now you can do the substitution with a single s///:
>
> $string =~ s/($esc_re)/$esc{$1}/g;
>
> If you want/need to handle all the possible escapes, then it gets a
> bit uglier. You can't simply handle each type one at a time; if one
> escape evaluates to a backslash, it could completely confuse the
> output. So you end up with either a single hairy expression, or a set
> of expressions that are linked statefully (e.g., using \G and /c).
>
> Happy hacking,
> t.
> _______________________________________________
> Web Page: http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#hackingsociety
>
More information about the LUG
mailing list