[lug] unescaping url encoded document

Walter Pienciak wpiencia at thunderdome.ieee.org
Fri Nov 6 14:00:20 MST 2009


On Fri, Nov 06, 2009 at 01:51:33PM -0700, Zan Lynx wrote:
> On 11/6/09 1:21 PM, Kenneth D Weinert wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > This is sort of amusing. I got a scam email telling me that the IRS was
> > going to give me a refund of $773.00 and all I had to do was fill in the
> > form and send it off.
> >
> > I clicked on it just to see where they were really sending it and did a
> > "View Source" in my browser.  Here are the first 4 lines (4th line
> > truncated):
> >
> > <Script Language='Javascript'>
> > <!-- HTML Encryption provided by IRS -->
> > <!--
> > document.write(unescape('%3C%21%44%4F%43%54%59%50%45%20%48%54%4D%4C%20%50
> >
> >
> > It displays fine, but I'm just curious what the submit button does and
> > wondered if anyone had an easy shortcut to translate the URL Encoding
> > into plain text outside of a browser.
> >
> > An interesting variation, at least one I hadn't seen before.
> 
> Sometimes it is a simple expansion. Other times it expands into more 
> Javascript, and the only easy way to find the output is to actually run it.

FWIW, this example seems to be a simple encoding as specified in
RFC2396, section 2.4.1:

"An escaped octet is encoded as a character triplet, consisting
of the percent character "%" followed by the two hexadecimal
digits representing the octet code. For example, "%20" is the
escaped encoding for the US-ASCII space character."

Walter



More information about the LUG mailing list