[lug] Re: Using curl to get at pages that require a cookie

Brennen Bearnes bbearnes at gmail.com
Thu Nov 2 13:31:38 MST 2006


Bill Thoen <bthoen at gisnet.com> wrote:

> I'm trying to use curl to download a series of HTTP pages that require a
> password at a top-level page and then I guess they use cookies to allow
> access to the lower pages. When I log in via my browser and enter the
> password, I can get to all the pages.

This might not be at all helpful, but for a while I was doing a lot of screen
scraping stuff for a client (corporate press releases, specifically) and
would encounter all sorts of weird authentication, javascript-based links,
etc., where it'd be trivial to get to a resource in-browser, and just
tremendously annoying to do the same with scriptable tools.  Eventually
an acquaintance pointed me at a Firefox extension called DownThemAll.
It's clearly geared towards slurping collections of media files (porn and
piracy, I assume), but it's flexible and generalizes quite well to a lot of
other areas.  I've basically quit using curl and wget for this sort of one-off
task.

http://www.downthemall.net/

-- Brennen Bearnes
http://p1k3.com/



More information about the LUG mailing list