[lug] Stupid WGET question
George Sexton
gsexton at mhsoftware.com
Thu Feb 17 09:30:53 MST 2005
-E just tacks .html onto the end of the file name, after any query string.
-A doesn't work either.
It seems that wget has a default spec that it will download, and -A can just
limit those values.
I tried adding the file extension to the mime.types file entry for
text/html, but that didn't work either.
George Sexton
MH Software, Inc.
http://www.mhsoftware.com/
Voice: 303 438 9585
> -----Original Message-----
> From: lug-bounces at lug.boulder.co.us
> [mailto:lug-bounces at lug.boulder.co.us] On Behalf Of Matt Thompson
> Sent: Thursday, February 17, 2005 8:23 AM
> To: Boulder LUG
> Subject: Re: [lug] Stupid WGET question
>
> On Wed, 2005-02-16 at 17:11 -0700, George Sexton wrote:
> > Anyone know how to have wget retrieve non-HTML files when
> it traverses an
> > HTML page?
> >
> > For example, I have an HTML page that has links to iCal
> files on it. I want
> > WGET to retrieve the .HTML file, and all .ICS files
> referenced from that
> > page.
> >
> > Here's the URL:
> >
> > http://www.mhsoftware.com/caldemo/iCal.html
>
> Hmm...there might be a way to do this, but you'll probably have to
> fiddle around to get what you need. For example, if you need to get
> the .css files for the .html page, you'd have to grab those too.
>
> The way I might do it is:
>
> wget --mirror -A{.ics,.html} http://...
>
> This should go through the entire tree and grab every .ics and .html
> file preserving directory structure. I think. Try it and
> see, I guess.
> You can use -A and -R to accept/reject file extensions if you need
> more/less.
>
> HTH,
> Matt
>
> --
> Learning just means you were wrong and they were right. - Aram
> Matt Thompson -- http://ucsub.colorado.edu/~thompsma/
> 440 UCB, Boulder, CO 80309-0440
> JILA A510, 303-492-4662
>
> _______________________________________________
> Web Page: http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
>
>
More information about the LUG
mailing list