[lug] Limiting GoogleBot's Hoovering
Michael J. Hammel
mjhammel at graphics-muse.org
Mon Sep 25 09:29:33 MDT 2006
On Mon, 2006-09-25 at 06:26 -0600, Bill Thoen wrote:
> Is there any way to stop Googlebots from hoovering my public FTP site
> multiple times in a session? Last night, one of Google's robots downloaded
> everything on my site 6 times. I don't mind PEOPLE downlaoding what they
> WANT, but it sort of bugs me when a machine just takes everything and then
> deos it several times at a go. I'd like to know what's popular and what's
> not because it helps me figure out what to post, but this sort of mindless
> downloading is not what I want.
Take a look at this. I'm pretty sure Google adheres to this, though I
can't say how well they adhere to it). In a nutshell: create a
robots.txt file in the root directory that says what a spider or other
bot can and cannot peruse.
http://www.robotstxt.org/
--
Michael J. Hammel Senior Software Engineer
mjhammel at graphics-msue.org http://graphics-muse.org
------------------------------------------------------------------------------
Remember that the best relationship is one in which your love for each
other exceeds your need for each other. -- Credited to the Dalai Lama.
More information about the LUG
mailing list