[lug] Limiting GoogleBot's Hoovering

Michael J. Hammel mjhammel at graphics-muse.org
Mon Sep 25 09:29:33 MDT 2006


On Mon, 2006-09-25 at 06:26 -0600, Bill Thoen wrote:
> Is there any way to stop Googlebots from hoovering my public FTP site
> multiple times in a session? Last night, one of Google's robots downloaded
> everything on my site 6 times. I don't mind PEOPLE downlaoding what they
> WANT, but it sort of bugs me when a machine just takes everything and then
> deos it several times at a go. I'd like to know what's popular and what's
> not because it helps me figure out what to post, but this sort of mindless
> downloading is not what I want.

Take a look at this.  I'm pretty sure Google adheres to this, though I
can't say how well they adhere to it).  In a nutshell:  create a
robots.txt file in the root directory that says what a spider or other
bot can and cannot peruse.

http://www.robotstxt.org/
-- 
Michael J. Hammel                                    Senior Software Engineer
mjhammel at graphics-msue.org                           http://graphics-muse.org
------------------------------------------------------------------------------
Remember that the best relationship is one in which your love for each
other exceeds your need for each other.  --  Credited to the Dalai Lama.




More information about the LUG mailing list