Nov 8, 2006

Google IP Falls into bot traps

mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; sv1; .net clr 1.1.4322)

This IP is owned by Google and is used by Google Web Accelerator

The problem is that google is not following the robots.txt file so its falling into bot traps.

Or if its not Google Web Accelerator falling into traps then people are using the ip as a proxy.

Question is what to do about this?


IncrediBILL said...

The web accelerator isn't a bot, it's browsers using the accelerator and browsers don't look at robots.txt.

That's the problem, they pre-load several pages based on the links it thinks are at the top of the page and SNAP! goes a bot trap.

Firefox does it as well with pre-fetch enabled.

tm said...

Problem is that anything automated and not under user control is a bot and should follow the bot rules.

Doesnt matter if its in a browser or not. Its still automated.