May 10, 2008

This company says "we employ a small army of PhDs" But they know nothing about building bots. The blog they run won't even take comments without giving a error page.

bad-behavior 403 Required header 'Accept' missing
Agent: Mozilla/5.0 (compatible; zermelo; + [] keeps showing up in my logs. It looks like this is a web hosting div of amazon so we may be able to ban it without banning amazon.

1 comment:

Anonymous said...

From my Apache HTTPD.CONF:
# stop amazonaws addresses scraping /jargon
RewriteEngine on
RewriteCond %{REMOTE_HOST} ^.*\.compute-1\.amazonaws\.com$ [NC]
RewriteCond %{REQUEST_URI} ^.*/jargon/.*\.html$ [NC]
RewriteRule !^/jargon/index\.html$ /jargon/index.html [L,R=permanent]

(excuse the wrap)