Mozilla/4.0 (compatible; MSIE 4.0; Windows NT; ....../1.0 )
63.80.56.36
Another fake browser detected.
OrgName: UUNET Technologies, Inc.
OrgID: UU
Address: 22001 Loudoun County Parkway
City: Ashburn
StateProv: VA
PostalCode: 20147
Country: US
Nov 30, 2006
Nov 29, 2006
Nov 28, 2006
anothrrobot (http://www.anothr.com) RSS ABUSE
anothrrobot (http://www.anothr.com)
60.191.17.90
The above IP is banned as a Single-stage open SMTP relay or HTTP Proxy See here
anothrrobot (http://www.anothr.com)
218.72.35.200 200.35.72.218.broad.hz.zj.dynamic.163data.com.cn
The above IP is also banned as a spammer see here
Located in Shanghai ShangHai china
This RSS Robot is said to read your rss feeds and then push them to the end user. But It keeps on reloading the rss feed over and over and over.
Example of abuse. I set my RSS feed Time To Live (TTL) to no more than 1 load per day but this bot is loading the feed every min and ignoring the TTL.
So its banned.
After banning I am seeing hits from another dynamic China IP address. I dont think this is a real Feed service.
Domain name: anothr.com
Registrant Contact:
Zheng
Zheng XY cnblog@gmail.com
13501863736 fax: 13501863736
15L,Huamin Building, No.728,Yanan Xi Rd.
Shanghai ShangHai 200051
CN
Administrative Contact:
Zheng XY cnblog@gmail.com
13501863736 fax: 13501863736
15L,Huamin Building, No.728,Yanan Xi Rd.
Shanghai ShangHai 200051
CN
Technical Contact:
Product Team diy@corp.myrice.com
64677272 fax: 64727880
Room 306,MingYuan Tower,1199 Fu Xing Road (M)
Shanghai Shanghai 200031
CN
Billing Contact:
Product Team diy@corp.myrice.com
64677272 fax: 64727880
Room 306,MingYuan Tower,1199 Fu Xing Road (M)
Shanghai Shanghai 200031
CN
DNS:
ns.myricedns.com
ns5.cnmsn.net
Created: 2006-03-16
Expires: 2008-03-16
60.191.17.90
The above IP is banned as a Single-stage open SMTP relay or HTTP Proxy See here
anothrrobot (http://www.anothr.com)
218.72.35.200 200.35.72.218.broad.hz.zj.dynamic.163data.com.cn
The above IP is also banned as a spammer see here
Located in Shanghai ShangHai china
This RSS Robot is said to read your rss feeds and then push them to the end user. But It keeps on reloading the rss feed over and over and over.
Example of abuse. I set my RSS feed Time To Live (TTL) to no more than 1 load per day but this bot is loading the feed every min and ignoring the TTL.
So its banned.
After banning I am seeing hits from another dynamic China IP address. I dont think this is a real Feed service.
Domain name: anothr.com
Registrant Contact:
Zheng
Zheng XY cnblog@gmail.com
13501863736 fax: 13501863736
15L,Huamin Building, No.728,Yanan Xi Rd.
Shanghai ShangHai 200051
CN
Administrative Contact:
Zheng XY cnblog@gmail.com
13501863736 fax: 13501863736
15L,Huamin Building, No.728,Yanan Xi Rd.
Shanghai ShangHai 200051
CN
Technical Contact:
Product Team diy@corp.myrice.com
64677272 fax: 64727880
Room 306,MingYuan Tower,1199 Fu Xing Road (M)
Shanghai Shanghai 200031
CN
Billing Contact:
Product Team diy@corp.myrice.com
64677272 fax: 64727880
Room 306,MingYuan Tower,1199 Fu Xing Road (M)
Shanghai Shanghai 200031
CN
DNS:
ns.myricedns.com
ns5.cnmsn.net
Created: 2006-03-16
Expires: 2008-03-16
outfoxbot/0.5
outfoxbot/0.5 (for internet experiments; http://; outfoxbot@gmail.com)
All Hits From 60.191.80.48
This bot runs on a IP banned for sending out china spam.
It is a unknown bot. Likely a email harvestor
All Hits From 60.191.80.48
This bot runs on a IP banned for sending out china spam.
It is a unknown bot. Likely a email harvestor
Nov 26, 2006
keymachine.de abuse probes
mozilla/4.0 (compatible; msie 6.0; windows nt 5.2; win64; amd64)
87.118.103.185 ns2.km20935-07.keymachine.de
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
84.19.188.68 ns.km21901-01.keymachine.de
I show this useragent only being used by keymachine.de not users.
mozilla/4.0 (compatible; msie 6.0; windows nt 5.0; avant browser [avantbrowser.com]; hotbar 4.4.5.0)
62.141.52.139 ns.km23144-19.keymachine.de
I show this useragent only being used by keymachine.de not users.
mozilla/5.0 (windows; u; windows nt 5.0; en-us; rv:1.7.5) gecko/20050207 firefox/1.0.1
62.141.52.139 ns.km23144-19.keymachine.de
87.118.106.4 ns.km23108-04.keymachine.de
keymachine.de is at it again. Goes straight to my contact form then back to the homepage then back to the contact form.
I few mins later it shows up on one of my PHP nuke sites trying to load a module that I do not run. After 14 tries it gave up and started on the homepage. After 6 tries it gave up on the homepage.
keymachine.de should be banned from all sites.
Listed in rfc-ignorant.org
87.118.103.185 ns2.km20935-07.keymachine.de
Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
84.19.188.68 ns.km21901-01.keymachine.de
I show this useragent only being used by keymachine.de not users.
mozilla/4.0 (compatible; msie 6.0; windows nt 5.0; avant browser [avantbrowser.com]; hotbar 4.4.5.0)
62.141.52.139 ns.km23144-19.keymachine.de
I show this useragent only being used by keymachine.de not users.
mozilla/5.0 (windows; u; windows nt 5.0; en-us; rv:1.7.5) gecko/20050207 firefox/1.0.1
62.141.52.139 ns.km23144-19.keymachine.de
87.118.106.4 ns.km23108-04.keymachine.de
keymachine.de is at it again. Goes straight to my contact form then back to the homepage then back to the contact form.
I few mins later it shows up on one of my PHP nuke sites trying to load a module that I do not run. After 14 tries it gave up and started on the homepage. After 6 tries it gave up on the homepage.
keymachine.de should be banned from all sites.
Listed in rfc-ignorant.org
Nov 25, 2006
converacrawler/0.9d 7-9745.san2.attens.net bot
converacrawler/0.9d (+http://www.authoritativeweb.com/crawl)
63.241.61.7 7-9745.san2.attens.net
This bot came in and refused to take no for a answer it tried to load every page I had. So its clear that it didn't spider my site to get those links it got them from google or somewhere else.
converacrawler was orginaly banned as a email harvestor but it now looks like a real search site at www.govmine.com.
They claim you can add this to robots.txt.
User-agent: ConveraCrawler
Disallow: /
63.241.61.7 7-9745.san2.attens.net
This bot came in and refused to take no for a answer it tried to load every page I had. So its clear that it didn't spider my site to get those links it got them from google or somewhere else.
converacrawler was orginaly banned as a email harvestor but it now looks like a real search site at www.govmine.com.
They claim you can add this to robots.txt.
User-agent: ConveraCrawler
Disallow: /
Nov 24, 2006
www.exalead.com Violates robots.txt
The www.exalead.com website has a robot that comes in and ignores your robots.txt file and takes a snapshot of your website and then post it as a thumbnail on its site.
It doesn't matter if you do block all images from bots like this.
User-agent: *
Disallow: /images/
Exalead.com refuses to abide by the commands in robots.txt.
It doesn't matter if you do block all images from bots like this.
User-agent: *
Disallow: /images/
Exalead.com refuses to abide by the commands in robots.txt.
abuse from thenewpush.com
Mozilla/4.0 compatible
64.92.199.43 host-64-92-199-43.thenewpush.com
Mozilla/4.0
64.92.199.42 host-64-92-199-42.thenewpush.com
64.92.199.60 host-64-92-199-60.thenewpush.com
Ran into this probe today from several IPS on thenewpush.com Looks like they were testing out useragents on diffrent IPS.
64.92.199.43 host-64-92-199-43.thenewpush.com
Mozilla/4.0
64.92.199.42 host-64-92-199-42.thenewpush.com
64.92.199.60 host-64-92-199-60.thenewpush.com
Ran into this probe today from several IPS on thenewpush.com Looks like they were testing out useragents on diffrent IPS.
Nov 23, 2006
66.199.236.106 duns.dunnaonline.com spamer
mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)
66.199.236.106 duns.dunnaonline.com
mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)
Humm why would a direct marketer be sending out a bot that fakes google and atempts to post spam into our scripts.
It looks like this is a spammer site and they are getting into blog spam?
the website won't load but a cache is still stored in google
Update this bot keeps trying scripts that dont exist.
66.199.236.106 duns.dunnaonline.com
mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)
dunnaonline.com is the leading provider for data for direct marketing campaings
Humm why would a direct marketer be sending out a bot that fakes google and atempts to post spam into our scripts.
It looks like this is a spammer site and they are getting into blog spam?
the website won't load but a cache is still stored in google
Update this bot keeps trying scripts that dont exist.
PHPNuke Atacker Bots
Have been seeing a lot of bots hitting my phpnuke sites its not clear why they are trying to load the following files since they are not used in the current version. And have never been located on my server.
The files they are atempting to post to are
logon.php
profile.php
posting.php
I have setup a autoban on these files to track the atacks here will be the results of what I find.
Its now clear what this is. This is an atack on the phpBB forum software that PHPNUKE uses problem is that this version is modified and the atack wont work on PHP NUKE. But that doesnt stop the robots atacks.
IPS of phpBB hackers
66.199.236.106 duns.dunnaonline.com <- worst abuser
213.186.116.169 utel10.in.ua
84.252.152.169 poltawa.com
75.126.18.154 server1.domishko.ru
81.177.24.80
81.177.4.43
66.230.154.154
66.230.161.122
222.33.248.126
The files they are atempting to post to are
logon.php
profile.php
posting.php
I have setup a autoban on these files to track the atacks here will be the results of what I find.
Its now clear what this is. This is an atack on the phpBB forum software that PHPNUKE uses problem is that this version is modified and the atack wont work on PHP NUKE. But that doesnt stop the robots atacks.
IPS of phpBB hackers
66.199.236.106 duns.dunnaonline.com <- worst abuser
213.186.116.169 utel10.in.ua
84.252.152.169 poltawa.com
75.126.18.154 server1.domishko.ru
81.177.24.80
81.177.4.43
66.230.154.154
66.230.161.122
222.33.248.126
Nov 22, 2006
Exalead image theft
Exalead Snapshots your site and lists it in its search system. It also tries to hot link all your images in a page view window. Sites like mine using hotlink protection will display a image theft notice when they do this.
I thought I has stoped this snapshot bot without stoping its crawler but they have again changed the useragent for it.
See here and here for more.
block useragents.
NG/2.0,Image crawler
NG/4.0,image crawler
Robot does not comply with simple basic robots.txt commands to not load images.
User-agent: *
Disallow: /images/
I thought I has stoped this snapshot bot without stoping its crawler but they have again changed the useragent for it.
See here and here for more.
block useragents.
NG/2.0,Image crawler
NG/4.0,image crawler
Robot does not comply with simple basic robots.txt commands to not load images.
User-agent: *
Disallow: /images/
Nov 21, 2006
nodomaintransfer abuse nodomaintransfer27.com is back
66.135.34.11 nodomaintransfer18.com
66.139.75.163 nodomaintransfer19.com
66.139.76.245 nodomaintransfer21.com
66.139.77.214 nodomaintransfer22.com
66.135.33.49 nodomaintransfer25.com
64.34.166.88 nodomaintransfer27.com
Will show up as a domain nodomaintransfer??.com with the ?? being replaced with a number. This is a guestbook spammer.
It is now suspected that they are registering throw away domains so when they get caught they can just switch to a new one. I have seen the above ones if you have seen other combos please post them.
On another note its odd that we also see Singapore peepsurf running a proxy on
nodomaintransfer21.com now suspected to be connected.
domain ban
nodomaintransfer,Gustbook Spammer
66.139.75.163 nodomaintransfer19.com
66.139.76.245 nodomaintransfer21.com
66.139.77.214 nodomaintransfer22.com
66.135.33.49 nodomaintransfer25.com
64.34.166.88 nodomaintransfer27.com
Will show up as a domain nodomaintransfer??.com with the ?? being replaced with a number. This is a guestbook spammer.
It is now suspected that they are registering throw away domains so when they get caught they can just switch to a new one. I have seen the above ones if you have seen other combos please post them.
On another note its odd that we also see Singapore peepsurf running a proxy on
nodomaintransfer21.com now suspected to be connected.
domain ban
nodomaintransfer,Gustbook Spammer
sumitbot_hansrajbot RufusBot Submit Bot spammer
sumitbot_hansrajbot (sumitbot_hansrajbot; http://64.124.122.252/feedback.html)
64.124.122.228.gw.xigs.net 64.124.122.228
IP has been flagged as a spammer. Also see SPAMBAG on 64.124.122.228
Yea same old story. But if its true why don't you have a real domain name and why are you running on a ip flagged as a source of spam. Get a real hosting account with a real domain and someone might believe you.
Sorry that statment is false. It identifies itself as The Submit Bot in crawls. Submitting what? Spam?
Its not clear what gw.xigs.net is. Is it a ISP or hosting company.
64.124.122.228.gw.xigs.net 64.124.122.228
IP has been flagged as a spammer. Also see SPAMBAG on 64.124.122.228
RufusBot
Why are we crawling?
We crawl the web towards the goal of developing a new kind of index/search tool that will bring substantial and previously unavailable exposure to websites. We're in "stealth mode" for the next few months for business reasons, but watch this page for more details on our product.
Yea same old story. But if its true why don't you have a real domain name and why are you running on a ip flagged as a source of spam. Get a real hosting account with a real domain and someone might believe you.
We identify ourselves with the name RufusBot in our crawls
The code below can be used to disallow access to all parts of your site just for our bot.
User-Agent: RufusBot
Disallow: /
Sorry that statment is false. It identifies itself as The Submit Bot in crawls. Submitting what? Spam?
Its not clear what gw.xigs.net is. Is it a ISP or hosting company.
blogbot/1.0 Locus.CS.UCLA.EDU 131.179.64.248
blogbot/1.0 (ucla cs dept contact:kcsia@cs.ucla.edu)
All Hits From Locus.CS.UCLA.EDU 131.179.64.248
Unknown what this bot is for so its banned.
All Hits From Locus.CS.UCLA.EDU 131.179.64.248
Unknown what this bot is for so its banned.
webbot.org www.webbot.ru webbot/0.1
mozilla/5.0 (compatible; webbot/0.1; http://www.webbot.ru/bot.html)
88.151.114.38 crawler38.us.webbot.org
88.151.114.36 crawler36.us.webbot.org
This bot fell right into bot traps and then kept trying to spider all my sites.
It is a ru robot
Banned due to abuse. Not following robots.txt
88.151.114.38 crawler38.us.webbot.org
88.151.114.36 crawler36.us.webbot.org
This bot fell right into bot traps and then kept trying to spider all my sites.
It is a ru robot
Banned due to abuse. Not following robots.txt
Nov 17, 2006
72.20.99.48 c08.ba.accelovation.com www.accelobot.com scrapper
400 Required header 'Accept' missing
Mozilla/5.0 (compatible; heritrix/1.8.0 +http://www.accelobot.com)
72.20.99.48 c08.ba.accelovation.com
This bot was caught hammering my site and getting blocked on all PHP pages by BB.
Recomend adding this robot to your robots file.
Really? Stealing my content so some big company can make money off of it is theft.
Helping big companies find ideals that they can take from us and patent is theft.
And worse yet once they take your ideals and patent them they come back and sue you for patent theft.
BANNED.
Mozilla/5.0 (compatible; heritrix/1.8.0 +http://www.accelobot.com)
72.20.99.48 c08.ba.accelovation.com
My mission is helping companies mine the online world. I seek innovators like you, who provide insights into unmet needs, trends, and market activity. Using Accelovation Market Discovery™ software (MDS), I help automate market research, allowing companies to more effectively and economically identify and take advantage of new opportunities for innovation and growth.
This bot was caught hammering my site and getting blocked on all PHP pages by BB.
Recomend adding this robot to your robots file.
Case Studies
Major consumer packaged goods companies use Accelovation to identify new innovations that will become their next billion dollar businesses.
Multiple Fortune 500 chemical companies use Accelovation to discover new markets for existing capabilities, while keeping tabs on the competition.
A Fortune 100 telecommunications company identifies patent infringers to win multi-million dollar awards via automated Accelovation searches.
Really? Stealing my content so some big company can make money off of it is theft.
Helping big companies find ideals that they can take from us and patent is theft.
And worse yet once they take your ideals and patent them they come back and sue you for patent theft.
BANNED.
Nov 10, 2006
Running getmyarticles.com remote scripts
elseif(intval(get_cfg_var(’allow_url_fopen’)) && function_exists(’file’)) {
if($content = @file(”http://getmyarticles.com/engine.php?”.$QueryString))
echo @join('’, $content);
}
elseif(function_exists(’curl_init’)) {
$ch = curl_init (”http://getmyarticles.com/engine.php?”.$QueryString);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_exec ($ch);
Take care. The site getmyarticles.com will not answer my questions about the security problems.
Beware of the PHP script provided by getmyarticles.com that they want you to put on your server. It allows them to take total control of your server. Instead of pulling content and displaying it on your server. It loads the script from the remote server and then runs it.
This is a huge security violation. Then can spam from your server or run bots or do anything they want. They will control your server.
Until they release a real script that just prints the content to the screen so it can not be executed or answer emails about why they wont change it do not use that service.
More testing on this shows that it looks like the remote content can be loaded then scanned for any php codes before its displayed but you will have to write your own script to do this. If anyone else wants to help test some safe scripts using this service let me know. Need to make sure we know all the exploits we need to scan for.
Scanning for
should prevent any php codes from running. Any more ideals?
if($content = @file(”http://getmyarticles.com/engine.php?”.$QueryString))
echo @join('’, $content);
}
elseif(function_exists(’curl_init’)) {
$ch = curl_init (”http://getmyarticles.com/engine.php?”.$QueryString);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_exec ($ch);
Take care. The site getmyarticles.com will not answer my questions about the security problems.
Beware of the PHP script provided by getmyarticles.com that they want you to put on your server. It allows them to take total control of your server. Instead of pulling content and displaying it on your server. It loads the script from the remote server and then runs it.
This is a huge security violation. Then can spam from your server or run bots or do anything they want. They will control your server.
Until they release a real script that just prints the content to the screen so it can not be executed or answer emails about why they wont change it do not use that service.
More testing on this shows that it looks like the remote content can be loaded then scanned for any php codes before its displayed but you will have to write your own script to do this. If anyone else wants to help test some safe scripts using this service let me know. Need to make sure we know all the exploits we need to scan for.
Scanning for
should prevent any php codes from running. Any more ideals?
Nov 9, 2006
Website Contact form How the robots atack
If you have a website you likely have a contact form so you do not have to list your email address.
The rise of blogs has also created roving spambots that post to comment forms. They are atempting to find blogs and guestbooks but they are also posting to our website contact forms.
Here is an example of a robot that came from
70.87.63.146 92.3f.5746.static.theplanet.com
The robot read the form from my html page copied all the form fields including the hidden ones. It then submitted all the proper filelds leaving the ones not used blank. It added data to teh name and city.
The city field contained 'k o s t a n a y' (Spaces added) The name contained a random name. It is beleived that this was a test message designed to post to everything and then com back a month later and scan google to find out what sites end up displaying the test phrase which in this case is the city.
Once it finds out which sites it got into it will then come back and post its spam message.
Strange thing about this robot is that it has a bug. It doesnt understand your reset or clear button so it tries to submit that field also like this.
reset=Reset form
So if you find your reset filed being posted to your form you should reject the entry.
Posting a key field or password field won't help because it will read the field and repost it. However after detecting this bot I changed my key and found that its still trying to post under the old key so it reads your key once and then doesn't do any updates.
In order to protect your forms from this bot I recomend using php to create your form page and then post the current date as a hidden field along with a rotating key. Then test for these when the data is submitted. This type of bot may pass the first test but none of the ones after that. In fact it may not even pass the first test it it doesnt post on the same day it scans.
For my forms that are on html pages I have changed my php submission form. It now displays a page asking the user to press submit again to verify the post. This inserts another date and key code in the input that the robots can not duplicate. Not only do they not know what the key will be before time but they would have to submit the data twice with diffrent keys to get in, something they are not programmed to do.
The verify button takes the place of the capata and works just as good so far.
The rise of blogs has also created roving spambots that post to comment forms. They are atempting to find blogs and guestbooks but they are also posting to our website contact forms.
Here is an example of a robot that came from
70.87.63.146 92.3f.5746.static.theplanet.com
The robot read the form from my html page copied all the form fields including the hidden ones. It then submitted all the proper filelds leaving the ones not used blank. It added data to teh name and city.
The city field contained 'k o s t a n a y' (Spaces added) The name contained a random name. It is beleived that this was a test message designed to post to everything and then com back a month later and scan google to find out what sites end up displaying the test phrase which in this case is the city.
Once it finds out which sites it got into it will then come back and post its spam message.
Strange thing about this robot is that it has a bug. It doesnt understand your reset or clear button so it tries to submit that field also like this.
reset=Reset form
So if you find your reset filed being posted to your form you should reject the entry.
Posting a key field or password field won't help because it will read the field and repost it. However after detecting this bot I changed my key and found that its still trying to post under the old key so it reads your key once and then doesn't do any updates.
In order to protect your forms from this bot I recomend using php to create your form page and then post the current date as a hidden field along with a rotating key. Then test for these when the data is submitted. This type of bot may pass the first test but none of the ones after that. In fact it may not even pass the first test it it doesnt post on the same day it scans.
For my forms that are on html pages I have changed my php submission form. It now displays a page asking the user to press submit again to verify the post. This inserts another date and key code in the input that the robots can not duplicate. Not only do they not know what the key will be before time but they would have to submit the data twice with diffrent keys to get in, something they are not programmed to do.
The verify button takes the place of the capata and works just as good so far.
Nov 8, 2006
Google IP 72.14.194.33 Falls into bot traps
mozilla/4.0 (compatible; msie 6.0; windows nt 5.1; sv1; .net clr 1.1.4322)
72.14.194.33
This IP is owned by Google and is used by Google Web Accelerator
The problem is that google is not following the robots.txt file so its falling into bot traps.
Or if its not Google Web Accelerator falling into traps then people are using the ip as a proxy.
Question is what to do about this?
72.14.194.33
This IP is owned by Google and is used by Google Web Accelerator
The problem is that google is not following the robots.txt file so its falling into bot traps.
Or if its not Google Web Accelerator falling into traps then people are using the ip as a proxy.
Question is what to do about this?
Nov 7, 2006
83.206.210.131 billythekid.ivelem.net
403 A User-Agent is required but none was provided
83.206.210.131 billythekid.ivelem.net
This one also has no useragent not sure what its doing.
83.206.210.131 billythekid.ivelem.net
This one also has no useragent not sure what its doing.
render-dream.com
bad-behavior 403 A User-Agent is required but none was provided
211.218.151.198
85.214.32.180 render-dream.com
These 2 came in at the same time and is the same bot from 2 ips.
Its not clear what render-dream.com is the website gives a error and I can find no record in google,
211.218.151.198
85.214.32.180 render-dream.com
These 2 came in at the same time and is the same bot from 2 ips.
Its not clear what render-dream.com is the website gives a error and I can find no record in google,
Nov 6, 2006
38.113.234.180 crawl1.cosmixcorp.com
cosmixcorp.com Is a Health search sytem
cfetch/1.0
38.113.234.180 crawl1.cosmixcorp.com
voyager/1.0
38.113.234.180 crawl1.cosmixcorp.com
They have started changing useragents lately. The site says its using the Voyager useragent.
Thats really strange since it keeps using cfetch/1.0 most of the time.
I had been banning them by ip but will try the robots file again.
Add this to robots.txt
User-agent: voyager
Disallow: /
cfetch/1.0
38.113.234.180 crawl1.cosmixcorp.com
voyager/1.0
38.113.234.180 crawl1.cosmixcorp.com
They have started changing useragents lately. The site says its using the Voyager useragent.
What is your crawler's HTTP user-agent string?
voyager/1.0
Thats really strange since it keeps using cfetch/1.0 most of the time.
I had been banning them by ip but will try the robots file again.
Add this to robots.txt
User-agent: voyager
Disallow: /
tm.net.my
60.48.201.70 tm.net.my
This is a ISP in Telekom Malaysia It creates a lot of guestbook spam and we had banned. it It will be unbanned as an experment to see what we get.
This is a ISP in Telekom Malaysia It creates a lot of guestbook spam and we had banned. it It will be unbanned as an experment to see what we get.
Jakarta Commons
See last post this one came uin using Jakarta Commons and was blocked by BB so they started changing IPS I guess thinking I was blocking them by IP?
Here is a list the first one had a longer Useragent
Jakarta Commons-HttpClient/3.0.1 UP.Link/6.2.3.21.0
12.25.203.39 babylon.openwave.com
Jakarta Commons-HttpClient/3.0.1
203.144.144.164 proxy.asianet.co.th
203.187.16.218 u16-218.u203-187.giga.net.tw
80.50.82.90
203.144.144.164 proxy.asianet.co.th
203.187.16.218 u16-218.u203-187.giga.net.tw
80.248.8.43
85.46.232.188 host188-232-static.46-85-b.business.telecomitalia.it
213.91.192.5 5_192.btc-net.bg
203.144.144.164 proxy.asianet.co.th
209.203.227.139 exchange.soundcontainer.com
217.40.239.201 host217-40-239-201.in-addr.btopenworld.com
203.187.16.218 u16-218.u203-187.giga.net.tw
203.187.16.218 u16-218.u203-187.giga.net.tw
80.237.140.233 proxy77.net
202.158.165.82
211.218.151.198
222.243.204.210
218.98.221.108
80.76.55.21
Still comming ip list updated,
Kind of looks like either he has accounts on all of these or the computers are compromised or they are some type of proxy...
Here is a list the first one had a longer Useragent
Jakarta Commons-HttpClient/3.0.1 UP.Link/6.2.3.21.0
12.25.203.39 babylon.openwave.com
Jakarta Commons-HttpClient/3.0.1
203.144.144.164 proxy.asianet.co.th
203.187.16.218 u16-218.u203-187.giga.net.tw
80.50.82.90
203.144.144.164 proxy.asianet.co.th
203.187.16.218 u16-218.u203-187.giga.net.tw
80.248.8.43
85.46.232.188 host188-232-static.46-85-b.business.telecomitalia.it
213.91.192.5 5_192.btc-net.bg
203.144.144.164 proxy.asianet.co.th
209.203.227.139 exchange.soundcontainer.com
217.40.239.201 host217-40-239-201.in-addr.btopenworld.com
203.187.16.218 u16-218.u203-187.giga.net.tw
203.187.16.218 u16-218.u203-187.giga.net.tw
80.237.140.233 proxy77.net
202.158.165.82
211.218.151.198
222.243.204.210
218.98.221.108
80.76.55.21
Still comming ip list updated,
Kind of looks like either he has accounts on all of these or the computers are compromised or they are some type of proxy...
400 Header 'Connection' contains invalid values
400 Header 'Connection' contains invalid values
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
84.73.184.24 84-73-184-24.dclient.hispeed.ch
This bot or hacker dont know which hit my site and was stoped by BB.
So it starts changing IPS still uses the same useragent and headers. Whats strange is all the IPS its using.
Here is a list.
84.5.129.72
68.56.7.53 c-68-56-7-53.hsd1.fl.comcast.net
12.227.189.4 12-227-189-4.client.mchsi.com
72.56.124.102 CPE0011e6ee1575-CM0011e6ee1574.cpe.net.cable.rogers.com
84.156.107.136 p549C6B88.dip.t-dialin.net
217.249.174.68 pD9F9AE44.dip.t-dialin.net
62.178.171.33 chello062178171033.5.12.vie.surfer.at
83.184.189.61 d83-184-189-61.cust.tele2.it
Also see next post about similar action using useragent 'Jakarta Commons'
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
84.73.184.24 84-73-184-24.dclient.hispeed.ch
This bot or hacker dont know which hit my site and was stoped by BB.
So it starts changing IPS still uses the same useragent and headers. Whats strange is all the IPS its using.
Here is a list.
84.5.129.72
68.56.7.53 c-68-56-7-53.hsd1.fl.comcast.net
12.227.189.4 12-227-189-4.client.mchsi.com
72.56.124.102 CPE0011e6ee1575-CM0011e6ee1574.cpe.net.cable.rogers.com
84.156.107.136 p549C6B88.dip.t-dialin.net
217.249.174.68 pD9F9AE44.dip.t-dialin.net
62.178.171.33 chello062178171033.5.12.vie.surfer.at
83.184.189.61 d83-184-189-61.cust.tele2.it
Also see next post about similar action using useragent 'Jakarta Commons'
Nov 3, 2006
84.244.8.86 pejantantangguh-a.biz
84.244.8.86 pejantantangguh-a.biz
SE - Sweden
What kind of domain is this?
Its website is blank and its robot visits with no useragent.
SE - Sweden
What kind of domain is this?
Its website is blank and its robot visits with no useragent.
tmhaos04.imsbiz.com
mozilla/4.0 (compatible; msie 5.01; windows nt)
210.87.251.107 tmhaos04.imsbiz.com
This one is a spammer the IP is on the spam blocklist.
What gave him away is the windows nt useragent. This is invalid.
210.87.251.107 tmhaos04.imsbiz.com
This one is a spammer the IP is on the spam blocklist.
What gave him away is the windows nt useragent. This is invalid.
s7.buzzlogic.com
Posted: October 31 2006 Post subject: suspicious link in my stats
--------------------------------------------------------------------------------
A check on this site buzzlogic shows that its a snoop bot for corps to check on who is talking about them. Another snoop bot.
--------------------------------------------------------------------------------
Does anybody have a clue what this is? I've had is show up three times now. Obviously, the entry link and exit link have nothing to do with my site.
Here is the stat info
2
October 31st 2006 16:42:51
7 seconds
Konqueror 3.5
Linux
1600x1200 Returning Visits:
Referring URL: 0
Location: Florida, Miami, United States
host name:s7.buzzlogic.com (64.34.246.44)
No referring link
A check on this site buzzlogic shows that its a snoop bot for corps to check on who is talking about them. Another snoop bot.
Subscribe to:
Posts (Atom)