Sep 6, 2006

Panscient Data Services

38.99.203.110 Panscient_Data_Services.demarc.cogentco.com

Orginaly the bot was detected scanning the site using a fake useragent. This was reported to cogentco.com who sent back a canned reply that this was a nice bot and followed robots file.

My orginal request for info on who ran the bot and why it was faking a useragent of a browser were ignored.

I replied back to abuse and asked if cogentco.com owned this bot and why it was using a fake useragent if it was a nice bot. But my questions were ignored and all I got back was the same canned reply.

cogentco.com knows about this bot, allows it to operate, hides the idenity of its owner and ignores complaints about it.

This bot was built by www.panscient.com it is unclear if they own it.

At Panscient Technologies we design, build and operate custom internet search engines that unlock the hidden structure of web data.
Using state of the art AI technology, Panscient Technologies' software analyzes web sites for their information content and compiles the data into a searchable index.


Yea right state of the art scrapping.

At this time it is unclear who else uses this bot because its stealth.

Add to domain ban list
Panscient_Data_Services.demarc.cogentco.com,Abuse

or to the ip ban on your server 38.99.203.110

10 comments:

JP. MS. said...

Hi. I have the same on my statistics... interesting. 90 pages in one day with 93 hits all coming from panscient_data_services.demarc.cogentco.com

I will fugure out how to ban them..

Anonymous said...

This is still going on. I've blocked their entire netrange for my personal sites/hosts (they alone were DOUBLING my daily usage!!!!!). And as a server admin at a large university, I'm about to do it here as well. They can suck eggs, not my bandwidth.

mike said...

I had the same problem on my site. If you're running apache, create and .htaccess file in your webroot and add this line:
RewriteCond %{REMOTE_ADDR} "^38\.99\.203\.110$" [OR]

While you're at it, you might want to get rid of Cyveillance too:
RewriteCond %{REMOTE_ADDR} "^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$" [OR]
RewriteCond %{REMOTE_ADDR} "^63\.146\.13\.6([4-9]|[7-8][0-9]|9[0-5])$" [OR]
RewriteCond %{REMOTE_ADDR} "^65\.213\.108\.1(2[8-9]|[3-4][0-9]|5[0-9])$" [OR]
RewriteCond %{REMOTE_ADDR} "^65\.222\.176\.(9[6-9]|1[0-1][0-9]|12[0-7])$" [OR]
RewriteCond %{REMOTE_ADDR} "^65\.118\.41\.(19[2-9]|2[0-1][0-9]|22[0-3])$" [OR]
RewriteCond %{REMOTE_ADDR} "^65\.222\.185\.7([2-9])$" [OR]

I'd rather have real visitors instead of a bot that is an unknown quantity, and cyvellance looking for copyrighted music...

The Strange House Astrology

Darrin said...

Useful info, thanks. You just saved me several hours of crafting a scathing email to Cogentco and Panscient that no doubt would have just been ignored. The thing has been trolling just the user profiles on my resume posting board for a couple of days. It seems to ignore whitepapers, message board threads, articles, and job postings -- so I'm thinking it definitely has an agenda of some sort.

Anonymous said...

This bot is sendig spam for our phpbb forum. Now It´s Banned.

Anonymous said...

Hey all, a friend just brought your post to my attention. I am the CEO of Panscient. If our bot is causing you trouble please let me know. You can also send an email to crawler@panscient.com. It will be read by a human.

We crawl for executive management profiles and other corporate information.

Our bot behaviour is explained on our FAQ: http://panscient.com/faq.htm.

The bot is throttled to request no more than one page per second per server, and programmed to obey robots.txt and robot meta tags, so if you are seeing any other behaviour it's a bug and we'd like to hear about it so we can fix it.

Thanks,

Jonathan Baxter

Anonymous said...

i typed this -Panscient_Data_Services.demarc.cogentco.com - its the address that accessed my site, into google and its giving me the web statistics for a lot of other sites. 2700 different results from google. is it taking all data like this and just displaying it online for everyone to view?

Anonymous said...

I was interested to find this little gem in a mysql_slow_queries log, which coincides with my database being deleted...

# Tue Sep 25 12:58:00 2007
# Query_time: 2 Lock_time: 0 Rows_sent: 0 Rows_examined: 0
use xxxxxxxx_gl140;
DELETE FROM gl_sessions WHERE remote_ip = '38.99.203.110' AND uid = 1

Anonymous said...

The address listed for this company is nothing more than a "virtual office" or "mail drop" at Alliance Business Centers, 620 Herndon Parkway Suite 200, Herndon VA 20170.

Why would a company claiming to be a normal business be using a mail drop in Herndon, Virginia? Good question. I know that there are a lot of U.S. Government contractors and sub-contractors in the Herndon VA area. Coincidence? I think not.

^38. seems to have a lot of problems and isn't it so nice that one company, Cogentco, based in Washington DC, seems to own all of ^38.

I say 403 to ^38. and I think it is a storefront.

Reichard said...

This "bot" is hitting me and using at least 2mb of bandwidth per day, second only to googlebot.

the bot is coming from: 38.98.120.70