Jump to content

Cuill.com Twiceler 'bot'


Recommended Posts

I was suspended twice today, and currently still am (awaiting someone processing the Ticket).

 

After a long search for why MY code was broken, it's turned out to be because I was being hammered every second of every hour of every day since the first of the month. It wasn't my code at all.

 

It wasn't my fault, but they caused my account to get suspended.

 

I highly recommend everyone look at this page: http://cuill.com/twiceler/robot.html

 

and add the IP addresses shown there into their cpanel 'IP deny' feature, as well as a robots.txt that includes:

 

User-agent: Twiceler

Disallow: /

Crawl-delay: 120

 

They were not initially even obeying robots.txt, but that supposedly has been resolved.

 

cuill.com is a bad player in the web crawler world, it seems. I invite you to google their name and see. But regardless, protect yourself before you suffer the fate I did today.

Link to post
Share on other sites

This bot has been nothing but bad news for the past 2 years and from my recent tests it still does not obey robots.txt. I have discovered this bot nailing 3 different servers over the last 3 days for hours at a time on individual sites. The funny thing is the site on each server was very small(less than 20 pages each), yet this bot crawled it for hours.

 

They have been claiming to be an experimental bot for a great new search engine for some time now, but have mostly caused site owners grief. I also read that at least on of the developers is an ex Google employee. In my opinion, if Twiceler is representative of his work at Google then it is no wonder he is an EX employee.

 

As of now all know Twiceler IPs have been banned across our server farm. This ban will remain in place until I see proof that this bot is legitimate and does something other than bring down our customer sites.

Link to post
Share on other sites
  • 3 weeks later...

I dont know if you guys unblocked it, but it has been crawling my website for a couple hours.

I did a search on the IP, found out it was this bot, and searched in google... Funny seeing my hosting company in the first 2 pages of search results.

 

I was getting hammered by IP 38.99.44.104

I don't care too much either way since I don't use all my bandwidth anyways. But I may block it

Link to post
Share on other sites
  • 3 weeks later...
  • 3 weeks later...
This bot has been nothing but bad news for the past 2 years and from my recent tests it still does not obey robots.txt. I have discovered this bot nailing 3 different servers over the last 3 days for hours at a time on individual sites. The funny thing is the site on each server was very small(less than 20 pages each), yet this bot crawled it for hours.

 

They have been claiming to be an experimental bot for a great new search engine for some time now, but have mostly caused site owners grief. I also read that at least on of the developers is an ex Google employee. In my opinion, if Twiceler is representative of his work at Google then it is no wonder he is an EX employee.

 

As of now all know Twiceler IPs have been banned across our server farm. This ban will remain in place until I see proof that this bot is legitimate and does something other than bring down our customer sites.

 

I'm getting hammered by this bot. Has this ban been lifted?

Link to post
Share on other sites

38.99.44.102 - - [25/Apr/2008:01:17:36 -0400] "GET /index.php?module=Event%20Calendar&func=view&tplview=&viewtype=day&Date=19951107&pc_username=&pc_category=&pc_topic= HTTP/1.0" 200 55099 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)"

38.99.13.121 - - [25/Apr/2008:01:17:40 -0400] "GET /index.php?module=Event%20Calendar&func=view&tplview=&viewtype=day&Date=20171204&pc_username=&pc_category=&pc_topic=&print= HTTP/1.0" 200 60395 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)"

38.99.13.123 - - [25/Apr/2008:01:17:42 -0400] "GET /index.php?module=Event%20Calendar&func=view&tplview=&viewtype=day&Date=20110831&pc_username=&pc_category=&pc_topic=&print= HTTP/1.0" 200 55318 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)"

38.99.44.103 - - [25/Apr/2008:01:17:53 -0400] "GET /index.php?module=Event%20Calendar&func=view&tplview=&viewtype=day&Date=19700603&pc_username=&pc_category=&pc_topic=&print=1 HTTP/1.0" 200 3307 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuill.com/twiceler/robot.html)"

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...