Best way to stop crawler bots
March 26, 2015 / by Marco / Categories : Technology
I recently discovered one of my VPS servers was running constantly at 1-2 CPU load and received notifications from the VPS provider that it was using more than my fair share of CPU resources and temporarily suspended the VPS to prevent it from impacting my neighbouring customers – which was fair enough.
Upon investigation there were several reasons why the CPU load was high and one of them being that the sight was being crawled by different bots, Google, Bing, Yahoo, Ahref, Yandex, Twitter and the list goes on. So in order to reduce the load I decided to investigate what the best way was to prevent all the bots except for Google to crawl through my websites (please note that on the same VPS I’m hosting multiple websites).
After some researching and testing, the best way I was able to stop all bots except for Google was to include the following in the robots.txt file:
User-Agent: *
Disallow: /
User-Agent: Googlebot
Allow: /
User-Agent: Googlebot-Mobile
Allow: /
User-Agent: Googlebot-Image
Allow: /
User-Agent: Mediapartners-Google
Allow: /
User-Agent: Adsbot-Google
Allow: /
I know some people may ask “Why only allow Google?” The answer to this question is because that’s the only search engine that I’ve noticed that has the highest referral visitors to my websites. I don’t see the point of having the other crawlers use up CPU and resources which could potentially slow down the website and I’d rather keep the website nice and clean with minimal user traffic. Also, I’ve noticed other crawlers are used for competitor analysis which I don’t really use.
Do you have any other tips? Is this the best way to do this? If you have any other tips please let me know.
OTHER ARTICLES YOU MAY LIKE
HOW TO USE SPLIT VIEW IN GOOGLE CHROME
Google Chrome is full of useful features that many people never notice until they appear by accident during everyday browsing. One of those quietly helpful tools is Split View, a feature that makes it much easier to work with two tabs side by side in the same browser window. If you regularly compare documents, read […]
read more
Dynal: The Complete Guide to Effortless Content Creation and Networking
Unlock seamless content creation and effective networking with Dynal’s comprehensive guide, designed to enhance your productivity and online presence.
read more
