Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Googlebot Crawl Rate causing site slowdown
-
I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT:
I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot.
Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings?
Thanks
-
Similar to Michael, my IT team is saying Googlebot is causing performance issues - specifically during peak hours.
It was suggested that we consider using apache re-write rules to serve Googlebot a 503 during our peak hours to limit the impact. I found the stackoverflow thread (link below) in which John Muller seems to suggest this approach, but has anyone tried this?
-
Blocking googlebot is a quick and easy way to disappear from the Index. Not an option if you want Google to rank your site.
For smaller sites or ones with limited technologies, I sometimes recommend using a crawl-delay directive in robots.txt
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=48620
But I agree with both Shane and Zachary, this doesn't seem like the long term answer to your problems. Your crawl stats don't seem out of line for a site of your size, and perhaps a better hardware configuration could help things out.
With 70 new articles each day, I'd want Google crawling my site as much as they pleased.
-
whatever Google's default is in GWT - It sets it for you.
You can change it, but it is not reccomended unless for a specific reason (such as Michael Lewis's specific scenario) even though, I am not completely sold that Gbot is what is causing the "dealbreaking" overhead.
-
what is the ideal setting on the crawler. i have been wondering about this for some time.
-
Hi,
Your admins saying that, is like someone saying "we need to shut the site down, we are getting to much traffic!" Common sys-admin response (fix it somewhere else)
4GB a day downloaded, is alot of Bot traffic, but it appears you are a "real time" site, that is probably actually helped and maybe even reliant on your high crawl rate....
I would upgrade hardware - or even look into some kind of off site cloud redundancy for failover (Hybrid)
I highly doubt that 4GB a day, is a "dealbreaker",but of course that is just based off the one image, and your admins probably have resource monitors - Maybe Varnish is an answer for static content to help lighten load???? Or CDN for file hosting to lighten bandwidth load?
Shane
-
We are hosting the site on our own hardware at a big colo. I know that we are upgrading servers but they will not be online until the end of July.
Thanks!
-
I wouldn't slow the crawl rate. A high crawl rate is good so that Google can keep their index of your website current.
The better solution is to reconsider your hardware and networking setup. Do you know how you are being hosted? From my own experience with a website of that size, a load balancer on two decent dedicated servers should handle the load without problems. Google crawling your pages shouldn't create noticeable overhead on the right setup.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does anyone know the linking of hashtags on Wix sites does it negatively or postively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please?
Does anyone know the linking of hashtags on Wix sites does it negatively or positively impact SEO. It is coming up as an error in site crawls 'Pages with 404 errors' Anyone got any experience please? For example at the bottom of this blog post https://www.poppyandperle.com/post/face-painting-a-global-language the hashtags are linked, but they don't go to a page, they go to search results of all other blogs using that hashtag. Seems a bit of a strange approach to me.
Technical SEO | | Mediaholix0 -
Why does Bing bot crawl so aggressively?
We observer that the Bing bot is crawling our site very aggressively. We set Bing's crawl control so that it should not crawl us during heavy traffic hours, but that did not change a thing. Does anyone have the problem and even better a solution?
Technical SEO | | Roverandom1 -
Tools/Software that can crawl all image URLs in a site
Excluding Screaming Frog, what other tools/software to use in order to crawl all image URLs in a site? Because in Screaming Frog, they don't crawl image URLs which are not under the site domain. Example of an image URL outside the client site: http://cdn.shopify.com/images/this-is-just-a-sample.png If the client is: http://www.example.com, Screaming Frog only crawls images under it like, http://www.example.com/images/this-is-just-a-sample.png
Technical SEO | | jayoliverwright0 -
Does my "spam" site affect my other sites on the same IP?
I have a link directory called Liberty Resource Directory. It's the main site on my dedicated IP, all my other sites are Addon domains on top of it. While exploring the new MOZ spam ranking I saw that LRD (Liberty Resource Directory) has a spam score of 9/17 and that Google penalizes 71% of sites with a similar score. Fair enough, thin content, bunch of follow links (there's over 2,000 links by now), no problem. That site isn't for Google, it's for me. Question, does that site (and linking to my own sites on it) negatively affect my other sites on the same IP? If so, by how much? Does a simple noindex fix that potential issues? Bonus: How does one go about going through hundreds of pages with thousands of links, built with raw, plain text HTML to change things to nofollow? =/
Technical SEO | | eglove0 -
Staging site and "live" site have both been indexed by Google
While creating a site we forgot to password protect the staging site while it was being built. Now that the site has been moved to the new domain, it has come to my attention that both the staging site (site.staging.com) and the "live" site (site.com) are both being indexed. What is the best way to solve this problem? I was thinking about adding a 301 redirect from the staging site to the live site via HTACCESS. Any recommendations?
Technical SEO | | melen0 -
Mobile site ranking instead of/as well as desktop site in desktop SERPS
I have just noticed that the mobile version of my site is sometimes ranking in the desktop serps either instead of as well as the desktop site. It is not something that I have noticed in the past as it doesn't happen with the keywords that I track, which are highly competitive. It is happening for results that include our brand name, e.g '[brand name][search term]'. The mobile site is served with mobile optimised content from another URL. e.g wwww.domain.com/productpage redirects to m.domain.com/productpage for mobile. Sometimes I am only seen the mobile URL in the desktop SERPS, other times I am seeing both the desktop and mobile URL for the same product. My understanding is that the mobile URL should not be ranking at all in desktop SERPS, could we be being penalised for either bad redirects or duplicate content? Any ideas as to how I could further diagnose and solve the problem if you do believe that it could be harming rankings?
Technical SEO | | pugh0 -
International Site Links In Footer
We have several international sites and we have them linked in the footer of our main .com site . Should we add "nofollow" to these links? Our concern is that Google could see these sites as a network?
Technical SEO | | EwanFisher0 -
Delete old site but redirect domain to a new domain and site
I just have a quick query and I have a feeling about what the answer is so just wanted to see what you guys thought... Basically I am working on a client site. This client has a few other websites that are divisions of their company. However these divisions/websites are no longer used. They are wanting to delete the websites but redirect the domains to their name main website. They believe this will pass on SEO benefits as these old division sites are old and have a good PR and history. I'm unsure for DEFINITE, which way is correct?
Technical SEO | | Weerdboil0