Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Oh no googlebot can not access my robots.txt file
- 
					
					
					
					
 I just receive a n error message from google webmaster Wonder it was something to do with Yoast plugin. Could somebody help me with troubleshooting this? Here's original message Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. Recommended action If the site error rate is 100%: - Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot.
- If your robots.txt is a static page, verify that your web service has proper permissions to access the file.
- If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure.
 If the site error rate is less than 100%: - Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors.
- The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website.
 After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site. 
- 
					
					
					
					
 I can open text file but Godaddy told me robots.txt file is not on my server (root level). Also told me that my site is not crawled because robot.txt file is not there. Basically all of those might have resulted from plug in I was using (term optimizer) Based on what Godaddy told me, my .htaccess file was crashed because of that and had to be recreated. So now .htaceess file is good. Now I have to figure out is why my site is not accessible from Googlebot. Let me know Keith if this is a quick fix or need some time to troubleshoot. You can send me a message to discuss about fees if nessary. Thanks again 
- 
					
					
					
					
 Hi, You have a robots.txt file here: http://www.soobumimphotography.com/robots.txt Can you write this again in English so it makes sense? "I called Godaddy and told me if I used any plug ins etc. Godaddy fixed .htaccss file and my site was up and runningjust fine." Yes google xml sitemaps will add the location of your stitemap to the robots.txt file - but there is nothing wrong with your robots.txt file. 
- 
					
					
					
					
 I just called Godaddy and told me that I don't have robots.txt tile. Can anyone help with this issue? So here's what happen: I purchased Joos de Vailk's Term Optimizer to consolidate tags etc. As soon as I installed & opened it, my site crashed. I called Godaddy and told me if I used any plug ins etc. Godaddy fixed .htaccss file and my site was up and runningjust fine. Isn't plugin like the Google XML Sitemaps automatically generates robots.txt file? 
- 
					
					
					
					
 Yes, my site was down. 
- 
					
					
					
					
 I had a .htaccess issue past 24 hour with plug in and Godaddy had fixed it for me. I think this caused problem. I just fetched again and still getting unreachable page. I wonder if I have bad .htaccess file 
- 
					
					
					
					
 Was your site down during this period? I would recommend setting up pingdom.com (free site monitoring), this will email you if your site goes down - I suspect this is a hosting related issue. FYI, I can access your robots.txt fine from here. 
- 
					
					
					
					
 Hi Bistoss, You should log into Google Webmaster Tools to check the day the problem occurred. It is not uncommon for host to have problems that temporarily cause access problems. In some rare cases Google itself could be having problems. For example, in July we had 1 day with a 11% failure rate, it was the host. Since then no problems. If your problems are persistent, then you may have an issue like this: http://blog.jitbit.com/2012/08/fixing-googlebot-cant-access-your-site.html old Analytic code. Other things to look at is any recent changes, specifically anything that had to do with .htaccess Be sure to use the FETCH AS GOOGLE bot after any changes to verify that Google can now crawl your site. Hope this helps 
- 
					
					
					
					
 I also use Robots Meta Configuration plug in 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Google indexing despite robots.txt block
 Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks! Technical SEO | | zeepartner0
- 
		
		
		
		
		
		Block Domain in robots.txt
 Hi. We had some URLs that were indexed in Google from a www1-subdomain. We have now disabled the URLs (returning a 404 - for other reasons we cannot do a redirect from www1 to www) and blocked via robots.txt. But the amount of indexed pages keeps increasing (for 2 weeks now). Unfortunately, I cannot install Webmaster Tools for this subdomain to tell Google to back off... Any ideas why this could be and whether it's normal? I can send you more domain infos by personal message if you want to have a look at it. Technical SEO | | zeepartner0
- 
		
		
		
		
		
		What can I do if my reconsideration request is rejected?
 Last week I received an unnatural link warning from Google. Sad times. I followed the guidelines and reviewed all my inbound links for the last 3 months. All 5000 of them! Along with several genuine ones from trusted sites like BBC, Guardian and Telegraph there was a load of spam. About 2800 of them were junk. As we don't employ any SEO agency and don't buy links (we don't even buy adwords!) I know that all of this spam is generated by spam bots and site scrapers copying our content. As the bad links have not been created by us and there are 2800 of them I cannot hope to get them removed. There are no 'contact us' pages on these Russian spam directories and Indian scraper sites. And as for the 'adult book marking website' who have linked to us over 1000 times, well I couldn't even contact that site in company time if I wanted to! As a result i did my manual review all day, made a list of 2800 bad links and disavowed them. I followed this up with a reconsideration request to tell Google what I'd done but a week later this has been rejected "We've reviewed your site and we still see links to your site that violate our quality guidelines." As these links are beyond my control and I've tried to disavow them is there anything more to be done? Cheers Steve Technical SEO | | SteveBrumpton0
- 
		
		
		
		
		
		Can too many pages hurt crawling and ranking?
 Hi, I work for local yellow pages in Belgium, over the last months we introduced a succesfull technique to boost SEO traffic: we have created over 150k of new pages, all targeting specific keywords and all containing unique content, a site architecture to enable google to find these pages through crawling, xml sitemaps, .... All signs (traffic, indexation of xml sitemaps, rankings, ...) are positive. So far so good. We are able to quickly build more unique pages, and I wonder how google will react to this type of "large scale operation": can it hurt crawling and ranking if google notices big volumes of content (unique content)? Please advice Technical SEO | | TruvoDirectories0
- 
		
		
		
		
		
		Googlebot does not obey robots.txt disallow
 Hi Mozzers! We are trying to get Googlebot to steer away from our internal search results pages by adding a parameter "nocrawl=1" to facet/filter links and then robots.txt disallow all URLs containing that parameter. We implemented this late august and since that, the GWMT message "Googlebot found an extremely high number of URLs on your site", stopped coming. But today we received yet another. The weird thing is that Google gives many of our nowadays robots.txt disallowed URLs as examples of URLs that may cause us problems. What could be the reason? Best regards, Martin Technical SEO | | TalkInThePark0
- 
		
		
		
		
		
		Googlebot Crawl Rate causing site slowdown
 I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT: http://imgur.com/dyIbf I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot. Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings? Thanks Technical SEO | | SuperMikeLewis0
- 
		
		
		
		
		
		Subdomain Removal in Robots.txt with Conditional Logic??
 I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites. Technical SEO | | ErnieB0
- 
		
		
		
		
		
		Does Google index XML files?
 Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed. Technical SEO | | nicole.healthline0
 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				