Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google Search console says 'sitemap is blocked by robots?
- 
					
					
					
					
 Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt." I don't understand why my sitemap is being blocked? My robots.txt look like this: User-Agent: * 
 Disallow:It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue? 
- 
					
					
					
					
 Nice happy to hear that do you work with Greg Reindel? He is a good friend I looked at your IP that is why I ask? Tom 
- 
					
					
					
					
 I agree with David Hey is your dev Greg Reindel? If so you can call me for help PM me here for my info. Thomas Zickell 
- 
					
					
					
					
 Hey guys, I ended up disabling the sitemap option from YoastSEO, then installed the 'Google (XML) sitemap' plug-in. I re-submitted the sitemap to Google last night, and it came back with no issues. I'm glad to finally have this sorted out. Thanks for all the help! 
- 
					
					
					
					
 Hi Christian, The current robots.txt shouldn't be blocking those URLs. Did you or someone else recently change the robots.txt file? If so, give Google a few days to re-crawl your site. Also, can you check what happens when you do a fetch and render on one of the blocked posts in Search Console? Do you have issues there? Cheers, David 
- 
					
					
					
					
 I think you need to make an https robots.txt file if you are running https if running https https://a-moz.groupbuyseo.org/blog/xml-sitemaps `User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php` Sitemap: https://domain.com/index-sitemap.xml(that is a https site map)can you send the sitemap URL or run it though deepcrawl Hope this helps? Did you make a new robots.txt file? 
- 
					
					
					
					
 Thanks for the response. Do you think this is a robots.txt issue? Or could this be caused by the YoastSEO plugin? Do you know if this plug-in works with YoastSEO together? Or will it cause issues? 
- 
					
					
					
					
 Thank you for the response. I just scanned the site using 'Screaming frog'. Under Internal>Directives there were zero 'no index' links. I also check for '404 errors', server 505 errors, or anything 'blocked by robots.txt'. Google search console is still showing me that there are URL's being blocked by my sitemap. (I added a screenshot of this). When I click through, it tells me that the 'post sitemap' has over +300 warnings. I have just deleted the YoastSEO plugin, and I am now re-installing it. hopefully, this fixes the issue. 
- 
					
					
					
					
 No, you do not need to change or plug-in what is happening is Webmaster tools is telling you that you have no index or no follow were robots xTag somewhere on your URLs inside your sitemap. Run your site through Moz, screaming frog Seo spider or deepcrawl and look for no indexed URLs. webmaster tools/search console is telling you that you have no index URLs inside of your XML sitemap not that you robots.txt is blocking it. This would be set in the Yoast plugin. one way to correct it is to look for noindex URLs & filter them inside Yoast so they are not being presented to the crawlers. If you would like you can turn off the sitemap on Yoast and turn it back on if that does not work I recommend completely removing the plug-in and reinstalling it - https://kb.yoast.com/kb/how-can-i-uninstall-my-plugin/
- https://kinsta.com/blog/uninstall-wordpress-plugin/
 Can you send a screenshot of what you're seeing? When you see it in Google Webmaster tools are you talking about the XML sitemap itself mean no indexed because all XML sitemaps are no indexed. Please add this to your robots.txt `User-agent:* Disallow:/wp-admin/ Allow:/wp-admin/admin-ajax.php` Sitemap: http://www.website.com/sitemap_index.xmlI hope this is of help, Tom 
- 
					
					
					
					
 Hi, Use this plugin https://wordpress.org/plugins/wp-robots-txt/ it will remove previous robots.txt and set simple wordpress robots.txt and wait for a day problem can be solved. Also watch this video on the same @ https://www.youtube.com/watch?v=DZiyN07bbBM Thanks 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		How to remove Parameters from Google Search Console?
 Hi All, Following are parameter configuration in search console - Parameters - fl Technical SEO | | adamjack
 Does this parameter change page content seen by the user? - Yes, Changes, reorders, or narrows page content.
 How does this parameter affect page content? - Narrow
 Which URLs with this parameter should Googlebot crawl? - Let Googlebot decide (Default) Query - Actually it is filter parameter. I have already set canonical on filter page. Now I am doing tracking of filter pages via data layer and tag manager so in google analytic I am not able to see filter url's because of this parameter. So I want to delete this parameter. Can anyone please help me? Thanks!0
- 
		
		
		
		
		
		Why isn't my homepage number #1 when searching my brand name?
 Hi! So we recently (a month ago) lunched a new website, we have great content that updates everyday, we're active on social platforms, and we did all that's possible, at the moment, when it comes to on site optimization (a web developer will join our team this month and help us fix all the rest). When I search for our brand name all our social profiles come up first, after them we have a few inner pages from our different news sections, but our homepage is somewhere in the 2nd search page... What may be the reason for that? Is it just a matter of time or is there a problem with our homepage I'm unable to find? Thanks! Technical SEO | | Orly-PP0
- 
		
		
		
		
		
		Will an XML sitemap override a robots.txt
 I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index? Technical SEO | | KCBackofen0
- 
		
		
		
		
		
		Best Practices for adding Dynamic URL's to XML Sitemap
 Hi Guys, I'm working on an ecommerce website with all the product pages using dynamic URL's (we also have a few static pages but there is no issue with them). The products are updated on the site every couple of hours (because we sell out or the special offer expires) and as a result I keep seeing heaps of 404 errors in Google Webmaster tools and am trying to avoid this (if possible). I have already created an XML sitemap for the static pages and am now looking at incorporating the dynamic product pages but am not sure what is the best approach. The URL structure for the products are as follows: http://www.xyz.com/products/product1-is-really-cool Technical SEO | | seekjobs
 http://www.xyz.com/products/product2-is-even-cooler
 http://www.xyz.com/products/product3-is-the-coolest Here are 2 approaches I was considering: 1. To just include the dynamic product URLS within the same sitemap as the static URLs using just the following http://www.xyz.com/products/ - This is so spiders have access to the folder the products are in and I don't have to create an automated sitemap for all product OR 2. Create a separate automated sitemap that updates when ever a product is updated and include the change frequency to be hourly - This is so spiders always have as close to be up to date sitemap when they crawl the sitemap I look forward to hearing your thoughts, opinions, suggestions and/or previous experiences with this. Thanks heaps, LW0
- 
		
		
		
		
		
		Should I block robots from URLs containing query strings?
 I'm about to block off all URLs that have a query string using robots.txt. They're mostly URLs with coremetrics tags and other referrer info. I figured that search engines don't need to see these as they're always better off with the original URL. Might there be any downside to this that I need to consider? Appreciate your help / experiences on this one. Thanks Jenni Technical SEO | | ShearingsGroup0
- 
		
		
		
		
		
		How is a dash or "-" handled by Google search?
 I am targeting the keyword AK-47 and it the variants in search (AK47, AK-47, AK 47) . How should I handle on page SEO? Right now I have AK47 and AK-47 incorporated. So my questions is really do I need to account for the space or is Google handling a dash as a space? At a quick glance of the top 10 it seems the dash is handled as a space, but I just wanted to get a conformation from people much smarter then I at seomoz. Thanks, Jason Technical SEO | | idiHost0
- 
		
		
		
		
		
		OK to block /js/ folder using robots.txt?
 I know Matt Cutts suggestions we allow bots to crawl css and javascript folders (http://www.youtube.com/watch?v=PNEipHjsEPU) But what if you have lots and lots of JS and you dont want to waste precious crawl resources? Also, as we update and improve the javascript on our site, we iterate the version number ?v=1.1... 1.2... 1.3... etc. And the legacy versions show up in Google Webmaster Tools as 404s. For example: http://www.discoverafrica.com/js/global_functions.js?v=1.1 Technical SEO | | AndreVanKets
 http://www.discoverafrica.com/js/jquery.cookie.js?v=1.1
 http://www.discoverafrica.com/js/global.js?v=1.2
 http://www.discoverafrica.com/js/jquery.validate.min.js?v=1.1
 http://www.discoverafrica.com/js/json2.js?v=1.1 Wouldn't it just be easier to prevent Googlebot from crawling the js folder altogether? Isn't that what robots.txt was made for? Just to be clear - we are NOT doing any sneaky redirects or other dodgy javascript hacks. We're just trying to power our content and UX elegantly with javascript. What do you guys say: Obey Matt? Or run the javascript gauntlet?0
- 
		
		
		
		
		
		Is blocking RSS Feeds with robots.txt necessary?
 Is it necessary to block an rss feed with robots.txt? It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html) And, google says here that it's important not to block RSS feeds (http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html) I'm just checking! Technical SEO | | nicole.healthline0
 
			
		 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				