Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I use meta noindex and robots.txt disallow?
- 
					
					
					
					
 Hi, we have an alternate "list view" version of every one of our search results pages The list view has its own URL, indicated by a URL parameter I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"... Thanks  
- 
					
					
					
					
 Hi, Thanks, I will do some testing to confirm that this behaves how I would like it to 
- 
					
					
					
					
 if all pages are 100#5 not indexed then I would block it in robots.txt, Google's John Muller confirmed to me that Googlebot will continue to crawl every link to check to see if a nofollow or noindex has changed status. So as a result we blocked our pages with robots.txt and saw a great increases in index/crawl rates on pages we want Google to pay attention to. It also reduces waste in server resources. However if there are any pages that are index, if you block them in robots.txt then Googlebot will never be able to crawl the link to determine that it should be noindex. This means it could stay in a permanent stage of indexed. I hope that answers all your questions? 
- 
					
					
					
					
 When you say: nofollow will tell the crawlers to not crawl the page I believe you mean to say that this will tell the crawlers not to crawl the links on the page, the page itself is itself still "crawled" is it not? But yes, you are right to say, that once robots.txt disallow is in place, the meta tag will not be seen and thus be moot (at which point I may as well take it off). It would be nice to be able to say "don't crawl this and don't put it in the index"... but is there a way? 
- 
					
					
					
					
 noindex only tells the search crawlers to not include the page in the index but still allows for them to crawl the page. nofollow will tell the crawlers to not crawl the page. robots.txt will accomplish this as well but both I think would be overkill. 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Using H3 before or instead of an H2...
 My designer and I have been having an argument: we have a blog with short, 400 words posts. They have an H1 with nice keywords and a catchy title, and then a few subheadings. I don't like making the subheadings H2, because the font looks way too large in Wordpress, so my designer wants to make them all H4s, so the font looks to be a nicer size. Here's my problem with that and why I usually just bold the subheadings: Is it really bad to put a bunch of H4s right under an H1, with not H2's or 3's to separate? I'm reading different arguments on the internet about this and gladly welcome more debate and/or case studies. Thank you! Intermediate & Advanced SEO | | genevieveagar0
- 
		
		
		
		
		
		Sanity Check: NoIndexing a Boatload of URLs
 Hi, I'm working with a Shopify site that has about 10x more URLs in Google's index than it really ought to. This equals thousands of urls bloating the index. Shopify makes it super easy to make endless new collections of products, where none of the new collections has any new content... just a new mix of products. Over time, this makes for a ton of duplicate content. My response, aside from making other new/unique content, is to select some choice collections with KW/topic opportunities in organic and add unique content to those pages. At the same time, noindexing the other 90% of excess collections pages. The thing is there's evidently no method that I could find of just uploading a list of urls to Shopify to tag noindex. And, it's too time consuming to do this one url at a time, so I wrote a little script to add a noindex tag (not nofollow) to pages that share various identical title tags, since many of them do. This saves some time, but I have to be careful to not inadvertently noindex a page I want to keep. Here are my questions: Is this what you would do? To me it seems a little crazy that I have to do this by title tag, although faster than one at a time. Would you follow it up with a deindex request (one url at a time) with Google or just let Google figure it out over time? Are there any potential negative side effects from noindexing 90% of what Google is already aware of? Any additional ideas? Thanks! Best... Mike Intermediate & Advanced SEO | | 945010
- 
		
		
		
		
		
		Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
 my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml Intermediate & Advanced SEO | | morg454540
- 
		
		
		
		
		
		Utf-8 symbols in the Title or Meta Description?
 Has somebody any experience (pros or cons) to using utf-8 symbols in the Title or in the Meta Description tags? Intermediate & Advanced SEO | | Yosef
 Expedia uses it:
 http://prntscr.com/74ofrv 74ofrv0
- 
		
		
		
		
		
		Should I be using meta robots tags on thank you pages with little content?
 I'm working on a website with hundreds of thank you pages, does it make sense to no follow, no index these pages since there's little content on them? I'm thinking this should save me some crawl budget overall but is there any risk in cutting out the internal links found on the thank you pages? (These are only standard site-wide footer and navigation links.) Thanks! Intermediate & Advanced SEO | | GSO0
- 
		
		
		
		
		
		Is using dots in URL path really a problem?
 we have a couple of pages displaying a dot in the URL path like domain.com/mr.smith/widget-mr.smith It displays fine in chrome, firefox and IE and for the user it may actually look better than replacing it by _ or -. Did this ever cause problems to anybody? Intermediate & Advanced SEO | | lcourse
 Any statement from google about it?
 Should I change existing URLs? If so, which other characters can I use in the URL instead of underscore and dash, since in our system dash and underscore are already used for rewriting other characters. Thanks0
- 
		
		
		
		
		
		Using Folkd for Video Backlink
 Hi Mozzers, What are your thoughts on using www.folkd.com for video SEO? We have a few company videos and would like to possibly get a backlink by either embedding one of our youtube videos on our site or self hosting the video. Are bookmarking sites like this spammy? Intermediate & Advanced SEO | | Travis-W0
- 
		
		
		
		
		
		Block an entire subdomain with robots.txt?
 Is it possible to block an entire subdomain with robots.txt? I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas? Intermediate & Advanced SEO | | kylesuss12
 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				