Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Could you use a robots.txt file to disalow a duplicate content page from being crawled?
- 
					
					
					
					
 A website has duplicate content pages to make it easier for users to find the information from a couple spots in the site navigation. Site owner would like to keep it this way without hurting SEO. I've thought of using the robots.txt file to disallow search engines from crawling one of the pages. Would you think this is a workable/acceptable solution? 
- 
					
					
					
					
 Yeah, sorry for the confusion. I put the tag on all the pages (Original and Duplicate). I sent you a PM with another good article on Rel canonical tag 
- 
					
					
					
					
 Peter, Thanks for the clarification. 
- 
					
					
					
					
 Generally agree, although I'd just add that Robots.txt also isn't so great at removing content that's already been indexed (it's better at prevention). So, I find that it's not just not ideal - it sometimes doesn't even work in these cases. Rel-canonical is generally a good bet, and it should go on the duplicate (you can actually put it on both, although it's not necessary). 
- 
					
					
					
					
 Next time I'll read the reference links better  Thank you! 
- 
					
					
					
					
 per google webmaster tools: If Google knows that these pages have the same content, we may index only one version for our search results. Our algorithms select the page we think best answers the user's query. Now, however, users can specify a canonical page to search engines by adding a element with the attribute rel="canonical"to the section of the non-canonical version of the page. Adding this link and attribute lets site owners identify sets of identical content and suggest to Google: "Of all these pages with identical content, this page is the most useful. Please prioritize it in search results."
- 
					
					
					
					
 Thanks Kyle. Anthony had a similar view on using the rel canonical tag. I'm just curious about adding it to both the original page or duplicate page? Or both? Thanks, Greg 
- 
					
					
					
					
 Anthony, Thanks for your response. See Kyle, he also felt using the rel canonical tag was the best thing to do. However he seemed to think you'd put it on the original page - the one you want to rank for. And you're suggesting putting on the duplicate page. Should it be added to both while specifying which page is the 'original'? Thanks! Greg 
- 
					
					
					
					
 I'm not sure I understand why the site owner seems to think that the duplicate content is necessary? If I was in your situation I would be trying to convince the client to remove the duplicate content from their site, rather than trying to find a way around it. If the information is difficult to find then this may be due to a problem with the site architecture. If the site does not flow well enough for visitors to find the information they need, then perhaps a site redesign is necessary. 
- 
					
					
					
					
 Well, the answer would be yes and no. A robots.txt file would stop the bots from indexing the page, but links from other pages in site to that non indexed page could therefor make it crawlable and then indexed. AS posted in google webmaster tools here: "You need a robots.txt file only if your site includes content that you don't want search engines to index. If you want search engines to index everything in your site, you don't need a robots.txt file (not even an empty one). While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results." I think the best way to avoid any conflict is applying the rel="canonical" tag to each duplicate page that you don't want indexed. You can find more info on rel canonical here Hope this helps out some. 
- 
					
					
					
					
 The best way would be to use the Rel canonical tag On the page you would like to rank for put the Rel canonical tag in This lets google know that this is the original page.Check out this link posted by Rand about the Rel canonical tag [http://www.seomoz.org/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps](http://www.seomoz.org/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps)
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Using hreflang for international pages - is this how you do it?
 My client is trying to achieve a global presence in select countries, and then track traffic from their international pages in Google Analytics. The content for the international pages is pretty much the same as for USA pages, but the form and a few other details are different due to how product licensing has to be set up. I don’t want to risk losing ranking for existing USA pages due to issues like duplicate content etc. What is the best way to approach this? This is my first foray into this and I’ve been scanning the MOZ topics but a number of the conversations are going over my head,so suggestions will need to be pretty simple 🙂 Is it a case of adding hreflang code to each page and creating different URLs for tracking. For example: Intermediate & Advanced SEO | | Caro-O
 URL for USA: https://company.com/en-US/products/product-name/
 URL for Canada: https://company.com/en-ca/products/product-name /
 URL for German Language Content: https://company.com/de/products/product-name /
 URL for rest of the world: https://company.com/en/products/product-name /1
- 
		
		
		
		
		
		Category Pages & Content
 Hi Does anyone have any great examples of an ecommerce site which has great content on category pages or product listing pages? Thanks! Intermediate & Advanced SEO | | BeckyKey1
- 
		
		
		
		
		
		After Server Migration - Crawling Gets slow and Dynamic Pages wherein Content changes are not getting Updated
 Hello, I have just performed doing server migration 2 days back All's well with traffic moved to new servers But somehow - it seems that w.r.t previous host that on submitting a new article - it was getting indexed in minutes. Now even after submitting page for indexing - its taking bit of time in coming to Search Engines and some pages wherein content is daily updated - despite submitting for indexing - changes are not getting reflected Site name is - http://www.mycarhelpline.com Have checked in robots, meta tags, url structure - all remains well intact. No unknown errors reports through Google webmaster Could someone advise - is it normal - due to name server and ip address change and expect to correct it automatically or am i missing something Kindly advise in . Thanks Intermediate & Advanced SEO | | Modi0
- 
		
		
		
		
		
		Contextual FAQ and FAQ Page, is this duplicate content?
 Hi Mozzers, On my website, I have a FAQ Page (with the questions-responses of all the themes (prices, products,...)of my website) and I would like to add some thematical faq on the pages of my website. For example : adding the faq about pricing on my pricing page,... Is this duplicate content? Thank you for your help, regards. Jonathan Intermediate & Advanced SEO | | JonathanLeplang0
- 
		
		
		
		
		
		Is a different location in page title, h1 title, and meta description enough to avoid Duplicate Content concern?
 I have a dynamic website which will have location-based internal pages that will have a <title>and <h1> title, and meta description tag that will include the subregion of a city. Each page also will have an 'info' section describing the generic product/service offered which will also include the name of the subregion. The 'specific product/service content will be dynamic but in some cases will be almost identical--ie subregion A may sometimes have the same specific content result as subregion B. Will the difference of just the location put in each of the above tags be enough for me to avoid a Duplicate Content concern?</p></title> Intermediate & Advanced SEO | | couponguy0
- 
		
		
		
		
		
		How to resolve duplicate content issues when using Geo-targeted Subfolders to seperate US and CAN
 A client of mine is about to launch into the USA market (currently only operating in Canada) and they are trying to find the best way to geo-target. We recommended they go with the geo-targeted subfolder approach (___.com and ___.com/ca). I'm looking for any ways to assist in not getting these pages flagged for duplicate content. Your help is greatly appreciated. Thanks! Intermediate & Advanced SEO | | jyoung2220
- 
		
		
		
		
		
		Are duplicate links on same page alright?
 If I have a homepage with category links, is it alright for those category links to appear in the footer as well, or should you never have duplicate links on one page? Can you please give a reason why as well? Thanks! Intermediate & Advanced SEO | | dkamen0
- 
		
		
		
		
		
		All page files in root? Or to use directories?
 We have thousands of pages on our website; news articles, forum topics, download pages... etc - and at present they all reside in the root of the domain /. For example: /aosta-valley-i6816.html Intermediate & Advanced SEO | | Peter264
 /flight-sim-concorde-d1101.html
 /what-is-best-addon-t3360.html We are considering moving over to a new URL system where we use directories. For example, the above URLs would be the following: /images/aosta-valley-i6816.html
 /downloads/flight-sim-concorde-d1101.html
 /forums/what-is-best-addon-t3360.html Would we have any benefit in using directories for SEO purposes? Would our current system perhaps mean too many files in the root / flagging as spammy? Would it be even better to use the following system which removes file endings completely and suggests each page is a directory: /images/aosta-valley/6816/
 /downloads/flight-sim-concorde/1101/
 /forums/what-is-best-addon/3360/ If so, what would be better: /images/aosta-valley/6816/ or /images/6816/aosta-valley/ Just looking for some clarity to our problem! Thank you for your help guys!0
 
			
		 
			
		 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				