Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Mod Rewrite / .htaccess avoid duplicate content
- 
					
					
					
					
 I have been searching and testing for hours but cannot find a solution. I am able to get a URL to display with out the file exntension. i.e domain.com/file instead of domain.com/file.php The problem is both versions of the URL above work, therefore a duplicate content issue. How can I force the URL with the file extension not to resolve and give a 404 error? Or just redirect to the non extension URL? IF it helps here is my code. Options +FollowSymLinks 
 RewriteEngine OnRewriteCond %{REQUEST_FILENAME} !-f 
 RewriteCond %{REQUEST_FILENAME} !-d
 RewriteCond %{REQUEST_FILENAME}.php -f
 RewriteRule ^(.+)$ $1.php [L,QSA]
- 
					
					
					
					
 Hi Erik, No problem, glad I could help  To answer your question, No it doesn't matter which you use because the end result will be re-written to remove the file extension and add a forward slash at the end. For consistency I would suggest having it without the .php inside your content though. If nothing else it would save you the pain of having to remove .php from your content if you moved to a content management system in the future. If you've got any other questions let me know, and I'll be happy to help. Ben 
- 
					
					
					
					
 Didnt say thanks before, so thank you. One question I did not think of. Should the internal linking of the site be to the file name with extension or no extension? I think it should be without extension but just want to double check. 
- 
					
					
					
					
 Hi Ben. I tried this code on another hosting account and it did worked. The first account was a VPS account from Godaddy. The second was a shared account from the same hosting company. Im not sure why it works on one and not on the other. I did see the mod_rewrite option enabled. 
- 
					
					
					
					
 Just tried this on my development server and it worked fine: RewriteBase / RewriteEngine on RewriteCond %{HTTP_HOST} ^test.local RewriteCond %{THE_REQUEST} ^GET\ (.).php\ HTTP RewriteRule (.).php$ $1 [R=301] remove index RewriteRule (.*)index$ $1 [R=301]remove slash if not directory RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} /$ RewriteRule (.)/ $1 [R=301] # add .php to access file, but don't redirect RewriteCond %{REQUEST_FILENAME}.php -f RewriteCond %{REQUEST_URI} !/$RewriteRule (.) $1.php [L]The dev URL is test.local so you would want to change this to www.yourdomain.co.ukI had a page called about.php if I entered http://test.local/about.php or http://test.local/about it would show http://test.local/about in the address bar 
- 
					
					
					
					
 Hi Ben. Thanks for your help but this does not work for some reason. Im testing it on an old site I have that is html and I just replaced php for html but both URL's still resolves. 
- 
					
					
					
					
 Good answer Ben. My main site is my own CMS, that I built 10 years ago, so after I added a lot of things to the .htaccess file and it became too large, I just moved the handling inside the control program, that only looks up filed URLs when they are broken. This processing is fast, but if there was any degradation, it only affects the broken URLs. Speaking of broken URLs, I was getting a few 400 return codes and it seems the webserver handles those, so you have no chance to handle it in .htaccess. So the wat to handle that is with a 400 handler - that on cpanel sites just needs a 400.shtml file, that you can customize. - you get a 400 response if you request a URL with a % symbol on the end, and some other site did that, thanks very much, and then google decided it would be a great thing to index.
 
- 
					
					
					
					
 Try using this instead: <code>RewriteBase /</code><code># remove .php; use THE_REQUEST to prevent infinite loops 
 RewriteCond %{HTTP_HOST} ^www.domain.com
 RewriteCond %{THE_REQUEST} ^GET\ (.).php\ HTTP
 RewriteRule (.).php$ $1 [R=301]remove indexRewriteRule (.*)index$ $1 [R=301] remove slash if not directoryRewriteCond %{REQUEST_FILENAME} !-d 
 RewriteCond %{REQUEST_URI} /$
 RewriteRule (.*)/ $1 [R=301]add .php to access file, but don't redirectRewriteCond %{REQUEST_FILENAME}.php -f 
 RewriteCond %{REQUEST_URI} !/$
 RewriteRule (.*) $1.php [L]</code>
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Duplicate content, although page has "noindex"
 Hello, I had an issue with some pages being listed as duplicate content in my weekly Moz report. I've since discussed it with my web dev team and we decided to stop the pages from being crawled. The web dev team added this coding to the pages <meta name='robots' content='max-image-preview:large, noindex dofollow' />, but the Moz report is still reporting the pages as duplicate content. Note from the developer "So as far as I can see we've added robots to prevent the issue but maybe there is some subtle change that's needed here. You could check in Google Search Console to see how its seeing this content or you could ask Moz why they are still reporting this and see if we've missed something?" Any help much appreciated! Technical SEO | | rj_dale0
- 
		
		
		
		
		
		Sudden Indexation of "Index of /wp-content/uploads/"
 Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you. Technical SEO | | Tom3_150
- 
		
		
		
		
		
		Duplicate content through product variants
 Hi, Before you shout at me for not searching - I did and there are indeed lots of threads and articles on this problem. I therefore realise that this problem is not exactly new or unique. The situation: I am dealing with a website that has 1 to N (n being between 1 and 6 so far) variants of a product. There are no dropdown for variants. This is not technically possible short of a complete redesign which is not on the table right now. The product variants are also not linked to each other but share about 99% of content (obvious problem here). In the "search all" they show up individually. Each product-variant is a different page, unconnected in backend as well as frontend. The system is quite limited in what can be added and entered - I may have some opportunity to influence on smaller things such as enabling canonicals. In my opinion, the optimal choice would be to retain one page for each product, the base variant, and then add dropdowns to select extras/other variants. As that is not possible, I feel that the best solution is to canonicalise all versions to one version (either base variant or best-selling product?) and to offer customers a list at each product giving him a direct path to the other variants of the product. I'd be thankful for opinions, advice or showing completely new approaches I have not even thought of! Kind Regards, Nico Technical SEO | | netzkern_AG0
- 
		
		
		
		
		
		Query Strings causing Duplicate Content
 I am working with a client that has multiple locations across the nation, and they recently merged all of the location sites into one site. To allow the lead capture forms to pre-populate the locations, they are using the query string /?location=cityname on every page. EXAMPLE - www.example.com/product www.example.com/product/?location=nashville www.example.com/product/?location=chicago There are thirty locations across the nation, so, every page x 30 is being flagged as duplicate content... at least in the crawl through MOZ. Does using that query string actually cause a duplicate content problem? Technical SEO | | Rooted1
- 
		
		
		
		
		
		Do I use /es/, /mx/ or /es-mx/ for my Spanish site for Mexico only
 I currently have the Spanish version of my site under myurl.com/es/ When I was at Pubcon in Vegas last year a panel reviewed my site and said the Spanish version should be in /mx/ rather than /es/ since es is for Spain only and my site is for Mexico only. Today while trying to find information on the web I found /es-mx/ as a possibility. I am changing my site and was planning to change to /mx/ but want confirmation on the correct way to do this. Does anyone have a link to Google documentation that will tell me for sure what to use here? The documentation I read led me to the /es/ but I cannot find that now. Technical SEO | | RoxBrock0
- 
		
		
		
		
		
		Home Page .index.htm and .com Duplicate Page Content/Title
 I have been whittling away at the duplicate content on my clients' sites, thanks to SEOmoz's pro report, and have been getting push back from the account manager at register.com (the site was built here and the owner doesn't want to move it). He says these are the exact same page and he can't access one to redirect to the other. Any suggestions? The SEOmoz report says there is duplicate content on both these urls: Durango Mountain Biking | Durango Mountain Resort - Cascade Village http://www.cascadevillagehotel.com/index.htm Durango Mountain Biking | Durango Mountain Resort - Cascade Village http://www.cascadevillagehotel.com/ Your help is greatly appreciated! Sheryl Technical SEO | | TOMMarketingLtd.0
- 
		
		
		
		
		
		How much to change to avoid duplicate content?
 Working on a site for a dentist. They have a long list of services that they want us to flesh out with text. They provided a bullet list of services, we're trying to get 1 to 2 paragraphs of text for each. Obviously, we're not going to write this off the top of our heads. We're pulling text from other sources and trying to rework. The question is, how much rephrasing do we have to do to avoid a duplicate content penalty? Do we make sure there are changes per paragraph, sentence, or phrase? Thanks! Eric Technical SEO | | ericmccarty0
- 
		
		
		
		
		
		Block Quotes and Citations for duplicate content
 I've been reading about the proper use for block quotes and citations lately, and wanted to see if I was interpreting it the right way. This is what I read: http://www.pitstopmedia.com/sem/blockquote-cite-q-tags-seo So basically my question is, if I wanted to reference Amazon or another stores product reviews, could I use the block quote and citation tags around their content so it doesn't look like duplicate content? I think it would be great for my visitors, but also to the source as I am giving them credit. It would also be a good source to link to on my products pages, as I am not competing with the manufacturer for sales. I could also do this for product information right from the manufacturer. I want to do this for a contact lens site. I'd like to use Acuvue's reviews from their website, as well as some of their product descriptions. Of course I have my own user reviews and content for each product on my website, but I think some official copy could do well. Would this be the best method? Is this how Rottentomatoes.com does it? On every movie page they have 2-3 sentences from 50 or so reviews, and not much unique content of their own. Cheers, Vinnie Technical SEO | | vforvinnie1
 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				