Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to resolve Duplicate Page Content issue for root domain & index.html?
-
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
-
great code Josh...but , after i saved it on .htaccess , a "?" appeared on the link..
http://www.domain.com/?/example/file.html
Is this ok ? pls advice/
Thank you,
-
You touched on a good point here "We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps."
Too often I see sites where they get the home page right but miss the re-write on the directories.
-
Here's the .htaccess rewrite command that you can use for the index.html redirect -
Options +FollowSymlinks RewriteEngine on
Index Rewrite RewriteRule ^index.(htm|html|php) http://www.amarasoftware.com/ [R=301,L] RewriteRule ^(.*)/index.(htm|html|php) http://www.amarasoftware.com/$1/ [R=301,L]
We set up our site to utilize a index redirect for all of our sub directories as well, so with this method you simply name your sub directories to match the url path that you desire. Each sub directory has it's own index which you redirect with a variation of the above code. By doing this you can have nice clean url paths like http://www.semclix.com/design/ecommerce/ - and mitigate the duplicate content issue. We hope that this helps.
-
I'd check it with some other software too... i.e. Raven Tools free trial or something, that will tell you if there's canonicalization problems... of course I'm not advocating Raven Tools over SEOmoz tools (I'm a member here and not there for good reasons), I just think best to try a few different tests before deciding if it's a problem. There might just be an issue with the SEOmoz campaign tool for the moment, which I'm sure they'll fix as soon as they realise.
Hey, aren't you the tutor I had in my SEC usability course?
-
Unfortunately I can't speak for how SEOmoz handles rewrites like this if it's already crawled the page.
The rewrite rule you're using looks like it's only rewriting the www portion of the URL, not index.html. So alone it wouldn't do anything to solve dupe content issues. (someone please correct me if I'm misreading the rewrite rule)
Here's a link to what I used to write a redirect for index.html on another site.
http://www.webmasterworld.com/forum92/6375.htm
I think it is a fairly safe assumption to make that SEOmoz is smart enough to realize if you're got a redirect in there (providing that its working). I'd still recommend taking a look to see if Google has cached or indexed an index.html version, though.
Edit: my personal, highly technical, acid-test for an index.html redirect is just going there and manually entering the url with index.html on the end, rather than waiting for a recrawl to see if you're heading in the right direction.
-
RewriteEngine on RewriteCond %{HTTP_HOST} ^([a-z.]+)?amarasoftware.com$ [NC] RewriteCond %{HTTP_HOST} !^www. [NC] RewriteRule .? http://www.%1amarasoftware.com%{REQUEST_URI} [R=301,L] Is what I use. In Seomoz this leads to www.amarasoftware.com and index.html so 2 different URL's, both with different incoming links, and a different authority, which has an impact on my ranking if correct. in SEomoz this a returns a duplicate title and meta tags errors. If SEOmoz finds 2 pages instead of one I may assume that Google agrees with this.
-
As you did, I'd normally handle this with a 301 from index.html to the root domain. When you say that it's "not had an impact" do you mean that the SEOmoz dashboard continues to show an error after it re-crawls, or that the search engines are not picking up the redirect?
SEOmoz dashboard does a great job, but I'd check to see how the search engines are actually indexing yourdomain.com/index.html vs. yourdomain.com also. If the search engines are indexing it as you want them to, then I'd be inclined to ignore the dashboard error.
I apologize if this is a stupid question, but I assume you manually checked that the redirect worked?
-
You wish to canonicalize the pages. That is the SEO word which describes exactly what you are trying to achieve.
Above are 5 URLs which can possibly lead to the exact same page. If you add the following HTML in the code then the pages will be canonicalized.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEM Rush & Duplicate content
Hi SEMRush is flagging these pages as having duplicate content, but we have rel = next etc implemented: https://www.key.co.uk/en/key/brand/bott https://www.key.co.uk/en/key/brand/bott?page=2 Or is it being flagged as they're just really similar pages?
Intermediate & Advanced SEO | | BeckyKey0 -
What are best page titles for sub-domain pages?
Hi Moz communtity, Let's say a website has multiple sub-domains with hundreds and thousands of pages. Generally we will be mentioning "primary keyword & "brand name" on every page of website. Can we do same on all pages of sub-domains to increase the authority of website for this primary keyword in Google? Or it gonna end up as negative impact if Google consider as duplicate content being mentioned same keyword and brand name on every page even on website and all pages of sub domains? Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Moving html site to wordpress and 301 redirect from index.htm to index.php or just www.example.com
I found page duplicate content when using Moz crawl tool, see below. http://www.example.com
Intermediate & Advanced SEO | | gozmoz
Page Authority 40
Linking Root Domains 31
External Link Count 138
Internal Link Count 18
Status Code 200
1 duplicate http://www.example.com/index.htm
Page Authority 19
Linking Root Domains 1
External Link Count 0
Internal Link Count 15
Status Code 200
1 duplicate I have recently transfered my old html site to wordpress.
To keep the urls the same I am using a plugin which appends .htm at the end of each page. My old site home page was index.htm. I have created index.htm in wordpress as well but now there is a conflict of duplicate content. I am using latest post as my home page which is index.php Question 1.
Should I also use redirect 301 im htaccess file to transfer index.htm page authority (19) to www.example.com If yes, do I use
Redirect 301 /index.htm http://www.example.com/index.php
or
Redirect 301 /index.htm http://www.example.com Question 2
Should I change my "Home" menu link to http://www.example.com instead of http://www.example.com/index.htm that would fix the duplicate content, as indx.htm does not exist anymore. Is there a better option? Thanks0 -
6 .htaccess Rewrites: Remove index.html, Remove .html, Force non-www, Force Trailing Slash
i've to give some information about my website Environment 1. i have static webpage in the root. 2. Wordpress installed in sub-dictionary www.domain.com/blog/ 3. I have two .htaccess , one in the root and one in the wordpress
Intermediate & Advanced SEO | | NeatIT
folder. i want to www to non on all URLs Remove index.html from url Remove all .html extension / Re-direct 301 to url
without .html extension Add trailing slash to the static webpages / Re-direct 301 from non-trailing slash Force trailing slash to the Wordpress Webpages / Re-direct 301 from non-trailing slash Some examples domain.tld/index.html >> domain.tld/ domain.tld/file.html >> domain.tld/file/ domain.tld/file.html/ >> domain.tld/file/ domain.tld/wordpress/post-name >> domain.tld/wordpress/post-name/ My code in ROOT htaccess is <ifmodule mod_rewrite.c="">Options +FollowSymLinks -MultiViews RewriteEngine On
RewriteBase / #removing trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L] #www to non
RewriteCond %{HTTP_HOST} ^www.(([a-z0-9_]+.)?domain.com)$ [NC]
RewriteRule .? http://%1%{REQUEST_URI} [R=301,L] #html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^.]+)$ $1.html [NC,L] #index redirect
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
RewriteRule ^index.html$ http://domain.com/ [R=301,L]
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L]</ifmodule> The above code do 1. redirect www to non-www
2. Remove trailing slash at the end (if exists)
3. Remove index.html
4. Remove all .html
5. Redirect 301 to filename but doesn't add trailing slash at the end0 -
Are pages with a canonical tag indexed?
Hello here, here are my questions for you related to the canonical tag: 1. If I put online a new webpage with a canonical tag pointing to a different page, will this new page be indexed by Google and will I be able to find it in the index? 2. If instead I apply the canonical tag to a page already in the index, will this page be removed from the index? Thank you in advance for any insights! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
Duplicate Content From Indexing of non- File Extension Page
Google somehow has indexed a page of mine without the .html extension. so they indexed www.samplepage.com/page, so I am showing duplicate content because Google also see's www.samplepage.com/page.html How can I force google or bing or whoever to only index and see the page including the .html extension? I know people are saying not to use the file extension on pages, but I want to, so please anybody...HELP!!!
Intermediate & Advanced SEO | | WebbyNabler0 -
How important is the number of indexed pages?
I'm considering making a change to using AJAX filtered navigation on my e-commerce site. If I do this, the user experience will be significantly improved but the number of pages that Google finds on my site will go down significantly (in the 10,000's). It feels to me like our filtered navigation has grown out of control and we spend too much time worrying about the url structure of it - in some ways it's paralyzing us. I'd like to be able to focus on pages that matter (explicit Category and Sub-Category) pages and then just let ajax take care of filtering products below these levels. For customer usability this is smart. From the perspective of manageable code and long term design this also seems very smart -we can't continue to worry so much about filtered navigation. My concern is that losing so many indexed pages will have a large negative effect (however, we will reduce duplicate content and be able provide much better category and sub-category pages). We probably should have thought about this a year ago before Google indexed everything :-). Does anybody have any experience with this or insight on what to do? Thanks, -Jason
Intermediate & Advanced SEO | | cre80 -
301 - should I redirect entire domain or page for page?
Hi, We recently enabled a 301 on our domain from our old website to our new website. On the advice of fellow mozzer's we copied the old site exactly to the new domain, then did the 301 so that the sites are identical. Question is, should we be doing the 301 as a whole domain redirect, i.e. www.oldsite.com is now > www.newsite.com, or individually setting each page, i.e. www.oldsite.com/page1 is now www.newsite.com/page1 etc for each page in our site? Remembering that both old and new sites (for now) are identical copies. Also we set the 301 about 5 days ago and have verified its working but haven't seen a single change in rank either from the old site or new - is this because Google hasn't likely re-indexed yet? Thanks, Anthony
Intermediate & Advanced SEO | | Grenadi0