Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
WordPress - How to stop both http:// and https:// pages being indexed?
-
Just published a static page 2 days ago on WordPress site but noticed that Google has indexed both http:// and https:// url's. Usually I only get http:// indexed though.
Could anyone please explain why this may have happened and how I can fix? Thanks!
-
Just one adjustment to this - although I think David's right that the canonical tag can be a good solution. Although Google can index https: fine, the issue is whether you're creating duplicates. If you have duplicates, then it's possible that the https: version could be the one you want as canonical. In this case, it doesn't sound like it, but I just wanted to point that out.
Of course, long-term, you should sort out why these are being created. A desktop crawler like Xenu or Screaming Frog may be the best bet, but I'd hit the WordPress forums, too. Odds are it's a common issue. Typically, it happens when some deeper page (like a shopping cart) on a site is secure, and then the links are all relative ("/about.php", for example). Then, those links get crawled as both secure and non-secure.
Unfortunately, I'm not a WordPress expert, so I can only speak in generalities.
-
Thanks David, I feel like going out to buy some Swedish Fish for some reason now.
-
I actually just did a wealth of research on this topic a few days ago. Without going into the nitty gritty details, if the https is site-wide Google recommends a Rel="canonical" attribute (http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394) pointing to the non-secure http version. Google claims it can index https fine, but Matt Cutts said he would "lean towards pointing the canonical to the http version." Also, on the Rel="canonical" page Google says:
If you publish content on both http://www.example.com/product.php?item=swedish-fish and https://www.example.com/product.php?item=swedish-fish, you can specify the canonical version of the page. Create the element:
Add this link to the section of https://www.example.com/product.php?item=swedish-fish.
Make sure the canonical is on every page of your site.
Not sure why this may have happened, but it is creating duplicate content, which is why the canonical is necessary.
Hope that helps!
Thanks
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
Robots.txt on http vs. https
We recently changed our domain from http to https. When a user enters any URL on http, there is an global 301 redirect to the same page on https. I cannot find instructions about what to do with robots.txt. Now that https is the canonical version, should I block the http-Version with robots.txt? Strangely, I cannot find a single ressource about this...
Technical SEO | | zeepartner0 -
Which Sitemap to keep - Http or https (or both)
Hi, Just finished upgrading my site to the ssl version (like so many other webmasters now that it may be a ranking factor). FIxed all links, CDN links are now secure, etc and 301 Redirected all pages from http to https. Changed property in Google Analytics from http to https and added https version in Webmaster Tools. So far, so good. Now the question is should I add the https version of the sitemap in the new HTTPS site in webmasters or retain the existing http one? Ideally switching over completely to https version by adding a new sitemap would make more sense as the http version of the sitemap would anyways now be re-directed to HTTPS. But the last thing i can is to get penalized for duplicate content. Could you please suggest as I am still a rookie in this department. If I should add the https sitemap version in the new site, should i delete the old http one or no harm retaining it.
Technical SEO | | ashishb010 -
How to stop google from indexing specific sections of a page?
I'm currently trying to find a way to stop googlebot from indexing specific areas of a page, long ago Yahoo search created this tag class=”robots-nocontent” and I'm trying to see if there is a similar manner for google or if they have adopted the same tag? Any help would be much appreciated.
Technical SEO | | Iamfaramon0 -
Do I use /es/, /mx/ or /es-mx/ for my Spanish site for Mexico only
I currently have the Spanish version of my site under myurl.com/es/ When I was at Pubcon in Vegas last year a panel reviewed my site and said the Spanish version should be in /mx/ rather than /es/ since es is for Spain only and my site is for Mexico only. Today while trying to find information on the web I found /es-mx/ as a possibility. I am changing my site and was planning to change to /mx/ but want confirmation on the correct way to do this. Does anyone have a link to Google documentation that will tell me for sure what to use here? The documentation I read led me to the /es/ but I cannot find that now.
Technical SEO | | RoxBrock0 -
Should i index or noindex a contact page
Im wondering if i should noindex the contact page im doing SEO for a website just wondering if by noindexing the contact page would it help SEO or hurt SEO for that website
Technical SEO | | aronwp0 -
How to Remove /feed URLs from Google's Index
Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these: <generator>http://wordpress.org/?v=3.5.2</generator> Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses. My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore. I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index. FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all. Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph.
Technical SEO | | M_D_Golden_Peak0 -
Wordpress categories causing too many links/duplicate content?
I've just added categories to my wordpress site and some of the posts show in several of the categories. Will this cause me duplicate content problems as I want the category pages to be indexed? Also as I add more categories I'm creating more links on the page. They can't be seen to the user as I have a plugin that creates drop down categories. When I go to 'view source' though all the links are there so google will see lots of links. How can I fix the too many links problem? And should I worry about duplicate content issue?
Technical SEO | | SamCUK1