Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google Indexing Feedburner Links???
-
I just noticed that for lots of the articles on my website, there are two results in Google's index. For instance:
http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html
and
Now my Feedburner feed is set to "noindex" and it's always been that way. The canonical tag on the webpage is set to:
rel='canonical' href='http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html' />
The robots tag is set to:
name="robots" content="index,follow,noodp" />
I found out that there are scrapper sites that are linking to my content using the Feedburner link.
So should the robots tag be set to "noindex" when the requested URL is different from the canonical URL? If so, is there an easy way to do this in Wordpress?
-
Hi Stephane
I also do not see duplicate pages indexed in Google. Not this page: http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html or any others searching the few pages of results on a site: search.
Perhaps it was a temporary thing were some got indexed. I've seen that happen on occasion, and then they get removed.
If you're still seeing otherwise definitely feel free to chime back in, and provide screenshots or an example query.
Thanks!
-Dan
-
I was trying to see how fast Google was indexing new content on my site (quite fast, about 15 seconds).
So I just searched "site:mywebsite.com" and two results came up for some pages.
-
Why do you believe that the second page is indexed? What was your search query?
-
Why does google shows duplicate results for the same page then?
-
Long as you have the canonical tag set up, you should be fine. If they link to the url with the feedburner variables attached, it is the equivalent of linking straight to your actual post page.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Is Indexing my 301 Redirects to Other sites
Long story but now i have a few links from my site 301 redirecting to youtube videos or eCommerce stores. They carry a considerable amount of traffic that i benefit from so i can't take them down, and that traffic is people from other websites, so basically i have backlinks from places that i don't own, to my redirect urls (Ex. http://example.com/redirect) My problem is that google is indexing them and doesn't let them go, i have tried blocking that url from robots.txt but google is still indexing it uncrawled, i have also tried allowing google to crawl it and adding noindex from robots.txt, i have tried removing it from GWT but it pops back again after a few days. Any ideas? Thanks!
Intermediate & Advanced SEO | | cuarto7150 -
Can Google Bot View Links on a Wix Page?
Hi, The way Wix is configured you can't see any of the on-page links within the source code. Does anyone know if Google Bots still count the links on this page? Here is the page in question: https://www.ncresourcecenter.org/business-directory If you do think Google counts these links, can you please send me URL fetcher to prove that the links are crawlable? Thank you SO much for your help.
Intermediate & Advanced SEO | | Fiyyazp0 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
Google Not Indexing XML Sitemap Images
Hi Mozzers, We are having an issue with our XML sitemap images not being indexed. The site has over 39,000 pages and 17,500 images submitted in GWT. If you take a look at the attached screenshot, 'GWT Images - Not Indexed', you can see that the majority of the pages are being indexed - but none of the images are. The first thing you should know about the images is that they are hosted on a content delivery network (CDN), rather than on the site itself. However, Google advice suggests hosting on a CDN is fine - see second screenshot, 'Google CDN Advice'. That advice says to either (i) ensure the hosting site is verified in GWT or (ii) submit in robots.txt. As we can't verify the hosting site in GWT, we had opted to submit via robots.txt. There are 3 sitemap indexes: 1) http://www.greenplantswap.co.uk/sitemap_index.xml, 2) http://www.greenplantswap.co.uk/sitemap/plant_genera/listings.xml and 3) http://www.greenplantswap.co.uk/sitemap/plant_genera/plants.xml. Each sitemap index is split up into often hundreds or thousands of smaller XML sitemaps. This is necessary due to the size of the site and how we have decided to pull URLs in. Essentially, if we did it another way, it may have involved some of the sitemaps being massive and thus taking upwards of a minute to load. To give you an idea of what is being submitted to Google in one of the sitemaps, please see view-source:http://www.greenplantswap.co.uk/sitemap/plant_genera/4/listings.xml?page=1. Originally, the images were SSL, so we decided to reverted to non-SSL URLs as that was an easy change. But over a week later, that seems to have had no impact. The image URLs are ugly... but should this prevent them from being indexed? The strange thing is that a very small number of images have been indexed - see http://goo.gl/P8GMn. I don't know if this is an anomaly or whether it suggests no issue with how the images have been set up - thus, there may be another issue. Sorry for the long message but I would be extremely grateful for any insight into this. I have tried to offer as much information as I can, however please do let me know if this is not enough. Thank you for taking the time to read and help. Regards, Mark Oz6HzKO rYD3ICZ
Intermediate & Advanced SEO | | edlondon0 -
Do links to PDF's on my site pass "link juice"?
Hi, I have recently started a project on one of my sites, working with a branch of the U.S. government, where I will be hosting and publishing some of their PDF documents for free for people to use. The great SEO side of this is that they link to my site. The thing is, they are linking directly to the PDF files themselves, not the page with the link to the PDF files. So my question is, does that give me any SEO benefit? While the PDF is hosted on my site, there are no links in it that would allow a spider to start from the PDF and crawl the rest of my site. So do I get any benefit from these great links? If not, does anybody have any suggestions on how I could get credit for them. Keep in mind that editing the PDF's are not allowed by the government. Thanks.
Intermediate & Advanced SEO | | rayvensoft0 -
Link Research Tools - Detox Links
Hi, I was doing a little research on my link profile and came across a tool called "LinkRessearchTools.com". I bought a subscription and tried them out. Doing the report they advised a low risk but identified 78 Very High Risk to Deadly (are they venomous?) links, around 5% of total and advised removing them. They also advised of many suspicious and low risk links but these seem to be because they have no knowledge of them so default to a negative it seems. So before I do anything rash and start removing my Deadly links, I was wondering if anyone had a). used them and recommend them b). recommend detoxing removing the deadly links c). would there be any cases in which so called Deadly links being removed cause more problems than solve. Such as maintaining a normal looking profile as everyone would be likely to have bad links etc... (although my thinking may be out on that one...). What do you think? Adam
Intermediate & Advanced SEO | | NaescentAdam0 -
Yoast SEO Plugin: To Index or Not to index Categories?
Taking a poll out there......In most cases would you want to index or NOT index your category pages using the Yoast SEO plugin?
Intermediate & Advanced SEO | | webestate0 -
How do you de-index and prevent indexation of a whole domain?
I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks
Intermediate & Advanced SEO | | Martin_S0