Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Can too many "noindex" pages compared to "index" pages be a problem?
- 
					
					
					
					
 Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio 
- 
					
					
					
					
 Julian, we sell digital sheet music and the additional 100,000 are products from Alfred music publishing company. Of course they will not be "high quality pages", but they are product pages, each one offering a piece of music. We are an e-commerce website, how can we avoid having product pages?! But of course, as Wesley said above, we can improve each product page quality content by giving more/custom information for each product, increasing user reviews, etc. Other suggestions? 
- 
					
					
					
					
 Thank you Wesley, yes, I think you are right. Our business is suffering really too much without traffic coming from the "noindex" pages, and after many months we still don't see recovery. I think the best approach would be probably to keep the pages in the index and differentiate them as much as we can. Thank you! 
- 
					
					
					
					
 Panda is probably the worst penalty to have. Very few site ever recover, even though site owner have spent a lot of time, effort and money trying to solve it. e.g. http://searchengineland.com/google-panda-two-years-later-losers-still-losing-one-real-recovery-149491 In this video, about 12.43 - matt cutts is clear, if you think its low quality 404 it, in other delete it. May I ask why you want to keep these 180,000 pages live? And why are you planning to add another 100,000 pages? Surely they cant be high quality pages? 
- 
					
					
					
					
 Fabrizo, as far as I know Google Panda is now part of the standard Google algorithm and it won't be a periodic event anymore. Penguin still is though. If your product pages are duplicate content according to Google try and see if you can do something about that instead of no-indexing it. Is there no way you can update the products so they display a more prominent description? I understand that manually it's not a possibility because there are way too much products for that to be an option. I did notice that on a lot of your product pages you have a standard text: "This item includes: PDF (digital sheet music to print), Scorch files (for online playing, transposition and printing), Videos, MIDI and Mp3 audio files (including <a title="This item includes Mp3 music accompaniment files.">Mp3 music accompaniment files</a>)* 
 Genre: classical
 Skill Level: medium"Since this is basicly the only text on a lot of pages I think it's a big part of the problem. Maybe you can change this text so it looks different for every product? Try tools like http://www.plagspotter.com/ to find the duplicate content and see which solution is best for your specific problem. I hope i helped and if you need more help let me know  
- 
					
					
					
					
 I understand what you mean and I agree with you in general, but specifically to our own website, I have no idea who put that link on that page, which is by the way a "nofollow" link. We never built links, all our incoming links are either natural and/or links from our own affiliates. I don't see much of "that stuff" on our back-link profile... am I in error? Anyhow, yes, we are aware the situation is quite complex. Thank you again. 
- 
					
					
					
					
 I actually looked at the competitors ranking #3 and #4 for the phrase "download sheet music" since your ranking 5th. Either way, its not a matter of too much or too little. It's how much of the link profile is authentic vs how much is made up of stuff like this.... http://www.dionneco.com/2011/02/love-is-a-parallax/ that's what I meant by fake links. I think what you may be missing is how complex the situation really is. There's a lot more to be considered than a number in Open Site Explorer - which is actually only a portions of what's really out there. You may also want to look at changes you can make on-site. I'm a firm believer that proper HTML, accessibility, UX and all that really matter. 
- 
					
					
					
					
 Thank you Takeshi, I think you got the problem right. The "crawling" side of the issue is something I was thinking about too! We are actually working on every aspect of our website to improve its content because we have suffered by Panda a lot in the past two years, so here is the strategy we begun to take since March: 1. "noindexing" most of our thin or almost-duplicate content to get it removed from the index 2. Improve our best content and differentiate it as much as we can with compelling content (this takes a long time!) 3. Consolidating similar pages with the use of canonical tags. In order to tackle the "slower crawling" problem you have highlighted here, do you think that would be probably better for us to stop engines to crawl those pages altogether via robots.txt once they have been removed? Would that solve the crawl issue? I could do that at least with these new 100,000 new product pages we plan to add! Thank you! 
- 
					
					
					
					
 Wesley, that's because of being penalized by Panda several times in the past... so we are trying the "clean-up" strategy with the hope to be "de-penalized" by Panda at the next related algorithm update. Looks like we had too many "thin" or "almost duplicate" pages... that's why we removed so many pages from the index! But if we don't see improvements in the coming 1-2 months, I guess we'll put the product pages in the index because our business is suffering a big deal! 
- 
					
					
					
					
 Colin, what do you mean with "fake links" exactly? Our link profile looks actually in better shape than our main competitors: virtualsheetmusic.com (our site): links: 614,013 root domains: 2,233 sheetmusicplus.com (competitor): links: 5,322,596 root domains: 6,149 (worse than our profile!) musicnotes.com (competitor): links: 6,527,429 root domains: 2,914 (much worse than our profile!) Am I missing anything? 
- 
					
					
					
					
 The discrepancy between noindexed/indexed pages is not in itself a problem. However having all those pages will present a challenge to Google, in terms of crawling. Even though the pages won't be indexed, Google will need to spend some of your limited crawl budget crawling all those pages. Also, to recover from Panda it's necessary to not only noindex duplicate content, but improve your indexed content. That means things like consolidating similar pages into one page, writing unique content for your pages, and getting unique user-generated content such as reviews. 
- 
					
					
					
					
 Why would you want to no-index your product pages? They seem like the kind of pages you want to get found on. There shouldn't be a problem between the amount of index pages VS no-index pages except you won't get found on the no-index ones. Product pages tend to be the kind of pages that you REALLY want to get found on. I think you should rethink your strategy to recover from the penalties. 
 Try to find out where exactly the penalties came from and fix the errors in that area of our website.
- 
					
					
					
					
 Can't say I've been in that situation, but search engines seem to interpret that tag as an on/off situation. and I think you probably know that your problems aren't related to or able to be solved by robots meta tags. You need less fake links. OSE finds well over half a million links from 3K root domains to your site. Look at your competitors - a few thousand links from a handful of domains. It's a shame because it seems like the internet wanted to make you the authority naturally - You've got a handful of really solid links coming in. If you could shed the spam somehow you'd be doing a lot better. So yea, stating the obvious, I know. best of luck to you and hope the site recovers! 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		No Index thousands of thin content pages?
 Hello all! I'm working on a site that features a service marketed to community leaders that allows the citizens of that community log 311 type issues such as potholes, broken streetlights, etc. The "marketing" front of the site is 10-12 pages of content to be optimized for the community leader searchers however, as you can imagine there are thousands and thousands of pages of one or two line complaints such as, "There is a pothole on Main St. and 3rd." These complaint pages are not about the service, and I'm thinking not helpful to my end goal of gaining awareness of the service through search for the community leaders. Community leaders are searching for "311 request service", not "potholes on main street". Should all of these "complaint" pages be NOINDEX'd? What if there are a number of quality links pointing to the complaint pages? Do I have to worry about losing Domain Authority if I do NOINDEX them? Thanks for any input. Ken Intermediate & Advanced SEO | | KenSchaefer0
- 
		
		
		
		
		
		Google Indexing Of Pages As HTTPS vs HTTP
 We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content. As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for. However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https. My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks? Intermediate & Advanced SEO | | vikasnwu1
- 
		
		
		
		
		
		E-Commerce Site Collection Pages Not Being Indexed
 Hello Everyone, So this is not really my strong suit but I’m going to do my best to explain the full scope of the issue and really hope someone has any insight. We have an e-commerce client (can't really share the domain) that uses Shopify; they have a large number of products categorized by Collections. The issue is when we do a site:search of our Collection Pages (site:Domain.com/Collections/) they don’t seem to be indexed. Also, not sure if it’s relevant but we also recently did an over-hall of our design. Because we haven’t been able to identify the issue here’s everything we know/have done so far: Moz Crawl Check and the Collection Pages came up. Checked Organic Landing Page Analytics (source/medium: Google) and the pages are getting traffic. Submitted the pages to Google Search Console. The URLs are listed on the sitemap.xml but when we tried to submit the Collections sitemap.xml to Google Search Console 99 were submitted but nothing came back as being indexed (like our other pages and products). We tested the URL in GSC’s robots.txt tester and it came up as being “allowed” but just in case below is the language used in our robots: Intermediate & Advanced SEO | | Ben-R
 User-agent: *
 Disallow: /admin
 Disallow: /cart
 Disallow: /orders
 Disallow: /checkout
 Disallow: /9545580/checkouts
 Disallow: /carts
 Disallow: /account
 Disallow: /collections/+
 Disallow: /collections/%2B
 Disallow: /collections/%2b
 Disallow: /blogs/+
 Disallow: /blogs/%2B
 Disallow: /blogs/%2b
 Disallow: /design_theme_id
 Disallow: /preview_theme_id
 Disallow: /preview_script_id
 Disallow: /apple-app-site-association
 Sitemap: https://domain.com/sitemap.xml A Google Cache:Search currently shows a collections/all page we have up that lists all of our products. Please let us know if there’s any other details we could provide that might help. Any insight or suggestions would be very much appreciated. Looking forward to hearing all of your thoughts! Thank you in advance. Best,0
- 
		
		
		
		
		
		Using hreflang="en" instead of hreflang="en-gb"
 Hello, I have a question in regard to international SEO and the hreflang meta tag. We are currently a B2B business in the UK. Our major market is England with some exceptions of sales internationally. We are wanting to increase our ranking into other english speaking countries and regions such as Ireland and the Channel Islands. My research has found regional google search engines for Ireland (google.ie), Jersey (google.je) and Guernsey (google.gg). Now, all the regions have English as one their main language and here is my questions. Because I use hreflang=“en-gb” as my site language, am I regional excluding these countries and islands? If I used hreflang=“en” would it include these english speaking regions and possible increase the ranking on these the regional search engines? Thank you, Intermediate & Advanced SEO | | SilverStar11
- 
		
		
		
		
		
		Date of page first indexed or age of a page?
 Hi does anyone know any ways, tools to find when a page was first indexed/cached by Google? I remember a while back, around 2009 i had a firefox plugin which could check this, and gave you a exact date. Maybe this has changed since. I don't remember the plugin. Or any recommendations on finding the age of a page (not domain) for a website? This is for competitor research not my own website. Cheers, Paul Intermediate & Advanced SEO | | MBASydney0
- 
		
		
		
		
		
		How long takes to a page show up in Google results after removing noindex from a page?
 Hi folks, A client of mine created a new page and used meta robots noindex to not show the page while they are not ready to launch it. The problem is that somehow Google "crawled" the page and now, after removing the meta robots noindex, the page does not show up in the results. We've tried to crawl it using Fetch as Googlebot, and then submit it using the button that appears. We've included the page in sitemap.xml and also used the old Google submit new page URL https://www.google.com/webmasters/tools/submit-url Does anyone know how long will it take for Google to show the page AFTER removing meta robots noindex from the page? Any reliable references of the statement? I did not find any Google video/post about this. I know that in some days it will appear but I'd like to have a good reference for the future. Thanks. Intermediate & Advanced SEO | | fabioricotta-840380
- 
		
		
		
		
		
		Is it a bad idea to have a "press" page and link to press mentions of our company?
 We've recently been getting quite a bit of press. Would it be wise to create a "press" page and link to mentions of us or would this devalue the links on the press pages as Google may think they reciprocal? Intermediate & Advanced SEO | | JenniferDacosta0
- 
		
		
		
		
		
		All In One SEO PACK Configuration - Index or Noindex?
 I'm finding conflicting information about the right way to configure the All in One SEO Pack wordpress plugin. Do I index or noindex for the items below? Use noindex for Categories - yes or no? Use noindex for Archives - yes or no? Use noindex for Tag Archives - yes or no? Intermediate & Advanced SEO | | webestate0
 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				