Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Investigating a huge spike in indexed pages
- 
					
					
					
					
 I've noticed an enormous spike in pages indexed through WMT in the last week. Now I know WMT can be a bit (OK, a lot) off base in its reporting but this was pretty hard to explain. See, we're in the middle of a huge campaign against dupe content and we've put a number of measures in place to fight it. For example: - 
Implemented a strong canonicalization effort 
- 
NOINDEX'd content we know to be duplicate programatically 
- 
Are currently fixing true duplicate content issues through rewriting titles, desc etc. 
 So I was pretty surprised to see the blow-up. Any ideas as to what else might cause such a counter intuitive trend? Has anyone else see Google do something that suddenly gloms onto a bunch of phantom pages? 
- 
- 
					
					
					
					
 I haven't contacted the forum yet but that's my next step. Pages indexed: 91k Blocked by robots.txt: 8.4million I don't even know how you could create 8.4 million indexable pages from our content. 
- 
					
					
					
					
 Have you contacted the Google Webmaster Help forums? As that seems to be a glitch in Google. How many pages are scraped by Mozbot? If the amount that mozbot shows is different, then you should either sit and wait until Google removes those indexed pages or create a conversation on the forums so someone at google can give you a hint of what is going on. 
- 
					
					
					
					
 Any help out there? Since the original question was posted, I've seen some improvement but even with aggressive canonicalization and noindexing, I'm still seeing a boatload of indexed pages. I am still seeing pages indexed that I've asked explicitly to be omitted by robots.txt (/search.aspx and */filter). I'm guessing it's just going to take a while to deindex what's there. Still, 91k pages indexed is quite a lot when you consider we only have about 3-4k pages and some articles. Is anyone aware of any significant releases by Google? 
- 
					
					
					
					
 Quite recent. We were actually seeing a nice downward trend in the huge number of pages indexed and then the number tripled. Crazy is an understatement. I would have thought the number of pages would fall given the number of pages that now use canonicals. 
- 
					
					
					
					
 How long have you waited since you applied all the rules to avoid duplicate content, as if it was just recently, then Google should be "rebuilding" the index of your site and stats may be a little crazy while that is happening. If it was over 2 month ago and you are seeing the increase now, then I'd suggest you revise the rules you created to see if your own Website isn't creating all those new pages. Hope that helps. 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Japanese URL-structured sitemap (pages) not being indexed by Bing Webmaster Tools
 Hello everyone, I am facing an issue with the sitemap submission feature in Bing Webmaster Tools for a Japanese language subdirectory domain project. Just to outline the key points: The website is based on a subdirectory URL ( example.com/ja/ ) The Japanese URLs (when pages are published in WordPress) are not being encoded. They are entered in pure Kanji. Google Webmaster Tools, for instance, has no issues reading and indexing the page's URLs in its sitemap submission area (all pages are being indexed). When it comes to Bing Webmaster Tools it's a different story, though. Basically, after the sitemap has been submitted ( example.com/ja/sitemap.xml ), it does report an error that it failed to download this part of the sitemap: "page-sitemap.xml" (basically the sitemap featuring all the sites pages). That means that no URLs have been submitted to Bing either. My apprehension is that Bing Webmaster Tools does not understand the Japanese URLs (or the Kanji for that matter). Therefore, I generally wonder what the correct way is to go on about this. When viewing the sitemap ( example.com/ja/page-sitemap.xml ) in a web browser, though, the Japanese URL's characters are already displayed as encoded. I am not sure if submitting the Kanji style URLs separately is a solution. In Bing Webmaster Tools this can only be done on the root domain level ( example.com ). However, surely there must be a way to make Bing's sitemap submission understand Japanese style sitemaps? Many thanks everyone for any advice! Technical SEO | | Hermski0
- 
		
		
		
		
		
		Should search pages be indexed?
 Hey guys, I've always believed that search pages should be no-indexed but now I'm wondering if there is an argument to index them? Appreciate any thoughts! Technical SEO | | RebekahVP0
- 
		
		
		
		
		
		Why google indexed pages are decreasing?
 Hi, my website had around 400 pages indexed but from February, i noticed a huge decrease in indexed numbers and it is continually decreasing. can anyone help me to find out the reason. where i can get solution for that? will it effect my web page ranking ? Technical SEO | | SierraPCB0
- 
		
		
		
		
		
		No index on subdomains
 Hi, We have a subdomain that is appearing in the search results - I want to hide this as it looks really bad. If I were to add the no index tag to the sub domain would URL would this affect the whole domain or just that sub domain? The main domain is vitally important - it is just that sub domain I need to hide. Many thanks Technical SEO | | Creditsafe0
- 
		
		
		
		
		
		Best way to handle pages with iframes that I don't want indexed? Noindex in the header?
 I am doing a bit of SEO work for a friend, and the situation is the following: The site is a place to discuss articles on the web. When clicking on a link that has been posted, it sends the user to a URL on the main site that is URL.com/article/view. This page has a large iframe that contains the article itself, and a small bar at the top containing the article with various links to get back to the original site. I'd like to make sure that the comment pages (URL.com/article) are indexed instead of all of the URL.com/article/view pages, which won't really do much for SEO. However, all of these pages are indexed. What would be the best approach to make sure the iframe pages aren't indexed? My intuition is to just have a "noindex" in the header of those pages, and just make sure that the conversation pages themselves are properly linked throughout the site, so that they get indexed properly. Does this seem right? Thanks for the help... Technical SEO | | jim_shook0
- 
		
		
		
		
		
		Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
 Dear all, starting with my .htaccess file: RewriteEngine On Technical SEO | | inlinear
 RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
 RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
 RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
 2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
 Holger0
- 
		
		
		
		
		
		Pages removed from Google index?
 Hi All, I had around 2,300 pages in the google index until a week ago. The index removed a load and left me with 152 submitted, 152 indexed? I have just re-submitted my sitemap and will wait to see what happens. Any idea why it has done this? I have seen a drop in my rankings since. Thanks Technical SEO | | TomLondon0
- 
		
		
		
		
		
		Handling 301s: Multiple pages to a single page (consolidation)
 Been scouring the interwebs and haven't found much information on redirecting two serparate pages to a single new page. Here is what it boils down to: Let's say a website has two pages, both with good page authority of products that are becoming fazed out. The products, Widget A and Widget B, are still popular search terms, but they are being combined into ONE product, Widget C. While Widget A and Widget B STILL have plenty to do with Widget C, Widget C is now the new page, the main focus page, and the page you want everyone to see and Google to recognize. Now, do I 301 Widget A and Widget B pages to Widget C, ALTHOUGH Widgets A and B previously had nothing to do with one another? (Remember, we want to try and keep some of that authority the two page have had.) OR do we keep Widget A and Widget B pages "alive", take them off the main navigation, and then put a "disclaimer" on the pages announcing they are now part of Widget C and link to Widget C? OR Should Widgets A and B page be canonicalized to Widget C? Again, keep in mind, widgets A and B previously were not similar, but NOW they are and result in Widget C. (If you are confused, we can provide a REAL work example of what we are talkinga about, but decided to not be specific to our industry for this.) Appreciate any and all thoughts on this. Technical SEO | | JU19850
 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				