Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
De-indexing product "quick view" pages
-
Hi there,
The e-commerce website I am working on seems to index all of the "quick view" pages (which normally occur as iframes on the category page) as their own unique pages, creating thousands of duplicate pages / overly-dynamic URLs. Each indexed "quick view" page has the following URL structure:
www.mydomain.com/catalog/includes/inc_productquickview.jsp?prodId=89514&catgId=cat140142&KeepThis=true&TB_iframe=true&height=475&width=700
where the only thing that changes is the product ID and category number.
Would using "disallow" in Robots.txt be the best way to de-indexing all of these URLs? If so, could someone help me identify how to best structure this disallow statement? Would it be:
Disallow: /catalog/includes/inc_productquickview.jsp?prodID=*
Thanks for your help.
-
Just to add, if you block URLs in robots.txt they wont actually get deindexed. They will be for all intents and purposes be blocked (wont cause duplicate content issues etc) but they will drop into the omitted results:
_In order to show you the most relevant results, we have omitted some entries very similar to the 13 already displayed._If you like, you can repeat the search with the omitted results included. And will look like this in the SERPS (see attachment).If you want them removed from the SERPs you will need to use the robots NOINDEX meta tag, or use GWMT as William advised.
The disallow entry you posted will block these pages, as long as they all start with that way. Although you don't actually need the trailing wild card as that gets ignored, you can just leave it open. Google robots.txt specs
-
Thanks William. I think I will stick with the Robots file in this case. I am nervous about using that parameter feature in case ?prodID is used in any other URL that should be indexed.
-
You can use that in your robots.txt, which should work on crawls.
Or
you can also go into WMT and setup your parameters, in this case would be ?prodID.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My product category pages are not being indexed on google can someone help?
My website has been indexed on google and all of its pages can be found on google except for the product category pages - which are where we want our traffic heading to, so this is a big problem for us. Our website is www.skirtinguk.com And an example of a page that isn't being indexed is https://www.skirtinguk.com/product-category/mdf-skirting-board/
Intermediate & Advanced SEO | | chelseaskirtinguk0 -
React.js Single Page Application Not Indexing
We recently launched our website that uses React.js and we haven't been able to get any of the pages indexed. Our previous site (which had a .ca domain) ranked #1 in the 4 cities we had pages and we redirected it to the .com domain a little over a month ago. We have recently started using prerender.io but still haven't seen any success. Has anyone dealt with a similar issue before?
Intermediate & Advanced SEO | | m_van0 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
Is their value in linking to PPC landing pages and using rel="canonical"
I have ppc landing pages that are similar to my seo page. The pages are shorter with less text with a focus on converting visitors further along in the purchase cycle. My questions are: 1. Is there a benefit for having the orphan ppc pages indexed or should I no index them? 2. If indexing does provide benefits, should I create links from my site to the ppc pages or should I just submit them in a sitemap? 3. If indexed, should I use rel="canonical" and point the ppc versions to the appropriate organic page? Thanks,
Intermediate & Advanced SEO | | BrandExpSteve0 -
Dynamic pages - ecommerce product pages
Hi guys, Before I dive into my question, let me give you some background.. I manage an ecommerce site and we're got thousands of product pages. The pages contain dynamic blocks and information in these blocks are fed by another system. So in a nutshell, our product team enters the data in a software and boom, the information is generated in these page blocks. But that's not all, these pages then redirect to a duplicate version with a custom URL. This is cached and this is what the end user sees. This was done to speed up load, rather than the system generate a dynamic page on the fly, the cache page is loaded and the user sees it super fast. Another benefit happened as well, after going live with the cached pages, they started getting indexed and ranking in Google. The problem is that, the redirect to the duplicate cached page isn't a permanent one, it's a meta refresh, a 302 that happens in a second. So yeah, I've got 302s kicking about. The development team can set up 301 but then there won't be any caching, pages will just load dynamically. Google records pages that are cached but does it cache a dynamic page though? Without a cached page, I'm wondering if I would drop in traffic. The view source might just show a list of dynamic blocks, no content! How would you tackle this? I've already setup canonical tags on the cached pages but removing cache.. Thanks
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Best practice for removing indexed internal search pages from Google?
Hi Mozzers I know that it’s best practice to block Google from indexing internal search pages, but what’s best practice when “the damage is done”? I have a project where a substantial part of our visitors and income lands on an internal search page, because Google has indexed them (about 3 %). I would like to block Google from indexing the search pages via the meta noindex,follow tag because: Google Guidelines: “Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines.” http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769 Bad user experience The search pages are (probably) stealing rankings from our real landing pages Webmaster Notification: “Googlebot found an extremely high number of URLs on your site” with links to our internal search results I want to use the meta tag to keep the link juice flowing. Do you recommend using the robots.txt instead? If yes, why? Should we just go dark on the internal search pages, or how shall we proceed with blocking them? I’m looking forward to your answer! Edit: Google have currently indexed several million of our internal search pages.
Intermediate & Advanced SEO | | HrThomsen0 -
How to properly link to products from category pages?
Hi All, We have an e-commerce website and the category pages are built so that there is a product image and below it there is the title. Both the image and the title are in a href (each on its own). I encountered the following unfinished discussion here at MOZ:
Intermediate & Advanced SEO | | BeytzNet
http://www.seomoz.org/q/how-to-optimize-achor-text-links-on-ecommerce-category-page#post-93758 The discussion states that its improper. The question is - if it is wrong then why? (maybe because Google will give its weight to the image anchor instead of the text anchor since it is higher in the page). The other question is how to resolve the matter?
Should I add nofollow to the image href? Thanks0 -
Best way to de-index content from Google and not Bing?
We have a large quantity of URLs that we would like to de-index from Google (we are affected b Panda), but not Bing. What is the best way to go about doing this?
Intermediate & Advanced SEO | | nicole.healthline0