Best practice for removing indexed internal search pages from Google?

HrThomsen

Hi Mozzers

I know that it’s best practice to block Google from indexing internal search pages, but what’s best practice when “the damage is done”?

I have a project where a substantial part of our visitors and income lands on an internal search page, because Google has indexed them (about 3 %).

I would like to block Google from indexing the search pages via the meta noindex,follow tag because:

Google Guidelines: “Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines.” http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769
Bad user experience
The search pages are (probably) stealing rankings from our real landing pages
Webmaster Notification: “Googlebot found an extremely high number of URLs on your site” with links to our internal search results

I want to use the meta tag to keep the link juice flowing. Do you recommend using the robots.txt instead? If yes, why?

Should we just go dark on the internal search pages, or how shall we proceed with blocking them?

I’m looking forward to your answer!

Edit: Google have currently indexed several million of our internal search pages.

italominano

Hello,

Sorry for the late answer, I have the same problem and I think I found the solution. For me works this:

1. Add meta tag robots No Index , Follow for the internal search pages and wait for Google remove it from the index.

Be careful if you do **BOTH (**Adding meta tag robots and Disallow in Robots.txt ) Because of this:

Please note that if you do both: block the search engines in robots.txt and via the meta tags, then the robots.txt command is the primary driver, as they may not crawl the page to see the meta tags, so the URL may still appear in the search results listed URL-only. Souce: http://tools.seobook.com/robots-txt/

I hope this information can help you.

MagicDude4Eva

I would honestly exclude all your internal search pages from the Google index via robots.txt (noindex) exclusion. This will at least re-distribute crawl-time to other areas of your site.

Just having the noindex,follow in the meta-tag (without the robots.txt exclusion) will let GoogleBot crawl the page and then eventually remove it from the index.

I would also change your search-page canoncial to the search term (i.e. /search/iphone) and then have a noindex,follow on meta-tag.

AdamThompson

It sounds like the meta noindex,follow tag is what you want.

robots.txt will block googlebot from crawling your search pages, but Google can still keep the search pages in its index.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Best practice for removing indexed internal search pages from Google?

Browse Questions

Explore more categories

Related Questions

Can you index a Google doc?

Best practice for deindexing large quantities of pages

On 1 of our sites we have our Company name in the H1 on our other site we have the page title in our H1 - does anyone have any advise about the best information to have in the H1, H2 and Page Tile

Pages are Indexed but not Cached by Google. Why?

Date of page first indexed or age of a page?

Dev Subdomain Pages Indexed - How to Remove

How to find all indexed pages in Google?

NOINDEX listing pages: Page 2, Page 3... etc?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved