How to prevent Google from crawling our product filter?

footsteps

Hi All,

We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl.

On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless.

In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls.

We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway.

What can we do to prevent Google from crawling all the filter options?

Thanks in advance for the help.

Kind regards,

Gerwin

footsteps

The following is added to our robots.txt .. now lets wait and see the results

User-agent: * Disallow: /admin/
Disallow: /?
Allow /?product_date=&product_date2=*
Disallow /?product_date=&product_date2=&

To check the working of the robots.txt i found a handy website;

http://phpweby.com/services/robots

footsteps

The url looks like this;

http://www.sneakerskoopjeonline.nl/herensneakers?product_brand=

So just adding;

User-agent: *
Disallow: /*?product_brand

Should do the trick?
Most important is that herensneakers itself should be indexed, followed and crawled

alexhoug

I would use your robots.txt file to prevent them from crawling the specific strings / pages. Go into your Google Webmaster Tools and you can see all the information Google has on your site and any issues, you can also specify robots.txt information in there. That would be the best route as Google is obedient with what is on the robots.txt file. If you want more information about robots.txt, go here.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

How to prevent Google from crawling our product filter?

Browse Questions

Explore more categories

Related Questions

Google Pagination Changes

My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help

Can Google crawl dynamically generated links?

Limit on Google Removal Tool?

Google is mixing subdomains. What can we do?

How long is the google sandbox these days?

How to stop Google crawling after 301 redirect?

Should I prevent Google from indexing blog tag and category pages?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved