Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

Moz Q&A is closed.

After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

Using 2 wildcards in the robots.txt file

Intermediate & Advanced SEO

1077

Locked

seo123456 last edited by

I have a URL string which I don't want to be indexed. it includes the characters _Q1 ni the middle of the string.

So in the robots.txt can I use 2 wildcards in the string to take out all of the URLs with that in it? So something like /_Q1. Will that pickup and block every URL with those characters in the string?

Also, this is not directly of the root, but in a secondary directory, so .com/.../_Q1. So do I have to format the robots.txt as //_Q1* as it will be in the second folder or just using /_Q1 will pickup everything no matter what folder it is on?

Thanks.
1 Reply Last reply
Reply Quote 0
lonniea last edited by

I'm not 100% positive, however it does make sense to use it this way.

User-agent: *
Disallow: /*_Q1$
1 Reply Last reply
Reply Quote 0

Browse Questions

View

From

Sorted by

With category

Explore more categories

Related Questions

Block session id URLs with robots.txt

Hi, I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt. Which directive should I use: User-agent: *
Disallow: ?filter= or User-agent: *
Disallow: /?filter= In other words, is the forward slash in the beginning of the disallow directive necessary? Thanks!
Intermediate & Advanced SEO | | Mat_C

1
How many images should I use in structured data for a product?

We have a basic printing website that offers business cards. Each type of business card has a few product images. Should we use structured data for all the images, or just the main image? What is your opinion about this? Thanks in advance.
Intermediate & Advanced SEO | | Choice

0
What does Disallow: /french-wines/?* actually do - robots.txt

Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke
Intermediate & Advanced SEO | | McTaggart

0
Should I use meta noindex and robots.txt disallow?

Hi, we have an alternate "list view" version of every one of our search results pages The list view has its own URL, indicated by a URL parameter I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"... Thanks 🙂
Intermediate & Advanced SEO | | ntcma

0
Using Canonical URL to poin to an external page

I was wondering if I can use a canonical URL that points to a page residing on external site? So a page like:
www.site1.com/whatever.html will have a canonical link in its header to www.site2.com/whatever.html. Thanks.
Intermediate & Advanced SEO | | llamb

0
Using a lot of "Read More" Hidden text

My site has a LOT of "read more" and when a user click they will see a lot of text. "read more" is dark blue bold and clear to the user. It is the perfect for the user experience, since right below I have pictures and videos which is what most users want. Question: I expect few users will click "Read more" (however, some users will appreciate chance to read and learn more) and I wonder if search engines may think I am hiding text and this is a risky approach or simply discount the text as having zero value from an SEO perspective? Or, equally important: If the text was NOT hidden with a "Read more" would the text actually carry more SEO value than if it is hidden under a "read more" even though users will NOT read the text anyway? If yes, reason may be: when the text is not hidden, search engines cannot see that users are not reading it and the text carry more weight from an SEO perspective than pages where text is hidden under a "Read more" where users rarely click "read more".
Intermediate & Advanced SEO | | khi5

0
Is using dots in URL path really a problem?

we have a couple of pages displaying a dot in the URL path like domain.com/mr.smith/widget-mr.smith It displays fine in chrome, firefox and IE and for the user it may actually look better than replacing it by _ or -. Did this ever cause problems to anybody?
Any statement from google about it?
Should I change existing URLs? If so, which other characters can I use in the URL instead of underscore and dash, since in our system dash and underscore are already used for rewriting other characters. Thanks
Intermediate & Advanced SEO | | lcourse

0
Robots.txt is blocking Wordpress Pages from Googlebot?

I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
Intermediate & Advanced SEO | | ENSO

0

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Using 2 wildcards in the robots.txt file

Browse Questions

Explore more categories

Related Questions

Block session id URLs with robots.txt

How many images should I use in structured data for a product?

What does Disallow: /french-wines/?* actually do - robots.txt

Should I use meta noindex and robots.txt disallow?

Using Canonical URL to poin to an external page

Using a lot of "Read More" Hidden text

Is using dots in URL path really a problem?

Robots.txt is blocking Wordpress Pages from Googlebot?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved