Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What does Disallow: /french-wines/?* actually do - robots.txt
- 
					
					
					
					
 Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke 
- 
					
					
					
					
 Glad to help, Luke! 
- 
					
					
					
					
 Thanks Logan for your help with this - much appreciated. Really helpful! 
- 
					
					
					
					
 Disallow: /?* is the same thing as Disallow:/?, since the asterisk is a wildcard, both of those disallows prevent any URL that begins with /? from being crawled. And yes, it is incredibly easy to disallow the wrong thing! The robots.txt tester in Search Console (under the Crawl menu) is very helpful for figuring out what a disallow will catch and what it will let by. I highly recommend testing any new disallows there before releasing them into the wild. 
- 
					
					
					
					
 Thanks again Logan. What would Disallow: /?* do because that is what the site I am looking at has implemented. Perhaps it works both ways around? I imagine it's easy to disallow the wrong thing or possibly not disallow the right thing. Ugh. 
- 
					
					
					
					
 Disallow: /*? This disallow literally says to crawlers 'if a URL starts with a slash (all URLs) and has a parameter, don't crawl it'. The * is a wildcard that says anything between / and ? is applicable to the disallow. It's very easy to disallow the wrong this especially in regards to parameters, for this reason I always do these 2 things rather than using robots.txt: - Set the purpose of each parameter in Search Console - Go to Crawl > URL Parameters to configure for your site
- Self-referring canonicals - most people disallow URLs with parameters in robots.txt to prevent indexing, but this only prevents crawling. A self-referring canonical pointing to the root level of that URL will prevent indexing or URLs with parameters.
 Hope that's helpful! 
- 
					
					
					
					
 Thanks Logan - I was just reading: Disallow: /*? # block any URL that includes a ? (and thus a query string) - do you know why the ? comes before the * in this case? 
- 
					
					
					
					
 Hi Luke, You are correct that this was done to block URLs with parameters. However, since there's no wildcard (the asterisk) before the folder name, the URL would have to start with /french-wines/. This disallow is really only preventing crawling on the single URL www.yoursite.com/french-wines/ with any parameters appended. 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Looking for opinions on structuring meta title tags/page title/menu title/H1
 Hi everyone I am hoping a few of you can share your opinions. I have been having conversations (okay, healthy debates) about how to write/structure meta title tag and how to compliment them with the H1, page title, menu name. To help explain the thought processes I will use a pretend keyword. How about "screwdriver". Case: (I made this up) we are redesigning a website for a construction tools manufacturing company (pretend name: ABC Tools) targeting OEMs who are interested in purchasing large quantities of tools. The product categories (to become main menu items) are Screwdrivers, Nails, Drills, and Hammers. (bear with me .... this is just an example I am making up on the fly) K. Circling back to screwdrivers - let's say we have one landing page (a primary category page and in the main menu) listing products and great details about screwdrivers. Focus keywords are screwdriver manufacturer, screwdriver supplier, construction screwdrivers Below are questions being debated. If you are willing ... how would you address these questions? And, can you explain WHY? QUESTION ONE: How would you structure the meta title tag (feel free to write one of your own) Screwdriver Manufacturer - Construction Screwdriver | ABC Tools ABC Tools - US-based Screwdriver Manufacturer Supplier Near You High-Quality Screwdrivers for Construction with ABC Tools QUESTION TWO: how would you write the H1 on the page? Would it match the meta tag? OR, would you write something different using the primary keyword? QUESTION THREE Remembering this is not a blog post ... it is a primary landing page linked to the main navigation. What would the menu title be? (remember the product categories above are how the main menu items are bucketed) Screwdrivers Screwdriver Manufacturer Typically in WordPress, the H1 and the menu title is auto-populated using the page title (not the title tag)... So, if we use Screwdrivers as the page title but we want the H1 to match the meta title tag, would we manually change the H1? Or, have the page title and title tag match, but manually change the menu item? Intermediate & Advanced SEO | | Brenda.Haines1
- 
		
		
		
		
		
		If my website do not have a robot.txt file, does it hurt my website ranking?
 After a site audit, I find out that my website don't have a robot.txt. Does it hurt my website rankings? One more thing, when I type mywebsite.com/robot.txt, it automatically redirect to the homepage. Please help! Intermediate & Advanced SEO | | binhlai0
- 
		
		
		
		
		
		Should I use noindex or robots to remove pages from the Google index?
 I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed. From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else. I can add the noindex tag to the review pages but they wont be crawled. https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have? Intermediate & Advanced SEO | | Tylerj0
- 
		
		
		
		
		
		What can we do to optimize / be mobile-friendly for PDFs?
 I'm getting a "Your page is not mobile-friendly." notice in the SERPs for all of our PDFs. I check the pdf on the phone and it appears just fine. rFtLq Intermediate & Advanced SEO | | johnnybgunn0
- 
		
		
		
		
		
		Should brand/company be included in meta title?
 Is there any point/benefit/requirement in using brand/company name in the meta title, I realise search engines like Google prefer brand focused pages, However it is unlikely that someone would be including the company in our search terms. Any thoughts? Intermediate & Advanced SEO | | seoman100
- 
		
		
		
		
		
		Robots.txt, does it need preceding directory structure?
 Do you need the entire preceding path in robots.txt for it to match? e.g: I know if i add Disallow: /fish to robots.txt it will block /fish Intermediate & Advanced SEO | | Milian
 /fish.html
 /fish/salmon.html
 /fishheads
 /fishheads/yummy.html
 /fish.php?id=anything But would it block?: en/fish
 en/fish.html
 en/fish/salmon.html
 en/fishheads
 en/fishheads/yummy.html
 **en/fish.php?id=anything (taken from Robots.txt Specifications)** I'm hoping it actually wont match, that way writing this particular robots.txt will be much easier! As basically I'm wanting to block many URL that have BTS- in such as: http://www.example.com/BTS-something
 http://www.example.com/BTS-somethingelse
 http://www.example.com/BTS-thingybob But have other pages that I do not want blocked, in subfolders that also have BTS- in, such as: http://www.example.com/somesubfolder/BTS-thingy
 http://www.example.com/anothersubfolder/BTS-otherthingy Thanks for listening0
- 
		
		
		
		
		
		Soft 404's from pages blocked by robots.txt -- cause for concern?
 We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this? Intermediate & Advanced SEO | | nicole.healthline4
- 
		
		
		
		
		
		Could you use a robots.txt file to disalow a duplicate content page from being crawled?
 A website has duplicate content pages to make it easier for users to find the information from a couple spots in the site navigation. Site owner would like to keep it this way without hurting SEO. I've thought of using the robots.txt file to disallow search engines from crawling one of the pages. Would you think this is a workable/acceptable solution? Intermediate & Advanced SEO | | gregelwell0
 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				