Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Blocking URL's with specific parameters from Googlebot
- 
					
					
					
					
 Hi, I've discovered that Googlebot's are voting on products listed on our website and as a result are creating negative ratings by placing votes from 1 to 5 for every product. The voting function is handled using Javascript, as shown below, and the script prevents multiple votes so most products end up with a vote of 1, which translates to "poor". How do I go about using robots.txt to block a URL with specific parameters only? I'm worried that I might end up blocking the whole product listing, which would result in de-listing from Google and the loss of many highly ranked pages. DON'T want to block: http://www.mysite.com/product.php?productid=1234 WANT to block: http://www.mysite.com/product.php?mode=vote&productid=1234&vote=2 Javacript button code: onclick="javascript: document.voteform.submit();" Thanks in advance for any advice given. Regards, 
 Asim
- 
					
					
					
					
 Good to hear, I am glad you perservered 
- 
					
					
					
					
 Tried them all now and all come back with "Success"... May be I'll post in the WMT Forum and see if anyone can shed light on this problem. Thanks for your help Alan, it's much appreciated. 
- 
					
					
					
					
 Yes correct, did you try the other formats? 
- 
					
					
					
					
 Tried "Fetch as Googlebot" in Diagnostics and it came back as "Success" so I guess the robots.txt directive is not working. I'm assuming it should have reported a failure message when attempting to fetch a URL containing "?mode=vote". 
- 
					
					
					
					
 Wrong place, go to diagnostics, then look for fetch as googlebot 
- 
					
					
					
					
 I added "Disallow: /mode=vote" to the robots.txt file and also manually entered it on Crawler Access page, then clicked "Test" and no errors were reported. The WMT page states that robots.txt was last downloaded 16 hours ago so I'll wait until it picks the file up again and then check for any errors. Hopefully that will do trick  
- 
					
					
					
					
 Try this in robots.txt, I did not think that Google allows wild cards but i just read that they do. Disallow: /*mode=vote*orDisallow: /*mode=voteorDisallow: /*modeThen try in Google WMT to read with googlebot to see if it works.The first in the list seems right to me, but I have seen others do it the other ways.
- 
					
					
					
					
 Thanks for the reply. The site was developed using PHP, mySQL and Javascript. I was hoping there was a way to do it without getting programmers involved... 
- 
					
					
					
					
 dont think you are going to do it in robots.txt, rather do a 301 from mode=vote to non mode vote. If you dont know how to put this into practise, tell me what your site is built with, if it is ASP.NET, i will show you how to impliment, if not someone else should be able to help. 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Google has deindexed a page it thinks is set to 'noindex', but is in fact still set to 'index'
 A page on our WordPress powered website has had an error message thrown up in GSC to say it is included in the sitemap but set to 'noindex'. The page has also been removed from Google's search results. Page is https://www.onlinemortgageadvisor.co.uk/bad-credit-mortgages/how-to-get-a-mortgage-with-bad-credit/ Looking at the page code, plus using Screaming Frog and Ahrefs crawlers, the page is very clearly still set to 'index'. The SEO plugin we use has not been changed to 'noindex' the page. I have asked for it to be reindexed via GSC but I'm concerned why Google thinks this page was asked to be noindexed. Can anyone help with this one? Has anyone seen this before, been hit with this recently, got any advice...? Technical SEO | | d.bird0
- 
		
		
		
		
		
		Robot.txt : How to block a specific file type in several subdirectories ?
 Hello everyone ! I need help setting up a robot.txt. I'm trying to block all pdf files in particular directories so I'm using this command. In the example below the line is blocking all .gif in the entire site. Block files of a specific file type (for example, .gif) | Disallow: /*.gif$ 2 questions : Can I use this command to specify one particular directory in which I want to block pdf files ? Will this line be recognized by googlebots ? Disallow: /fileadmin/xxxxxxx/xxx/xxxxxxx/*.pdf$ Then I realized that I would have to write as many lines as many directories there are in which I want to block pdf files. Let's say I want to block pdf files in all these 3 directories /fileadmin/directory1 /fileadmin/directory1/sub1 /fileadmin/directory1/sub1/pdf Is there a pattern-matching rule I could use to blocks access to pdf files in all subdirectories instead of writing 3x the above line for each subdirectory ? For exemple : Disallow: /fileadmin/directory1*/ Many thanks in advance for any insight you may have. Technical SEO | | LabeliumUSA0
- 
		
		
		
		
		
		SERP result (URL) doesn't change after a 301
 A couple of months ago there was a result in Google for our branded search term which wasn't the 'official' URL, actually the result shown in the SERP was www.mycompany-ip.nl. We've applied a 301 redirect of this URL to the 'official' URL which is a subdomain: department.mycompany.nl. From Google the redirect is obviously working, but up until now, I don't see Google replacing the incorrect URL by the correct URL. I am wondering what to do to make the result correct. André Technical SEO | | ConclusionDigital0
- 
		
		
		
		
		
		Are Collapsible DIV's SEO-Friendly?
 When I have a long article about a single topic with sub-topics I can make it user friendlier when I limit the text and hide text just showing the next headlines, by using expandable-collapsible div's. My doubt is if Google is really able to read onclick textlinks (with javaScript) or if it could be "seen" as hidden text? I think I read in the SEOmoz Users Guide, that all javaScript "manipulated" contend will not be crawled. So from SEOmoz's Point of View I should better make use of old school named anchors and a side-navigation to jump to the sub-topics? (I had a similar question in my post before, but I did not use the perfect terms to describe what I really wanted. Also my text is not too long (<1000 Words) that I should use pagination with rel="next" and rel="prev" attributes.) THANKS for every answer 🙂 Technical SEO | | inlinear0
- 
		
		
		
		
		
		Temporarily suspend Googlebot without blocking users
 We'll soon be launching a redesign, on a new platform, migrating millions of pages to new URLs. How can I tell Google (and other crawlers) to temporarily (a day or two) ignore my site? We're hoping to buy ourselves a small bit of time to verify redirects and live functionality before allowing Google to crawl and index the new architecture. GWT's recommendation is to 503 all pages - including robots.txt, but that also makes the site invisible to real site visitors, resulting in significant business loss. Bad answer. I've heard some recommendations to disallow all user agents in robots.txt. Any answer that puts the millions of pages we already have indexed at risk is also a bad answer. Thanks Technical SEO | | lzhao0
- 
		
		
		
		
		
		Google's "cache:" operator is returning a 404 error.
 I'm doing the "cache:" operator on one of my sites and Google is returning a 404 error. I've swapped out the domain with another and it works fine. Has anyone seen this before? I'm wondering if G is crawling the site now? Thx! Technical SEO | | AZWebWorks0
- 
		
		
		
		
		
		Blank pages in Google's webcache
 Hello all, Is anybody experiencing blanck page's in Google's 'Cached' view? I'm seeing just the page background and none of the content for a couple of my pages but when I click 'View Text Only' all of teh content is there. Strange! I'd love to hear if anyone else is experiencing the same. Perhaps this is something to do with the roll out of Google's updates last week?! Thanks, Technical SEO | | A_Q
 Elias0
- 
		
		
		
		
		
		Does Google pass link juice a page receives if the URL parameter specifies content and has the Crawl setting in Webmaster Tools set to NO?
 The page in question receives a lot of quality traffic but is only relevant to a small percent of my users. I want to keep the link juice received from this page but I do not want it to appear in the SERPs. Technical SEO | | surveygizmo0
 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				