Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Multiple robots.txt files on server
- 
					
					
					
					
 Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate) 
 robots.txt-Original (original)
 robots.txt-NEW (other content)
 robots.txt-Working (other content dupplicate)Would really appreciate help and expertise suggestions. Thanks! 
- 
					
					
					
					
 So what's the best policy if a site uses an e-commerce platform like Magento, which has a robots file, but also has a Wordpress blog installed to another folder. eg: /blog and uses a plugin like YOAST which generated a robots file of the Wordpress installation. Then you have 2 robots files, is this detrimental or no big deal? 
- 
					
					
					
					
 Thanks very much for the help! 
- 
					
					
					
					
 Thanks very much for the help! 
- 
					
					
					
					
 Keep a backup and remove them. Search engines are only going to look at the file which is exactly called robots.txt variations of file name will be ignored. Do make sure the entries are correct in the main one though, you don't want Google crawling admin pages or other confidential areas of the site. 
- 
					
					
					
					
 Hi, thanks for the answer and help! Well, I only have one domain that has a webpage and no subdomains active (no blog-subdomain or similar) - so how can I configure that to the situation? Can I just remove all and upload the one I want, maybe? 
- 
					
					
					
					
 That's a good question, EMS. The robots.txt protocol can get kind of 
 confusing when you think about it too long, and it sounds like you've
 thought about this a bit. However, in this case, it might help to
 look at robots.txt from the perspective of the spider.When a spider finds a URL, it takes the whole domain name (everything 
 between 'http://' and the next '/'), then sticks a '/robots.txt' on
 the end of it and looks for that file. If that file exists, then the
 spider should read it to see where it is allowed to crawl.In your case, Googlebot, or any other spider, should try to access 
 three URLs: domainA.com/robots.txt, domainB.domainA.com/robots.txt,
 and domainB.com/robots.txt. The rules in each are treated as
 separate, so disallowing robots from domainA.com/ should result in
 domainA.com/ being removed from search results while
 domainB.domainA.com/ remains unaffected, which does not sound like not
 something you want.The problem you might have with the setup you have described is this-- 
 in order to keep domainB.domainA.com out of the results, you would
 need to have domainB.domainA.com/robots.txt exclude robots, while
 domainB.com/robots.txt welcomes them. This means that you would need
 to have a way to make domainB.domainA.com/ and domainB.com/ serve
 different information, and judging from what you've described, you
 have not set up your server to do so yet.Of course, it is always possible that I have assumed to much about 
 your situation, so it is a good idea to use Google's robots.txt
 analysis tool (see http://www.google.com/support/webmasters/bin/topic.py?topic=8475
 ) to see if your robots.txt files already produce the results you
 want.If using robots.txt files doesn't solve the problem, and assuming that 
 you want to continue hosting all of your content on domainA.com, one
 strategy you really should look into would be setting up a 301
 redirect from the pages on domainB.domainA.com/ to domainB.com/ . If
 you need more advice on how to do this with your server software, your
 hosting company's tech support would definitely be the best place to
 start, but this group is here to help if more isues arise. Hope that helps! 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Robots.txt Syntax for Dynamic URLs
 I want to Disallow certain dynamic pages in robots.txt and am unsure of the proper syntax. The pages I want to disallow all include the string ?Page= Which is the proper syntax? Technical SEO | | btreloar
 Disallow: ?Page=
 Disallow: ?Page=*
 Disallow: ?Page=
 Or something else?0
- 
		
		
		
		
		
		Good robots txt for magento
 Dear Communtiy, I am trying to improve the SEO ratings for my website www.rijwielcashencarry.nl (magento). My next step will be implementing robots txt to exclude some crawling pages. Technical SEO | | rijwielcashencarry040
 Does anybody have a good magento robots txt for me? And what need i copy exactly? Thanks everybody! Greetings, Bob0
- 
		
		
		
		
		
		Is it important to include image files in your sitemap?
 I run an ecommerce business that has over 4000 product pages which, as you can imagine, branches off into thousands of image files. Is it necessary to include those in my sitemap for faster indexing? Thanks for you help! -Reed Technical SEO | | IceIcebaby0
- 
		
		
		
		
		
		Google insists robots.txt is blocking... but it isn't.
 I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch? Technical SEO | | ahockley0
- 
		
		
		
		
		
		How to create unique content for businesses with multiple locations?
 I have a client that owns one franchise location of a franchise company with multiple locations. They have one large site with each location owning it's own page on the site, which I feel is the best route. The problem is that each location page has basically duplicate content on each page resulting in like 80 pages of duplicate content. I'm looking for advice on how to create unique content for each location page? What types of information can we write about to make each page unique, because you can only twist sentences and content around so much before it just all sounds cookie cutter and therefore offering little value. Technical SEO | | RonMedlin0
- 
		
		
		
		
		
		No indexing url including query string with Robots txt
 Dear all, how can I block url/pages with query strings like page.html?dir=asc&order=name with robots txt? Thanks! Technical SEO | | HMK-NL0
- 
		
		
		
		
		
		Hosting sitemap on another server
 I was looking into XML sitemap generators and one that seems to be recommended quite a bit on the forums is the xml-sitemaps.com They have a few versions though. I'll need more than 500 pages indexed, so it is just a case of whether I go for their paid for version and install on our server or go for their pro-sitemaps.com offering. For the pro-sitemaps.com they say: "We host your sitemap files on our server and ping search engines automatically" My question is will this be less effective than my installing it on our server from an SEO perspective because it is no longer on our root domain? Technical SEO | | design_man0
- 
		
		
		
		
		
		Does posting an article on multiple sites hurt seo?
 A client of mine creates thought leadership articles and pitches multiple sites to host the article on their site to reach different audiences. The sites that pick it up are places such as AdAge and MarketingProfs and we do get link juice from these sources most of the time. Does having the same article on these sites as well as your own hurt your SEO efforts in any way? Could it be recognized as duplicate content? I know the links are great just wondering if there is any other side effects especially when there are no links provided! Thank you! Technical SEO | | Scratch_MM0
 
			
		 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				