Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Do I need a separate robots.txt file for my shop subdomain?
-
Hello Mozzers!
Apologies if this question has been asked before, but I couldn't find an answer so here goes...
Currently I have one robots.txt file hosted at https://www.mysitename.org.uk/robots.txt
We host our shop on a separate subdomain https://shop.mysitename.org.uk
Do I need a separate robots.txt file for my subdomain? (Some Google searches are telling me yes and some no and I've become awfully confused!
-
Thank you. I want to disallow specific URLs on the subdomain and add the shop sitemap in the robots.txt file. So I'll go ahead and create another!
-
You go be fine without one. You only need one if you want to manage that subdmain: add specific xml sitemaps links in robots.txt, cut access to specific folders for that subdomain.
if you don't need any of that - just move forward without one.
-
Currently we just have: User-agent: *
I'm in the process of optimising.
-
It depends what currently is in your robots.txt. Usually it would be useful to have another one for your subdomain.
-
Yes, I would have a seperate robots.txt files.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Subdomain 403 error
Hi Everyone, A crawler from our SEO tool detects a 403 error from a link from our main domain to a a couple of subdomains. However, these subdomains are perfect accessibly. What could be the problem? Is this error caused by the server, the crawlbot or something else? I would love to hear your thoughts.
Technical SEO | | WeAreDigital_BE
Jens0 -
Blocking Affiliate Links via robots.txt
Hi, I work with a client who has a large affiliate network pointing to their domain which is a large part of their inbound marketing strategy. All of these links point to a subdomain of affiliates.example.com, which then redirects the links through a 301 redirect to the relevant target page for the link. These links have been showing up in Webmaster Tools as top linking domains and also in the latest downloaded links reports. To follow guidelines and ensure that these links aren't counted by Google for either positive or negative impact on the site, we have added a block on the robots.txt of the affiliates.example.com subdomain, blocking search engines from crawling the full subddomain. The robots.txt file is the following code: User-agent: * Disallow: / We have authenticated the subdomain with Google Webmaster Tools and made certain that Google can reach and read the robots.txt file. We know they are being blocked from reading the affiliates subdomain. However, we added this affiliates subdomain block a few weeks ago to the robots.txt, but links are still showing up in the latest downloads report as first being discovered after we added the block. It's been a few weeks already, and we want to make sure that the block was implemented properly and that these links aren't being used to negatively impact the site. Any suggestions or clarification would be helpful - if the subdomain is being blocked for the search engines, why are the search engines following the links and reporting them in the www.example.com subdomain GWMT account as latest links. And if the block is implemented properly, will the total number of links pointing to our site as reported in the links to your site section be reduced, or does this not have an impact on that figure?From a development standpoint, it's a much easier fix for us to adjust the robots.txt file than to change the affiliate linking connection from a 301 to a 302, which is why we decided to go with this option.Any help you can offer will be greatly appreciated.Thanks,Mark
Technical SEO | | Mark_Ginsberg0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Best Way To Clean Up Unruly SubDomain?
Hi, I have several subdomains that present no real SEO value, but are being indexed. They don't earn any backlinks either. What's the best way of cleaning them up? I was thinking the following: 1. Verify them all in Webmaster Tools. 2. Remove all URLs from the index via the Removal Tool in WMT 3. Add site-wide no-index, follow directive. Also, to remove the URLs in WMT, you usually have to block the URLs via /robots.txt. If I'd like to keep Google crawling through the subdomains and remove their URLs, is there a way to do so?
Technical SEO | | RocketZando0 -
Googlebot does not obey robots.txt disallow
Hi Mozzers! We are trying to get Googlebot to steer away from our internal search results pages by adding a parameter "nocrawl=1" to facet/filter links and then robots.txt disallow all URLs containing that parameter. We implemented this late august and since that, the GWMT message "Googlebot found an extremely high number of URLs on your site", stopped coming. But today we received yet another. The weird thing is that Google gives many of our nowadays robots.txt disallowed URLs as examples of URLs that may cause us problems. What could be the reason? Best regards, Martin
Technical SEO | | TalkInThePark0 -
Oh no googlebot can not access my robots.txt file
I just receive a n error message from google webmaster Wonder it was something to do with Yoast plugin. Could somebody help me with troubleshooting this? Here's original message Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. Recommended action If the site error rate is 100%: Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot. If your robots.txt is a static page, verify that your web service has proper permissions to access the file. If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure. If the site error rate is less than 100%: Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors. The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website. After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site.
Technical SEO | | BistosAmerica0 -
Keep the blog separate or incorporate into main domain?
So my organization currently has both a main site and on a separate domain and separate host a wordpress blog. (our own domain not a wordpress.com) the content posted on this blog is local, community driven, and related to our business but it is not used in anyway as a "sales" tool. It's more for interaction purposes with members and employees. This blog has a lot of content and is updated with new posts very often. (generate traffic from a pretty wide variety of searches some related some not) My plan has been to 301 the old domain and move the wordpress blog over to our root domain in a subdirectory such as oursite.com/blog. Does anyone have tips for moving a blog over like this? I'm concerned about any link juice it has dropping off and since it does provide some links to our root site currently (since it's basically a separate site). Basically i'm wondering if it'll be worth the effort or if i should just keep it separate and focus on other content gen strategies.
Technical SEO | | SCFederal0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0