Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Is there any value in having a blank robots.txt file?
-
I've read an audit where the writer recommended creating and uploading a blank robots.txt file, there was no current file in place. Is there any merit in having a blank robots.txt file?
What is the minimum you would include in a basic robots.txt file?
-
I know this is four years old, but there's value in having a blank robots.txt as some tools (including the latest version of the Moz crawler) will baulk at sites without a robots.txt file.
-
Thanks for both of your replies. As per my question it was around whether there is any value having a blank robots.txt file. Philipp's answer was right on the money.
-
i mentioned same only, The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site."
n has added - More and more people use robots,txt to disallow access to some administration or private folders of the site
-
No use in having a blank robots.txt. Minimum requirement if you want to have your site crawled is this:
User-agent: * Allow: /
Note that Gagans example above will block the entire site.
-
Hi, This is what i got
" Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called_The Robots Exclusion Protocol_. It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:
User-agent: * Disallow: /
The "<tt>User-agent: *</tt>" means this section applies to all robots. The "<tt>Disallow: /</tt>" tells the robot that it should not visit any pages on the site."
More and more people use robots,txt to disallow access to some administration or private folders of the site . If you dont want to hide anything then may be you can leave it blank
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have two robots.txt pages for www and non-www version. Will that be a problem?
There are two robots.txt pages. One for www version and another for non-www version though I have moved to the non-www version.
Technical SEO | | ramb0 -
Crawl solutions for landing pages that don't contain a robots.txt file?
My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader1 -
Recommended log file analysis software for OS X?
Due to some questions over direct traffic and Googlebot behavior, I want to do some log file analysis. The catch is this is a Mac shop, so all our systems are on OS X. I have Windows 8 running in an emulator, but for the sake of simplicity I'd rather run all my software in OS X. This post by Tim Resnik recommended Web Log Explorer, but it's for Windows only. I did discover Sawmill, which claims to run on any platform. Any other suggestions? Bear in mind our site is load balanced over three servers, so please take that into consideration.
Technical SEO | | ufmedia0 -
I accidentally blocked Google with Robots.txt. What next?
Last week I uploaded my site and forgot to remove the robots.txt file with this text: User-agent: * Disallow: / I dropped from page 11 on my main keywords to past page 50. I caught it 2-3 days later and have now fixed it. I re-imported my site map with Webmaster Tools and I also did a Fetch as Google through Webmaster Tools. I tweeted out my URL to hopefully get Google to crawl it faster too. Webmaster Tools no longer says that the site is experiencing outages, but when I look at my blocked URLs it still says 249 are blocked. That's actually gone up since I made the fix. In the Google search results, it still no longer has my page title and the description still says "A description for this result is not available because of this site's robots.txt – learn more." How will this affect me long-term? When will I recover my rankings? Is there anything else I can do? Thanks for your input! www.decalsforthewall.com
Technical SEO | | Webmaster1230 -
Links from the same server has value or not
Hi Guys, Sometime ago one of the SEO experts said to me if I get links from the same IP address, Google doesn't count them as with much value. For an example, I am a web devleoper and I host all my clients websites on one server and link them back to me. Im wondering whether those links have any value when it comes to seo or should I consider getting different hosting providers? Regards Uds
Technical SEO | | Uds0 -
Duplicate content problem from an index.php file
Hi One of my sites is flagging a duplicate content problem which is affecting the search rankings. The duplicate problem is caused by http://www.mydomain.com/index.php which has a page rank of 26 How can I sort the duplicate content problem, as the main page should just be http://www.mydomain.com which has a page rank of 42 and is the stronger page with stronger links etc Many Thanks
Technical SEO | | ocelot0 -
Internal search : rel=canonical vs noindex vs robots.txt
Hi everyone, I have a website with a lot of internal search results pages indexed. I'm not asking if they should be indexed or not, I know they should not according to Google's guidelines. And they make a bunch of duplicated pages so I want to solve this problem. The thing is, if I noindex them, the site is gonna lose a non-negligible chunk of traffic : nearly 13% according to google analytics !!! I thought of blocking them in robots.txt. This solution would not keep them out of the index. But the pages appearing in GG SERPS would then look empty (no title, no description), thus their CTR would plummet and I would lose a bit of traffic too... The last idea I had was to use a rel=canonical tag pointing to the original search page (that is empty, without results), but it would probably have the same effect as noindexing them, wouldn't it ? (never tried so I'm not sure of this) Of course I did some research on the subject, but each of my finding recommanded one of the 3 methods only ! One even recommanded noindex+robots.txt block which is stupid because the noindex would then be useless... Is there somebody who can tell me which option is the best to keep this traffic ? Thanks a million
Technical SEO | | JohannCR0 -
Invisible robots.txt?
So here's a weird one... Client comes to me for some simple changes, turns out there are some major issues with the site, one of which is that none of the correct content pages are showing up in Google, just ancillary (outdated) ones. Looks like an issue because even the main homepage isn't showing up with a "site:domain.com" So, I add to Webmaster Tools and, after an hour or so, I get the red bar of doom, "robots.txt is blocking important pages." I check it out in Webmasters and, sure enough, it's a "User agent: * Disallow /" ACK! But wait... there's no robots.txt to be found on the server. I can go to domain.com/robots.txt and see it but nothing via FTP. I upload a new one and, thankfully, that is now showing but I've never seen that before. Question is: can a robots.txt file be stored in a way that can't be seen? Thanks!
Technical SEO | | joshcanhelp0