Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Dilemma about "images" folder in robots.txt
-
Hi, Hope you're doing well.
I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
-
Yup my images send me traffic from Google images on most of my sites and attractive images attract hotlinks as well. At the moment people are hosting their images on a different domain (cdn) and are still being credited with the images but I haven't tried to do that myself ie I don't know if they've set some "ownership" somewhere and somehow.
-
I recommend allowing Google to crawl those images. Google optimizes its crawl rate and once it has done a complete crawl it will understand how often to crawl certain areas of your site. My main concern would be that you are losing potential rankings and indexing from those images - if they are unique and high quality you definitely want them to index the images, understand the file names, and appropriately index them.
I wouldn't be concerned about Google bot eating up your server resources. If it does become a problem, then you can go back and adjust the bot access through the robots.txt, as you've done already. However, I would let them in first and only react if it becomes a problem.
I have tens of thousands of product images accessed by the google bot and it is no concern to my ecommerce company and the server resources. I'm not saying that it can't be a potential problem, but the benefit outweighs the risk of it being one - I choose a reactive stance in this situation.
Closely monitor your Google Webmaster Tools account, watch the crawl rate and statistics, and if it becomes an issue then decide on which image folders should or shouldn't be indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does anyone know how to fix this structured data error on search console? Invalid value in field "itemtype"
I'm getting the same structured data error on search console form most of my websites, Invalid value in field "itemtype" I take off all the structured data but still having this problem, according to Search console is a syntax problem but I can't find what is causing this. Any guess, suggestion or solution for this?
Intermediate & Advanced SEO | | Alexanders0 -
Using hreflang="en" instead of hreflang="en-gb"
Hello, I have a question in regard to international SEO and the hreflang meta tag. We are currently a B2B business in the UK. Our major market is England with some exceptions of sales internationally. We are wanting to increase our ranking into other english speaking countries and regions such as Ireland and the Channel Islands. My research has found regional google search engines for Ireland (google.ie), Jersey (google.je) and Guernsey (google.gg). Now, all the regions have English as one their main language and here is my questions. Because I use hreflang=“en-gb” as my site language, am I regional excluding these countries and islands? If I used hreflang=“en” would it include these english speaking regions and possible increase the ranking on these the regional search engines? Thank you,
Intermediate & Advanced SEO | | SilverStar11 -
What do you add to your robots.txt on your ecommerce sites?
We're looking at expanding our robots.txt, we currently don't have the ability to noindex/nofollow. We're thinking about adding the following: Checkout Basket Then possibly: Price Theme Sortby other misc filters. What do you include?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Wildcarding Robots.txt for Particular Word in URL
Hey All, So I know that this isn't a standard robots.txt, I'm aware of how to block or wildcard certain folders but I'm wondering whether it's possible to block all URL's with a certain word in it? We have a client that was hacked a year ago and now they want us to help remove some of the pages that were being autogenerated with the word "viagra" in it. I saw this article and tried implementing it https://builtvisible.com/wildcards-in-robots-txt/ and it seems that I've been able to remove some of the URL's (although I can't confirm yet until I do a full pull of the SERPs on the domain). However, when I test certain URL's inside of WMT it still says that they are allowed which makes me think that it's not working fully or working at all. In this case these are the lines I've added to the robots.txt Disallow: /*&viagra Disallow: /*&Viagra I know I have the solution of individually requesting URL's to be removed from the index but I want to see if anybody has every had success with wildcarding URL's with a certain word in their robots.txt? The individual URL route could be very tedious. Thanks! Jon
Intermediate & Advanced SEO | | EvansHunt0 -
When i search for my domain name - google asks "did you mean" - why?
Hi all, I just noticed something quite odd - if i do a search for my domain name (see: http://goo.gl/LBc1lz) google shows my domain as first result, but it also asks "did i mean" and names another website with very similar name. the other site has far lower PA/DA according to Moz, any ideas why google is doing this? and more inportantly how i could stop it? please advise James
Intermediate & Advanced SEO | | isntworkdull0 -
Should I NOFOLLOW my "Add To Cart" buttons?
Hello and Merry Christmass Should I NOFOLLOW my "Add To Cart" buttons? My e-commerce site has hundreds of products. Content wise, there is no real value to the reader on that page (besides for some testimonials and "why here" sentences). So it is not a page you'd want / expect to find in the SERPs. Also, with hundreds of links pointing to this page it would be "stronger" than other important pages which doesn't make sense. Last but not least, if I have limited time that the bots are on my site, why keep sending them to a non important page. This is why I am leaning to nofollowing the "add to cart" buttons and looking for reinforcements. Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
Do I need to use rel="canonical" on pages with no external links?
I know having rel="canonical" for each page on my website is not a bad practice... but how necessary is it for pages that don't have any external links pointing to them? I have my own opinions on this, to be fair - but I'd love to get a consensus before I start trying to customize which URLs have/don't have it included. Thank you.
Intermediate & Advanced SEO | | Netrepid0 -
"Jump to" Links in Google, how do you get them?
I have just seen yoast.com results in Google and noticed that nearly all the indexed pages show a "Jump to" link So instead of showing the full URL under the title tag, it shows these type of links yoast.com › SEO
Intermediate & Advanced SEO | | JohnPeters
yoast.com › Social Media
yoast.com › Analytics With the SEO, Social Media and Analytics all being clickable. How has he achieved this? And is it something to try and incorporate in my sites?0