Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Why Can't Googlebot Fetch Its Own Map on Our Site?
-
I created a custom map using google maps creator and I embedded it on our site. However, when I ran the fetch and render through Search Console, it said it was blocked by our robots.txt file. I read in the Search Console Help section that: 'For resources blocked by robots.txt files that you don't own, reach out to the resource site owners and ask them to unblock those resources to Googlebot."
I did not setup our robtos.txt file. However, I can't imagine it would be setup to block google from crawling a map. i will look into that, but before I go messing with it (since I'm not familiar with it) does google automatically block their maps from their own googlebot? Has anyone encountered this before?
Here is what the robot.txt file says in Search Console:
User-agent: * Allow: /maps/api/js? Allow: /maps/api/js/DirectionsService.Route Allow: /maps/api/js/DistanceMatrixService.GetDistanceMatrix Allow: /maps/api/js/ElevationService.GetElevationForLine Allow: /maps/api/js/GeocodeService.Search Allow: /maps/api/js/KmlOverlayService.GetFeature Allow: /maps/api/js/KmlOverlayService.GetOverlays Allow: /maps/api/js/LayersService.GetFeature Disallow: / Any assistance would be greatly appreciated. Thanks, Ruben -
Awesome, Croy. That was extremely helpful.
Thanks,
Ruben
-
Hello Ruben!
Not too long ago, I saw a hangout with John Mueller who got the same question and gives a logical answer. Here's the URL at timestamp 46:06 https://www.youtube.com/watch?v=i_xnKznRNCc&feature=player_embedded#t=2766
I'd transcript if I had the time, but it will take you a couple of minutes. It essentially says that even though it is embedded, it won't access that block of content because maps.googleapis.com/robots.txt blocks it. If I understood wrong, please let me know!
I had the same problem where part of my content was coming from the pins INSIDE the map, but since it was blocked, it couldn't reach the content itself. Had to do a workaround
Hope it helps!
EDIT: Just jumping back in to say that you should look back at your site. Seems like Google can correctly crawl its own map and the content served through it. Our site has no more problems related to this. November the 15th I still had issues, but today I've tested multiple times and it looked okay.
Croy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Category URL Pagination where URLs don't change between pages
Hello, I am working on an e-commerce site where there are categories with multiple pages. In order to avoid pagination issues I was thinking of using rel=next and rel=prev and cannonical tags. I noticed a site where the URL doesn't change between pages, so whether you're on page 1,2, or 3 of the same category, the URL doesn't change. Would this be a cleaner way of dealing with pagination?
Technical SEO | | whiteonlySEO0 -
Staging site and "live" site have both been indexed by Google
While creating a site we forgot to password protect the staging site while it was being built. Now that the site has been moved to the new domain, it has come to my attention that both the staging site (site.staging.com) and the "live" site (site.com) are both being indexed. What is the best way to solve this problem? I was thinking about adding a 301 redirect from the staging site to the live site via HTACCESS. Any recommendations?
Technical SEO | | melen0 -
Best way to handle pages with iframes that I don't want indexed? Noindex in the header?
I am doing a bit of SEO work for a friend, and the situation is the following: The site is a place to discuss articles on the web. When clicking on a link that has been posted, it sends the user to a URL on the main site that is URL.com/article/view. This page has a large iframe that contains the article itself, and a small bar at the top containing the article with various links to get back to the original site. I'd like to make sure that the comment pages (URL.com/article) are indexed instead of all of the URL.com/article/view pages, which won't really do much for SEO. However, all of these pages are indexed. What would be the best approach to make sure the iframe pages aren't indexed? My intuition is to just have a "noindex" in the header of those pages, and just make sure that the conversation pages themselves are properly linked throughout the site, so that they get indexed properly. Does this seem right? Thanks for the help...
Technical SEO | | jim_shook0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
The word 'shop' in a page title
I'm reworking most of the page titles on our site and I'm considering the use of the word 'Shop' before a product category. ex. Shop 'keyword' | Brand Name As opposed to just using the keyword sans 'Shop.' Some of the keywords are very generic, especially for a top level category page. Question: Is the word 'Shop' damaging my SEO efforts in any way?
Technical SEO | | rhoadesjohn0 -
Oh no googlebot can not access my robots.txt file
I just receive a n error message from google webmaster Wonder it was something to do with Yoast plugin. Could somebody help me with troubleshooting this? Here's original message Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. Recommended action If the site error rate is 100%: Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot. If your robots.txt is a static page, verify that your web service has proper permissions to access the file. If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure. If the site error rate is less than 100%: Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors. The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website. After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site.
Technical SEO | | BistosAmerica0 -
Is there a great tool for URL mapping old to new web site?
We are implementing new design and removing some pages and adding new content. Task is to correctly map and redirect old pages that no longer exist.
Technical SEO | | KnutDSvendsen0 -
Google.ca is showing our US site instead of our Canada Site
When our Canadian users who search on google.ca for our brand (e.g. Travelocity, Travelocity hotels, etc.), the first few results our from our US site (travelocity.com) rather than our Canadian site (travelocity.ca). In Google Webmaster Tools, we've adjusted the geotargeting settings to focus on the appropriate locale, but the wrong country TLD is still coming up at the top via google.ca. What's the best way to ensure our Canadian site comes up instead of the US site on google.ca? Thanks, Tory Smith
Technical SEO | | travelocitysearch
Travelocity0