Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
520 Error from crawl report with Cloudflare
-
I am getting a lot of 520 Server Error in crawl reports. I see this is related to Cloudflare. We know 520 is Cloudflare so maybe the Moz team can change this from "unknown" to "Cloudflare 520". Perhaps the Moz team can update the "how to fix" section in the reporting, if they have some possible suggestions on how to avoid seeing these in the report of if there is a real issue that needs to be addressed. At this point I don't know.
There must be a solution that Moz can provide like a setting in Cloudflare that will permit the Rogerbot if Cloudflare is blocking it because it does not like its behavior or something.
It could be that Rogerbot is crawling my site on a bad day or at a time when we were deploying a massive site change. If I know when my site will be down can I pause Rogerbot?
I found this https://developers.cloudflare.com/support/troubleshooting/general-troubleshooting/troubleshooting-crawl-errors/
-
A 520 error is an HTTP error code that indicates that Cloudflare was unable to establish a connection to the origin server. This can happen for a variety of reasons, including:
Server downtime: The origin server might be down or undergoing maintenance.
Firewall restrictions: The origin server might have a firewall that is blocking requests from Cloudflare.
DNS issues: There might be a DNS misconfiguration that is preventing Cloudflare from resolving the origin server's IP address.
SSL issues: There might be an issue with the SSL certificate on the origin server.
To troubleshoot the issue, you can try the following:
Check if the origin server is up and running.
Check if the origin server has a firewall that is blocking requests from Cloudflare.
Check if the DNS is configured correctly.
Check if the SSL certificate is valid and configured correctly.
If none of these steps resolve the issue, you can reach out to Cloudflare support for further assistance.
-
@awilliams_kingston To answer your question, there is no option to pause Rogerbot manually. However, Rogerbot only crawls a website when a Site Crawl campaign is active and scheduled to run. If you want to pause Rogerbot, you can stop the active campaign or schedule the next crawl to start at a later time.
To schedule a Site Crawl, go to your Moz Pro account, click on "Site Crawl" in the left-hand navigation menu, and select "Add Campaign" to set up a new campaign or select an existing one. From there, you can customize your crawl settings, including the crawl frequency and start time.
If you have a scheduled maintenance window and want to prevent Rogerbot from crawling your site during that time, you can adjust the crawl frequency to avoid overlapping with your maintenance schedule. You can also use a robots.txt file to block the crawler from accessing specific pages or sections of your site.
-
@awilliams_kingston The 520 server error you're seeing in your Moz crawl reports is related to Cloudflare. It's a generic error, which means it could be caused by a variety of issues, including server overload or misconfigured settings.
To address this, you could check your Cloudflare firewall settings and see if there are any rules that are blocking the Moz Rogerbot crawler. If there are, try adding an exception for the Rogerbot user agent to allow it to crawl your site without being blocked.
If you know your site will be down for maintenance or undergoing significant changes, you could pause the Moz crawler during that time to prevent it from generating false 520 errors in your reports.
Finally, you could check out the troubleshooting guide in the Cloudflare documentation for more information on identifying and addressing crawl errors. Remember to work with both Moz and Cloudflare support teams to find a solution that works for your specific setup.
-
@Kateparish Thank you.
How do you pause Rogerbot? I can't find anything on that in my admin panel but maybe it is because there is no crawl happening at the moment and my next crawl is scheduled to happen in a few days. Also, is there a way to schedule a pause if a crawl is happening? If I know I have site maintenance on a certain day of the week a specific time, for example, I can have Rogerbot take a break? -
A 520 error typically indicates a connection error between Cloudflare and the origin server. This error occurs when the server returns an empty or invalid response to Cloudflare, or when the server takes too long to respond.
To troubleshoot a 520 error from a crawl report with Cloudflare, you can take the following steps:
Check the server logs: The first step in troubleshooting a 520 error is to check the server logs for any error messages. Look for any errors related to the server's network or connectivity, such as DNS resolution issues, network timeouts, or firewall restrictions.
Check Cloudflare logs: Cloudflare logs can provide additional insights into the cause of the error. Check the Cloudflare logs for any error messages or connection issues between Cloudflare and the origin server.
Temporarily disable Cloudflare: Temporarily disabling Cloudflare can help you determine if the error is caused by Cloudflare or the origin server. If the error disappears when Cloudflare is disabled, then the issue is likely with Cloudflare.
Contact Cloudflare support: If you are unable to resolve the issue on your own, you can contact Cloudflare support for assistance. Provide them with the server logs and Cloudflare logs, as well as any other relevant information, to help them diagnose the issue.
By following these steps, you should be able to identify and resolve the 520 error from the crawl report with Cloudflare.
-
@awilliams_kingston The 520 server error you're seeing in your Moz crawl reports is related to Cloudflare. It's a generic error, which means it could be caused by a variety of issues, including server overload or misconfigured settings.
To address this, you could check your Cloudflare firewall settings and see if there are any rules that are blocking the Moz Rogerbot crawler. If there are, try adding an exception for the Rogerbot user agent to allow it to crawl your site without being blocked.
If you know your site will be down for maintenance or undergoing significant changes, you could pause the Moz crawler during that time to prevent it from generating false 520 errors in your reports.
Finally, you could check out the troubleshooting guide in the Cloudflare documentation for more information on identifying and addressing crawl errors. Remember to work with both Moz and Cloudflare support teams to find a solution that works for your specific setup.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rogerbot directives in robots.txt
I feel like I spend a lot of time setting false positives in my reports to ignore. Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
Reporting & Analytics | | awilliams_kingston0 -
Moz crawler is not able to crawl my website
Hi, i need help regarding Moz Can't Crawl Your Site i also share screenshot that Moz was unable to crawl your site on Mar 26, 2022. Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster.
Technical SEO | | JasonTorney
my robts.txt also ok i checked it
Here is my website https://whiskcreative.com.au
just check it please as soon as possibe0 -
URLs dropping from index (Crawled, currently not indexed)
I've noticed that some of our URLs have recently dropped completely out of Google's index. When carrying out a URL inspection in GSC, it comes up with 'Crawled, currently not indexed'. Strangely, I've also noticed that under referring page it says 'None detected', which is definitely not the case. I wonder if it could be something to do with the following? https://www.seroundtable.com/google-ranking-index-drop-30192.html - It seems to be a bug affecting quite a few people. Here are a few examples of the URLs that have gone missing: https://www.ihasco.co.uk/courses/detail/sexual-harassment-awareness-training https://www.ihasco.co.uk/courses/detail/conflict-resolution-training https://www.ihasco.co.uk/courses/detail/prevent-duty-training Any help here would be massively appreciated!
Technical SEO | | iHasco0 -
Find all external 404 errors/links?
Hi All, We have recently discovered a site was linking to our site but it was linking to an incorrect url, resulting in a 404 error. We had only found this by pure chance and wondered if there was a tool out there that will tell us when a site is linking to an incorrect url on our site? Thanks 🙂
Technical SEO | | O2C0 -
Do YouTube videos in iFrames get crawled?
There seems to be quite a few articles out there that say iframes cause problems with organic search and that the various bots can't/won't crawl them. Most of the articles are a few years old (including Moz's video sitemap article). I'm wondering if this is still the case with YouTube/Vimeo/etc videos, all of which only offer iFrames as an embed option. I have a hard time believing that a Google property (YT) would offer an embed option that it's own bot couldn't crawl. However, let me know if that is in fact the case. Thanks! Jim
Technical SEO | | DigitalAnarchy0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Can too many pages hurt crawling and ranking?
Hi, I work for local yellow pages in Belgium, over the last months we introduced a succesfull technique to boost SEO traffic: we have created over 150k of new pages, all targeting specific keywords and all containing unique content, a site architecture to enable google to find these pages through crawling, xml sitemaps, .... All signs (traffic, indexation of xml sitemaps, rankings, ...) are positive. So far so good. We are able to quickly build more unique pages, and I wonder how google will react to this type of "large scale operation": can it hurt crawling and ranking if google notices big volumes of content (unique content)? Please advice
Technical SEO | | TruvoDirectories0 -
404 crawl errors from "tel:" link?
I am seeing thousands of 404 errors. Each of the urls is like this: abc.com/abc123/tel:1231231234 Everything is normal about that url except the "/tel:1231231234" these urls are bad with the tel: extension, they are good without it. The only place I can find this character string is on each page we have this code which is used for Iphones and such. What are we doing wrong? Code: Phone: <a href="[tel:1231231234](tel:7858411943)"> (123) 123-1234a>
Technical SEO | | EugeneF0