Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Does Google index XML files?
-
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
-
Yes, Google indexes XML files. You can try a search using filetype:xml
I am not an expert on RSS files but I believe the XML versions use the <rss>tag. If I were to take a guess, I would say Google can easily examine the file (they read pdf's for example) and determine if it is an RSS feed.</rss>
-
Google and other search engines know about XML files. The reason I don't necessarily say indexed is because you don't see them really popping up on a SERP page. Robots know the difference between a straight xml file and an rss feed because of the technical markup and structure of the actual file.
For example a very common use for Google and xml files is sitemaps, which you can give them the sitemap feed in Webmaster tools so they will know about the file. If you did a search for www.rootdomain.com + sitemap file you would never see an .xml file actually popup. You would need to visit the url directly.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URLs dropping from index (Crawled, currently not indexed)
I've noticed that some of our URLs have recently dropped completely out of Google's index. When carrying out a URL inspection in GSC, it comes up with 'Crawled, currently not indexed'. Strangely, I've also noticed that under referring page it says 'None detected', which is definitely not the case. I wonder if it could be something to do with the following? https://www.seroundtable.com/google-ranking-index-drop-30192.html - It seems to be a bug affecting quite a few people. Here are a few examples of the URLs that have gone missing: https://www.ihasco.co.uk/courses/detail/sexual-harassment-awareness-training https://www.ihasco.co.uk/courses/detail/conflict-resolution-training https://www.ihasco.co.uk/courses/detail/prevent-duty-training Any help here would be massively appreciated!
Technical SEO | | iHasco0 -
Pages are Indexed but not Cached by Google. Why?
Hello, We have magento 2 extensions website mageants.com since 1 years google every 15 days cached my all pages but suddenly last 15 days my websites pages not cached by google showing me 404 error so go search console check error but din't find any error so I have cached manually fetch and render but still most of pages have same 404 error example page : - https://www.mageants.com/free-gift-for-magento-2.html error :- http://webcache.googleusercontent.com/search?q=cache%3Ahttps%3A%2F%2Fwww.mageants.com%2Ffree-gift-for-magento-2.html&rlz=1C1CHBD_enIN803IN804&oq=cache%3Ahttps%3A%2F%2Fwww.mageants.com%2Ffree-gift-for-magento-2.html&aqs=chrome..69i57j69i58.1569j0j4&sourceid=chrome&ie=UTF-8 so have any one solutions for this issues
Technical SEO | | vikrantrathore0 -
Indexed pages
Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... Google Search Console: 237 indexed pages Google search using site command: 468 results MOZ site crawl: 1013 unique URLs Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500) Can anyone shed any light on why they differ so much? And where lies the truth?
Technical SEO | | muzzmoz1 -
Site indexed by Google, but (almost) never gets impressions
Hi there, I have a question that I wasn't able to give it a reasonable answer yet, so I'm going to trust on all of you. Basically a site has all its pages indexed by Google (I verified with site:sitename.com) and it also has great and unique content. All on-page grades are A with absolutely no negative factors at all. However its pages do not get impressions almost at all. Of course I didn't expect it to be on page 1 since it has been launched on Dec, 1st, but it looks like Google is ignoring (or giving it bad scores) for some reason. Only things that can contribute to that could be: domain privacy on the domain, redirect from the www to the subdomain we use (we did this because it will be a multi-language site, so we'll assign to each country a subdomain), recency (it has been put online on Dec 1st and the domain is just a couple of months old). Or maybe because we blocked crawlers for a few days before the launch? Exactly a few days before Dec 1st. What do you think? What could be the reason for that? Thanks guys!
Technical SEO | | ruggero0 -
Vanity URLs are being indexed in Google
We are currently using vanity URLs to track offline marketing, the vanity URL is structured as www.clientdomain.com/publication, this URL then is 302 redirected to the actual URL on the website not a custom landing page. The resulting redirected URL looks like: www.clientdomain.com/xyzpage?utm_source=print&utm_medium=print&utm_campaign=printcampaign. We have started to notice that some of the vanity URLs are being indexed in Google search. To prevent this from happening should we be using a 301 redirect instead of a 302 and will the Google index ignore the utm parameters in the URL that is being 301 redirect to? If not, any suggestions on how to handle? Thanks,
Technical SEO | | seogirl221 -
How To Cleanup the Google Index After a Website Has Been HACKED
We have a client whose website was hacked, and some troll created thousands of viagra pages, which were all indexed by Google. See the screenshot for an example. The site has been cleaned up completely, but I wanted to know if anyone can weigh in on how we can cleanup the Google index. Are there extra steps we should take? So far we have gone into webmaster tools and submitted a new site map. ^802D799E5372F02797BE19290D8987F3E248DCA6656F8D9BF6^pimgpsh_fullsize_distr.png
Technical SEO | | yoursearchteam0 -
Why are Google search results different if you are log'd into Google or not?
I get different results when I'm log'd into my Google account associated with my website than if I'm not. The same country is occurring. So how can I rely on the google results I'm seeing? For instance my site is page 1 with the improvements I made based on SEOMOZ if I'm log'd in. Yet I'm not on the first 25 pages if I'm not logged in.
Technical SEO | | Romana0 -
Ror.xml vs sitemap.xml
Hey Mozzers, So I've been reading somethings lately and some are saying that the top search engines do not use ror.xml sitemap but focus just on the sitemap.xml. Is that true? Do you use ror? if so, for what purpose, products, "special articles", other uses? Can sitemap be sufficient for all of those? Thank you, Vadim
Technical SEO | | vijayvasu0