Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best server-side sitemap generators
-
I've been looking into sitemap generators recently and have got a good knowledge of what creating a sitemap for a small website of below 500 URLs involves. I have successfully generated a sitemap for a very small site, but I’m trying to work out the best way of crawling a large site with millions of URLs.
I’ve decided that the best way to crawl such a large number of URLs is to use a server side sitemap, but this is an area that doesn’t seem to be covered in detail on SEO blogs / forums. Could anyone recommend a good server side sitemap generator? What do you think of the automated offerings from Google and Bing? I’ve found a list of server side sitemap generators from Google, but I can’t see any way to choose between them. I realise that a lot will depend on the type of technologies we use server side, but I'm afraid that I don't know them at this time.
-
Unless they have fixed it in recent months, xml-sitemaps does not generate correct video sitemaps.
-
Yeah, they offer free and paid hosted versions too. But I found the server side version much simpler to setup and control.
-
-
Excellent advice Federico. My first reaction was, "but that's not a server-side sitemap generator". I just looked at their website though and it turns out that it is! Looks like I need to read things more carefully!
I'll look into that as an option but if anyone else has any server side sitemap generators that they'd recommend then I'd be really interested to hear about them
-
I have been using xml-sitemaps (paid version) for all my sites over 5 years and they work like a charm, scraping and indexing what it needs to be indexed ans scraped, plus it consumes really low resources. 100% recommended (they have nice plugins too for extra sitempas (video, news, images, etc).
Hope that helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pending Sitemaps
Hi, all Wondering if someone could give me a pointer or two, please. I cannot seem to get Google or Bing to crawl my sitemap. If I submit the sitemap in WMT and test it I get a report saying 44,322urls found. However, if I then submit that same sitemap it either says Pending (in old WMT) or Couldn't fetch in the new version. This couldn't fetch is very puzzling as it had no issue fetching the map to test it. My other domains on the same server are fine, the problem is limited to this one site. I have tried several pages on the site using the Fetch as Google tool and they load without issue, however, try as I may, it will not fetch my sitemap. The sitemapindex.xml file won't even submit. I can confirm my sitemaps, although large, work fine, please see the following as an example (minus the spaces, of course, didn't want to submit and make it look like I was just trying to get a link) https:// digitalcatwalk .co.uk/sitemap.xml https:// digitalcatwalk .co.uk/sitemapindex.xml I would welcome any feedback anyone could offer on this, please. It's driving me mad trying to work out what is up. Many thanks, Jeff
Intermediate & Advanced SEO | | wonkydogadmin0 -
Spotify XML Sitemap
All, Working on an SEO work up for a Spotify site. Looks like they are using a sitemap that links to additional pages. A problem, none of the links are actually linked within the sitemap. This feels like a strong error. https://lubricitylabs.com/sitemap.xml Thoughts?
Intermediate & Advanced SEO | | dmaher0 -
Which search engines should we submit our sitemap to?
Other than Google and Bing, which search engines should we submit our sitemap to?
Intermediate & Advanced SEO | | NicheSocial0 -
Best practice for retiring old product pages
We’re a software company. Would someone be able to help me with a basic process for retiring old product pages and re-directing the SEO value to new pages. We are retiring some old products to focus on new products. The new software has much similar functionality to the old software, but has more features. How can we ensure that the new pages get the best start in life? Also, what is the best way of doing this for users? Our plan currently is to: Leave the old pages up initially with a message to the user that the old software has been retired. There will also be a message explaining that the user might be interested in one of our new products and a link to the new pages. When traffic to these pages reduces, then we will delete these pages and re-direct them to the homepage. Has anyone got any recommendations for how we could approach this differently? One idea that I’m considering is to immediately re-direct the old product pages to the new pages. I was wondering if we could then provide a message to the user explaining that the old product has been retired but that the new improved product is available. I’d also be interested in pointing the re-directs to the new product pages that are most relevant rather than the homepage, so that they get the value of the old links. I’ve found in the past that old retirement pages for products can outrank the new pages as until you 301 them then all the links and authority flow to these pages. Any help would be very much appreciated 🙂
Intermediate & Advanced SEO | | RG_SEO0 -
Urls missing from product_cat sitemap
I'm using Yoast SEO plugin to generate XML sitemaps on my e-commerce site (woocommerce). I recently changed the category structure and now only 25 of about 75 product categories are included. Is there a way to manually include urls or what is the best way to have them all indexed in the sitemap?
Intermediate & Advanced SEO | | kisen0 -
Are Bluehost servers slow or is it just me?
I have a ton of websites on Bluehosts servers... are my sites slow because I have so many sites on there? Or is Bluehost slow for everyone?
Intermediate & Advanced SEO | | jhinchcliffe1 -
Canonical URLs and Sitemaps
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external). Questions: 1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags? 2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
Intermediate & Advanced SEO | | 379seo0 -
XML Sitemap Index Percentage (Large Sites)
Hi all I'm wanting to find out from those who have experience dealing with large sites (10s/100s of millions of pages). What's a typical (or highest) percentage of indexed pages vs. submitted pages you've seen? This information can be found in webmaster tools where Google shows you the pages submitted & indexed for each of your sitemap. I'm trying to figure out whether, The average index % out there There is a ceiling (i.e. will never reach 100%) It's possible to improve the indexing percentage further Just to give you some background, sitemap index files (according to schema.org) have been implemented to improve crawl efficiency and I'm wanting to find out other ways to improve this further. I've been thinking about looking at the URL parameters to exclude as there are hundreds (e-commerce site) to help Google improve crawl efficiency and utilise the daily crawl quote more effectively to discover pages that have not been discovered yet. However, I'm not sure yet whether this is the best path to take or I'm just flogging a dead horse if there is such a ceiling or if I'm already at the average ballpark for large sites. Any suggestions/insights would be appreciated. Thanks.
Intermediate & Advanced SEO | | danng0