Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Host sitemaps on S3?
-
Hey guys,
I run a dynamic web service and I will start building static sitemaps for it pretty soon. The fact that my app lives in a multitude of servers doesn't make it easy to distribute frequently updated static files throughout the servers.
My idea was to host the files in AWS S3 and point my robots.txt sitemap directive there. I'll use a sitemap index so, every other sitemap will be hosted on S3 as well.
I could dynamically mirror the content from the files in S3 through my app, but that would be a little more resource intensive than just serving the static files from a common place.
Any ideas? Thanks!
-
My general take on this sort of scenario is first to eliminate all the redundant hostnames with round-robin DNS, through adding extra server power with software-based load-balancing in the interim with a solution like InterWorx, and breaking out database servers. If you do that, you should have a nice little server cluster that's crazy efficient.and scalable. You can add a CDN to the mix if you like as well. With all of that, SEO should work the same way as on a single server.
Sitemaps can then be generated dynamically really easily (in under 25 lines of code, most of the time).
If you just want a way to mirror static files, you'll want to look at rsync.
And finally, as for S3, my personal opinion is to stay away. I'm an SEO, but I also spent 7 years building a hosting company. Those solutions sound great in their marketing, but are scientifically less reliable than standard hosting, and you can verify that via public uptime tracking sites like HyperSpin.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Japanese URL-structured sitemap (pages) not being indexed by Bing Webmaster Tools
Hello everyone, I am facing an issue with the sitemap submission feature in Bing Webmaster Tools for a Japanese language subdirectory domain project. Just to outline the key points: The website is based on a subdirectory URL ( example.com/ja/ ) The Japanese URLs (when pages are published in WordPress) are not being encoded. They are entered in pure Kanji. Google Webmaster Tools, for instance, has no issues reading and indexing the page's URLs in its sitemap submission area (all pages are being indexed). When it comes to Bing Webmaster Tools it's a different story, though. Basically, after the sitemap has been submitted ( example.com/ja/sitemap.xml ), it does report an error that it failed to download this part of the sitemap: "page-sitemap.xml" (basically the sitemap featuring all the sites pages). That means that no URLs have been submitted to Bing either. My apprehension is that Bing Webmaster Tools does not understand the Japanese URLs (or the Kanji for that matter). Therefore, I generally wonder what the correct way is to go on about this. When viewing the sitemap ( example.com/ja/page-sitemap.xml ) in a web browser, though, the Japanese URL's characters are already displayed as encoded. I am not sure if submitting the Kanji style URLs separately is a solution. In Bing Webmaster Tools this can only be done on the root domain level ( example.com ). However, surely there must be a way to make Bing's sitemap submission understand Japanese style sitemaps? Many thanks everyone for any advice!
Technical SEO | | Hermski0 -
Image Sitemap
I currently use a program to create our sitemap (xml). It doesn't offer creating an mage sitemaps. Can someone suggest a program that would create an image sitemap? Thanks.
Technical SEO | | Kdruckenbrod0 -
Remove sitemap, effect ranking?
We are considering to remove our sitemap because it doesn't display the right structure. Will it affect current rankings if we remove the sitemap en continuing without a sitemap? Thanks
Technical SEO | | rijwielcashencarry0400 -
Exclude Child URLs from XML Sitemap Generator (Wordpress)
Hi all, I was recommended the XML Sitemap Generator for Wordpress by the very helpful Keith Bloemendaal and John Pring - however I can't seem to exclude child URLs. There is a section Exclude items and a subsection Exclude posts. I have tried inputting the URLs for the pages I don't want in the sitemap, however that didn't work. So I read that you have to include a list of "IDs" - not sure where on earth to find that info, tried the page name and the post= number from the URL, however neither worked. I hope somebody can point me in the right direction - and apologies, I am a Wordpress novice, and I got no answers from the Wordpress forums so turned right back to SEOmoz! Cheers.
Technical SEO | | markadoi840 -
Ror.xml vs sitemap.xml
Hey Mozzers, So I've been reading somethings lately and some are saying that the top search engines do not use ror.xml sitemap but focus just on the sitemap.xml. Is that true? Do you use ror? if so, for what purpose, products, "special articles", other uses? Can sitemap be sufficient for all of those? Thank you, Vadim
Technical SEO | | vijayvasu0 -
Is "last modified" time in XML Sitemaps important?
My Tech lead is concerned that his use of a script to generate XML sitemaps for some client sites may be causing negative issues for those sites. His concern centers around the fact that the script generates a sitemap which indicates that every URL page in the site was last modified at the exact same date and time. I have never heard anything to indicate that this might be a problem, but I do know that the sitemaps I generate for other client sites can choose server response or not. What is the best way to generate the sitemap? Last mod from actual time modified, or all set at one date and time?
Technical SEO | | ShaMenz0 -
Where should a knowledge base be hosted for max. SEO benefit?
A client would like to set up a knowledge base to work in conjunction with their website and we are tossing up whether to go with a hosted solution (and therefore set up as a subdomain) or find a solution that we host on the clients domain (which will presumably have more SEO benefit). We are leaning towards the latter (although are mindful that we need to balance the client’s desire for a quality KB solution). Appreciate your feedback.
Technical SEO | | E2E0 -
How to handle sitemap with pages using query strings?
Hi, I'm working to optimize a site that currently has about 5K pages listed in the sitemap. There are not in face this many pages. Part of the problem is that one of the pages is a tool where each sort and filter button produces a query string URL. It seems to me inefficient to have so many items listed that are all really the same page. Not to mention wanting to avoid any duplicate content or low quality issues. How have you found it best to handle this? Should I just noindex each of the links? Canonical links? Should I manually remove the pages from the sitemap? Should I continue as is? Thanks a ton for any input you have!
Technical SEO | | 5225Marketing0