Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Sitemap.xml strategy for site with thousands of pages
-
I have a client that has a HUGE website with thousands of product pages. We don't currently have a sitemap.xml because it would take so much power to map the sitemap. I have thought about creating a sitemap for the key pages on the website - but didn't want to hurt the SEO on the thousands of product pages. If you have a sitemap.xml that only has some of the pages on your site - will it negatively impact the other pages, that Google has indexed - but are not listed on the sitemap.xml.
-
@jerrico1 Only including some pages in the sitemap won't hurt your SEO performance at all. I've done this on a number of sites for exactly the same reasons you are facing.
The XML sitemap simply gives Google one more way to find your pages. Ideally, you could use it to give Google a way to find all of your pages but you want to at least use it for the pages you want to be sure Google finds. However, there is no penalty if the page isn't in the sitemap.
That said - you may want to check if you need the XML sitemap at all as a point of discovery. If you have lots of links (internal or external) to the pages on your website, then odds are good that Google is already finding those pages. The XML sitemap wouldn't hurt to have but if there already links to these pages, you likely don't have a big problem to solve here.
The best way to check this is within your log file - pull a unique list of all the URLs that Google has crawled over the last few weeks. You may not be able to open up your log files (sometimes you can't easily on large sites and you aren't using an enterprise log analyzer). If that is the case, then you could check to see how many of your pages are Google organic landing pages in your analytics tool--if the page is getting traffic from Google, then Google clearly found the page.
Hope that helps!
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Japanese URL-structured sitemap (pages) not being indexed by Bing Webmaster Tools
Hello everyone, I am facing an issue with the sitemap submission feature in Bing Webmaster Tools for a Japanese language subdirectory domain project. Just to outline the key points: The website is based on a subdirectory URL ( example.com/ja/ ) The Japanese URLs (when pages are published in WordPress) are not being encoded. They are entered in pure Kanji. Google Webmaster Tools, for instance, has no issues reading and indexing the page's URLs in its sitemap submission area (all pages are being indexed). When it comes to Bing Webmaster Tools it's a different story, though. Basically, after the sitemap has been submitted ( example.com/ja/sitemap.xml ), it does report an error that it failed to download this part of the sitemap: "page-sitemap.xml" (basically the sitemap featuring all the sites pages). That means that no URLs have been submitted to Bing either. My apprehension is that Bing Webmaster Tools does not understand the Japanese URLs (or the Kanji for that matter). Therefore, I generally wonder what the correct way is to go on about this. When viewing the sitemap ( example.com/ja/page-sitemap.xml ) in a web browser, though, the Japanese URL's characters are already displayed as encoded. I am not sure if submitting the Kanji style URLs separately is a solution. In Bing Webmaster Tools this can only be done on the root domain level ( example.com ). However, surely there must be a way to make Bing's sitemap submission understand Japanese style sitemaps? Many thanks everyone for any advice!
Technical SEO | | Hermski0 -
Page titles in browser not matching WP page title
I have an issue with a few page titles not matching the title I have In WordPress. I have 2 pages, blog & creative gallery, that show the homepage title, which is causing duplicate title errors. This has been going on for 5 weeks, so its not an a crawl issue. Any ideas what could cause this? To clarify, I have the page title set in WP, and I checked "Disable PSP title format on this page/post:"...but this page is still showing the homepage title. Is there an additional title setting for a page in WP?
Technical SEO | | Branden_S0 -
Removing Redirected URLs from XML Sitemap
If I'm updating a URL and 301 redirecting the old URL to the new URL, Google recommends I remove the old URL from our XML sitemap and add the new URL. That makes sense. However, can anyone speak to how Google transfers the ranking value (link value) from the old URL to the new URL? My suspicion is this happens outside the sitemap. If Google already has the old URL indexed, the next time it crawls that URL, Googlebot discovers the 301 redirect and that starts the process of URL value transfer. I guess my question revolves around whether removing the old URL (or the timing of the removal) from the sitemap can impact Googlebot's transfer of the old URL value to the new URL.
Technical SEO | | RyanOD0 -
NoIndex/NoFollow pages showing up when doing a Google search using "Site:" parameter
We recently launched a beta version of our new website in a subdomain of our existing site. The existing site is www.fonts.com with the beta living at new.fonts.com. We do not want Google to crawl the new site until it's out of beta so we have added the following on all pages: However, one of our team members noticed that google is displaying results from new.fonts.com when doing an "site:new.fonts.com" search (see attached screenshot). Is it possible that Google is indexing the content despite the noindex, nofollow tags? We have double checked the syntax and it seems correct except the trailing "/". I know Google still crawls noindexed pages, however, the fact that they're showing up in search results using the site search syntax is unsettling. Any thoughts would be appreciated! DyWRP.png
Technical SEO | | ChrisRoberts-MTI0 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0 -
Same Video on Multiple Pages and Sites... Duplicate Issues?
We're rolling out quite a bit of pro video and hosting on a 3-party platform/player (likely BrightCove) that also allows us to have the URL reside on our domain. Here is a scenario for a particular video asset: A. It's on a product page that the video is relevant for. B. We have an entry on our blog with the video C. We have a separate section of our site "Video Library" that provides a centralized view of all videos. It's there too. D. We eventually give the video to other sites (bloggers, industry educational sites etc) for outreach and link-building. A through C on our domain are all for user experience as every page is very relevant, but are there any duplicate video issues here? We would likely only have the transcript on the product page (though we're open to suggestions). Any related feedback would be appreciated. We want to make this scalable and done properly from the beginning (will be rolling out 1000+ videos in 2010)
Technical SEO | | SEOPA0 -
Should XML sitemaps include *all* pages or just the deeper ones?
Hi guys, Ok this is a bit of a sitemap 101 question but I cant find a definitive answer: When we're running out XML sitemaps for google to chew on (we're talking ecommerce and directory sites with many pages inside sub-categories here) is there any point in mentioning the homepage or even the second level pages? We know google is crawling and indexing those and we're thinking we should trim the fat and just send a map of the bottom level pages. What do you think?
Technical SEO | | timwills0 -
What are the pros and cons of moving one site onto a subdomain of another site?
Two sites. One has weaker sales. What would the benefits and problems for SEO of moving the weak site from its own domain to a subdomain of the stronger site?
Technical SEO | | GriffinHansen0