Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Japanese URL-structured sitemap (pages) not being indexed by Bing Webmaster Tools
-
Hello everyone,
I am facing an issue with the sitemap submission feature in Bing Webmaster Tools for a Japanese language subdirectory domain project. Just to outline the key points:
-
The website is based on a subdirectory URL ( example.com/ja/ )
-
The Japanese URLs (when pages are published in WordPress) are not being encoded. They are entered in pure Kanji.
-
Google Webmaster Tools, for instance, has no issues reading and indexing the page's URLs in its sitemap submission area (all pages are being indexed).
When it comes to Bing Webmaster Tools it's a different story, though. Basically, after the sitemap has been submitted ( example.com/ja/sitemap.xml ), it does report an error that it failed to download this part of the sitemap: "page-sitemap.xml" (basically the sitemap featuring all the sites pages). That means that no URLs have been submitted to Bing either.
My apprehension is that Bing Webmaster Tools does not understand the Japanese URLs (or the Kanji for that matter). Therefore, I generally wonder what the correct way is to go on about this.
When viewing the sitemap ( example.com/ja/page-sitemap.xml ) in a web browser, though, the Japanese URL's characters are already displayed as encoded.
I am not sure if submitting the Kanji style URLs separately is a solution. In Bing Webmaster Tools this can only be done on the root domain level ( example.com ). However, surely there must be a way to make Bing's sitemap submission understand Japanese style sitemaps?
Many thanks everyone for any advice!
-
-
Hello there,
Thanks for your suggestions and sorry for the late response. In fact, I also left an inquiry with the Bing Webmaster Tools mail support (I did not even realise they offered this service), and they answered within one day.
They confirmed that the site runs without any errors and that the sitemap has now been submitted successfully. Upon checking I can confirm this (the sitemaps URLs have finally been submitted). Therefore, all is in order now.
I still do not understand why prior to this the JA sitemap URLs were not being submitted (for weeks), even though I tried to make Bing Webmaster Tools re-crawl it by re-submitting the sitemap.
In any case, I guess this is one of these episodes where the problem simply fixed itself. Kudos to their support though...
Thanks everyone
-
Hey there–a few thoughts/questions:
- have you correctly implemented hreflang tags (tags that display the alternate language & country versions in the section of every page of your site)?
- why did you choose to create a separate sitemap that lives under the /ja page path? you could, instead, add alternate URLs to the JP version of your content in your existing sitemap
- I doubt this is why you're seeing issues, but is there a particular reason you chose JA as the page path as opposed to the HTML ISO country code for Japan, JP?
To specifically answer your Q about Kanji, I have not found anything that states Bing does not support Kanji. After some preliminary searching, it also looks like Bing does present URLs with Kanji characters in its results (example). As a result, I don't think Kanji is the reason you're having trouble getting your JP sitemap read by Bing.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
Indexing product attributes in sitemap
Hey Mozzers! I'm battling a few questions about the sitemap for my ecommerce store. Could you help me out? Is it necessary to include your product attributes in the sitemap? I'm not sure why it would matter to have a sitemap that lists everything in the color cherry. Also, if the attributes were included in the sitemap, would that count as duplicate content for the same products to show up in multiple attributes? Is there any benefit to submitting the sitemaps individually? For example, submitting /product-sitemap.xml, /product_brand-sitemap.xml versus just /sitemap.xml? Any other best practices for managing my ecommerce sitemap, or great resources, would be very helpful. Thank you! a1vUz
Technical SEO | | localwork0 -
Bing webmaster tools incorrectly showing missing title and description tags
Hey all, Was wondering if anyone else has come across this issue. Bing is showing title and description tags missing in the head of my wordpress blog. I can't seem to find any documentation on this. Thanks, Roman
Technical SEO | | Dynata_panel_marketing0 -
Category URL Pagination where URLs don't change between pages
Hello, I am working on an e-commerce site where there are categories with multiple pages. In order to avoid pagination issues I was thinking of using rel=next and rel=prev and cannonical tags. I noticed a site where the URL doesn't change between pages, so whether you're on page 1,2, or 3 of the same category, the URL doesn't change. Would this be a cleaner way of dealing with pagination?
Technical SEO | | whiteonlySEO0 -
How to stop google from indexing specific sections of a page?
I'm currently trying to find a way to stop googlebot from indexing specific areas of a page, long ago Yahoo search created this tag class=”robots-nocontent” and I'm trying to see if there is a similar manner for google or if they have adopted the same tag? Any help would be much appreciated.
Technical SEO | | Iamfaramon0 -
Investigating a huge spike in indexed pages
I've noticed an enormous spike in pages indexed through WMT in the last week. Now I know WMT can be a bit (OK, a lot) off base in its reporting but this was pretty hard to explain. See, we're in the middle of a huge campaign against dupe content and we've put a number of measures in place to fight it. For example: Implemented a strong canonicalization effort NOINDEX'd content we know to be duplicate programatically Are currently fixing true duplicate content issues through rewriting titles, desc etc. So I was pretty surprised to see the blow-up. Any ideas as to what else might cause such a counter intuitive trend? Has anyone else see Google do something that suddenly gloms onto a bunch of phantom pages?
Technical SEO | | farbeseo0 -
Should all pagination pages be included in sitemaps
How important is it for a sitemap to include all individual urls for the paginated content. Assuming the rel next and prev tags are set up would it be ok to just have the page 1 in the sitemap ?
Technical SEO | | Saijo.George0 -
Optimal Structure for Forum Thread URL
For getting forum threads ranked, which is best and why? site.com**/topic/**thread-title-goes-here site.com**/t/**thread-title-goes-here site.com**/**thread-title-goes-here I'd take comfort in knowing that SEOmoz uses the middle version, except that "q" is more meaningful to a human than "t". The last option seems like the best bet overall, except that users could potentially steal urls that I may want to use in the future. My old structure was site.com/forum/topic/TOPIC_ID-thread-title-goes-here so obviously any of those would be a vast improvement, but I might as well make the best choice now so I only have to change once.
Technical SEO | | PatrickGriffith0