Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Removing UpperCase URLs from Indexing
-
This search - site:www.qjamba.com/online-savings/automotix
gives me this result from Google:
Automotix online coupons and shopping - Qjamba
https://www.qjamba.com/online-savings/automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products.and Google tells me there is another one, which is 'very simliar'. When I click to see it I get:
Automotix online coupons and shopping - Qjamba
https://www.qjamba.com/online-savings/Automotix
Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products.This is because I recently changed my program to redirect all urls with uppercase in them to lower case, as it appears that all lowercase is strongly recommended.
I assume that having 2 indexed urls for the same content dilutes link juice. Can I safely remove all of my UpperCase indexed pages from Google without it affecting the indexing of the lower case urls? And if, so what is the best way -- there are thousands.
-
Hi AMHC,
It makes sense that without hardly any backlinks built up Google wont find my upper case URLS since all the page links have been changed, however, I am writing out all of the urls that are redirected into email, and from that I can tell that Google is finding them--I guess they may have a list of urls from prior indexing that they crawl independent of what their crawler comes up with.
I'll keep looking to see what they have indexed and if it turns out they just aren't crawling certain pages, will put them in a sitemap to be crawled..It's a good idea for taking care of the problem quickly--so if it progresses too slowly I'll do that.
Thanks very much for your answers!
-
Google needs to crawl the bad pages that you 301d. If there are no live links to those pages, then Google can't find them to 301. In short, if you created new lower case URLs, you just increased your duplicate content problem.
To solve this problem, build an HTML sitemap with all of the bad URLs. Have Google fetch and submit the page and all of the pages it links to. Google will crawl all of your old pages and apply the 301s.
-
Thanks AMHC. In my case, I just don't have many back links so I don't have the urgency that you faced with getting Google to see all the redirects. But, I'm still not understanding--it sounds like you believe that once google sees the redirect it removes the old uppercase from its index. It doesn't look to me like that is what happened in my case because Google is currently indexing BOTH, and so that means it has crawled my new lowercase and I know it isn't crawling any uppercase anymore (it cant--all are redirected). So, that's why I wonder if I have to remove those uppercase urls...does that make sense or am I just not understanding it still?
EDIT: I just discovered I wasn't doing a 301 direct so it wasn't considered a permanent move. That, if I understand it right, will remove the upper case from googles index permanently.
-
Canonicals still drain link juice. Canonicals aren't like a 301. The link juice still stays on the canocalized page. All a canonical does is tell Google, in the case of duplicate content, which page is primary. Canonicals handle the duplicate content issue, they do not handle the link juice issue. If I have 2 pages: /product-name/ and /product-name=?khdfpohfo/ that are duplicates, you can via canonical, tell Google to ignore the page with the variable string and rank the page without the variable string. If the page with the variable string has links, the link juice stays on the page.
The HTML Sitemap is there to tell Google about the 301s. the sitemap would look like this:
After you do the 301 redirect, as well as set up parameters in the .htaccess file (I think - not the developer on this), everything should redirect to the lower case URL. The problem is that if you do a 301 redirect for your entire site, Google may not figure it out too quickly. When it crawls your home page downward, it's only going to see the new URLs, and can't crawl the old 301 URLs because there aren't any internal links pointing at them. The only way Google will see the 301 is via an external backlink. The way we solved this was to create an HTML sitemap of all of the old upper case URLs. We then had Google fetch and index/crawl the sitemap. As it crawls the sitemap, where all of the URLs are 301 redirects, it will likewise point all of the Link Juice at the new URLs.
-
I gotcha. Yeah, different thing going on here..these urls can be really difficult! I have uppercase lowercase, https http, urls that have different content(not just formatting) for mobile as desktop and vice versa, mobile urls that dont even exist for desktop, and desktop urls that dont exist for mobile..all under the same domain. 1000s of internal pages....In the desire to create a good website for users I've created an SEO monster because I didn't realize the many consequences with regard to search indexes.
If you know a true expert in these areas I need him/her. 4 years on this site, its live finally (2 months), and now I'm discovering all of these things have to be fixed, but i can't afford thousands of dollars..I'll do the work, I just need the knowledge!
-
I see where you are coming from, and I do not have a good answer then, when I did a lowercase redirect I started by creating the new lowercase pages then setting canonical to them. After a few months I removed the uppercase versions and redirected them to the new lowercase.
-
Hutch, thanks.
The site is dynamic with thousands of pages that are now being redirected to lower case, so I'm not seeing how using canonical would work because the upper case urls aren't on the site anymore. I guess I think of canonical as being useful when you have ongoing content on the site that duplicates one or more other pages on the same site. In my case none of the upper case urls exist anymore so they don't have 'ongoing' content. I'm still new to this so if it sounds like I have it wrong, please correct me.
-
Another quick fix would be to use a canonical tag on all of your pages pointing to the full lowercase versions.
So for the URLs example.com/UPPER; example.com/Upper; and example.com/upper you would place the following into the head so Google knows that these are just variations of the same page, and if will point search to the desired page example.com/upper
-
AMHC, thank you for your response. I'm in the middle of quite a mess, as this is one of several issues, so really appreciate your help. I must confess to not following everything you wrote exactly:
In your situation, I think i understand the redirect -- it is the same reason I am doing a redirect--it is so that anyone coming from to this site with uppercase in it will end up on the lower case page, and in the case of google will then index the page as a lower case page. BTW, for me that has been easy as I am doing it via php -- if the url doesn't equal its strtolower of the url , then I redirect to strtolower.
I think I get what you are saying about the sitemap -- it speeds up google crawling the site and seeing that all those upper cases should be lowercase from your redirect. In my case, i don't have the concern about Google discovering them as you did because my site is only a couple months old. And, I never have given Google a sitemap so many of my pages aren't crawled yet (I am trying to clean up my entire url structure before i submit a sitemap to them--however they have already crawled perhaps 20% of the site, so I'm now trying to examine what google has crawled and how it has been indexed to figure out what needs to be done).
What I'm not understanding is this: It seems to me that what you described should succeed for going forward to getting both Google and your users to the right ending page, but I don't see how it removes the prior uppercase urls from Google's index. What is it that tells Google your prior upper case urls should no longer be in their index? Is it the fact that they aren't in the sitemap you provide now? Or, do they literally have to be removed using some kind of removal or disavow tool? I discovered this (as you see in the op) because Google appears to never have removed the Uppercase ones even though they are indexing the lower case now.
Ted
-
We had the same issue. Boy, was it an education. I had no idea that URLs were case sensitive for Google, and neither did my SEO buddies. I bet if you asked 100 SEOs if URLs were case sensitive for Google, 95 would answer "No". We discovered the problem in GWT and GA when they had different statistics for the mixed case and all lower case versions of the URL. We believed that we had both a duplicate content issue as well as a link juice splitting issue, with backlinks being pointed at both URLs.
We solved the problem by doing a 301 redirect, but as we are an ecommerce site with thousands of products, it was a messy process. We had to redirect pretty much every page on the site since the mixed case categories contaminated subcategories and products.
The 301 went pretty smoothly, and we saw a minor bump up in some of our Rankings. I would strongly suggest that you create an HTML sitemap for every upper case URL that you are going to 301. Here were our thoughts - we could be wrong on this. If we just 301 a page, and don't tell Google, then Google won't know about it unless it tries to crawl the page. We felt like we needed to show Google that all of the pages are being redirected asap. Create an HTML sitemap with all of your upper case URLs. After you do the 301, have Google fetch and index the sitemap page and all of the pages that it links to. Leave the map up for a few days, and then you can take it down. This will expedite moving the link juice to the correct pages as Google will index the 301 for every page in the sitemap.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Changing Url Removes Backlink
Hello MOZ Community, I have question regarding Bad Backlink Removal. My Site's Post's Image got 4 to 5k backlinks from unknown sites and also their is no contact details on their site so that i can contact them to remove. So, I have an idea for which i want suggestion " If I change the url that receieves backlinks" does this will remove backlinks? For Example: https://example.com/test/ got 5k backlinks if I change this url to https://examplee.com/test-failed/ does this will remove those 5k backlinks? If not then How Can I remove those Backlinks? I Know about disavow but this takes time.
Intermediate & Advanced SEO | | Jackson210 -
How do I know if I am correctly solving an uppercase url issue that may be affecting Googlebot?
We have a large e-commerce site (10k+ SKUs). https://www.flagandbanner.com. As I have begun analyzing how to improve it I have discovered that we have thousands of urls that have uppercase characters. For instance: https://www.flagandbanner.com/Products/patriotic-paper-lanterns-string-lights.asp. This is inconsistently applied throughout the site. I directed our website vendor to fix the issue and they placed 301 redirects via a rule to the web.config file. Any url that contains an uppercase character now displays as a lowercase. However, as I use screaming frog to monitor our site, I see all these 301 redirects--thousands of them. The XML sitemap still shows the the uppercase versions. We have had indexing issues as well. So I'm wondering what is the most effective way to make sure that I'm not placing an extra burden on Googlebot when they index our site? Should I have just not cared about the uppercase issue and let it alone?
Intermediate & Advanced SEO | | webrocket0 -
Does google ignore ? in url?
Hi Guys, Have a site which ends ?v=6cc98ba2045f for all its URLs. Example: https://domain.com/products/cashmere/robes/?v=6cc98ba2045f Just wondering does Google ignore what is after the ?. Also any ideas what that is? Cheers.
Intermediate & Advanced SEO | | CarolynSC0 -
Duplicate URLs ending with #!
Hi guys, Does anyone know why a site can contain duplicate URLs ending with hastag & exclamation mark e.g. https://site.com.au/#! We are finding a lot of these URLs (as duplicates) and i was wondering what they are from developer standpoint? And do you think it's worth the time and effort adding a rel canonical tag or 301 to these URLs eventhough they're not getting indexed by Google? Cheers, Chris
Intermediate & Advanced SEO | | jayoliverwright0 -
I have a lot of spammy links coming to my 404 page (the URLs have been removed now). Should i re-direct to Home?
I have a lot of spammy links pointing at my website according to MOZ. Thankfully all of them were for some URLs that we've long since removed so they're hitting my 404. Should i change the 404 with a 301 and Re-Direct that Juice to my home page or some other page or will that hurt my ranking?
Intermediate & Advanced SEO | | jagdecat0 -
Dev Subdomain Pages Indexed - How to Remove
I own a website (domain.com) and used the subdomain "dev.domain.com" while adding a new section to the site (as a development link). I forgot to block the dev.domain.com in my robots file, and google indexed all of the dev pages (around 100 of them). I blocked the site (dev.domain.com) in robots, and then proceeded to just delete the entire subdomain altogether. It's been about a week now and I still see the subdomain pages indexed on Google. How do I get these pages removed from Google? Are they causing duplicate content/title issues, or does Google know that it's a development subdomain and it's just taking time for them to recognize that I deleted it already?
Intermediate & Advanced SEO | | WebServiceConsulting.com0 -
How do you de-index and prevent indexation of a whole domain?
I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks
Intermediate & Advanced SEO | | Martin_S0 -
Is it safe to redirect multiple URLs to a single URL?
Hi, I have an old Wordress website with about 300-400 original pages of content on it. All relating to my company's industry: travel in Africa. It's a legitimate site with travel stories, photos, advice etc. Nothing spammy about. No adverts on it. No affiliates. The site hasn't been updated for a couple of years and we no longer have a need for it. Many of the stories on it are quite out of date. The site has built up a modest Mozrank value over the last 5 years, and has a few hundreds organically achieved inbound links. Recently I set up a swanky new branded website on ExpressionEngine on a new domain. My intention is to: Shut down the old site Focus all attention on building up content on the new website Ask the people linking to the old site to my new site instead (I wonder how many will actually do so...) Where possible, setup a 301 redirect from pages on the old site to their closest match on the new site Setup a 301 redirect from the old site's home page to new site's homepage Sounds good, right? But there is one issue I need some advice on... The old site has about 100 pages that do not have a good match on the new site. These pages are outdated or inferior quality, so it doesn't really make sense to rewrite them and put them on the new site. I call these my "black sheep pages". So... for these "black sheep pages" should I (A) redirect the urls to the new site's homepage (B) redirect the urls the old site's home page (which in turn, redirects to the new site's homepage, or (C) not redirect the urls, and let them die a lonely 404 death? OPTION A: oldsite.com/page1.php -> newsite.com
Intermediate & Advanced SEO | | AndreVanKets
oldsite.com/page2.php -> newsite.com
oldsite.com/page3.php -> newsite.com
oldsite.com/page4.php -> newsite.com
oldsite.com/page5.php -> newsite.com
oldsite.com -> newsite.com OPTION B: oldsite.com/page1.php -> oldsite.com
oldsite.com/page2.php -> oldsite.com
oldsite.com/page3.php -> oldsite.com
oldsite.com/page4.php -> oldsite.com
oldsite.com/page5.php -> oldsite.com
oldsite.com -> newsite.com OPTION 😄 oldsite.com/page1.php : do not redirect, let page 404 and disappear forever
oldsite.com/page2.php : do not redirect, let page 404 and disappear forever
oldsite.com/page3.php : do not redirect, let page 404 and disappear forever
oldsite.com/page4.php : do not redirect, let page 404 and disappear forever
oldsite.com/page5.php : do not redirect, let page 404 and disappear forever
oldsite.com -> newsite.com My intuition tells me that Option A would pass the most "link juice" to my new site, but I am concerned that it could also be seen by Google as a spammy redirect technique. What would you do? Help 😐1