Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
URL Rewriting Best Practices
-
Hey Moz!
I’m getting ready to implement URL rewrites on my website to improve site structure/URL readability. More specifically I want to:
- Improve our website structure by removing redundant directories.
- Replace underscores with dashes and remove file extensions for our URLs.
Please see my example below:
Old structure: http://www.widgets.com/widgets/commercial-widgets/small_blue_widget.htm
New structure: https://www.widgets.com/commercial-widgets/small-blue-widget
I've read several URL rewriting guides online, all of which seem to provide similar but overall different methods to do this. I'm looking for what's considered best practices to implement these rewrites. From what I understand, the most common method is to implement rewrites in our .htaccess file using mod_rewrite (which will find the old URLs and rewrite them according to the rewrites I implement).
One question I can't seem to find a definitive answer to is when I implement the rewrite to remove file extensions/replace underscores with dashes in our URLs, do the webpage file names need to be edited to the new format? From what I understand the webpage file names must remain the same for the rewrites in the .htaccess to work. However, our internal links (including canonical links) must be changed to the new URL format. Can anyone shed light on this?
Also, I'm aware that implementing URL rewriting improperly could negatively affect our SERP rankings. If I redirect our old website directory structure to our new structure using this rewrite, are my bases covered in regards to having the proper 301 redirects in place to not affect our rankings negatively?
Please offer any advice/reliable guides to handle this properly.
Thanks in advance!
-
Thanks for clearing that up and all of the help!
-
I'm saying rename files first and do rewrite for removing extensions.
You will have to do rewrite for replacing underscores with hyphens anyway, just for redirect purposes.
So, rename files from underscores to hyphens; do rewrite rule for underscore to hyphens to insure old pages are being redirected; do another rewrite for removing file extensions. In som time (2-3-4 months) when old file names (with underscores) are out of google index, delete first rewrite.
-
Hey Dmitrii,
I was planning on using two rewrites.
One rewrite for replacing the underscores with hyphens.
And another rewrite for removing the file extensions.
Just so I fully understand, you recommend implementing the rewrite for replacing the underscores with hyphens in our .htaccess file. Then once the new URLs are indexed, change the webpage file names themselves by replacing the underscores with hyphens, make the newly named files live and remove this rewrite from our .htaccess. Is my understanding correct?
Again...thanks for all of your help!
-
Well, I thought that's what you were going to do and use rewrite just for deleting file extensions. Honestly, I'd leave file extensions and rename files to hyphens. This way there is no server processing involved.
-
Another question just popped into my head...
Once our new website directory structure and URL format has been rewritten, redirected and indexed by search engines, would it make sense to edit the actual webpage file names (replacing the underscores w/ hyphens) and then remove the URL rewrite that replaces the underscores with the hyphens? Or is this not recommended?
-
Thanks for the help Dmitrii!
Both the rewrite I posted above and yours for removing file extensions failed to work. However, it seems this one does the trick (taken from the Apache help forums).
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+).htm [NC,OR]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+).php [NC]
RewriteRule ^ %1 [R,L] -
Yes, I believe so, that's the only rewrite you'd need not to mess up rankings.
I don't know if one of codes is better than another. All I know that my piece of code is working and i haven't used the one you wrote. It seems ok to me, but just test it. If it works, I don't think there is any difference.
-
Hey Dmitrii,
This rewrite that I posted above...
RewriteRule ^old/(.*)$ /new/$1 [L,R=301]
...isn't intended to remove the file extensions. I'm using it to redirect the old directory structure to our new directory structure.
I was asking if using this rewrite when changing my directory structure will be all I need in regards to having all the necessary redirects in place to not negatively affect our SEO/SERP rankings. Any idea?
Also, would you recommend the rewrite you provided above over the one below when removing file extensions?
RewriteBase /
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(.*)$ $1.htmlLet me know if I'm being clear enough
Thanks!
-
the rule you wrote wont work.
What it will do is redirect this: _domain.com/old/small_blue_widget.htm _to this: domain.com/new/small_blue_widget.htm
To remove the extension would be:
<code>RewriteRule ^([^\.]+)$ $1.htm [NC,L]</code>
-
Thanks for the response Dmitrii!
Thanks for for confirming that I don't need to update the webpage file names.
Do you know if redirecting the old directories to the new ones (using the the rewrite below) is all I need to do regarding redirects? In other words, when redirecting directories using the rewrite below is there any need to redirect the old URL format (small_blue_widget.htm) to the new (small-blue-widget)? My understanding is no, all I need to do is redirect the directories; but please share your knowledge.Thanks in advance!
<code>RewriteRule ^old/(.*)$ /new/$1 [L,R=301]</code>
-
Hi there.
Well, as for best practices - you got it covered - remove/substitute underscores, remove redundant directories, make urls readable and understandable by users, implement redirects for pages, which are being renamed.
As for removing extensions from files - i'm not sure it has any effect on SEO or user experience at all. But no, you don't have to create new format pages. Basically what mod_rewrite does is when somebody requests a page, server says "I gonna server you this file with this name, because you sent me this specific request". Just be aware that there is no way to access both original url and rewritten url at the same time, since it would create duplicate issues.
As for rankings affect - as long as all redirects are done properly and urls are targeting the keywords on the page - you should be fine.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help with facet URLs in Magento
Hi Guys, Wondering if I can get some technical help here... We have our site britishbraces.co.uk , built in Magento. As per eCommerce sites, we have paginated pages throughout. These have rel=next/prev implemented but not correctly ( as it is not in is it in ) - this fix is in process. Our canonicals are currently incorrect as far as I believe, as even when content is filtered, the canonical takes you back to the first page URL. For example, http://www.britishbraces.co.uk/braces/x-style.html?ajaxcatalog=true&brand=380&max=51.19&min=31.19 Canonical to... http://www.britishbraces.co.uk/braces/x-style.html Which I understand to be incorrect. As I want the coloured filtered pages to be indexed ( due to search volume for colour related queries ), but I don't want the price filtered pages to be indexed - I am unsure how to implement the solution? As I understand, because rel=next/prev implemented ( with no View All page ), the rel=canonical is not necessary as Google understands page 1 is the first page in the series. Therefore, once a user has filtered by colour, there should then be a canonical pointing to the coloured filter URL? ( e.g. /product/black ) But when a user filters by price, there should be noindex on those URLs ? Or can this be blocked in robots.txt prior? My head is a little confused here and I know we have an issue because our amount of indexed pages is increasing day by day but to no solution of the facet urls. Can anybody help - apologies in advance if I have confused the matter. Thanks
Intermediate & Advanced SEO | | HappyJackJr0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
What are the best practices for geo-targeting by sub-folders?
My domain is currently targeting the US, but I'm building out sub-folders that will need to geo-target France, England, and Spain. Each country will have it's own sub-folder, and professionally translated (domain.com/france). Other than the hreflang tags, what are other best practices I can implement? Can Google Webmaster tools geo-target by subfolder? Any suggestions would be appreciated. Thanks Justin
Intermediate & Advanced SEO | | Rhythm_Agency0 -
Should I include URLs that are 301'd or only include 200 status URLs in my sitemap.xml?
I'm not sure if I should be including old URLs (content) that are being redirected (301) to new URLs (content) in my sitemap.xml. Does anyone know if it is best to include or leave out 301ed URLs in a xml sitemap?
Intermediate & Advanced SEO | | Jonathan.Smith0 -
Attack of the dummy urls -- what to do?
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website: www.mysite.com/dynamicpage/dummy123 www.mysite.com/dynamicpage/dummy456 etc.. How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors. I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
Intermediate & Advanced SEO | | friendoffood1 -
How to deal with URLs and tabbed content
Hi All, We're currently redesigning a website for a new home developer and we're trying to figure out the best way to deal with tabbed content in the URL structure. The design of the site at the moment will have a page for a development and within that you can select your house type, then when on the house type page there will be tabs displayed for the user to see things like the plot map, availability and pricing, specifications, etc. The way our development team are looking at handling this is for the URL to use a hashtag or a query string at the end of it so we can still land users on these specific tabs for PPC for example. My question is really, has anyone had any experience with this? Any recommendations on how to best display the urls for SEO? Thanks
Intermediate & Advanced SEO | | J_Sinclair0 -
Weird 404 URL Problem - domain name being placed at end of urls
Hey there. For some reason when doing crawl tests I'm finding pages with the domain name being tacked on the end and causing 404 errors.
Intermediate & Advanced SEO | | Jay328
For example: http://domainname.com/page-name/http://domainname.com This is happening to all pages, posts and even category type 1. Site is in Wordpress
2. Using Yoast SEO plugin Any suggestions? Thanks!0 -
E-commerce site, one product multiple categories best practice
Hi there, We have an e-commerce shopping site with over 8000 products and over 100 categories. Some sub categories belong to multiple categories - for example, A Christmas trees can be under "Gardening > Plants > Trees" and under "Gifts > Holidays > Christmas > Trees" The product itself (example: Scandinavian Xmas Tree) can naturally belong to both these categories as well. Naturally these two (or more) categories have different breadcrumbs, different navigation bars, etc. From an SEO point of view, to avoid duplicate content issues, I see the following options: Use the same URL and change the content of the page (breadcrumbs and menus) based on the referral path. Kind of cloaking. Use the same URL and display only one "main" version of breadcrumbs and menus. Possibly add the other "not main" categories as links to the category / product page. Use a different URL based on where we came from and do nothing (will create essentially the same content on different urls except breadcrumbs and menus - there's a possibiliy to change the category text and page title as well) Use a different URL based on where we came from with different menus and breadcrumbs and use rel=canonical that points to the "main" category / product pages This is a very interesting issue and I would love to hear what you guys think as we are finalizing plans for a new website and would like to get the most out of it. Thank you all!
Intermediate & Advanced SEO | | arikbar0