Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Site structure: Any issues with 404'd parent folders?
-
Is there any issue with a 404'd parent folder in a URL? There's no links to the parent folder and a parent folder page never existed. For example say I have the following pages w/ content:
/famous-dogs/lassie/
/famous-dogs/snoopy/
/famous-dogs/scooby-doo/But I never (and maybe never plan to) created a general **/famous-dogs/ **page. Sitemaps.xml does not link to it, nor does any page on my site.
Is there any concerns with doing this? Am I missing out on any sort of value that might pass to a parent folder?
-
Yeah - there is various speculation about how signals or authority traverse folder structures (see for example this whiteboard Friday ) but I haven't seen anything suggesting it's permanent - all of this may be an argument for adding /famous-dogs/ at some point, but I wouldn't personally stress about it not being there at launch.
-
Yeah. I'd just leave it as a 404 in that case
-
In my scenario, considering I might add a parent "famous dogs" page at some point, it'd probably best to leave robots.txt alone, right?
-
Thanks for the response. This is what I expected.
I swear I read somewhere that Google may pass some form of value from a child to a parent. i.e. "/famous-dogs/lassie/" could pass some value to "/famous-dogs/", absent any links. Can't find the source, but I suppose I'm a bit worried that I'd permanently lose out on some value if the parent does not exist initially. Considering I may add a "famous dogs" parent page at some point.
-
PS - if you're worried about the crawling, you could always block it in robots.txt if you really wanted (but unless it's a huge site I wouldn't bother). Note - if you do go this route, do it carefully so as not to block all contents of the folder at the same time!
-
The short answer is that there should be no harm going with your proposed approach.
Longer version: I believe there are cases where Google has tried to crawl a directory like "/famous-dogs/" in your example purely because it appears as a sub-folder in the paths of other pages even though there are not any direct links to it. But even if it does crawl it, if you don't have or intend to have a page there, a 404 is a perfectly valid response.
In general, while there could be a case that it's worth creating a "/famous-dogs/" page if there is search demand you can fulfil, until or unless you do, there is no harm in it returning a 404 response.
-
Seems odd that indexers would care if a parent directory page exists or not. Is there any proof that Google will attempt crawl parent folder pages that aren't in sitemaps.xml and aren't linked to anywhere else?
Perhaps I'm slowly building out my site. Depending on the material/approach, it might make sense to release a page talking about a sub-category (lassie) before releasing content about a parent category (famous dogs). Or maybe "famous dogs" is such low search volume that it doesn't make sense to spend time creating a parent "famous dogs" page.
If I'm understanding correctly, with the above you're effectively telling me to:
1. Build a parent category page. If I don't plan on investing much time/effort into the parent page, noindex it.
2. Reorganize my site folder structure.
Neither seem like a great option.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should you 'noindex' Checkout Pages?
Today I was reviewing my Moz analytics and suddenly noticed 1,000 issues with pages without a meta description. I reviewed the list and learned it is 1,000 checkout pages. That's because my website has thousands of agency pages from which you can buy a product, and it reflects that difference on each version of the checkout. So, I was thinking about no-indexing (but continuing to 'follow') these checkout pages, but wondering if it has any knock-on effects I may be unaware of? Any assistance is much appreciated. Luke
Intermediate & Advanced SEO | | Luke_Proctor0 -
My last site crawl shows over 700 404 errors all with void(0 added to the ends of my posts/pages.
Hello, My last site crawl shows over 700 404 errors all with void(0 added to the ends of my posts/pages. I have contacted my theme company but not sure what could have done this. Any ideas? The original posts/pages are still correct and working it just looks like it did duplicates and added void(0 to the end of each post/page. Questions: There is no way to undo this correct? Do I have to do a redirect on each of these? Will this hurt my rankings and domain authority? Any suggestions would be appreciated. Thanks, Wade
Intermediate & Advanced SEO | | neverenoughmusic.com0 -
Moved company 'Help Center' from Zendesk to Intercom, got lots of 404 errors. What now?
Howdy folks, excited to be part of the Moz community after lurking for years! I'm a few weeks into my new job (Digital Marketing at Rewind) and about 10 days ago the product team moved our Help Center from Zendesk to Intercom. Apparently the import went smoothly, but it's caused one problem I'm not really sure how to go about solving: https://help.rewind.io/hc/en-us/articles/*** is where all our articles used to sit https://help.rewind.io/*** is where all our articles now are So, for example, the following article has now moved as such: https://help.rewind.io/hc/en-us/articles/115001902152-Can-I-fast-forward-my-store-after-a-rewind- https://help.rewind.io/general-faqs-and-billing/frequently-asked-questions/can-i-fast-forward-my-store-after-a-rewind This has created a bunch of broken URLs in places like our Shopify/BigCommerce app listings, in our email drips, and in external resources etc. I've played whackamole cleaning many of these up, but these old URLs are still indexed by Google – we're up to 475 Crawl Errors in Search Console over the past week, all of which are 404s. I reached out to Intercom about this to see if they had something in place to help, but they just said my "best option is tracking down old links and setting up 301 redirects for those particular addressed". Browsing the Zendesk forms turned up some relevant-ish results, with the leading recommendation being to configure javascript redirects in the Zendesk document head (thread 1, thread 2, thread 3) of individual articles. I'm comfortable setting up 301 redirects on our website, but I'm in a bit over my head in trying to determine how I could do this with content that's hosted externally and sitting on a subdomain. I have access to our Zendesk admin, so I can go in and edit stuff there, but don't have experience with javascript redirects and have read that they might not be great for such a large scale redirection. Hopefully this is enough context for someone to provide guidance on how you think I should go about fixing things (or if there's even anything for me to do) but please let me know if there's more info I can provide. Thanks!
Intermediate & Advanced SEO | | henrycabrown1 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
Changing URL structure of date-structured blog with 301 redirects
Howdy Moz, We've recently bought a new domain and we're looking to change over to it. We're also wanting to change our permalink structure. Right now, it's a WordPress site that uses the post date in the URL. As an example: http://blog.mydomain.com/2015/01/09/my-blog-post/ We'd like to use mod_rewrite to change this using regular expressions, to: http://newdomain.com/blog/my-blog-post/ Would this be an appropriate solution? RedirectMatch 301 /./././(.) /blog/$1
Intermediate & Advanced SEO | | IanOBrien0 -
Putting "noindex" on a page that's in an iframe... what will that mean for the parent page?
If I've got a page that is being called in an iframe, on my homepage, and I don't want that called page to be indexed.... so I put a noindex tag on the called page (but not on the homepage) what might that mean for the homepage? Nothing? Will Google, Bing, Yahoo, or anyone else, potentially see that as a noindex tag on my homepage?
Intermediate & Advanced SEO | | Philip-DiPatrizio0 -
Can we retrieve all 404 pages of my site?
Hi, Can we retrieve all 404 pages of my site? is there any syntax i can use in Google search to list just pages that give 404? Tool/Site that can scan all pages in Google Index and give me this report. Thanks
Intermediate & Advanced SEO | | mtthompsons0 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4