• BBgmoro

        See all notifications

        Skip to content
        Moz logo Menu open Menu close
        • Products
          • Moz Pro
          • Moz Pro Home
          • Moz Local
          • Moz Local Home
          • STAT
          • Moz API
          • Moz API Home
          • Compare SEO Products
          • Moz Data
        • Free SEO Tools
          • Domain Analysis
          • Keyword Explorer
          • Link Explorer
          • Competitive Research
          • MozBar
          • More Free SEO Tools
        • Learn SEO
          • Beginner's Guide to SEO
          • SEO Learning Center
          • Moz Academy
          • MozCon
          • Webinars, Whitepapers, & Guides
        • Blog
        • Why Moz
          • Digital Marketers
          • Agency Solutions
          • Enterprise Solutions
          • Small Business Solutions
          • The Moz Story
          • New Releases
        • Log in
        • Log out
        • Products
          • Moz Pro

            Your all-in-one suite of SEO essentials.

          • Moz Local

            Raise your local SEO visibility with complete local SEO management.

          • STAT

            SERP tracking and analytics for enterprise SEO experts.

          • Moz API

            Power your SEO with our index of over 44 trillion links.

          • Compare SEO Products

            See which Moz SEO solution best meets your business needs.

          • Moz Data

            Power your SEO strategy & AI models with custom data solutions.

          Turn SEO data into actionable content briefs

          Turn SEO data into actionable content briefs

          Learn more
        • Free SEO Tools
          • Domain Analysis

            Get top competitive SEO metrics like DA, top pages and more.

          • Keyword Explorer

            Find traffic-driving keywords with our 1.25 billion+ keyword index.

          • Link Explorer

            Explore over 40 trillion links for powerful backlink data.

          • Competitive Research

            Uncover valuable insights on your organic search competitors.

          • MozBar

            See top SEO metrics for free as you browse the web.

          • More Free SEO Tools

            Explore all the free SEO tools Moz has to offer.

          Let your business shine with Listings AI

          Let your business shine with Listings AI

          Get found
        • Learn SEO
          • Beginner's Guide to SEO

            The #1 most popular introduction to SEO, trusted by millions.

          • SEO Learning Center

            Broaden your knowledge with SEO resources for all skill levels.

          • On-Demand Webinars

            Learn modern SEO best practices from industry experts.

          • How-To Guides

            Step-by-step guides to search success from the authority on SEO.

          • Moz Academy

            Upskill and get certified with on-demand courses & certifications.

          • MozCon

            Save on Early Bird tickets and join us in London or New York City

          Access 20 years of data with flexible pricing
          Moz API

          Access 20 years of data with flexible pricing

          Find your plan
        • Blog
        • Why Moz
          • Digital Marketers

            Simplify SEO tasks to save time and grow your traffic.

          • Small Business Solutions

            Uncover insights to make smarter marketing decisions in less time.

          • Agency Solutions

            Earn & keep valuable clients with unparalleled data & insights.

          • Enterprise Solutions

            Gain a competitive edge in the ever-changing world of search.

          • The Moz Story

            Moz was the first & remains the most trusted SEO company.

          • New Releases

            Get the scoop on the latest and greatest from Moz.

          Surface actionable competitive intel
          New Feature

          Surface actionable competitive intel

          Learn More
        • Log in
          • Moz Pro
          • Moz Local
          • Moz Local Dashboard
          • Moz API
          • Moz API Dashboard
          • Moz Academy
        • Avatar
          • Moz Home
          • Notifications
          • Account & Billing
          • Manage Users
          • Community Profile
          • My Q&A
          • My Videos
          • Log Out

        The Moz Q&A Forum

        • Forum
        • Questions
        • My Q&A
        • Users
        • Ask the Community

        Welcome to the Q&A Forum

        Browse the forum for helpful insights and fresh discussions about all things SEO.

        1. Home
        2. SEO Tactics
        3. Intermediate & Advanced SEO
        4. How is Google crawling and indexing this directory listing?

        Moz Q&A is closed.

        After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

        How is Google crawling and indexing this directory listing?

        Intermediate & Advanced SEO
        4
        7
        243885
        Loading More Posts
        • Watching

          Notify me of new replies.
          Show question in unread.

        • Not Watching

          Do not notify me of new replies.
          Show question in unread if category is not ignored.

        • Ignoring

          Do not notify me of new replies.
          Do not show question in unread.

        • Oldest to Newest
        • Newest to Oldest
        • Most Votes
        Reply
        • Reply as question
        Locked
        This topic has been deleted. Only users with question management privileges can see it.
        • danatanseo
          danatanseo last edited by

          We have three Directory Listing pages that are being indexed by Google:

          http://www.ccisolutions.com/StoreFront/jsp/

          http://www.ccisolutions.com/StoreFront/jsp/html/

          http://www.ccisolutions.com/StoreFront/jsp/pdf/

          How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why.

          If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site?

          Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content.

          For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page:

          http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML

          This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff

          As you can see, this results in duplicate content problems.

          Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result?

          For example:

          Disallow: /StoreFront/jsp/

          Disallow: /StoreFront/jsp/html/

          Disallow: /StoreFront/jsp/pdf/

          Can we do this without risking blocking Googlebot from content we do want crawled and indexed?

          Many thanks in advance for any and all help on this one!

          1 Reply Last reply Reply Quote 0
          • danatanseo
            danatanseo last edited by

            Thanks so much to you all. This has gotten us closer to an answer. We are consulting with the folks who developed the Web store to make sure that these solutions won't break other things if implemented, particularly something mentioned to me by our IT Director called "Sim links" - I'll keep you posted!

            1 Reply Last reply Reply Quote 1
            • StreamlineMetrics
              StreamlineMetrics @danatanseo last edited by

              I am referring to Web users. If a user or search engine tried to view those directory listing pages, they will get a Forbidden message, which is what you want to happen. The content in those directories will still be accessible by the pages on the site since the files still exist in those directories, but the pages listing the files in those directories won't be accessible in the browser to users/search engines. In other words, turning off the Directory indexes will not affect any of the content on the site.

              1 Reply Last reply Reply Quote 1
              • john4math
                john4math @StreamlineMetrics last edited by

                He's got the right idea, you shouldn't be serving these pages (unless you have a specific reason to).  The problem is these index pages are returning with a status code of 200 OK, so Google assumes it's fine to index them.  These pages should either come back with a 404 or a 403 (forbidden), and users then wouldn't be able to browse your site with these directory pages.

                Disallowing in robots.txt may not immediately remove these from search results, you may get that lovely description underneath the results that says, "A description for this result is not available because of this site's robots.txt".

                1 Reply Last reply Reply Quote 1
                • danatanseo
                  danatanseo last edited by

                  Thanks much to you both for jumping in. (thumbs up!)

                  Streamline, I understand your suggestion regarding .htaccess, however, as I mentioned, the content in these directories is being used to populate content on our pages. In your response you mentioned that users/search engines wouldn't be able to access them. When you say "users," are you referring to Web visitors, and not site admins?

                  StreamlineMetrics 1 Reply Last reply Reply Quote 0
                  • StreamlineMetrics
                    StreamlineMetrics last edited by

                    There's numerous ways Google could have found those pages and added them to the index, but there's really no way to determine exactly what caused it in the first place. All it takes is for one visit by Google for a page to be crawled and indexed.

                    If you don't want these pages indexed, then blocking those directories/pages in robots.txt would not be the solution because you would prevent Google from accessing those pages at all going forward. But the problem is that these pages are already in Google's index and by simply using the robots.txt file, you are just telling Google not to visit those pages from now on and thus your pages will remain in the index. A better solution would be to add the no-index, no-cache tags to those pages so the next time Google accesses those pages, they will know to remove those pages from the index.

                    And now that I've read through your post again, I am now realizing you are talking about file directories rather than normal webpages. What I've wrote above mainly still applies, but I think the quick and easy fix would be to turn off Directory Indexes all together (unless you need them for some reason?). All you have to do is add the following code to your .htaccess file -

                    Options -Indexes

                    This will turn off these directory listings so users/search engines can't access them and they should eventually fall out of the Google index.

                    john4math 1 Reply Last reply Reply Quote 2
                    • FedeEinhorn
                      FedeEinhorn last edited by

                      You can use robots to disallow google from even crawling those pages, while the meta noindex still allows the crawling but prevents the indexing of those pages.

                      If you have any sensitive data that you don't want Google to read, then go ahead and use the robots directives you wrote above. However, if you just want them deindexed I'll suggest to go with the meta noindex, as it will allow other pages (linked) to be indexed but leave that particular page out.

                      1 Reply Last reply Reply Quote 1
                      • 1 / 1
                      • First post
                        Last post

                      Browse Questions

                      Explore more categories

                      • Moz Tools

                        Chat with the community about the Moz tools.

                      • SEO Tactics

                        Discuss the SEO process with fellow marketers

                      • Community

                        Discuss industry events, jobs, and news!

                      • Digital Marketing

                        Chat about tactics outside of SEO

                      • Research & Trends

                        Dive into research and trends in the search industry.

                      • Support

                        Connect on product support and feature requests.

                      • See all categories

                      Related Questions

                      • seoelevated

                        Old brand name being suffixed on Google SERP listings

                        At the end of some of our listings in Google search results pages, our old brand name is being suffixed even though it is not in our title tags. For context, we re-branded several months ago, and at that time also migrated to a new domain name. Our title tags have our current brand name suffixed, like "Shop Example Category | Example©". In the Google search results, but not in Bing nor Yahoo, about half of our pages have titles whcih instead look like this: "Shop Example Category | Example© - oldBrandName". The "dash" and the old brand name are not in our title tags, but they are being appended, even when our title tags are fairly long. For example, even with titles at 54 characters (421 pixels), the suffix is being appended. BUT, not with our longer title tags. We are actually OK with the brand name being appended if our title tags are on the shorter side, but would prefer that our current brand name be appended instead of the older one. I realize we could increase the length of all our title tags, and perhaps we may go that route. But, does anyone know where Google would be getting the old brand name to append onto the URLs? We've checked and it is not in our page source (the old brand name is used in our page source in some areas of text and some url paths, but not in any kind of meta tag). Per Google's guidance (https://www.searchenginejournal.com/google-do-not-put-organization-schema-markup-on-every-page/289981/) we only have schema for the "Organization" on our home page, and not on every page. So, assuming this advice is correct to not add schema to every page, how can we inform Google of our current brand name so that it stops appending our old brand name on pages?

                        Intermediate & Advanced SEO | | seoelevated
                        0
                      • LindsayE

                        Can you index a Google doc?

                        We have updated and added completely new content to our state pages. Our old state content is sitting in a our Google drive. Can I make these public to get them indexed and provide a link back to our state pages? In theory it sounds like a great link building strategy... TIA!

                        Intermediate & Advanced SEO | | LindsayE
                        1
                      • RosemaryB

                        Is possible to submit a XML sitemap to Google without using Google Search Console?

                        We have a client that will not grant us access to their Google Search Console (don't ask us why). Is there anyway possible to submit a XML sitemap to Google without using GSC? Thanks

                        Intermediate & Advanced SEO | | RosemaryB
                        0
                      • kchandler

                        Proper 301 in Place but Old Site Still Indexed In Google

                        So i have stumbled across an interesting issue with a new SEO client. They just recently launched a new website and implemented a proper 301 redirect strategy at the page level for the new website domain. What is interesting is that the new website is now indexed in Google BUT the old website domain is also still indexed in Google? I even checked the Google Cached date and it shows the new website with a cache date of today. The redirect strategy has been in place for about 30 days. Any thoughts or suggestions on how to get the old domain un-indexed in Google and get all authority passed to the new website?

                        Intermediate & Advanced SEO | | kchandler
                        0
                      • iam-sold

                        Best way to remove full demo (staging server) website from Google index

                        I've recently taken over an in-house role at a property auction company, they have a main site on the top-level domain (TLD) and 400+ agency sub domains! company.com agency1.company.com agency2.company.com... I recently found that the web development team have a demo domain per site, which is found on a subdomain of the original domain - mirroring the site. The problem is that they have all been found and indexed by Google: demo.company.com demo.agency1.company.com demo.agency2.company.com... Obviously this is a problem as it is duplicate content and so on, so my question is... what is the best way to remove the demo domain / sub domains from Google's index? We are taking action to add a noindex tag into the header (of all pages) on the individual domains but this isn't going to get it removed any time soon! Or is it? I was also going to add a robots.txt file into the root of each domain, just as a precaution! Within this file I had intended to disallow all. The final course of action (which I'm holding off in the hope someone comes up with a better solution) is to add each demo domain / sub domain into Google Webmaster and remove the URLs individually. Or would it be better to go down the canonical route?

                        Intermediate & Advanced SEO | | iam-sold
                        0
                      • desmond.liang

                        Our login pages are being indexed by Google - How do you remove them?

                        Each of our login pages show up under different subdomains of our website. Currently these are accessible by Google which is a huge competitive advantage for our competitors looking for our client list. We've done a few things to try to rectify the problem: -  No index/archive to each login page Robot.txt to all subdomains to block search engines gone into webmaster tools and added the subdomain of one of our bigger clients then requested to remove it from Google (This would be great to do for every subdomain but we have a LOT of clients and it would require tons of backend work to make this happen.) Other than the last option, is there something we can do that will remove subdomains from being viewed from search engines? We know the robots.txt are working since the message on search results say: "A description for this result is not available because of this site's robots.txt – learn more." But we'd like the whole link to disappear.. Any suggestions?

                        Intermediate & Advanced SEO | | desmond.liang
                        1
                      • CommercePundit

                        How to stop Google crawling after 301 redirect?

                        I have removed all pages from my old website and set 301 redirect to new website. But, I have verified old website with Google webmaster tools' HTML verification file which enable me to track all data and existence of pages in Google search for my old website. I was assumed that, Google will stop crawling and DE-indexed all pages after 301 redirect. Because, I have set 301 redirect before 3 months. Now, I'm able to see Google bot activity on my website with help of Google webmaster tools. You can find out attachment to know more about it. How can it possible & How Google can crawl removed pages? You can see following image to know more about it. First & Second

                        Intermediate & Advanced SEO | | CommercePundit
                        0
                      • footsteps

                        How to prevent Google from crawling our product filter?

                        Hi All, We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl. On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless. In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls. We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway. What can we do to prevent Google from crawling all the filter options? Thanks in advance for the help. Kind regards, Gerwin

                        Intermediate & Advanced SEO | | footsteps
                        0

                      Get started with Moz Pro!

                      Unlock the power of advanced SEO tools and data-driven insights.

                      Start my free trial
                      Products
                      • Moz Pro
                      • Moz Local
                      • Moz API
                      • Moz Data
                      • STAT
                      • Product Updates
                      Moz Solutions
                      • SMB Solutions
                      • Agency Solutions
                      • Enterprise Solutions
                      • Digital Marketers
                      Free SEO Tools
                      • Domain Authority Checker
                      • Link Explorer
                      • Keyword Explorer
                      • Competitive Research
                      • Brand Authority Checker
                      • Local Citation Checker
                      • MozBar Extension
                      • MozCast
                      Resources
                      • Blog
                      • SEO Learning Center
                      • Help Hub
                      • Beginner's Guide to SEO
                      • How-to Guides
                      • Moz Academy
                      • API Docs
                      About Moz
                      • About
                      • Team
                      • Careers
                      • Contact
                      Why Moz
                      • Case Studies
                      • Testimonials
                      Get Involved
                      • Become an Affiliate
                      • MozCon
                      • Webinars
                      • Practical Marketer Series
                      • MozPod
                      Connect with us

                      Contact the Help team

                      Join our newsletter
                      Moz logo
                      © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                      • Accessibility
                      • Terms of Use
                      • Privacy

                      Looks like your connection to Moz was lost, please wait while we try to reconnect.