Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google is indexing my directories
-
I'm sure this has been asked before, but I was looking at all of Google's results for my site and I found dozens of results for directories such as:
Index of /scouting/blog/wp-includes/js/swfupload/plugins
Obviously I don't want those indexed. How do I prevent Google from indexing those?
Also, it only seems to be doing it with Wordpress, not any of the directories on my main site. (We have a wordpress blog, which is only a portion of the site)
-
Thanks for expanding on that!
a) There are no pages in these folders; they're just directories full of Wordpress config stuff
b) I already have Yoast and have tag pages noindexed. All the directories I'm referring to are similar to the path I showed above.
c) If I request removal of specific folders, will it also remove all the files within the folder?\
Thanks!
-
To elaborate a little on what he said
a) yes block the folders with content you don't want indexed in robots.txt - but that is not enough you need to add the noindex to the pages as well for an extra measure.
b) get the YOAST seo plugin for wordpress installed on your site and configure it to follow,noindex these page. look at other sections you don't want indexed too like tag pages. This plugin is easy to use and will help suggest default settings.
c) yes you can go into WMT to request removal of specific folders and urls and Google will quickly get them out of the index for you instead of waiting for them to age away. Your robots.txt should be in place before doing this to make sure it's permanent.
-
Here are your options:
- Robots.txt - http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93708 - easy option.
- Noindex tags - http://www.mattcutts.com/blog/google-noindex-behavior/
- Removal request - https://support.google.com/webmasters/bin/answer.py?hl=en&answer=59819
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
Google not Indexing images on CDN.
My URL is: https://bit.ly/2hWAApQ We have set up a CDN on our own domain: https://bit.ly/2KspW3C We have a main xml sitemap: https://bit.ly/2rd2jEb and https://bit.ly/2JMu7GB is one the sub sitemaps with images listed within. The image sitemap uses the CDN URLs. We verified the CDN subdomain in GWT. The robots.txt does not restrict any of the photos: https://bit.ly/2FAWJjk. Yet, GWT still reports none of our images on the CDN are indexed. I ve followed all the steps and still none of the images are being indexed. My problem seems similar to this ticket https://bit.ly/2FzUnBl but however different because we don't have a separate image sitemap but instead have listed image urls within the sitemaps itself. Can anyone help please? I will promptly respond to any queries. Thanks
Technical SEO | | TNZ
Deepinder0 -
What are best options for website built with navigation drop-down menus in JavaScript, to get those menus indexed by Google?
This concerns f5.com, a large website with navigation menus that drop down when hovered over. The sub nav items (example: “DDoS Protection”) are not cached by Google and therefore do not distribute internal links properly to help those sub-pages rank well. Best option naturally is to change the nav menus from JS to CSS but barring that, is there another option? Will Schema SiteNavigationElement work as an alternate?
Technical SEO | | CarlLarson0 -
Indexed pages
Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... Google Search Console: 237 indexed pages Google search using site command: 468 results MOZ site crawl: 1013 unique URLs Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500) Can anyone shed any light on why they differ so much? And where lies the truth?
Technical SEO | | muzzmoz1 -
How to stop google from indexing specific sections of a page?
I'm currently trying to find a way to stop googlebot from indexing specific areas of a page, long ago Yahoo search created this tag class=”robots-nocontent” and I'm trying to see if there is a similar manner for google or if they have adopted the same tag? Any help would be much appreciated.
Technical SEO | | Iamfaramon0 -
Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO? Anyone know? Thanks! ~Brett
Technical SEO | | BBuck0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Why google index my IP URL
hi guys, a question please. if site:112.65.247.14 , you can see google index our website IP address, this could duplicate with our darwinmarketing.com content pages. i am not quite sure why google index my IP pages while index domain pages, i understand this could because of backlink, internal link and etc, but i don't see obvious issues there, also i have submit request to google team to remove ip address index, but seems no luck. Please do you have any other suggestion on this? i was trying to do change of address setting in Google Webmaster Tools, but didn't allow as it said "Restricted to root level domains only", any ideas? Thank you! boson
Technical SEO | | DarwinChinaSEO0