Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How preproduction website is getting indexed in Google.
-
Hi team,
Can anybody please help me to find how my preproduction website and urls are getting indexed in Google.
-
As Eric hinted, the best method to prevent any pages being indexed would be to use htaccess password protection dialog on your development site. It's fairly easy to implement. You can find instructions to do so here: http://www.htaccesstools.com/articles/password-protection/
-
Hi Anoop! Have everyone's answers helped? Do you still have any questions?
-
Anoop, when a 'development' or 'preproduction' website or subdomain is getting indexed, that means that you haven't stopped the search engines from crawling it. The search engines, especially Google, are very aggressive at crawling, and they will crawl just about any URL that they find. It seems as though all you have to do is visit that page and it's going to get crawled.
Best way to stop Google from crawling (then indexing) a website is to stop it from getting crawled using the robots.txt file. Keep in mind, though, that even if you tell them to stay out of it using the robots.txt file they will still index those URLs.
The only way to stop Google from crawling would be to password protect the website or make it available only on a private server, or available via VPN only.
-
In addition to noindexing the pages using the meta tag, if you have WMT / Search Console set up, you can request Google remove those URLs from their index for the time being. I've found that this may take up to a couple of hours from the removal request to the time of actual removal.
As to how they were found, there's a good chance that Google crawled a link to a preproduction webpage and went from there.
-
Hi
To prevent most search engine web crawlers from indexing a page on your site, place the following meta tag into the section of your page:
To prevent only Google web crawlers from indexing a page:
You should be aware that some search engine web crawlers might interpret the
noindex
directive differently. As a result, it is possible that your page might still appear in results from other search engines.here is complete guide: https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?csw=1
-
Hi,
Have you noindexed & nofollowed the site and pages? I would also suggest you block all crawlers by disallowing access in the robots.txt file.
Do you know if this has all been done?
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My WP website got attack by malware & now my website site:www.example.ca shows about 43000 indexed page in google.
Hi All My wordpress website got attack by malware last week. It affected my index page in google badly. my typical site:example.ca shows about 130 indexed pages on google. Now it shows about 43000 indexed pages. I had my server company tech support scan my site and clean the malware yesterday. But it still shows the same number of indexed page on google.
Technical SEO | | ChophelDoes anybody had ever experience such situation and how did you fixed it. Looking for help. Thanks FILE HIT LIST:
{YARA}Spam_PHP_WPVCD_ContentInjection : /home/example/public_html/wp-includes/wp-tmp.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-includes/wp-vcd.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-content/themes/oceanwp.zip
{YARA}webshell_webshell_cnseay02_1 : /home/example2/public_html/content.php
{YARA}eval_post : /home/example2/public_html/wp-includes/63292236.php
{YARA}webshell_webshell_cnseay02_1 : /home/example3/public_html/content.php
{YARA}eval_post : /home/example4/public_html/wp-admin/28855846.php
{HEX}php.generic.malware.442 : /home/example5/public_html/wp-22.php
{HEX}php.generic.cav7.421 : /home/example5/public_html/SEUN.php
{HEX}php.generic.malware.442 : /home/example5/public_html/Webhook.php0 -
How long does Google takes to re-index title tags?
Hi, We have carried out changes in our website title tags. However, when I search for these pages on Google, I still see the old title tags in the search results. Is there any way to speed this process up? Thanks
Technical SEO | | Kilgray0 -
Site indexed by Google, but (almost) never gets impressions
Hi there, I have a question that I wasn't able to give it a reasonable answer yet, so I'm going to trust on all of you. Basically a site has all its pages indexed by Google (I verified with site:sitename.com) and it also has great and unique content. All on-page grades are A with absolutely no negative factors at all. However its pages do not get impressions almost at all. Of course I didn't expect it to be on page 1 since it has been launched on Dec, 1st, but it looks like Google is ignoring (or giving it bad scores) for some reason. Only things that can contribute to that could be: domain privacy on the domain, redirect from the www to the subdomain we use (we did this because it will be a multi-language site, so we'll assign to each country a subdomain), recency (it has been put online on Dec 1st and the domain is just a couple of months old). Or maybe because we blocked crawlers for a few days before the launch? Exactly a few days before Dec 1st. What do you think? What could be the reason for that? Thanks guys!
Technical SEO | | ruggero0 -
How to fix Google index after fixing site infected with malware.
Hi All Upgraded a Joomla site for a customer a couple of months ago that was infected with malware (it wasn't flagged as infected by google). Site is fine now but still noticing search queries for "cheap adobe" etc with links to http://domain.com/index.php?vc=201&Cheap_Adobe_Acrobat_xi in web master tools (about 50 in total). These url's redirect back to home page and seem to be remaining in the index (I think Joomla is doing this automatically) Firstly, what sort of effect would these be having on on their rankings? Would they be seen by google as duplicate content for the homepage (moz doesn't report them as such as there are no internal links). Secondly what's my best plan of attack to fix them. Should I setup 404's for them and then submit them to google? Will resubmitting the site to the index fix things? Would appreciate any advice or suggestions on the ramifications of this and how I should fix it. Regards, Ian
Technical SEO | | iragless0 -
Why is my blog disappearing from Google index?
My Google blogger blog is about 10 months old. In that time i have worked really hard with adding unique content, building relationships with other bloggers in the same niche, and done some inbound marketing. 2 weeks ago I updated the template to something cleaner, with a little more "wordpress" feel to it. This means i've messed about with the code a lot in these weeks, adding social buttons etc. The problem is that from some point late last week thurs/fri my pages started disappearing from Googles index. I have checked webmaster tools and have no manual actions. My link profile is pretty clean as its a new site, and i have manually checked every piece of content published for plagiarism etc. So what is going on? Did i break my blog? Or is something else amiss? Impressions are down 96% comparing Nov 1-5th to previous 5 days. site is here: http://bit.ly/174beVm Thanks for any help in advance.
Technical SEO | | Silkstream0 -
How to Stop Google from Indexing Old Pages
We moved from a .php site to a java site on April 10th. It's almost 2 months later and Google continues to crawl old pages that no longer exist (225,430 Not Found Errors to be exact). These pages no longer exist on the site and there are no internal or external links pointing to these pages. Google has crawled the site since the go live, but continues to try and crawl these pages. What are my next steps?
Technical SEO | | rhoadesjohn0 -
How to stop my webmail pages not to be indexed on Google ??
when i did a search in google for Site:mywebsite.com , for a list of pages indexed. Surprisingly the following come up " Webmail - Login " Although this is associated with the domain , this is a completely different server , this the rackspace email server browser interface I am sure that there is nothing on the website that links or points to this.
Technical SEO | | UIPL
So why is Google indexing it ? & how do I get it out of there. I tried in webmaster tool but I could not , as it seems like a sub-domain. Any ideas ? Thanks Naresh Sadasivan0 -
Google is indexing my directories
I'm sure this has been asked before, but I was looking at all of Google's results for my site and I found dozens of results for directories such as: Index of /scouting/blog/wp-includes/js/swfupload/plugins Obviously I don't want those indexed. How do I prevent Google from indexing those? Also, it only seems to be doing it with Wordpress, not any of the directories on my main site. (We have a wordpress blog, which is only a portion of the site)
Technical SEO | | UnderRugSwept0