Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Subdomain Removal in Robots.txt with Conditional Logic??
-
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live.
My specific situation is this:
I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content.
Should I:
a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense)
b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning...
Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
-
Here's how I dealt with a similar situation in the past.
Robots.txt on each of the dev subdomains and on the live domain. Dev subdomains robots.txt excluded the entire subdomain, and subdomains were verified in GWT and removed as needed.
Made live subdomain robots.txt read-only so it didn't get overwritten. Should have made dev subdomains robots.txt read-only as well, since they sometimes got refreshed with the live content (there was a UGC database that would occasionally get copied to a dev subdomain, and we'd have robots.txt get copied over too and dev subdomain indexed).
Set up a code monitor that checks the contents of all of the robots.txt daily and sends me an email if anything is changed.
Not perfect, but I was at least able to catch changes soon after they happened, and prevented a few changes.
-
you can't put logic in robots.txt and subdomains are seen as different sites, so you need to create separate robots.txt files for each subdomain and block them in their respective robots.txt files.
You'll need to also add the Google verification code and verify them, then in GWMT you can request to have the subdomain removed from Googles index, that's the fastest way.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Subdomain 403 error
Hi Everyone, A crawler from our SEO tool detects a 403 error from a link from our main domain to a a couple of subdomains. However, these subdomains are perfect accessibly. What could be the problem? Is this error caused by the server, the crawlbot or something else? I would love to hear your thoughts.
Technical SEO | | WeAreDigital_BE
Jens0 -
Add trailing slash after removing .html extention
My website is non www ,it has wordpress in subdirectory and some static webpages in the root and other subdirectory 1. i want to remove .html extention from the webpages in the root and
Technical SEO | | Uber_
the others static webpages in subdirectory.
2. add slash at the end.
3. 301 redirect from non slash to url with slash. so it should be http://ghadaalsaman.com/articles.html to http://ghadaalsaman.com/articles/ and http://ghadaalsaman.com/en/poem-list.html to http://ghadaalsaman.com/en/poem-list/ the below code 1. working with non slash at the end **2. **redirect 301 url with slash to non here's my .htaccess <ifmodule mod_rewrite.c="">Options +FollowSymLinks -MultiViews RewriteEngine On
RewriteBase /</ifmodule> #removing trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L] #www to non
RewriteCond %{HTTP_HOST} ^www.(([a-z0-9_]+.)?domain.com)$ [NC]
RewriteRule .? http://%1%{REQUEST_URI} [R=301,L] #html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^.]+)$ $1.html [NC,L] #index redirect
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
RewriteRule ^index.html$ http://ghadaalsaman.com/ [R=301,L]
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L] PS everything is ok with the wordpress , the problems with static pages only. Thanks in advanced0 -
Does a subdomain benefit from being on a high authority domain?
I think the title sums up the question, but does a new subdomain get any ranking benefits from being on a pre-existing high authority domain. Or does the new subdomain have to fend for itself in the SERPs?
Technical SEO | | RG_SEO0 -
Robots.txt on http vs. https
We recently changed our domain from http to https. When a user enters any URL on http, there is an global 301 redirect to the same page on https. I cannot find instructions about what to do with robots.txt. Now that https is the canonical version, should I block the http-Version with robots.txt? Strangely, I cannot find a single ressource about this...
Technical SEO | | zeepartner0 -
No index on subdomains
Hi, We have a subdomain that is appearing in the search results - I want to hide this as it looks really bad. If I were to add the no index tag to the sub domain would URL would this affect the whole domain or just that sub domain? The main domain is vitally important - it is just that sub domain I need to hide. Many thanks
Technical SEO | | Creditsafe0 -
Does an subdomain hosted offsite provide SEO value
We have a job board hosted through an applicant processing system which we've setup as a subdomain (jobs.ourcompany.com), most of the assets are hosted on our primary domain (ourcompany.com). My question is does having it hosted offsite provide any value? Do we get credit for that content being shared and distributed on the web or does the applicant processing system? As I see it the options are (correct me if I'm wrong): Host the job listings on our primary domain (ourcompany.com/jobs) and have it point to the application on the subdomain. Advertise the job listings pointing to the primary domain on the paid sites. The free job listing sites will automatically point to the sub-domain because the applicant processing system automatically submits them. Host the job listings entirely on the sub-domain applicant tracking system and link to it from our primary site navigation. Advertise the job listings to the sub-domain so that both free and paid point to the same place. Obviously the second one would be much easier just not sure on the technical side of our website getting credit by search engines as the one who has produced the content.
Technical SEO | | r1200gsa0 -
How to Remove a website from your Bing Webmaster Tools account
I have a site in Bing Webmaster Tools that I no longer work on. I can't seem to find where to delete this website from my webmaster tools account. Anyone know how (there doesn't seem to be anything obvious under Bing Help or on a Google Search).
Technical SEO | | TopFloor0 -
Removing Media from Wordpress
I've run the seomoz on page report and found an interesting issue. I'm using wordpress and it seems that every picture I add to my articles seem to be added as separate pages to the site. I'm having to go to each and every picture and creating a meta tag and description to it. I still get duplicate content issues with the same. On my Disqus system, I get the same pictures added just as a page or article would look like. What can I do to avoid this?
Technical SEO | | emasaa0