Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best way to block a search engine from crawling a link?
-
If we have one page on our site that is is only linked to by one other page, what is the best way to block crawler access to that page?
I know we could set the link to "nofollow" and that would prevent the crawler from passing any authority, and we can set the page to "noindex" to prevent it from appearing in search results, but what is the best way to prevent the crawler from accessing that one link?
-
Hi there,
I'm assuming you are trying to do pagerank sculpting (or something related..) - which was made a little more tough in recent years. I'll base my answer around this assumption, so feel free to correct me if this isn't the case.
There are several methods to make a link uncrawlable:
- AJAX - Googlebot will not read any calls through AJAX. If you can load your link through an external call, it would be completely hidden.
- Javascript - Obfuscate links with Javascript that masks the link. You can do any number of solutions here, including using tags with a title of your URL, which upon clicking, goes that that URL. Simple and effective.
- Redirects - I haven't tested this last idea, and it may not work. You might be able to redirect to another page in your website, which is then set to not be indexed. Then redirect to the intended page through a query string. In theory it should work, but obviously not as good as the previous methods I described.
Let me know if you have questions. I'd be glad to help further.
Cheers!
-
Noindex/nofollow should be good enough, but if you want to be sure it doesn't get indexed, you could can also include <meta name="robots" content="NOINDEX, NOFOLLOW"> in the head section of the page to be blocked. You can also exclude the page in your robots.txt file. </meta name="robots">
You can find a simple robots.txt generator in Google Webmaster Tools if you need to block particular pages or directories. The robots.txt file should be in the root directory of your site and look something like this:
User-agent: * Disallow: /file-you-want-to-hide.html
You can also request removal of specific URLs in Webmaster Tools if it has already been indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Image Search - Is there a way to influence the related icons at the top of the image search results?
Google recently added related icons at the top of the image search results page. Some of the icons may be unrelated to the search. Are there any best practices to influence what is positioned in the related image icons section? Thank you.
Intermediate & Advanced SEO | | JaredBroussard1 -
Top hierarchy pages vs footer links vs header links
Hi All, We want to change some of the linking structure on our website. I think we are repeating some non-important pages at footer menu. So I want to move them as second hierarchy level pages and bring some important pages at footer menu. But I have confusion which pages will get more influence: Top menu or bottom menu or normal pages? What is the best place to link non-important pages; so the link juice will not get diluted by passing through these. And what is the right place for "keyword-pages" which must influence our rankings for such keywords? Again one thing to notice here is we cannot highlight pages which are created in keyword perspective in top menu. Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Best way to block a sub-domain from being indexed
Hello, The search engines have indexed a sub-domain I did not want indexed its on old.domain.com and dev.domain.com - I was going to password them but is there a best practice way to block them. My main domain default robots.txt says :- Sitemap: http://www.domain.com/sitemap.xml global User-agent: *
Intermediate & Advanced SEO | | JohnW-UK
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /trackback/
Disallow: /feed/
Disallow: /comments/
Disallow: /category//
Disallow: */trackback/
Disallow: */feed/
Disallow: /comments/
Disallow: /?0 -
My website (non-adult) is not appearing in Google search results when i have safe search settings on. How can i fix this?
Hi, I have this issue where my website does not appear in Google search results when i have the safe search settings on. If i turn the safe search settings off, my site appears no problem. I'm guessing Google is categorizing my website as adult, which it definitely is not. Has anyone had this issue before? Or does anyone know how to resolve this issue? Any help would be much appreciated. Thanks
Intermediate & Advanced SEO | | CupidTeam0 -
Do links to PDF's on my site pass "link juice"?
Hi, I have recently started a project on one of my sites, working with a branch of the U.S. government, where I will be hosting and publishing some of their PDF documents for free for people to use. The great SEO side of this is that they link to my site. The thing is, they are linking directly to the PDF files themselves, not the page with the link to the PDF files. So my question is, does that give me any SEO benefit? While the PDF is hosted on my site, there are no links in it that would allow a spider to start from the PDF and crawl the rest of my site. So do I get any benefit from these great links? If not, does anybody have any suggestions on how I could get credit for them. Keep in mind that editing the PDF's are not allowed by the government. Thanks.
Intermediate & Advanced SEO | | rayvensoft0 -
Do 404 Pages from Broken Links Still Pass Link Equity?
Hi everyone, I've searched the Q&A section, and also Google, for about the past hour and couldn't find a clear answer on this. When inbound links point to a page that no longer exists, thus producing a 404 Error Page, is link equity/domain authority lost? We are migrating a large eCommerce website and have hundreds of pages with little to no traffic that have legacy 301 redirects pointing to their URLs. I'm trying to decide how necessary it is to keep these redirects. I'm not concerned about the page authority of the pages with little traffic...I'm concerned about overall domain authority of the site since that certainly plays a role in how the site ranks overall in Google (especially pages with no links pointing to them...perfect example is Amazon...thousands of pages with no external links that rank #1 in Google for their product name). Anyone have a clear answer? Thanks!
Intermediate & Advanced SEO | | M_D_Golden_Peak0 -
What is the best way to handle special characters in URLs
What is the best way to handle special characters? We have some URL's that use special characters and when a sitemap is generate using Xenu it changes the characters to something different. Do we need to have physically change the URL back to display the correct character? Example: URL: http://petstreetmall.com/Feeding-&-Watering/361.html Sitmap Link: http://www.petstreetmall.com/Feeding-%26-Watering/361.html
Intermediate & Advanced SEO | | WebRiverGroup0