Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to find all crawlable links on a particular page?
- 
					
					
					
					
 Hi! This might sound like a newbie question, but I'm trying to find all crawlable links (that google bot sees), on a particular page of my website. I'm trying to use screaming frog, but that gives me all the links on that particular page, AND all subsequent pages in the given sub-directory. What I want is ONLY the crawlable links pointing away from a particular page. What is the best way to go about this? Thanks in advance. 
- 
					
					
					
					
 Thanks for sharing this information Thomas. Appreciate your time and help here. Regards. 
- 
					
					
					
					
 I understand yes are referred that is a parameter or how far from home here's some information on a tool I'm using right now http://www.internetmarketingninjas.com/seo-tools/google-sitemap-generator/here is an HTML file of the results however you can see the how far from home on the left hand side I suggest you run the tool yourself so you can see the full resultsUsing the IMN Google Site Map GeneratorLinks are critically important to webpages, not only for connecting to other, related pages to help end users find the information they want, but in optimizing the pages for SEO. The Find Broken Links, Redirects & Google Sitemap Generator Free Tool allows webmasters and search engine optimizers to check the status of both external links and internal links on an entire website. The resulting report generated by the Google sitemap generator tool will give webmasters and SEOs insight to the link structure of a website, and identify link redirects and errors, all of which help in planning a link optimization strategy. We always offer the downloadable results and the sitemap generator free for everyone. Get startedTo start with the free sitemap generator, type (or paste) the full home page URL of the website you want scanned. Select the number of pages you want to scan (up to 500, up to 1,000, or up to 10,000). Note that the job starts immediately and runs in real time. For larger sites containing numerous pages, the process can take up to 30 minutes to crawl and gather data on 1,000 pages (and longer still for very large sites). You can set the Google sitemap generator tool to send you an email once the crawl is completed and the data report is prepared. The online sitemap generator offers several options and also acts as an XML sitemap generator or an HTML sitemap generator. Note that the results table data of the online sitemap generator is interactive. Most of the data items are linked, either to the URLs referenced or to details about the data. For most cells that contain non-URL data, pause the mouse over the cell to see the full results. 
 Results BarWhen the tool starts, a results bar appears at the top of the page showing the following information: - Status of the tool (Crawling or Done)
- Number of Internal URLs crawled
- Number of External links found
- Number of Internal HTTP Redirects found
- Number of External HTTP Redirects found
- Number of Internal HTTP error codes found
- Number of External HTTP error codes found
 For those who need sitemaps provided by either an HTML sitemap generator or an XML sitemap generator, 
 there are corresponding options offered here. Also shown are the following:- Download XML Sitemap button
- Download tool results in Excel format
- Download tool results in HTML format
 Lastly, if you love the free sitemap generator tool, you can tell the world by clicking any of the following social media buttons: - Facebook Like
- Google+
 
 Email notificationNext, you can submit your email address to have a copy of the report emailed to you if you choose not to wait for it to finish crawling. We offer this feature as well as the sitemap generator free to all users. 
 Tool results dataWhen results are ready, the HTML sitemap generator will organize the data into six tables: - Internal links
- External links
- Internal errors (a subset of Internal Links)
- Internal redirects (another subset of Internal Links)
- External errors (a subset of External Links)
- External redirects (another subset of External Links)
 The table data is typically linked to either page URLs or to details about the data. Click on column headers to sort the results. 
 1Internal Links tableThe Internal links table created by the XML sitemap generator includes the following data fields: - URLs crawled on the site
- Link to The On Page Optimization Analysis Free SEO Tool for that URL
- URL’s level from the domain root
- URL’s returned HTTP status code
- Number of internal links the URL has within the site (click to see the list of URLs)
- Link text used for the URL
- Number of internal links on the page (click to see the list of URLs)
- Number of external links on the page (click to see the list of URLs)
- Size of the page on kilobytes (click to see page load speed test results for this URL from Google)
- Link to the Check Image Sizes, Alt Text, Header Checks and More Free SEO Tool for that URL
- The tag text from the URL’s page
- The description tag text from the URL’s page
- The keywords tag text from the URL’s page
- Contents, if used, of the anchor tag’s “rel=” attribute
 
 2External Links tableThe External links table includes the following data fields: - URL’s returned HTTP status code
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- External URL used in the link
- Link text used for the URL
- Internal page URL on which the link was first found
 
 3Internal HTTP code errors tableThe Internal errors table gathers all of the pages returning HTTP code errors (4xx and 5xx level codes) in one place to help organize the effort to resolve the problems. It includes the following data fields: - URL’s returned HTTP status code
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- Internal URL used in the link
- Link text used for the URL
- Internal page URL on which the link was first found
 The Internal errors table is a subset of the Internal links table showing just those pages returning HTTP status code errors. 
 4Internal HTTP redirects tableThe Internal redirects table combines all of the pages returning HTTP redirects in one list so you can easily review them. You should not have to rely on redirects internally. Instead, you can fix the source code containing the redirected link. This table contains the following data fields: - URL’s returned HTTP status code (click it to go to the HTTP Response Code Checker tool)
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- Internal URL used in the link
- Link text used for the URL
- Redirect’s target URL
- Internal page URL on which the link was first found
 The Internal redirects table is a subset of the Internal links table showing just those pages returning 301 and 302 HTTP status code redirects. 
 5External HTTP code errors tableThe External errors table gathers all of the pages returning HTTP code errors (4xx and 5xx level codes) in one place to help organize the effort to resolve the problems. It includes the following data fields: - URL’s returned HTTP status code (click it to go to the HTTP Response Code Checker tool)
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- Internal URL used in the link
- Link text used for the URL
- Redirect’s target URL
- Internal page URL on which the link was first found
 The External errors table is a subset of the External links table showing just those pages returning HTTP status code errors. 
 6External HTTP redirects tableThe External redirects table combines all of the pages returning HTTP redirects in one list so you can easily review them. As the redirect to the targeted page does not affect your page, fix these URLs is a lower priority. This table contains the following data fields: - URL’s returned HTTP status code (click it to go to the HTTP Response Code Checker tool)
- Number of times that URL is linked to from within the site (click to see the list of affected URLs)
- External URL used in the link
- Link text used for the URL
- Redirect’s target URL
- Internal page URL on which the link was first found
 The External redirects table is a subset of the External links table showing just those pages returning 301 and 302 HTTP status code redirects. 
- 
					
					
					
					
 Hi Thomas! When I say 1 click, I mean all links that can directly be reached from www.wishpicker.com. For example wishpicker.com/gifts-for can be reached directly from wishpicker.com wishpicker.com/gifts-for/boyfriend cannot be reached directly from wishpicker.com. I would first need to go to wishpicker.com/gifts-for, and then go to wishpicker.com/gifts-for/boyfriend. So wishpicker.com/gifts-for is 1 click away, and wishpicker.com/gifts-for/boyfriend is 2 clicks away from wishpicker.com. I am looking to crawl all links that are only 1 click away. Thanks for your help here. Really appreciate it. 
- 
					
					
					
					
 when you say one click away are you talking about a parameter? I will run this through screaming frog and a couple other tools and see if I can get your answer. 
- 
					
					
					
					
 Hi Thomas Thanks for your response. Here is my website: www.wishpicker.com What I am looking for is all the links present only 1 click away from the page www.wishpicker.com (both internal and external). Performing a crawl with screaming frog is giving me all links (1, 2, 3, 4, and more clicks away). Not sure how to limit the crawl to show links that are only 1 click away, and exclude links that are 2 or more clicks away from this page. Look forward to your response. Thanks! 
- 
					
					
					
					
 Hi, Screaming frog does in fact show you the links that would be considered external links. Here is a great guide. http://www.seerinteractive.com/blog/screaming-frog-guide If you look at the external part of Screaming frog you'll find what you're looking for however you may also do this with using either the campaign tool or the browser plug-in. I would suggest reading the seer interactive guide and sticking with screaming frog it is an outstanding tool. Here are some other tools which I hope will help you if that is not the route you wish to go. If you could post a photograph of what you are looking for or what you mean by it only showing you the internal link count I know what you mean by that I just want to see what screen you're looking on to get the The answer you're looking for. Here are some more tools that will allow you to scan up to 1000 pages of your website for free and will tell you the information you're looking for. http://www.internetmarketingninjas.com/tools if you cannot find what you're looking for in their you might want to try http://www.quicksprout.com/2013/02/04/how-to-perform-a-seo-audit-free-5000-template-included/ distilled.net/U might be the best way to find out these types of things however it is a complete search engine optimization training course. Sincerely, Thomas 
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Does a no-indexed parent page impact its child pages?
 If I have a page* in WordPress that is set as private and is no-indexed with Yoast, will that negatively affect the visibility of other pages that are set as children of that first page? *The context is that I want to organize some of the pages on a business's WordPress site into silos/directories. For example, if the business was a home remodeling company, it'd be convenient to keep all the pages about bathrooms, kitchens, additions, basements, etc. bundled together under a "services" parent page (/services/kitchens/, /services/bathrooms/, etc.). The thing is that the child pages will all be directly accessible from the menus, so there doesn't need to be anything on the parent /services/ page itself. Another such parent page/directory/category might be used to keep different photo gallery pages together (/galleries/kitchen-photos/, /galleries/bathroom-photos/, etc.). So again, would it be safe for pages like /services/kitchens/ and /galleries/addition-photos/ if the /services/ and /galleries/ pages (but not /galleries/* or anything like that) are no-indexed? Thanks! Technical SEO | | BrianAlpert781
- 
		
		
		
		
		
		Why is Google Webmaster Tools showing 404 Page Not Found Errors for web pages that don't have anything to do with my site?
 I am currently working on a small site with approx 50 web pages. In the crawl error section in WMT Google has highlighted over 10,000 page not found errors for pages that have nothing to do with my site. Anyone come across this before? Technical SEO | | Pete40
- 
		
		
		
		
		
		Are image pages considered 'thin' content pages?
 I am currently doing a site audit. The total number of pages on the website are around 400... 187 of them are image pages and coming up as 'zero' word count in Screaming Frog report. I needed to know if they will be considered 'thin' content by search engines? Should I include them as an issue? An answer would be most appreciated. Technical SEO | | MTalhaImtiaz0
- 
		
		
		
		
		
		How much domain authority is passed on through a link from a page with low authority?
 Hello, Let's say that there is a link to site A from site B. The domain authority of site B is 85, but the link is on a page that has a page authority of only 1. Does much authority get passed along from site B to site A? (Let's assume site A has a domain authority of 35, if that's relevant.) Thank you! Technical SEO | | nyc-seo0
- 
		
		
		
		
		
		Product Pages Outranking Category Pages
 Hi, We are noticing an issue where some product pages are outranking our relevant category pages for certain keywords. For a made up example, a "heavy duty widgets" product page might rank for the keyword phrase Heavy Duty Widgets, instead of our Heavy Duty Widgets category page appearing in the SERPs. We've noticed this happening primarily in cases where the name of the product page contains an at least partial match for the desired keyword phrase we want the category page to rank for. However, we've also found isolated cases where the specified keyword points to a completely irrelevent pages instead of the relevant category page. Has anyone encountered a similar issue before, or have any ideas as to what may cause this to happen? Let me know if more clarification of the question is needed. Thanks! Technical SEO | | ShawnHerrick0
- 
		
		
		
		
		
		Splitting Page Authority with two URLs for the same page.
 Hello guys, My website is currently holding two different URLs for the same page and I am under the impression such set up is dividing my Page Authority and Link Juice. We currently have the following page with both URLs below: www.wbresearch.com/soldiertechnologyusa/home.aspx Technical SEO | | JoaoPdaCosta-WBR
 www.wbresearch.com/soldiertechnologyusa/ Analysing the page authority and backlinks I identified that we are splitting the amount of backlinks (links from sites, social media and therefore authority). "/home.aspx"
 PA: 67
 Linking Root Domains: 52
 Total Links: 272 "/"
 PA: 64
 Linking Root Domains: 29
 Total Links: 128 I am under the impression that if the URLs were the same we would maximise our backlinks and therefore page authority. My Question: How can I fix this? Should I have a 301 redirect from the page "/" to the "/home.aspx" therefore passing the authority and link juice of “/” directly to “/homes.aspx”? Trying to gather thoughts and ideas on this, suggestions are much appreciated? Thanks!0
- 
		
		
		
		
		
		Handling 301s: Multiple pages to a single page (consolidation)
 Been scouring the interwebs and haven't found much information on redirecting two serparate pages to a single new page. Here is what it boils down to: Let's say a website has two pages, both with good page authority of products that are becoming fazed out. The products, Widget A and Widget B, are still popular search terms, but they are being combined into ONE product, Widget C. While Widget A and Widget B STILL have plenty to do with Widget C, Widget C is now the new page, the main focus page, and the page you want everyone to see and Google to recognize. Now, do I 301 Widget A and Widget B pages to Widget C, ALTHOUGH Widgets A and B previously had nothing to do with one another? (Remember, we want to try and keep some of that authority the two page have had.) OR do we keep Widget A and Widget B pages "alive", take them off the main navigation, and then put a "disclaimer" on the pages announcing they are now part of Widget C and link to Widget C? OR Should Widgets A and B page be canonicalized to Widget C? Again, keep in mind, widgets A and B previously were not similar, but NOW they are and result in Widget C. (If you are confused, we can provide a REAL work example of what we are talkinga about, but decided to not be specific to our industry for this.) Appreciate any and all thoughts on this. Technical SEO | | JU19850
- 
		
		
		
		
		
		How to remove the 4XX Client error,Too many links in a single page Warning and Cannonical Notices.
 Firstly,I am getting around 12 Errors in the category 4xx Client error. The description says that this is either bad or a broken link.How can I repair this ? Secondly, I am getting lots of warnings related to too many page links of a single page.I want to know how to tackle this ? Finally, I don't understand the basics of Cannonical notices.I have around 12 notices of this kind which I want to remove too. Please help me out in this regard. Thank you beforehand. Amit Ganguly http://aamthoughts.blogspot.com - Sustainable Sphere Technical SEO | | amit.ganguly0
 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				