Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How can I get a photo album indexed by Google?
- 
					
					
					
					
 We have a lot of photos on our website. Unfortunately most of them don't seem to be indexed by Google. We run a party website. One of the things we do, is take pictures at events and put them on the site. An event page with a photo album, can have anywhere between 100 and 750 photo's. For each foto's there is a thumbnail on the page. The thumbnails are lazy loaded by showing a placeholder and loading the picture right before it comes onscreen. There is no pagination of infinite scrolling. Thumbnails don't have an alt text. Each thumbnail links to a picture page. This page only shows the base HTML structure (menu, etc), the image and a close button. The image has a src attribute with full size image, a srcset with several sizes for responsive design and an alt text. There is no real textual content on an image page. (Note that when a user clicks on the thumbnail, the large image is loaded using JavaScript and we mimic the page change. I think it doesn't matter, but am unsure.) I'd like that full size images should be indexed by Google and found with Google image search. Thumbnails should not be indexed (or ignored). Unfortunately most pictures aren't found or their thumbnail is shown. Moz is giving telling me that all the picture pages are duplicate content (19,521 issues), as they are all the same with the exception of the image. The page title isn't the same but similar for all images of an album. Example: On the "A day at the park" event page, we have 136 pictures. A site search on "a day at the park" foto, only reveals two photo's of the albums.
- 
					
					
					
					
 Yeap, google should be crawling all your site naturaly. 
 Remember that bots have a finite time to crawl your site. It might be that there not enogh time for all yout images.
 Even more, if you want to be stric with only high res images indexed, that's the main reason to use sitemaps.And yes again, use different sitemaps, ona for images and one for pages. 
- 
					
					
					
					
 The website has 228,687 pictures (and thus so many picture pages). In total it has about 600,000 pages. That's way to much for a simple static sitemap. I am considering generating sitemaps and using multiple site maps. Than I can put a noindex on the picture pages and add list the photo's as part of the event page, as show in this guide. However, I feel that I really shouldn't have to as Google should be able to crawl my site naturally. 
- 
					
					
					
					
 Hi there, Have you tried telling google to index your images is by a sitemap_images.xml. There you can specify exactly what image to index. Im not seeing any sitemap in https://www.fiestainfo.com/sitemap.xml Best luck. 
 GR.
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Can I still monitor noindex, nofollow pages with Google Analytics?
 I have a private/login site where all pages are noindex, nofollow. Can I still monitor external site links with Google Analytics? Technical SEO | | jasmine.silver0
- 
		
		
		
		
		
		Can you force Google to use meta description?
 Is it possible to force Google to use only the Meta description put in place for a page and not gather additional text from the page? Technical SEO | | A_Q0
- 
		
		
		
		
		
		Does Google index internal anchors as separate pages?
 Hi, Back in September, I added a function that sets an anchor on each subheading (h[2-6]) and creates a Table of content that links to each of those anchors. These anchors did show up in the SERPs as JumpTo Links. Fine. Back then I also changed the canonicals to a slightly different structur and meanwhile there was some massive increase in the number of indexed pages - WAY over the top - which has since been fixed by removing (410) a complete section of the site. However ... there are still ~34.000 pages indexed to what really are more like 4.000 plus (all properly canonicalised). Naturally I am wondering, what google thinks it is indexing. The number is just way of and quite inexplainable. So I was wondering: Does Google save JumpTo links as unique pages? Also, does anybody know any method of actually getting all the pages in the google index? (Not actually existing sites via Screaming Frog etc, but actual pages in the index - all methods I found sadly do not work.) Finally: Does somebody have any other explanation for the incongruency in indexed vs. actual pages? Thanks for your replies! Nico Technical SEO | | netzkern_AG0
- 
		
		
		
		
		
		How to stop my webmail pages not to be indexed on Google ??
 when i did a search in google for Site:mywebsite.com , for a list of pages indexed. Surprisingly the following come up " Webmail - Login " Although this is associated with the domain , this is a completely different server , this the rackspace email server browser interface I am sure that there is nothing on the website that links or points to this. Technical SEO | | UIPL
 So why is Google indexing it ? & how do I get it out of there. I tried in webmaster tool but I could not , as it seems like a sub-domain. Any ideas ? Thanks Naresh Sadasivan0
- 
		
		
		
		
		
		CDN Being Crawled and Indexed by Google
 I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott Technical SEO | | Scott-Thomas0
- 
		
		
		
		
		
		How do I get out of google bomb?
 Hi all, I have a website named bijouxroom.com; and I was in the 7th page for the search term takı in google; and 2nd page for online takı. Now, I see that in one day my results seem to be on the 13th and 10th page in google respectively. I made too much anchor text for takı and online takı. What shall I do to gain my positions back? Thanks in advance. Regards, Technical SEO | | ozererim0
- 
		
		
		
		
		
		What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
 Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like: staging.domain.com Technical SEO | | fthead9
 User-agent: *
 Disallow: / in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.0
- 
		
		
		
		
		
		Dynamically-generated .PDF files, instead of normal pages, indexed by and ranking in Google
 Hi, I come across a tough problem. I am working on an online-store website which contains the functionlaity of viewing products details in .PDF format (by the way, the website is built on Joomla CMS), now when I search my site's name in Google, the SERP simply displays my .PDF files in the first couple positions (shown in normal .PDF files format: [PDF]...)and I cannot find the normal pages there on SERP #1 unless I search the full site domain in Google. I really don't want this! Would you please tell me how to figure the problem out and solve it. I can actually remove the corresponding component (Virtuemart) that are in charge of generating the .PDF files. Now I am trying to redirect all the .PDF pages ranking in Google to a 404 page and remove the functionality, I plan to regenerate a sitemap of my site and submit it to Google, will it be working for me? I really appreciate that if you could help solve this problem. Thanks very much. Sincerely SEOmoz Pro Member Technical SEO | | fugu0
 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				