Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Paginated Pages Which Shouldnt' Exist..
-
Hi
I have paginated pages on a crawl which shouldn't be paginated:
https://www.key.co.uk/en/key/chairs
My crawl shows:
<colgroup><col width="377"></colgroup>
| https://www.key.co.uk/en/key/chairs?page=2 |
| https://www.key.co.uk/en/key/chairs?page=3 |
| https://www.key.co.uk/en/key/chairs?page=4 |
| https://www.key.co.uk/en/key/chairs?page=5 |
| https://www.key.co.uk/en/key/chairs?page=6 |
| https://www.key.co.uk/en/key/chairs?page=7 |
| https://www.key.co.uk/en/key/chairs?page=8 |
| https://www.key.co.uk/en/key/chairs?page=9 |
| https://www.key.co.uk/en/key/chairs?page=10 |
| https://www.key.co.uk/en/key/chairs?page=11 |
| https://www.key.co.uk/en/key/chairs?page=12 |
| https://www.key.co.uk/en/key/chairs?page=13 |
| https://www.key.co.uk/en/key/chairs?page=14 |
| https://www.key.co.uk/en/key/chairs?page=15 |
| https://www.key.co.uk/en/key/chairs?page=16 |
| https://www.key.co.uk/en/key/chairs?page=17 |Where is this coming from?
Thank you
-
You will also have to get those URLs out of the index once you fix the rel next/prev issue. In order to do that effectively, they should return a 404 or 410 status code in the HTTP header so Google knows that they no longer exist (even though they never really did in the first place). Otherwise, it's what is known as a "soft 404" in which the page doesn't really exist, but returns a 200 (OK) status code, which is confusing to Google if you don't want them indexed.
-
Hi Becky
I can see chairs:
https://www.key.co.uk/en/key/chairs
But the paginated versions above are not in there. (can you see them?)
All you need to do is remove this directive for pages without a page 2: rel="next" href="https://www.key.co.uk/en/key/chairs?page=2" > as there is no page 2 for chairs.
Regards
Nigel
-
Hi Nigel
Thanks for jumping in. I'm confused as I have found the pages on my screaming frog crawl?
This page https://www.key.co.uk/en/key/chairs shouldn't have any pagination as there are no additional pages, but there is rel=next in the source code...
Now I'm a bit confused!
Becky
-
Yes I've just gone through every top level page too & pagination is awful, so I'm compiling a list and a case to push it.
It's pretty bad across the site, so I'll push for this to be updated. I find new issues with it all the time..
Thanks for your help!
-
Yes exactly. Even though the pages don't exist to the user, they still technically exist. If I were you, I'd take a very deep look at pagination on your site. If this is happening at scale, then fixing it could be a major improvement to your site. I took a look and it seems to be happening on all your top-level category pages like Chairs, Office Furniture, Shelving & Racking, etc.
These paginated pages are essentially a bunch of duplicate pages of your main category pages, each with a self-referencing canonical (which is the proper way to set up pagination). So Google could be extremely confused about which one to rank. In most cases, Google will rank page 1 because the use of rel="next"/rel="prev" is essentially telling Google that page 1 is the canonical version. However, you're still opening yourself up to the possibility of Google crawling all of these duplicate pages which is a huge waste on your crawl budget.
Hope that helps!
-
Hi
Thank you both.
We do have issues with our pagination which I've raised with developers, but it's taking forever to sort out. I'll flag this as well.
So even though the content on the paginated pages for Chairs doesn't exist we still need to remove the tags on these - https://www.key.co.uk/en/key/chairs?page=10
-
If you view your source code, you'll notice you are actually using rel="next" and rel="prev" on the main category page (https://www.key.co.uk/en/key/chairs). This is why you (and most likely Googlebot as well) are crawling these paginated pages. Even though you don't have links to the paginated pages on the main category page, they still exist and you're giving crawlers the directive (rel next / rel prev) to crawl them.
If you remove rel="next" on the category home page, that should help but you should really remove rel="next" and rel="prev" on the paginated pages as well. Unless you do that, Google will still find them and crawl them because they're aware these pages exist and they're likely indexed.
Here's a great resource on understanding pagination as well as the correct use of rel="next" and rel="prev" from Maile Ohye at Google: https://www.youtube.com/watch?v=njn8uXTWiGg
Hope this helps!
Cheers!
-Tyler -
Nice website by the way. It looks very professional. And your 49 DA is very impressive.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Would You Redirect a Page if the Parent Page was Redirected?
Hi everyone! Let's use this as an example URL: https://www.example.com/marvel/avengers/hulk/ We have done a 301 redirect for the "Avengers" page to another page on the site. Sibling pages of the "Hulk" page live off "marvel" now (ex: /marvel/thor/ and /marvel/iron-man/). Is there any benefit in doing a 301 for the "Hulk" page to live at /marvel/hulk/ like it's sibling pages? Is there any harm long-term in leaving the "Hulk" page under a permanently redirected page? Thank you! Matt
Intermediate & Advanced SEO | | amag0 -
Moved company 'Help Center' from Zendesk to Intercom, got lots of 404 errors. What now?
Howdy folks, excited to be part of the Moz community after lurking for years! I'm a few weeks into my new job (Digital Marketing at Rewind) and about 10 days ago the product team moved our Help Center from Zendesk to Intercom. Apparently the import went smoothly, but it's caused one problem I'm not really sure how to go about solving: https://help.rewind.io/hc/en-us/articles/*** is where all our articles used to sit https://help.rewind.io/*** is where all our articles now are So, for example, the following article has now moved as such: https://help.rewind.io/hc/en-us/articles/115001902152-Can-I-fast-forward-my-store-after-a-rewind- https://help.rewind.io/general-faqs-and-billing/frequently-asked-questions/can-i-fast-forward-my-store-after-a-rewind This has created a bunch of broken URLs in places like our Shopify/BigCommerce app listings, in our email drips, and in external resources etc. I've played whackamole cleaning many of these up, but these old URLs are still indexed by Google – we're up to 475 Crawl Errors in Search Console over the past week, all of which are 404s. I reached out to Intercom about this to see if they had something in place to help, but they just said my "best option is tracking down old links and setting up 301 redirects for those particular addressed". Browsing the Zendesk forms turned up some relevant-ish results, with the leading recommendation being to configure javascript redirects in the Zendesk document head (thread 1, thread 2, thread 3) of individual articles. I'm comfortable setting up 301 redirects on our website, but I'm in a bit over my head in trying to determine how I could do this with content that's hosted externally and sitting on a subdomain. I have access to our Zendesk admin, so I can go in and edit stuff there, but don't have experience with javascript redirects and have read that they might not be great for such a large scale redirection. Hopefully this is enough context for someone to provide guidance on how you think I should go about fixing things (or if there's even anything for me to do) but please let me know if there's more info I can provide. Thanks!
Intermediate & Advanced SEO | | henrycabrown1 -
Substantial difference between Number of Indexed Pages and Sitemap Pages
Hey there, I am doing a website audit at the moment. I've notices substantial differences in the number of pages indexed (search console), the number of pages in the sitemap and the number I am getting when I crawl the page with screamingfrog (see below). Would those discrepancies concern you? The website and its rankings seems fine otherwise. Total indexed: 2,360 (Search Consule)
Intermediate & Advanced SEO | | Online-Marketing-Guy
About 2,920 results (Google search "site:example.com")
Sitemap: 1,229 URLs
Screemingfrog Spider: 1,352 URLs Cheers,
Jochen0 -
How do I get rel='canonical' to eliminate the trailing slash on my home page??
I have been searching high and low. Please help if you can, and thank you if you spend the time reading this. I think this issue may be affecting most pages. SUMMARY: I want to eliminate the trailing slash that is appended to my website. SPECIFIC ISSUE: I want www.threewaystoharems.com to showing up to users and search engines without the trailing slash but try as I might it shows up like www.threewaystoharems.com/ which is the canonical link. WHY? and I'm concerned my back-links to the link without the trailing slash will not be recognized but most people are going to backlink me without a trailing slash. I don't want to loose linkjuice from the people and the search engines not being in consensus about what my page address is. THINGS I"VE TRIED: (1) I've gone in my wordpress settings under permalinks and tried to specify no trailing slash. I can do this here but not for the home page. (2) I've tried using the SEO by yoast to set the canonical page. This would work if I had a static front page, but my front page is of blog posts and so there is no advanced page settings to set the canonical tag. (3) I'd like to just find the source code of the home page, but because it is CSS, I don't know where to find the reference. I have gone into the css files of my wordpress theme looking in header and index and everywhere else looking for a specification of what the canonical page is. I am not able to find it. I'm thinking it is actually specified in the .htaccess file. (4) Went into cpanel file manager looking for files that contain Canonical. I only found a file called canonical.php . the only thing that seemed like it was worth changing was changing line 139 from $redirect_url = home_url('/'); to $redirect_url = home_url(''); nothing happened. I'm thinking it is actually specified in the .htaccess file. (5) I have gone through the .htaccess file and put thes 4 lines at the top (didn't redirect or create the proper canonical link) and then at the bottom of the file (also didn't redirect or create the proper canonical link) : RewriteEngine on
Intermediate & Advanced SEO | | Dillman
RewriteCond %{HTTP_HOST} ^([a-z.]+)?threewaystoharems.com$ [NC]
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule .? http://www.%1threewaystoharems.com%{REQUEST_URI} [R=301,L] Please help friends.0 -
How long takes to a page show up in Google results after removing noindex from a page?
Hi folks, A client of mine created a new page and used meta robots noindex to not show the page while they are not ready to launch it. The problem is that somehow Google "crawled" the page and now, after removing the meta robots noindex, the page does not show up in the results. We've tried to crawl it using Fetch as Googlebot, and then submit it using the button that appears. We've included the page in sitemap.xml and also used the old Google submit new page URL https://www.google.com/webmasters/tools/submit-url Does anyone know how long will it take for Google to show the page AFTER removing meta robots noindex from the page? Any reliable references of the statement? I did not find any Google video/post about this. I know that in some days it will appear but I'd like to have a good reference for the future. Thanks.
Intermediate & Advanced SEO | | fabioricotta-840380 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
Hi! I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us. The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this: User-agent: googlebot Disallow: /community/photos/ Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible? Thanks! Leona
Intermediate & Advanced SEO | | HD_Leona0 -
Does Google crawl the pages which are generated via the site's search box queries?
For example, if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled? I am asking this question because this would be a URL that is non existent on the site and hence am confused as to whether Google bots would be able to find it.
Intermediate & Advanced SEO | | pulseseo0