• majorAlexa

        See all notifications

        Skip to content
        Moz logo Menu open Menu close
        • Products
          • Moz Pro
          • Moz Pro Home
          • Moz Local
          • Moz Local Home
          • STAT
          • Moz API
          • Moz API Home
          • Compare SEO Products
          • Moz Data
        • Free SEO Tools
          • Domain Analysis
          • Keyword Explorer
          • Link Explorer
          • Competitive Research
          • MozBar
          • More Free SEO Tools
        • Learn SEO
          • Beginner's Guide to SEO
          • SEO Learning Center
          • Moz Academy
          • MozCon
          • Webinars, Whitepapers, & Guides
        • Blog
        • Why Moz
          • Digital Marketers
          • Agency Solutions
          • Enterprise Solutions
          • Small Business Solutions
          • The Moz Story
          • New Releases
        • Log in
        • Log out
        • Products
          • Moz Pro

            Your all-in-one suite of SEO essentials.

          • Moz Local

            Raise your local SEO visibility with complete local SEO management.

          • STAT

            SERP tracking and analytics for enterprise SEO experts.

          • Moz API

            Power your SEO with our index of over 44 trillion links.

          • Compare SEO Products

            See which Moz SEO solution best meets your business needs.

          • Moz Data

            Power your SEO strategy & AI models with custom data solutions.

          Let your business shine with Listings AI
          Moz Local

          Let your business shine with Listings AI

          Learn more
        • Free SEO Tools
          • Domain Analysis

            Get top competitive SEO metrics like DA, top pages and more.

          • Keyword Explorer

            Find traffic-driving keywords with our 1.25 billion+ keyword index.

          • Link Explorer

            Explore over 40 trillion links for powerful backlink data.

          • Competitive Research

            Uncover valuable insights on your organic search competitors.

          • MozBar

            See top SEO metrics for free as you browse the web.

          • More Free SEO Tools

            Explore all the free SEO tools Moz has to offer.

          NEW Keyword Suggestions by Topic
          Moz Pro

          NEW Keyword Suggestions by Topic

          Learn more
        • Learn SEO
          • Beginner's Guide to SEO

            The #1 most popular introduction to SEO, trusted by millions.

          • SEO Learning Center

            Broaden your knowledge with SEO resources for all skill levels.

          • On-Demand Webinars

            Learn modern SEO best practices from industry experts.

          • How-To Guides

            Step-by-step guides to search success from the authority on SEO.

          • Moz Academy

            Upskill and get certified with on-demand courses & certifications.

          • MozCon

            Save on Early Bird tickets and join us in London or New York City

          Unlock flexible pricing & new endpoints
          Moz API

          Unlock flexible pricing & new endpoints

          Find your plan
        • Blog
        • Why Moz
          • Digital Marketers

            Simplify SEO tasks to save time and grow your traffic.

          • Small Business Solutions

            Uncover insights to make smarter marketing decisions in less time.

          • Agency Solutions

            Earn & keep valuable clients with unparalleled data & insights.

          • Enterprise Solutions

            Gain a competitive edge in the ever-changing world of search.

          • The Moz Story

            Moz was the first & remains the most trusted SEO company.

          • New Releases

            Get the scoop on the latest and greatest from Moz.

          Surface actionable competitive intel
          New Feature

          Surface actionable competitive intel

          Learn More
        • Log in
          • Moz Pro
          • Moz Local
          • Moz Local Dashboard
          • Moz API
          • Moz API Dashboard
          • Moz Academy
        • Avatar
          • Moz Home
          • Notifications
          • Account & Billing
          • Manage Users
          • Community Profile
          • My Q&A
          • My Videos
          • Log Out

        The Moz Q&A Forum

        • Forum
        • Questions
        • My Q&A
        • Users
        • Ask the Community

        Welcome to the Q&A Forum

        Browse the forum for helpful insights and fresh discussions about all things SEO.

        1. Home
        2. SEO Tactics
        3. Intermediate & Advanced SEO
        4. Can PDF be seen as duplicate content? If so, how to prevent it?

        Moz Q&A is closed.

        After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.

        Can PDF be seen as duplicate content? If so, how to prevent it?

        Intermediate & Advanced SEO
        7
        20
        12841
        Loading More Posts
        • Watching

          Notify me of new replies.
          Show question in unread.

        • Not Watching

          Do not notify me of new replies.
          Show question in unread if category is not ignored.

        • Ignoring

          Do not notify me of new replies.
          Do not show question in unread.

        • Oldest to Newest
        • Newest to Oldest
        • Most Votes
        Reply
        • Reply as question
        Locked
        This topic has been deleted. Only users with question management privileges can see it.
        • Gestisoft-Qc
          Gestisoft-Qc Subscriber last edited by

          I see no reason why PDF couldn't be considered duplicate content but I haven't seen any threads about it.

          We publish loads of product documentation provided by manufacturers as well as White Papers and Case Studies. These give our customers and prospects a better idea off our solutions and help them along their buying process.

          However, I'm not sure if it would be better to make them non-indexable to prevent duplicate content issues. Clearly we would prefer a solutions where we benefit from to keywords in the documents.

          Any one has insight on how to deal with PDF provided by third parties?

          Thanks in advance.

          1 Reply Last reply Reply Quote 1
          • ilonka65
            ilonka65 last edited by

            It looks like Google is not crawling tabs anymore, therefore if your pdf's are tabbed within pages, it might not be an issue: https://www.seroundtable.com/google-hidden-tab-content-seo-19489.html

            1 Reply Last reply Reply Quote 0
            • ASriv
              ASriv Subscriber last edited by

              Sure, I understand - thanks EGOL

              1 Reply Last reply Reply Quote 0
              • EGOL
                EGOL @ASriv last edited by

                I would like to give that to you but it is on a site that I don't share in forums.  Sorry.

                1 Reply Last reply Reply Quote 0
                • ASriv
                  ASriv Subscriber last edited by

                  Thanks EGOL

                  That would be ideal.

                  For a site that has multiple authors and with it being impractical to get a developer involved every time a web page / blog post and the pdf are created, is there a single line of code that could be used to accomplish this in .htaccess?

                  If so, would you be able to show me an example please?

                  EGOL 1 Reply Last reply Reply Quote 0
                  • EGOL
                    EGOL last edited by

                    I assigned rel=canonical to my PDFs using htaccess.

                    Then, if anyone links to the PDFs the linkvalue gets passed to the webpage.

                    1 Reply Last reply Reply Quote 0
                    • ASriv
                      ASriv Subscriber last edited by

                      Hi all

                      I've been discussing the topic of making content available as both blog posts and pdf downloads today.

                      Given that there is a lot of uncertainty and complexity around this issue of potential duplication, my plan is to house all the pdfs in a folder that we block with robots.txt

                      Anyone agree / disagree with this approach?

                      1 Reply Last reply Reply Quote 0
                      • Dr-Pete
                        Dr-Pete Staff @ATMOSMarketing56 last edited by

                        Unfortunately, there's no great way to have it both ways. If you want these pages to get indexed for the links, then they're potential duplicates. If Google filters them out, the links probably won't count. Worst case, it could cause Panda-scale problems. Honestly, I suspect the link value is minimal and outweighed by the risk, but it depends quite a bit on the scope of what you're doing and the general link profile of the site.

                        1 Reply Last reply Reply Quote 0
                        • ATMOSMarketing56
                          ATMOSMarketing56 Subscriber last edited by

                          I think you can set it to public or private (logged-in only) and even put a price-tag on it if you want. So yes setting it to private would help to eliminate the dup content issue, but it would also hide the links that I'm using to link-build.

                          I would imagine that since this guide would link back to our original site that it would be no different than if someone were to copy the content from our site and link back to us with it, thus crediting us as the original source. Especially if we ensure to index it through GWMT before submitting to other platforms. Any good resources that delve into that?

                          Dr-Pete 1 Reply Last reply Reply Quote 0
                          • Dr-Pete
                            Dr-Pete Staff last edited by

                            Potentially, but I'm honestly not sure how Scrid's pages are indexed. Don't you need to log in or something to actually see the content on Scribd?

                            1 Reply Last reply Reply Quote 0
                            • ATMOSMarketing56
                              ATMOSMarketing56 Subscriber last edited by

                              What about this instance:

                              (A) I made an "ultimate guide to X" and posted it on my site as individual HTML pages for each chapter

                              (B) I made a PDF version with the exact same content that people can download directly from the site

                              (C) I uploaded the PDF to sites like Scribd.com to help distribute it further, and build links with the links that are embedded in the PDF.

                              Would those all be dup content? Is (C) recommended or not?

                              1 Reply Last reply Reply Quote 0
                              • EGOL
                                EGOL @Gestisoft-Qc last edited by

                                Thanks!. I am going to look into this.  I'll let you know if I learn anything.

                                1 Reply Last reply Reply Quote 0
                                • Dr-Pete
                                  Dr-Pete Staff @Gestisoft-Qc last edited by

                                  If they duplicate your main content, I think the header-level canonical may be a good way to go. For the syndication scenario, it's tough, because then you're knocking those PDFs out of the rankings, potentially, in favor of someone else's content.

                                  Honestly, I've seen very few people deal with canonicalization for PDFs, and even those cases were small or obvious (like a page with the exact same content being outranked by the duplicate PDF). It's kind of uncharted territory.

                                  1 Reply Last reply Reply Quote 3
                                  • EGOL
                                    EGOL @Gestisoft-Qc last edited by

                                    Thanks for all of your input Dr. Pete. The example that you use is almost exactly what I have - hundreds of .pdfs on a fifty page site. These .pdfs rank well in the SERPs, accumulate pagerank, and pass traffic and link value back to the main site through links embedded within the .pdf. The also have natural links from other domains. I don't want to block them or nofollow them butyour suggestion of using header directive sounds pretty good.

                                    1 Reply Last reply Reply Quote 0
                                    • Dr-Pete
                                      Dr-Pete Staff @Gestisoft-Qc last edited by

                                      Oh, sorry - so these PDFs aren't duplicates with your own web/HTML content so much as duplicates with the same PDFs on other websites?

                                      That's more like a syndication situation. It is possible that, if enough people post these PDFs, you could run into trouble, but I've never seen that. More likely, your versions just wouldn't rank. Theoretically, you could use the header-level canonical tag cross-domain, but I've honestly never seen that tested.

                                      If you're talking about a handful of PDFs, they're a small percentage of your overall indexed content, and that content is unique, I wouldn't worry too much. If you're talking about 100s of PDFs on a 50-page website, then I'd control it. Unfortunately, at that point, you'd probably have to put the PDFs in a folder and outright block it. You'd remove the risk, but you'd stop ranking on those PDFs as well.

                                      1 Reply Last reply Reply Quote 2
                                      • EGOL
                                        EGOL @Gestisoft-Qc last edited by

                                        @EGOL: Can you expend a bit on your Author suggestion?

                                        I was wondering if there is a way to do rel=author for a pdf document.  I don't know how to do it and don't know if it is possible.

                                        1 Reply Last reply Reply Quote 0
                                        • Gestisoft-Qc
                                          Gestisoft-Qc Subscriber @Dr-Pete last edited by

                                          To make sure I understand what I'm reading:

                                          • PDFs don't usually rank as well as regular pages (although it is possible)
                                          • It is possible to configure a canonical tag on a PDF

                                          My concern isn't that our PDFs may outrank the original content but rather getting slammed by Google for publishing them.

                                          Am right in thinking a canonical tag prevents to accumulate link juice? If so I would prefer to not use it, unless it leads to Google slamming.

                                          Any one has experienced Google retribution for publishing PDF coming from a 3rd party?

                                          @EGOL: Can you expend a bit on your Author suggestion?

                                          Thanks all!

                                          EGOL Dr-Pete 5 Replies Last reply Reply Quote 0
                                          • Dr-Pete
                                            Dr-Pete Staff last edited by

                                            I think it's possible, but I've only seen it in cases that are a bit hard to disentangle. For example, I've seen a PDF outrank a duplicate piece of regular content when the regular content had other issues (including massive duplication with other, regular content). My gut feeling is that it's unusual.

                                            If you're concerned about it, you can canonicalize PDFs with the header-level canonical directive. It's a bit more technically complex than the standard HTML canonical tag:

                                            http://googlewebmastercentral.blogspot.com/2011/06/supporting-relcanonical-http-headers.html

                                            I'm going to mark this as "Discussion", just in case anyone else has seen real-world examples.

                                            Gestisoft-Qc 1 Reply Last reply Reply Quote 2
                                            • EGOL
                                              EGOL last edited by

                                              I am really interested in hearing what others have to say about this.

                                              I know that .pdfs can be very valuable content.  They can be optimized, they rank in the SERPs, they accumulate PR and they can pass linkvalue.  So, to me it would be a mistake to block them from the index...

                                              However, I see your point about dupe content... they could also be thin content.  Will panda whack you for thin and dupes in your PDFs?

                                              How can canonical be used... what about author?

                                              Anybody know anything about this?

                                              1 Reply Last reply Reply Quote 3
                                              • MargaritaS
                                                MargaritaS last edited by

                                                Just like any other piece of duplicate content, you can use canonical link elements to specify the original piece of content (if there's indeed more than one identical piece). You could also block these types of files in the robots.txt, or use noindex-follow meta tags.

                                                Regards,

                                                Margarita

                                                1 Reply Last reply Reply Quote 5
                                                • 1 / 1
                                                • First post
                                                  Last post

                                                Browse Questions

                                                Explore more categories

                                                • Moz Tools

                                                  Chat with the community about the Moz tools.

                                                • SEO Tactics

                                                  Discuss the SEO process with fellow marketers

                                                • Community

                                                  Discuss industry events, jobs, and news!

                                                • Digital Marketing

                                                  Chat about tactics outside of SEO

                                                • Research & Trends

                                                  Dive into research and trends in the search industry.

                                                • Support

                                                  Connect on product support and feature requests.

                                                • See all categories

                                                Related Questions

                                                • cinzia09

                                                  Same product in different categories and duplicate content issues

                                                  Hi,I have some questions related to duplicate content on e-commerce websites. 1)If a single product goes to multiple categories (eg. A black elegant dress could be listed in two categories like "black dresses" and "elegant dresses") is it considered duplicate content even if the product url is unique? e.g www.website.com/black-dresses/black-elegant-dress duplicated> same content from two different paths www.website.com/elegant-dresses/black-elegant-dress duplicated> same content from two different paths www.website.com/black-elegant-dress unique url > this is the way my products urls look like Does google perceive this as duplicated content? The path to the content is only one, so it shouldn't be seen as duplicated content, though the product is repeated in different categories.This is the most important concern I actually have. It is a small thing but if I set this wrong all website would be affected and thus penalised, so I need to know how I can handle it. 2- I am using wordpress + woocommerce. The website is built with categories and subcategories. When I create a product in the product page backend is it advisable to select  just the lowest subcategory or is it better to select both main category and subcategory in which the product belongs? I usually select the subcategory alone.  Looking forward to your reply and suggestions. thanks

                                                  Intermediate & Advanced SEO | | cinzia09
                                                  1
                                                • marcandre

                                                  Woocommerce SEO & Duplicate content?

                                                  Hi Moz fellows, I'm new to Woocommerce and couldn't find help on Google about certain SEO-related things. All my past projects were simple 5 pages websites + a blog, so I would just no-index categories, tags and archives to eliminate duplicate content errors. But with Woocommerce Product categories and tags, I've noticed that many e-Commerce websites with a high domain authority actually rank for certain keywords just by having their category/tags indexed. For example keyword 'hippie clothes' = etsy.com/category/hippie-clothes (fictional example) The problem is that if I have 100 products and 10 categories & tags on my site it creates THOUSANDS of duplicate content errors, but If I 'non index' categories and tags they will never rank well once my domain authority rises... Anyone has experience/comments about this? I use SEO by Yoast plugin. Your help is greatly appreciated! Thank you in advance. -Marc

                                                  Intermediate & Advanced SEO | | marcandre
                                                  1
                                                • EasyLounge

                                                  [E-commerce] Duplicate content due to color variations (canonical/indexing)

                                                  Hello, We currently have a lot of color variations on multiple products with almost the same content. Even with our canonicals being set, Moz's crawling tool seems to flag them as duplicate content. What we have done so far: Choosing the best-selling color variation (our "master product") Adding a rel="canonical" to every variation (with our "master product" as the canonical URL) In my opinion, it should be enough to address this issue. However, being given the fact that it's flagged as duplicate by Moz, I was wondering if there is something else we should do? Should we add a "noindex,follow" to our child products and "index,follow" to our master product? (sounds to me like such a heavy change) Thank you in advance

                                                  Intermediate & Advanced SEO | | EasyLounge
                                                  0
                                                • MBASydney

                                                  Duplicate content on sites from different countries

                                                  Hi, we have a client who currently has a lot of duplicate content with their UK and US website. Both websites are geographically targeted (via google webmaster tools) to their specific location and have the appropriate local domain extension. Is having duplicate content a major issue, since they are in two different countries and geographic regions of the world? Any statement from Google about this? Regards, Bill

                                                  Intermediate & Advanced SEO | | MBASydney
                                                  0
                                                • AxialDev

                                                  How do I geo-target continents & avoid duplicate content?

                                                  Hi everyone, We have a website which will have content tailored for a few locations: USA: www.site.com
                                                  Europe EN: www.site.com/eu
                                                  Canada FR: www.site.com/fr-ca Link hreflang and  the GWT option are designed for countries. I expect a fair amount of duplicate content; the only differences will be in product selection and prices. What are my options to tell Google that it should serve www.site.com/eu in Europe instead of www.site.com? We are not targeting a particular country on that continent. Thanks!

                                                  Intermediate & Advanced SEO | | AxialDev
                                                  0
                                                • gXeSEO

                                                  Is an RSS feed considered duplicate content?

                                                  I have a large client with satellite sites. The large site produces many news articles and they want to put an RSS feed on the satellite sites that will display the articles from the large site. My question is, will the rss feeds on the satellite sites be considered duplicate content? If yes, do you have a suggestion to utilize the data from the large site without being penalized? If no, do you have suggestions on what tags should be used on the satellite pages? EX: wrapped in tags? THANKS for the help. Darlene

                                                  Intermediate & Advanced SEO | | gXeSEO
                                                  0
                                                • HiteshBharucha

                                                  Duplicate content on subdomains.

                                                  Hi Mozer's, I have a site www.xyz.com and also geo targeted sub domains www.uk.xyz.com, www.india.xyz.com and so on. All the sub domains have the content which is same as the content on the main domain that is www.xyz.com. So, I want to know how can i avoid content duplication. Many Thanks!

                                                  Intermediate & Advanced SEO | | HiteshBharucha
                                                  0
                                                • knielsen

                                                  Copying my Facebook content to website considered duplicate content?

                                                  I write career advice on Facebook on a daily basis. On my homepage users can see the most recent 4-5 feeds (using FB social media plugin). I am thinking to create a page on my website where visitors can see all my previous FB feeds. Would this be considered duplicate content if I copy paste the info, but if I use a Facebook social media plugin then it is not considered duplicate content? I am working on increasing content on my website and feel incorporating FB feeds would make sense. thank you

                                                  Intermediate & Advanced SEO | | knielsen
                                                  0

                                                Get started with Moz Pro!

                                                Unlock the power of advanced SEO tools and data-driven insights.

                                                Start my free trial
                                                Products
                                                • Moz Pro
                                                • Moz Local
                                                • Moz API
                                                • Moz Data
                                                • STAT
                                                • Product Updates
                                                Moz Solutions
                                                • SMB Solutions
                                                • Agency Solutions
                                                • Enterprise Solutions
                                                • Digital Marketers
                                                Free SEO Tools
                                                • Domain Authority Checker
                                                • Link Explorer
                                                • Keyword Explorer
                                                • Competitive Research
                                                • Brand Authority Checker
                                                • Local Citation Checker
                                                • MozBar Extension
                                                • MozCast
                                                Resources
                                                • Blog
                                                • SEO Learning Center
                                                • Help Hub
                                                • Beginner's Guide to SEO
                                                • How-to Guides
                                                • Moz Academy
                                                • API Docs
                                                About Moz
                                                • About
                                                • Team
                                                • Careers
                                                • Contact
                                                Why Moz
                                                • Case Studies
                                                • Testimonials
                                                Get Involved
                                                • Become an Affiliate
                                                • MozCon
                                                • Webinars
                                                • Practical Marketer Series
                                                • MozPod
                                                Connect with us

                                                Contact the Help team

                                                Join our newsletter
                                                Moz logo
                                                © 2021 - 2025 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                                                • Accessibility
                                                • Terms of Use
                                                • Privacy

                                                Looks like your connection to Moz was lost, please wait while we try to reconnect.