Webmaster Central on Duplicate Content

Despite the prevailing fear of a penalty for content on a website that is deemed by Google to be duplicated, there is in fact no such thing.

Much unnecessary worry can be prevented by simply accepting that some of you content is going to be duplicated despite all your efforts and that Google is 'cool' with it. Yes, it is good practice to use such measures as the canonical meta tag to indicate  a master or ‘canonical’ page but wasting time rewriting content just for the sake of it is not a good idea.

A recent Google Webmaster Central office-hours hangout dedicated to Duplicate Content  makes this point abundantly clear.

You can listen to it here:  https://www.youtube.com/watch?v=cxWo4ttPgAc

Here is a summary of the main points made by Google's John Mueller in his presentation:

What is Duplicate Content?

  • Exact same page 
  • Can appear as www / non www or as HTTP and HTTPS
  • And many others like mobile pages and CDNs etc.
  • Every website can have it!

Google is very capable of filtering out duplicates.

Handling Dup Content During Crawls

  • duplicates waste server resources
  • When we set up parameter handling - it helps 
  • Google has smart systems to detect and handle this as well

Duplicate Content During Indexing

  • Dups waste storage
  • Google tries to keep one copy
  • This can be tricky for many reasons (webmasters can help!)

Duplicate Content In Search Results

  • Dups are confusing for users - Google tries to just show one result
  • Often show as 'we have omitted similar results...'

Duplicate Content Problems

  • Unnecessary crawling
  • Harder to track metrics
  • Google can pick a 'non-preferred' URL to show
There is no duplicate content penalty except for:

  • scraper sites
  • spam sites that spin content
  • Doorway sites

See also: Syndicating Content


John Mueller