Duplicated content is something that very few websites are free of. The majority of time it is not intentional, but was created by our content manager. There are also some other types of duplicated content that are intentional, which can have a negative effect on the website.
At times, having duplicated content can have unexpected consequences.
In summary, what you have to know about duplicated content, is that it happens when the same content appears in multiple URLs. In principal, this is not a reason for punishment, as long as your website doesn’t have a high percentage of duplicated content. Keep in mind that duplicated pages won’t make Google angry with us, but avoiding them will give it the clue that we are on the right track.
Even though it isn’t directly penalized, it can lead to a loss of potential pagerank, since search engines won’t know which of the pages with duplicated content are the most relevant for a specific search. Next, I will give you some examples of duplicated content and how to resolve them.
This is the most common reason for duplicated content. It happens when your initial page has more than one URL:
Each one of these leads to the same page with the same content. Having more than one without any redirect leads to the search engine not knowing which one you want people to access.
You have two options:
- Create a redirect in the server to ensure that users are only shown one page
- Set which subdomain you want to be the principal one (“www” or “no-www”) in Google Webmaster Tools.
Tags and categorization issues
This occurs on blogs when many tags or categories have the same content as other pages (something very common in the blogging world). For example, we have a blog with 3 publications that have the following tags and categories:
- Title: How to resolve duplicated content
- Tags: Duplicated content, SEO, advice
- Categories: SEO, advice, content
- Title: How to detect duplicated content
- Tags: Duplicated content, SEO, content
- Categories: SEO, content
- Title: Advice for creating quality content
- Tags: advice, content, quality content
- Categories: SEO, content, advice
This is how the publications are arranged by tags and content.
We can see that the following pages have the same posts:
- The tags SEO and Duplicated Content.
- The categories SEO and Content.
- The category advice and the advice
The solution depends on how you use categories and tags and how many you have in each publication. If you use a few categories and a lot of tags (like most people), adding noindex and nofollow meta tags to your tag pages will make sure that your categories are what is used for your pagerank in the search results. If you use a lot of categories and only a few tags, it’s the opposite: you need to add the noindex and nofollow meta tags to your category pages.
This happens more and more often due to the enormous increase in mobile traffic in the last year. What’s going on here is that every page on a website has two different urls.
There are several possible solutions. The first is making the mobile version of the site different from the normal version, giving all the pages different URLs and different designs that show the information differently depending on the device used to access the website. That requires lots of time and effort. In case you don’t have enough time to do this, we recommend creating a responsive design to dynamically adapt the website design according to the resolution of the user’s screen.
Finally, the fastest solution is to add rel=canonical tags to the mobile version of every page, which will point to the desktop version of the page.
There are many types of parameters, above all in e-commerce: product filters (color, size, rating, etc.), order (by lowest price, by relevance, by highest price, in a grid, etc.) and user sessions. The problem is that many of these parameters don’t change the content of the page, which results in many URLs for the same content.
In this example, we can see three parameters: color, low price and high price.
Example of parameters in Google Webmasters Tools
The solution for any problem with parameters is to add a rel=canonical tag to the original page. That makes it simple to avoid Google having any type of confusion with the original page.
Another possible solution is to indicate to Google through Google Webmaster Tools > Configuration > URL Parameters which parameters it should ignore when indexing pages on your website.
Pagination refers to when an article, product list, or pages of tabs and categories, are longer than one page. Even though the pages have different content, they are all focused on the same topic. This is an enormous problem for e-commerce pages that have hundreds of articles in the same category.
Right now, there are the HTML tags rel=next and rel=prev that allow search engines to know that all of the pages belong to the same category/publication. When it knows this, it will not index all of these pages and will focus on the first page for all of the positioning.
Another solution is to use the pagination parameter in the URL and add it to Google Webmaster Tools so it is not indexed.
In addition to these cases, there are many more causes of unintentionally duplicated content, but these are probably the most common, and the ones which have a more or less simple solution. Duplicated content is something that you always have to be aware of. It is easy to solve when you only have a few duplicated pages, but it can be very tedious work when the number gets very large.