Dealing With Onsite Duplicate Content Issues

Date published 15 October 2014 Author

Navneet Kaushal

Categories

Content
SEO

The problem of duplicate content arises when there is more than one version of a page indexed by the search engine. Duplicity can be both onsite and offsite: Onsite duplicity is when the same content is seen on multiple pages within a website and offsite duplicity is when the content on your website is similar to that on some other site.

Duplicate content within the same site makes it difficult for a search engine to decide which page to rank.

Here are some of the most common onsite duplicate content issues and how to fix them:

Duplicate Content Issues

Duplicate content issues can lead to a decrease in crawl rate – this happens because Googlebot is busy crawling unnecessary similar pages
Wrong page ranks result in poor user experience
New websites may face delays in rankings
Search engines don’t know which page to index
Search engines fail to determine which page to rank for a search query

The Cause of Duplicate Content Issues

URL parameters like click tracking and certain analytics code can cause issues of duplicate content. Google offers advice here for URLs containing specific parameters.

Printer-friendly version content can also cause duplicate content issues when different versions of a page get indexed.

Identical product descriptions for similar products, either within your site or across multiple sites selling the same products, is a problem mostly faced by e-commerce sites when they use generic product descriptions, i.e. the manufacturer-supplied copy. Since they are coming form the same source, they remain 100 percent identical.

Another factor that causes duplicate content issues is the session ID. The problem arises when individual users visiting a website are assigned different session IDs.

Using different URLs or domains like the M. approach for mobile versions of websites can also cause problems.

Duplicate content can also arise when both www and non-www versions of a page are available and the same content is served on both.

Other causes of duplicate content can include scraping and content syndication; paginating comments; similar content on a post page, home page, and archives page; or a site architecture in which there are multiple paths to the same page.

Matt Cutts offers some great advice as to what e-commerce sites can do to prevent the problem of duplicate content here.

Solving the Problem of Duplicate Content

Redirecting Duplicate Content: Set up a 301 redirect from the page with copied content to the one with the original content. Make sure you redirect all the old duplicate content URLs to the proper canonical URLs.

Use a “rel=canonical” Tag: Using a “rel=canonical” tag tells search engines which version of the page you want the search engine to show in the search results page. The canonical tag is found in the header of a Web page.

Use Meta Tags: Use meta tags to tell search engines which pages you do not want to index.

Syndicate Carefully: In case you syndicate your content on other sites, be careful. Make sure each site to which your content is syndicated links back to your site. You can also ask them to use “no follow.”

If you have multiple pages that are similar, expand the pages to contain unique content or consolidate them into a single page.

The Same URL for Mobile Sites: To solve the duplicate content issues in case of a mobile version of your site, going responsive or the same URL will solve the problem.

Check Guest Posts for Duplicity: Before you accept guest posts, check them for duplicity. Plagiarism can cause serious penalties to reputable websites.

Tell Google How to Index Your Site: Google allows you to decide which page should be crawled and which should not. You can also inform Google how you would like it to index your pages.

Be Consistent With Your Internal Linking Strategy: Just stick to one particular format to avoid confusion.

Tools

Google Webmaster Tools: Use Google Webmaster Tools to trace duplicate content in meta description and title description. If you are using Google Webmaster Tools, log in to your account, click on Diagnostics, followed by “HTML Suggestions.” You will see a table showing Duplicate Title Tags and Duplicate Meta Descriptions. Clicking on any of the links will show you the URLs where the duplicity is.

Siteliner: Use Siteliner to check for duplicate content and broken links by entering your website URL and clicking on “Go.” Siteliner will generate a full report on duplicate content, broken links, and skipped pages. Click on “Duplicate Content” in the Site Details section to get an overview of the URLs, titles, match words, match percentage, and match pages.

ScreamingFrog: The ScreamingFrog crawler will crawl up to 500 pages for free for issues including duplicate content. Click on Page Titles. Select “Duplicate” in the “Filter” section. You will get a list of the URLs which have copied content. Analyze them and correct them.

Virante Duplicate Content Checker: Submit your domain and Virante scans your site to see if there is any internal duplicity. It conducts a Google cache check, 404 check, and www versus non-www check by checking the headers returned by both versions of the URL, PR dispersion, and supplemental pages in the Google index.

Xenu: Xenu checks for broken links. Go through the table to check identical titles. Launch Xenu Sleuth. Go to File and click on Check URL. As soon as you click on OK, Xenu will start crawling the URLS. Save the file and export it to MS Excel. You can then analyze the spreadsheet for duplicate content issues.

SmallSeoTools: To check for plagiarism, copy and paste your blog post in the box and the tool will tell you how original is your content. Copy the content you want to check for duplicity and paste it in the yellow box on the tool. Type in the captcha code and click on “Check for Plagiarism.” Those phrases which have been lifted from elsewhere will be marked in red. You can click on the highlighted text to see the source.

The problem of duplicate content is not something that can’t be fixed. Replacing duplicate content with unique and informative content, which is of some value for the users and search engines as well, will give a much needed boost to your website.

If you think we missed out on some important tools for the detection of duplicate content, let us know by commenting below. Apart from this, you can also send us your feedback if you have some additional information and tips to tackle the issue of duplicate content.

Industry

SEO

PPC

Analytics

Social

Local

Mobile

Video

Content

Development

Opinion

Information

Follow us

Dealing With Onsite Duplicate Content Issues

Duplicate Content Issues

The Cause of Duplicate Content Issues

Solving the Problem of Duplicate Content

Tools

Leave a Reply Cancel reply

Resources

Analytics The 2023 B2B Superpowers Index

Analytics Data Analytics in Marketing

Digital Marketing The Third-Party Data Deprecation Playbook

Digital Marketing Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Resources

The 2023 B2B Superpowers Index

Data Analytics in Marketing

The Third-Party Data Deprecation Playbook

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

Related Articles

The Search Engine Watch Top 5!

How to drive B2B conversions from your organic traffic

A must-have web accessibility checklist for digital marketers

Here’s how you can master your next seasonal digital marketing campaign

Quora and Reddit: Powerhouses for SEO and marketing in 2021

Why killing your content marketing makes the most sense

What five news-SEO experts make of Google’s new, "Full Coverage" feature in...

Is Google moving towards greater search equity?