SEOSearch Engines Are Winning the War on Content Farms [STUDY]

Search Engines Are Winning the War on Content Farms [STUDY]

A University of Glasgow computer scientist tested 50 queries known as attractive fodder for content farmers and found that Google and Bing seem to be winning the ongoing war against lower quality search results.

This time last year, search industry eyes focussed on the general quality of search results eventually causing a storm of protest from all parts of the web regarding over optimized websites that offered little of value to the user – and too many ads. The trend which takes us into 2012 is that low quality websites have been falling off the top of the SERPs, largely as a result of Google’s Panda algorithm update.

University of Glasgow computer scientist Richard McCreadie, at the request of NewScientist magazine, examined 50 queries known as content farm targets in March and again in August. The results, according to NewScientest, are “striking.”

For the purpose of this study, McCreadie defined low quality results as “uninformative sites whose primary function appears to be displaying adverts.” He hired people to review the results on Google and Bing during the two given time frames.

Keep in mind that the first Panda update rolled out on February 24th and affected 11.8 percent of results, so some of the test queries were most likely already affected by the beginning of the initial test period. That further progress was seen throughout the year reinforces that subsequent Panda updates did as they were meant to do, according to Google: “reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful.”

Amit Singhal and Matt Cutts explained further how Panda sniffs out low quality sites, in a March interview with Wired:

Singhal:We wanted to keep it strictly scientific, so we used our standard evaluation system that we’ve developed, where we basically sent out documents to outside testers. Then we asked the raters questions like: “Would you be comfortable giving this site your credit card? Would you be comfortable giving medicine prescribed by this site to your kids?”

Cutts: There was an engineer who came up with a rigorous set of questions, everything from. “Do you consider this site to be authoritative? Would it be okay if this was in a magazine? Does this site have excessive ads?” Questions along those lines.

Singhal: And based on that, we basically formed some definition of what could be considered low quality. In addition, we launched the Chrome Site Blocker [allowing users to specify sites they wanted blocked from their search results] earlier , and we didn’t use that data in this change. However, we compared and it was 84 percent overlap [between sites downloaded by the Chrome blocker and downgraded by the update]. So that said that we were in the right direction.

One of the queries McCreadie identified as attractive to content farmers was “how to train for a marathon.” In that example, sites with generic lists of tips were present in the March test, but had disappeared from the top 10 by the August test, replaced with higher quality results from reputable publications such as Runner’s World magazine. McCreadie reported to NewScientist that they had found similar trends across the 50 test queries.

Between the March and August test periods, Panda was updated five times:

April 11th, Panda 2.0 introduced signals such as user-blocked websites

  • May 9th, Panda 2.1, minor changes
  • June 16th to 20th, Panda 2.2, more minor changes
  • July 26th, Panda 2.3 acknowledged by Google
  • August 12th, Panda 2.4 rolled out the algorithmic changes globally

Late in April, Forbes took a look at early results to determine how top content farms had been affected by the first two incarnations of Panda. At that time, Demand Media’s Answerbag’s Google referrals were down 80 percent and eHow, another Demand Media property, saw its Google search visibility drop 42 percent. Overall, Demand Media traffic fell 40 percent, according to Experian Hitwise. Here are some other content farm traffic results in the wake of Panda, from around the web:

Hubpages Traffic According to Quantcast

Mahalo hit by Google Panda

Image by Amit Agarwal

Since McCreadie’s study, Google has updated the algorithm a number of times, most notably with the September 28th Panda 2.5 update and the November 3rd Google Freshness update, which affected 35 percent of searches.

Over the course of the last year, we’ve heard loud cries of protest after each of the updates from smaller site owners who felt they’d been unfairly penalized by Panda. In retrospect though, as we’re heading into a new year, it does seem that Panda is accomplishing what it was meant to do.

Towards the end of 2011, on Webmaster Radio’s Webcology show, host Jim Hedger asked each of the Year in Review panelists what they felt the biggest search story of the year had been. Surprisingly, perhaps, Panda wasn’t really on the radar of some of the more recognized names in search as one of the bigger concerns of 2011. In the Webcology chatroom, it was generally agreed among industry vets including Jill Whalen that sites hit by Panda, whether they realized it or not, time and again were found to have areas in need of improvement that very well could have contributed to their being snagged in the updates: duplicate content, thin or shallow content, overwhelming ad placement.

Google is clearly committed to continuing their process of identifying signals that indicate lower quality sites and introducing these tweaks to improve results. Bing… well, that McCreadie found similar results in test results between the two largest engines is hardly surprising, given Steve Ballmer’s surprising revelation at the Web 2.0 Summit, which I reference often and would nominate for Search Quote of the Year, if I could:

“Take any search you want and try it out on Bing, and try it out on Google… 70 percent of the time, you probably won’t care, 15 percent of the time you’ll probably like us better, and 15 percent of the time you’ll like the other guy better.”

How did Panda affect your strategy over the course of the last year, and what do you expect to see as far as algorithm updates in 2012? Let us know in the comments!


The Semrush Content Writing Workbook

whitepaper | Market Research The Semrush Content Writing Workbook

Data-Driven Market Research and Competitive Analysis

whitepaper | Market Research Data-Driven Market Research and Competitive Analysis

Semrush Keyword Difficulty

whitepaper | Analytics Semrush Keyword Difficulty

Searchmetrics Core Web Vitals Study

whitepaper | Analytics Searchmetrics Core Web Vitals Study