How Twitter Data Can Improve Real-Time Search

It’s time to face an unfortunate truth: the evolution of real-time search has been painfully slow. Providing high quality content to users with a short turnaround time (from 10 minutes to four hours) is astonishingly unreliable. Like many in the search community, I expected Google’s Latest Results to fade out the way New Coke did in the ’80s.

Rather than burning out (or slowly fading), Latest Results has lingered. The reason is simple: it’s a brand new vertical and the expectation of smashing success at first pass isn’t always realistic.

Instead, real-time search in Google takes a beta form and is noteworthy not for what it does, but for what it might be able to do in the future. Simply put, it’s good enough for now. For it to thrive, Google needs to vastly improve real-time search the same way it improved search in the late ’90s — through a more creative understanding of authority that stems far beyond linking or the strength of a domain.

Before delving into a proposed solution, let’s outline the problem.

Why is Real-time Search Ineffective?

Traditional search is eons ahead of real-time search simply because it has access to the trillions of interconnecting hyperlinks used to interpret and identify the power hierarchies within the Internet. Generally speaking, the more quality backlinks a Web site has, the more likely it is to rank for its targeted keywords.

However, it’s unlikely that links can/should ever have a significant impact on real-time search because links are accrued over time. That’s because a link is typically housed within the confines of an article — something that can’t be conceived or published instantaneously.

These powerful links can’t be applied in real-time search, so the algorithm is left with an enormous quality gap. The question becomes: how do search engines close this gap in order to provide more relevant information to users?

What Will Replace the Link?

The answer, in part, may lie in Twitter. Twitter is one of the few publishing forums on the Internet in which links can be easily shared within seconds of the release of a story. This makes the retweet a tantalizing metric with the potential to become a key foundation of real-time search. With the press of a single button, users can now share content with their entire audience.

Skeptics might be tempted to immediately point to the fact that spammy retweets are commonplace. Publishers often have zombie accounts that automatically retweet, and attempted manipulation would be inevitable. Would such activities completely invalidate the value of a retweet?

Not necessarily.

Identifying Twitter Authority

In traditional search, not all links are equal. Same goes for the retweet. It just places a premium in developing and understanding Twitter power users. Once authority is established, the influential vote would have greater weighting than their spammy counterparts. Below, I’ve listed just a few potential criteria that could be used to develop a listing of authoritative users:

  • Number of followers: What is the user’s sphere of influence? Is the user read by 10 or 10,000? While this data is helpful, it remains insufficient as these numbers are often greatly inflated by the “if you follow me, I’ll follow you” tendency of less-than-reputable users.
  • Ratio of following-followers: Drilling down a bit deeper, we can often see that the most influential tweeters often have drastic variances in their following-to-follower ratios. Influential users typically follow less than 500 accounts, yet can have anywhere from 10,000 to 1 million-plus followers. Conversely, fly-by-night spammers typically follow several thousand people in quick rashes, followed by massive sheddings if the favor isn’t returned.
  • Number of original tweets: In scoring news publications, the Google News algorithm measures how prolific the news source is and the ratio of original to AP content. This additional factor could be used to determine authority among power users.
  • Link aggregation data: When the user links out, is it to high authority sites or spammy sites?

Now, in Real Time

Applying this methodology, the retweet essentially transmogrifies into a vote casting system in an unequal democracy (the best kind).

The retweet could assist in drastically improving both Google News and Latest Results. Because Google News basically works on semantic relevance and domain authority, it would be vastly improved by a user voting system. This would prevent less authoritative sites (often the original creators of the content) from being overshadowed by behemoth scrapers such as the AP and other large news outlets. Moreover, Google News would finally be rewarding the article and not the size of the newspaper.

Additionally, metrics such as quality retweets could be used in assigning something along the lines of a JournalistRank, a score used to measure total impact of journalists, which would affect his/her likelihood of high rankings in the future. Presumably, this could benefit high profile writers such as Michael Lewis, and that rank could transfer independently from domain to domain.

It’s unlikely that any one data set will replace the link. Instead, real-time search will thrive only through triangulation of information that effectively indicates authority.

Related reading

search reports for ecommerce to pull now for Q4 plan
amazon google market share for ecommerce, data
Two simple behavioral levers to improve your link building efforts
set up google analytics annotations for google updates