IndustryHow Google’s Human Search Quality Raters Assign a URL Rating

How Google's Human Search Quality Raters Assign a URL Rating

A leaked URL Rater guidebook from Google’s Search Quality Rating Program offers important insight into what human raters are looking for when they evaluate webpages and assign the URL a quality score. Some insight into how ratings are determined.

Google ran over 20,000 experiments in 2010 to come up with the 500 modifications they felt best improved search results for users. While there are over 200 factors, or signals, these constant updates affect their relevance and how they impact search results. It might seem like the algorithm is changing so often, webmasters can’t keep up – especially when updates like Panda come out.

However, there are specific qualities and attributes real people – Google’s URL Raters – look for when they visit your page. Those guidelines have been leaked in the form of a 120+ page training manual for new URL raters.

In a strange twist of fate, PotPieGirl found Google’s training manual for human URL quality raters by simply doing a Google search. URL rating seems the first step in Google’s manual rating hierarchy; from the introduction: “When you can do URL rating, you will be well on your way to becoming a successful Search Quality Rater!” As a fun little aside, the Adobe PDF (not Google Docs) guide sits in the Amazon cloud and instructs Google raters that they must use Firefox (not Chrome) in testing.

URL raters are given a URL and query. They are instructed to visit the landing page and assign a rating based on the guidelines laid out in the 125-page guide. So what, exactly, are they looking for?

Google’s Manual Rating Scale and Possible URL Quality Flags

google-url-rating-scale

Important scoring factors for URL Raters (taken from throughout the guide):

  • User intent and page utility
  • Location
  • Language
  • Query interpretation
  • Specificity of the query and landing page
  • Timeliness of the informational need of the query to evaluate whether recent content is necessary
  • The language of video content vs. query and landing page content language

do-know-go

Once the URL rater understands the query based on task language and location, as well as its dominant interpretation (the most common users have in mind), they look at the user’s intent based on “Do-Know-Go” classifications. They can then begin to evaluate the landing page and assign ratings.

The top rating, “Vital,” is used only in special situations (i.e., the dominant query interpretation is an entity such as a person or company. Interestingly, a Vital score does not require that the page is useful to a user; rather, that it is official.

Depending on what the URL Rater determined as the user’s intent, the homepage or a subpage may get a Vital rating. For example, for “Go” queries, the Vital page is the one requested by the user. If that happens to be a subpage, it gets the Vital rating over the homepage. However, if the query is an entity and the user intent is navigation, the homepage automatically gets the Vital marker. There are several pages of examples within the guide.

Quotes for common names like Bob Smith cannot be marked Vital. How does social media fit in? A social page for a celebrity or person with a unique, identifying name, or for a small group of people such as a band, can be marked Vital. Social media pages for companies cannot. Ever.

Most Queries Do Not Have a Vital Webpage

The scope of Vital, according to Google, is quite narrow. If a query does not have a dominant interpretation, is not an entity or navigational query, or no official webpage exists for the query, there can be no Vital result.

For example, “knitting,” “ipod reviews,” and “diabetes,” queries will not be assigned a Vital result. In cases where entities have official homepages on multiple domains, all such URLs can be considered Vital. Once a URL is deemed Vital, it is further sub-classified as either Appropriate Vital, International Vital, or Other Vital.

So, assuming you’re not a former president, international brand, or celebrity, let’s look at the other ratings to see how you could improve your own URL rating the next time a manual rater comes knocking.

Google’s Criteria for a “Useful” Webpage

A single query can have multiple Useful webpages. Manual URL Raters are instructed to decide whether the page is high quality and a good “fit” for the query. They may be “highly satisfying, authoritative, entertaining, and/or recent (such as breaking news on a topic).”

They should be organized and seem trustworthy. The information sources must seem reliable. “Spammy” pages should not get a Useful rating. Google notes that users searching for entities such as celebrities and companies are often looking for entertainment and therefore Useful might suit a social networking page or blog that didn’t qualify as Vital.

Sound subjective? It really is, but keep in mind that the manual review and rating process is just one piece of the puzzle.

Earning a “Relevant” URL Score

Pages marked Relevant must be “helpful for many or some users.” They must still fit the query but may not have as many valuable attributes as Useful pages. They might be less recent, address only a piece of the query, or be less comprehensive overall.

Relevant pages are said to be of average to good quality; all other aspects considered, a helpful and somewhat authoritative page will not score a Relevant if it is of poor quality.

URL Ratings You Want to Avoid

Slightly Relevant webpages are not helpful for most users and may contain less helpful information and/or be of lower quality overall, though they are related to the query. The information on the page may be outdated, too broad… or too specific. Shallow pages with little content or information go here.

If a mobile landing page appears in a regular URL rating task for any given query, it automatically gets a Slightly Relevant rating. If the content on that mobile page isn’t related to the query, it is knocked down to Off-Topic or Useless.

Google calls out a few tricks content marketers may use to make a page appear more relevant than it really is and instructs manual raters to look past query terms in the URL or page title and copied content or repeated keywords.

How can you earn the Google kiss of death with an Off-Topic or Useless rating?

  • Helpful to few or no users
  • Page content unrelated to the query
  • Deceptive content
  • Links and ads with no actual content
  • Links that redirect to other pages with more links and ads
  • Nothing on the page can be considered helpful to users by the rater.

Flags and the URL Rater’s Interpretation of Web Designer Intent

From section 6.1: “If you find a page to be ‘spammy,’ but you don’t feel comfortable saying that the webmaster definitely designed the page using deceptive web design techniques, you should assign a Maybe Spam flag.” Raters are instructed to mark pages as Spam if they believe the webmaster designed the page using deceptive web design techniques.

A page will be flagged as Porn if it has pornographic content in images, links, pop-ups, and/or ads. Raters are cautioned to use their judgment to decide if the content qualifies as porn in the country in which the task is located.

Where user intent based on the query shows that they were clearly not looking for porn, Google instructs the rater to give the page an Off-Topic or Useless rating in addition to the Porn flag. One example would be a search for “classic cars,” with a result showing a naked person on top of a car. Porn pages can be scored as high as Vital, but will always get the Porn flag. Child pornography and beastiality must be reported by the rater in accordance with U.S. Federal law, regardless of the task location.

Low quality pages or those with a lot of ads do not necessarily earn a Spam designation. Further, Spam flags are not query-dependant; a page could get pegged as Spam if it shows up in a manual rater’s task list on a completely unrelated query.

What makes a page spammy:

  • Hidden text or links – may be exposed by selecting all page text and scrolling to the bottom (all text is highlighted), disabling CSS/Javascript, or viewing source code
  • Sneaky redirects – redirecting through several URLs, rotating destination domains cloaking with JavaScript redirects and 100% frame
  • Keyword stuffing – no percentage or keyword density given; this is up to the rater
  • PPC ads that only serve to make money, not help users
  • Copied/scraped content and PPC ads
  • Feeds with PPC ads
  • Doorway pages – multiple landing pages that all direct user to the same destination
  • Templates and other computer-generated pages mass-produced, marked by copied content and/or slight keyword variations
  • Copied message boards with no other page content
  • Fake search pages with PPC ads
  • Fake blogs with PPC ads, identified by copied/scraped or nonsensical spun content
  • Thin affiliate sites that only exist to make money, identified by checkout on a different domain, image properties showing origination at another URL, lack of original content, different WhoIs registrants of the two domains in question
  • Pure PPC pages with little to no content
  • Parked domains

UPDATE: Apparently Google was busy yesterday, as the link to the original document has disappeared, except for those few who were able to download and save the PDF.

Resources

The 2023 B2B Superpowers Index
whitepaper | Analytics

The 2023 B2B Superpowers Index

9m
Data Analytics in Marketing
whitepaper | Analytics

Data Analytics in Marketing

11m
The Third-Party Data Deprecation Playbook
whitepaper | Digital Marketing

The Third-Party Data Deprecation Playbook

1y
Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study
whitepaper | Digital Marketing

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

2y