IndustryGoogle & The Approved Cloaking Problem

Google & The Approved Cloaking Problem

Google needs to modify its policy on cloaking and provide more webmaster support, so programs like Google Scholar don't leave it open to accusations of hypocrisy.

The launch of Google Scholar shows once again that Google has no problem with cloaking, as long as this is approved by Google.

As you may recall, I wrote back in May about how Google was allowing NPR to cloak content that was in both its Google News and Google Web Search indexes: Cloaking By NPR OK At Google.

There’s no problem on the searcher side with allowing this. Google wants to have more good content in its index. Allowing the cloaking of NPR’s content, and now some scholarly content, helps the searcher. However, it flies in the face of Google’s own stated policy on cloaking:

The term “cloaking” is used to describe a website that returns altered webpages to search engines crawling the site. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they’ll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings.

In both situations, Google is being shown content that is different than what regular users see. My past article documents this in the NPR case. With the Google Scholar situation, Google is able to spider the full-text of documents while many regular searchers without password access to this material will only see abstracts of the documents.

In addition, the presence of such material has a natural effect to distort search rankings. If the material is now in the index, then it suddenly has a chance to rank well — and push aside other material. This is a GOOD distortion. Users may want to find this formerly invisible material. But it’s a distortion nonetheless and part of Google’s current definition of cloaking.

Rewrite The Definition

I’ve suggested to Google that they make some changes to their polices, to help reconcile what they say and what they do. Here are two rewordings I’ve passed along:

The term “cloaking” is used to describe a website that returns altered webpages to search engines crawling the site without permission. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they’ll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings. In some limited cases, Google does have arrangements with publishers where we may crawl material different from what a regular user sees. In these cases, the arrangements are done because we feel they benefit the quality of our searches, not harm them.

Or:

The term “cloaking” is used to describe a website that returns altered webpages to search engines crawling the site without permission. In other words, the webserver is programmed to return different content to Google than it returns to regular users, usually in an attempt to distort search engine rankings. This can mislead users about what they’ll find when they click on a search result. To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking without our permission, if we feel it is harmful to our search rankings.

Perhaps this all seems like word play to Google, but it’s this type of oversight that causes some people to lose faith in the company. Our forums recently had a thread looking at other cases where what Google says is different than how it acts: Google quirks summary (all lies…).

More important, when Google allows cloaking despite saying it is not allowed, then web site owners who aren’t allowed to have special arrangements with Google are left feeling like anything goes. Clarifying the policy is important. Otherwise, as I’ve heard from people since I wrote about the NPR story, some will feel like they may as well cloak if they feel the circumstances warrant it.

This leads to two final points. First, cloaking is not bad. Rather, cloaking as a technique is often tied to bad intent, but not always.

Intent, Not Technique

If the intent is to try and manipulate the search engine in ways many might find harmful to users, then the ban is really on that manipulation rather than the fact cloaking is involved. If the intent is deemed good, as is the case with NPR and Google Scholar, then cloaking is clearly seen by Google as OK.

I hope Google will make the word changes and finally acknowledge that in some cases, it does approve cloaking. My past article from 2003, Ending The Debate Over Cloaking, looks at how this might help us get back on track to think about intent rather than technique.

Better Support For All

The last point is that Google needs to rapidly develop some system to extend the special arrangements it gives only some publishers to all of them. There is plenty of good, non-scholarly material locked behind password systems. That material — much of it perhaps even more important to the general public — remains inaccessible.

Google takes feeds from merchants. It works with book publishers. Academic publishers now get to have relationships. But general web publishers, upon which Google has built its business? They remain in the cold.

I’ve written and written and written in the past about the need for Google to provide some type of webmaster services to such publishers. It’s time for the standard response of “we’re always thinking” or “maybe in the future” to end. Get on with it now.

Failure to do so is going to cause web site owners to lose further faith with Google, or as mentioned, simply decide they might as well do whatever makes sense.

A recent forum thread illustrates this: Locked content on Google. There, a web site owner is trying to get his password-protected content indexed. The idea of doing abstracts is suggested, but so is cloaking — and that’s something the site owner considers. How much better would it be for Google to simply establish formal relationships with sites, so such end-runs and other games could end? Much better!

FYI, I’ve had several off-the-record conversations about all this with Google over the past few months, and I sent another follow up message to them yesterday. I’ve still got no on-the-record comment I can report on the cloaking policy. If that changes, I’ll let you know.

Resources

The 2023 B2B Superpowers Index
whitepaper | Analytics

The 2023 B2B Superpowers Index

8m
Data Analytics in Marketing
whitepaper | Analytics

Data Analytics in Marketing

10m
The Third-Party Data Deprecation Playbook
whitepaper | Digital Marketing

The Third-Party Data Deprecation Playbook

1y
Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study
whitepaper | Digital Marketing

Utilizing Email To Stop Fraud-eCommerce Client Fraud Case Study

1y