The future of voice search: “Ubiquitous and seamless”
These are interesting times for voice search, both in terms of its adoption among consumers and its technological development.
We have moved beyond seeing voice search as a futuristic concept with rather limited and stilted realization, to viewing it as an increasingly integral part of our interactions with home and handheld devices.
However, voice search brings with it a lengthy list of questions for technology providers, consumers, and marketers alike.
If we are indeed at something of a crossroads for this technology, it seems a good time to address these questions, giving particular thought to how the landscape will change over the next few years.
These questions include, but are certainly not limited to:
Unfortunately, neither Siri, Alexa, nor Google Now were of much assistance when I posed them these questions, so we will endeavor to answer them the old-fashioned way.
Let’s start with a quick recap of where we are today.
Voice search providers understand a user’s intent not just through what question is being asked, but also through geo-information, browsing history and past behavior, with the goal of instantly answering that query.
At its apotheosis, this technology should be able to alert us of – and resolve – queries and issues before we even become consciously aware of them. Push notifications from Google Now on Android devices provide a glimpse of just how effective this could be.
Voice search has actually been around for well over a decade, but until recently it has been subordinate to its text-based counterpart, hindered by hilarious but damaging bloopers.
Verbal communication, of course, predates written language and, as such, it feels more natural for us to hold a spoken conversation.
However, when it comes to searching for and retrieving information online, we have experienced this development in reverse, starting with written language and progressing to verbal communication.
As a result, marketers have often been left with the unenviable task of inferring user intent from the simple phrases typed into search engines.
This has come with benefits, too. One of the real defining elements of search marketing has always been the predictability of search queries and volumes.
We set budgets aside based on these numbers, we forecast performance taking these numbers as facts, so it will affect us if search trends are imbued with the inherent fluidity and transience of speech patterns.
That said, it has taken the collective might of Google, Amazon, Baidu, Microsoft and Facebook to get us to a point where voice search is now a viable (and sometimes preferable) way of requesting information, and there is still some way to go before the technology is perfected.
There are many reasons for this staggered roadmap.
First of all, the task of taking meaningful spoken units (morphemes) from a person, converting them to text units (graphemes) on a computer, and finding the corresponding information to answer the original query, is an incredibly complex one.
As such, the list of possible voice commands for a search engine still looks something like this:
We shouldn’t expect such formulaic constructions to remain as the standard, however.
Industry developments like Google’s Hummingbird algorithm have moved us closer to true conversational search than we have ever been before. Voice search therefore seems, logically, to be the area that will develop in tandem with advances in conversational search.
And for us search marketers, developments like the addition of voice queries within Search Analytics mean we can soon report with at least a modicum of accuracy on our campaigns.
So, as natural language processing improves, the anthropomorphic monikers given to digital assistants like Alexa and Siri will make a lot more sense. They will engage us in conversation, even ask us questions, and understand the true intent behind our phrases.
We are already seeing this with Google Assistant. This technology has the ability to ask questions to the user to better understand their intent and perform actions as a result, for example to book train tickets.
This is a fascinating and impressive development that has implications far beyond just search marketing. When combined with Google’s integration of apps into its search index, we can gain a clearer view into just how significant voice search could be in shaping user behavior.
It also moves us a few steps closer to query-less search, where a device knows what we want before we even think to ask the question.
It must be said, nonetheless, that Google is far from monopolizing this territory – Amazon, Apple, Baidu and Microsoft are all investing heavily and there is an ongoing land-grab for what will be very fertile territory.
We know that voice recognition, natural language processing, and voice search are of strategic importance to the world’s biggest tech companies, and a recent quote from Google reveals exactly why:
“Our goal in Speech Technology Research is twofold: to make speaking to devices around you (home, in car), devices you wear (watch), devices with you (phone, tablet) ubiquitous and seamless.”
To be both ubiquitous and seamless means being driven by a unified software solution.
Digital assistants, powered majoritively by the technology that underpins voice search, can be the software that joins the dots between all of those hardware touchpoints, from home to car to work.
As Jason Tabeling wrote last week, this is a growing hardware market and the onus is on securing as much of this market as possible.
Amazon and Google won’t always want to invest so heavily in the hardware business, however.
It would be far more sustainable to have other hardware makers incorporate Amazon and Google’s software into their devices, increasing the reach of their respective virtual assistants much more cost-effectively.
For now, winning the hardware race is a sensible Trojan horse strategy to ensure that either Google or Amazon gains a foothold in that essential software market.
Predominantly younger generations, although this trend is even more deeply entrenched in China than in the West, due to the complexity of typing Chinese phrases and a willingness to engage with new technologies. As such, Baidu has seen significant growth in the usage of its voice search platform.
Google voice search and Google Assistant are increasing their recognition accuracy levels significantly, however, which has previously been one of the important barriers to widespread uptake.
In fact, the difference between 95% and 99% accuracy is where use goes from occasional to frequent.
These margins may seem relatively inconsequential, but when it comes to speech they are the difference between natural language and very stilted communication. It is this 4-point increase in accuracy that has seen voice search go from gimmick to everyday staple for so many users.
Certain types of queries and searches are likely to require more than just one instant answer, as they require a visual element; for example, planning a trip, or deciding which winter coat to buy.
It is imperative that businesses do not over-optimize for voice search without thinking this through, as voice search does not yet lend itself so readily to these more complex answers.
The graph below shows the different ways in which teens and adults have reported using voice search:
This generational gap is telling, as it strongly suggests that voice search will become more prevalent over time; not just because of the improved technology at consumers’ disposal, but also because of an increased number of people who have grown up with voice search and are accustomed to using it.
It is still noteworthy that so much of this increased usage relates to informational queries, nonetheless. The $37 billion per year search industry is predicated on the notion of choice, mainly within commercial queries.
There may be one true answer to ‘What time is it?’, but ‘What should I buy to wear to the party on Saturday?’ opens itself up to any number of possibilities.
The biggest challenge facing voice search providers as they try to monetize the increasing demand is that the interface simply doesn’t lend itself to advertising.
We saw this very recently with the Beauty and the Beast ‘ad’ controversy, which was seen as invasive, primarily because if there is only one answer to a question, users are unwilling to accept an advertisement in place of a response.
That issue aside, other questions remain unanswered. If users do start to conduct commercial queries and the response is multifaceted, the traditional SERP seems a much more fitting format than a single-answer interface.
The question of how to monetize voice search has been raised repeatedly at Google’s quarterly earning meetings, so we can surmise that they will find a solution.
We can expect Google to continue experimenting with ad formats for however long it takes to devise the right formula, while hopefully keeping its huge user base content along the way.
One prediction is that Google and Amazon will use the advent of augmented reality to provide multiple options in response to a voice-based query.
This would be in keeping with the nature of this more futuristic interaction, as it would feel disjointed to speak to a digital assistant and simply see four PPC ads on a phone screen as the response.
By creating an augmented reality-based search results page, visual search engines can sell advertising space and keep users satisfied.
We have seen signs that this could be tested soon, with Amazon said to be exploring the possibility of opening augmented reality homeware stores.
The irony of Amazon, they slayer of so many traditional stores, now taking a seemingly retrograde step by opening stores of its own, will not be lost on most in the industry.
These will be much more than just traditional brick-and-mortar presences for the online giant, however, and will be more in line with its forays into grocery shopping.
Now if we bring voice search and the Alexa digital assistant back into the frame, this all starts to fit together rather nicely.
Voice search suddenly becomes a vehicle to showcase and provide a wide range of products and services, from timely reminders about appointments, to contextual ad placements in response to commercial queries.
The more data is fed into this machine, the more accurate it becomes and – should privacy concerns be allayed or bludgeoned into submission – the happier the consumer will be with the results.
Voice search is not, on its own, the future of the search industry.
One real, if slightly lofty, ambition is to arrive at query-less search, requiring neither a text nor a voice prompt for a digital assistant to spring into action.
Another, more tangible and realistic goal, will be to use voice search to unify the varied touchpoints that make up the average consumer’s day. Though tangible and realistic in technological terms, this goal will remain tantalizingly out of reach if consumers use a variety of hardware providers and data is not shared across platforms, of course.
Making all of this a “ubiquitous and seamless” experience will be hard for consumers to resist and will make it even harder for them to move to another provider and start the process over again. This will be the bargaining chip used to persuade consumers to stay loyal with Apple or Google products from home to car to work.
We will see the continued rise of query-less search, where digital assistants answer our questions pre-emptively. Think Google Next, rather than Google Now.