In a lively session at last week’s SMX West conference, three presenters made a strong case for the need to think creatively about optimizing content for voice and virtual assistants. Overall, the message was that Google is actively devouring online content to serve up in response to voice queries, and yet there’s still quite a bit of competitive advantage to be gained in an atmosphere where SEOs may not yet have caught on to the range of opportunities for voice optimization.
Sound, search and semantics: How form follows function
Upasna Gautam from Ziff Davis provided a detailed technical explanation of Google’s approach to Automatic Speech Recognition (ASR). She argued that only by learning about the form of Google’s voice processing technology will we be able to properly understand its function and deploy successful strategies.
Gautam explained that Google’s ASR is structured as a three-part process comprised of sound signal processing which converts speech into mathematical data; speech modeling which determines the meaning of the utterance; and delivery of relevant search results back to the voice assistant.
At every processing stage, Google uses quality metrics to gauge and improve accuracy. Some examples:
- Word error rate: measures recognition accuracy at the word level
- Semantic quality: measures how closely voice results match results of queries typed in by a user
- Perplexity: measures the quality of a language model by its ability to predict the next word in a sequence
- Out-of-vocabulary rate: measures how many words spoken by a user are not accounted for in the language model
- Latency: the time it takes to complete a voice search
Using these metrics and others, in combination with machine learning and neural networks, Google’s voice processing technology works to constantly improve results delivered to consumers. In light of this, SEO practitioners need to be able to design well structured and concise answers even to comparatively vague questions and need to understand the tradeoffs Google’s process is designed to make. Gautam suggested, for example, that Google will sometimes favor speed over accuracy, so that an answer to a query that scores lower on semantic quality may still outrank a higher scoring result if it can be delivered more quickly.
Voice search results and SERP features
Next up was Stephan Spencer, author and founder of The Science of SEO, who provided a useful overview of the specific tactics SEOs can use to create content for voice. Spencer pointed out that Google Home and Google Assistant tend to read out verbatim the content of featured snippets, and that many of the same strategies used today to gain featured status in search are also applicable in a voice context.
Special considerations do apply for voice. Of the three types of snippets — paragraphs, lists and tables – the first two work best as spoken responses to queries. Paragraphs are arguably the best fit for voice, a fact which mirrors their popularity in SERPs. According to Spencer, paragraphs make up 81 to 82 percent of all featured snippets, with lists at 11 to 12 percent and tables at 7 percent.
Spencer offered some specific tactics to employ when creating snippets, aside from proper Schema markup:
- Questions should be wrapped in H1 or H2 tags
- Answers should use paragraph tags
- Keep answers to 40 to 55 words
- Use standard formatting for lists including proper UL or OL tags
- Structure FAQ pages by topic, not page number
In addition to these tips, Spencer suggested several creative ways to source content for snippets, such as targeting weaker answers from competitors and checking the “People also ask” questions that appear below many snippets.
He pointed out, however, that snippets are not the only way to be featured in voice. Generally speaking, Google is looking at content outside the traditional “10 blue links” when sourcing voice content, the specific source differing based on the intent of the query. Other content sources that feature prominently in voice search include local packs, knowledge panels, carousels, and recipes.
As far as local pack placement, it’s more important than ever to pull out all the stops and compete for those top three positions, since Google voice responses will favor the top three and cause the rest to fall further into obscurity.
Spencer ended his presentation by suggesting that we are in the midst of a paradigm shift whereby the graphical user interface, or GUI, is giving way to the LUI — the linguistic user interface. This development, Spencer claimed, is happening in a surprisingly rapid manner with big implications over the next decade.
6 steps for voice search
The final presentation, from Benu Aggarwal, president and founder of Milestone, Inc., discussed tactics for deploying FAQ content at an enterprise scale. Aggarwal emphasized an omnichannel perspective that makes FAQ content available from a central hub for syndication across voice search, chatbots, ad campaigns, and other endpoints.
Deploying a voice search campaign involves six steps, Aggarwal explained:
- Conversational content: developing content that lends itself to conversational contexts like voice and chat
- Design and UI: adapting your content to various interfaces through user-centered design
- Technology: employing AI-based technologies to deliver content to users via voice assistants, chatbots, your website, etc.
- Promotion: extending the reach of FAQ content by connecting it with organic and paid search
- Actions and skills: building actions for Google Assistant and skills for Alexa to leverage content
- Measurement of impact: use metrics for campaign success such as queries, clicks, conversations, and post-click actions
Another takeaway from Aggarwal’s presentation concerned the need for relevance and specificity. In contrast to earlier days when SEO tactics could be more blatant in their courting of keyword traffic, today’s voice targeting needs to be both useful to consumers and vertically appropriate. Aggarwal recounted, for instance, that her company sourced FAQ content for a hotel client by going directly to the front desk staff and asking them to list the 100 most common questions from guests.
Each speaker offered actionable advice for gaining an edge in what is sure to become an increasingly competitive space over the next year and beyond. I was left with a few unanswered questions, however. In particular, it strikes me as problematic that those who win the race to be featured as an answer to a Google Assistant query stand to gain very little aside from the thrill of victory. On a SERP, featured snippets can lead to clicks, but a voice response that reads the content of a snippet is a closed circle – a question matched with an answer that doesn’t lend itself to any follow-on action. Isn’t Google, then, merely capitalizing on the effort of others without providing any benefit in return?
Of course, it’s generally in our strategic interest to get content to appear prominently in whatever way Google chooses to feature it, but this is one of several examples where voice as a marketplace may still be said to be in its infancy.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.