Impressive statistics abound in relation to digital assistants and voice search. To cite just a few:
- Over 20% of mobile searches are by voice (Source: Google).
- There are 8.4 billion voice-assisted digital assistants in use worldwide today, according to Statista.
- Voice commerce sales will reach $164 billion worldwide by 2025.
Clearly, this is a growing industry and the physical evidence of the marketing opportunity lies all around us.
Devices like Google Home and the Amazon Echo range became increasingly prominent in living rooms, kitchens, and bedrooms in Western markets in the late 2010s. Furthermore, anthropomorphic assistants like Siri, Alexa, and Google Assistant are embedded in a wide range of smartphones, cars, and even fridges.

Voice search is no longer a trend: it’s user search experience that exists at the intersection of increased mobile use, sophisticated machine learning algorithms, and the symbiotic relationship between people and technology.
However, despite the initial push and integration into various devices, monetizing these platforms has proved challenging. Amazon, for example, reported almost $10 billion in losses from its Alexa division in 2022. This was because many users relied on these devices for basic tasks like setting timers or playing music, rather than engaging in deeper, revenue-generating interactions. This led to a period where voice technology, though widespread, didn’t fully live up to its commercial potential.
Now, voice technology is poised for a new resurgence, fueled by advancements in generative AI and multimodal interactions. Multimodal search is the capability of a search engine or AI model to process and interpret multiple types of input—such as text, voice, and images—simultaneously to deliver more accurate and contextually relevant results.
OpenAI’s ongoing developments, including (as of September 2024) a voice-led and multimodal ChatGPT, are pushing the boundaries of what voice assistants can do by integrating text, voice, and even images into seamless, intuitive experiences. Google and Meta are also making significant strides in this area, enhancing their platforms to allow for more natural and nuanced interactions.
The promise of generative AI lies in its ability to understand and generate complex, context-aware responses that go beyond simple voice commands. When combined with multimodal AI – technology that can interpret and generate content across multiple formats – voice assistants could become even more integral to how we interact with technology. Crucially, these new capabilities can open the door to commercial interactions between voice-led devices and consumers.
But what does all this really mean for marketers? Does voice search require a separate strategy? And, if so, what does it entail?
This article will look at what we know about voice search today, and will explore the following areas:
Voice Search: A Natural Extension of Semantic Search
When people first learn to communicate, they usually start with speech and then move on to written language. Search engines have experienced this in reverse.
Early search engines relied heavily on keyword matching: simply trying to find web pages that contained the exact words typed by the searcher. Over time, however, search engines have evolved to semantic search, which focuses on understanding the meaning behind a query, rather than just matching the words.
Semantic search enables search engines to comprehend the context, user intent, and relationships between words. This shift represents a move toward “intelligent” search, which interprets the nuances in language and delivers more relevant results.
Suppose I ask a friend, “What would be the best smartphone for me to buy?” Because they know me and they can tailor the recommendation based on my preferences.
Scaling that level of insight requires very significant natural language processing power, along with the information retrieval technology to sift through billions of results and locate the right one. Smartphones provide more contextual information than desktop computers, but a search engine still needs a reliable way to process and utilize so much data.
The image highlights the levels of difficulty for a search engine and the technology required to surmount them.

This matters when we consider voice search strategy. People adapt their behaviors based on the possibilities at their disposal. As marketers, understanding those behaviors is essential if we want to cut through the noise and connect with consumers.
Brands create the content that leads a consumer from question to answer. A search engine is the intermediary that makes the connection.
Google’s Hummingbird algorithm ushered in the age of semantic search. It used the Google Knowledge Graph to understand the relationship between entities and deliver something approaching conversational search.
Ask Google, “Who is the King of Spain?” and it will respond, “King Felipe VI”. Next, ask it, “Who is his wife?” and it will respond, “Letizia of Spain”.
Google infers that “his” in the second search relates to King Felipe, the result of the first search. It is a subtle but significant shift that affects how we should create and promote content through search. We can now have conversations through SEO, rather than one-off exchanges.
Semantic search continues to change how people find information and it has heightened their expectations. The advent of OpenAI’s ChatGPT and Google’s Gemini have further increased the demand for highly personalized responses from chatbots and search engines.
As the technology’s capabilities change, so should ours as marketers!
In essence, this development is a natural and vital component of voice search’s rise.
Google reports that people are increasingly searching for queries using words like “me”, “my”, and “I”. These words all indicate that the searcher expects a personalized response.

As an indicator of the modern consumer’s requirements from online content, this is very telling. People would only ask these questions if they expect the answer to be personalized and unique to them. These searches are typically carried out by voice, rather than text, because people communicate using different language when they speak.
Once more, this points to the difference between voice search and traditional search. Consumers are treating digital assistants as exactly that: a personal helper to get things done quicker and easier than before. We expect the assistant to “know” us.
Another question that is often raised is just how genuine the commercial opportunity is for voice search. As we have seen, Amazon’s Echo devices are popular, but they have not led to a huge increase in sales for the ecommerce giant.
A 2017 study from iProspect revealed that while people predominantly use voice search to get information or enable actions like turning on lights, they are also using it to find stores, research, and purchase. Nonetheless, it is clear that these are minor use cases in comparison with more prosaic actions such as finding out the weather.

Moreover, the distinction between voice search on mobile and with a smart home device is rather marked. This is perhaps to be expected, given that we carry mobile devices with us and they have screens whereas home devices tend not to, but it does bring important implications for brands. Mobile phone screens provide a canvas on which to display choice and information, while a home device must deliver one, authoritative answer.
This is another reason why there is cautious optimism in the industry that the multimodal ChatGPT can help usher in an age of voice-led commerce. It will use images and videos, as well as voice, to assist the consumer in their shopping journey.

From this information, we can start to understand the drivers – both technological and human – that have seen voice search grow so rapidly.
Voice search best practices
Marketers can use several tactics to enhance their voice search strategy:
- Technical SEO
- Content marketing
- Local SEO
- SEO strategy
Technical SEO
- Focus on speed and mobile-friendliness. A study of 10,000 voice search results by Backlinko shows that the time to first byte for a voice search result is significantly shorter than for the average webpage. With Google’s “speed update”, this should be the first port of call for any mobile or voice search strategy.

- Use structured data on all landing pages: One sizable challenge for digital assistants is that they must comb through trillions of pages to identify the elements that will answer a user’s query. Structured data, taken from the Schema.org standard, helps a search engine to navigate code and understand its contents.
- Experiment with new data formats: Google offers support for the Speakable structured data element.
The application of this format is limited for now, but it is not difficult to imagine a future where digital assistants read content directly from all landing pages. Early movers will seize the advantage in this field.
Content marketing
When it comes to voice search, the area it can have a huge impact is in content marketing. It can help you to:
- Create conversational content: Voice search lends itself naturally to dialogue and conversational content, so this should be factored into content strategy. Identify common questions or pain points in your industry and, quite simply, answer them better than anyone else does.
- Write for intent states, not keywords: Voice search queries tend to be much more varied than their typed counterparts. As such, trying to target individual queries within content is a challenging and unnecessary approach. Rather than just simply matching keywords, search engines now want to satisfy user intent. Think about why people are searching. Aim to understand and respond to these states, helping people to achieve their task quickly and effectively. This will be more profitable than creating landing pages to target individual queries.
- Develop a consistent brand voice: The future of voice search will involve brands speaking to their audience. This could be in the form of audio clips embedded in content or the search engine reading out text from the page. Either way, brands should be thinking of how they want their company to sound, rather than just look.
- Test voice interactions using AI tools: Use tools like ChatGPT with voice mode or other conversational AI platforms to simulate user interactions. Analyze how your content is being read, and then adjust tone, phrasing, and structure based on these insights to optimize the listening experience.
Furthermore, Google has been offering the following areas for assessment when it comes to this kind of voice search:
- Information satisfaction: The content of the answer should meet the information needs of the user.
- Length: When a displayed answer is too long, users can quickly scan it visually and locate the relevant information. For voice answers, that is not possible. It is much more important to ensure that we provide a helpful amount of information, hopefully not too much or too little.
- Formulation: It is much easier to understand a badly formulated written answer than an ungrammatical spoken answer, so more care has to be placed in ensuring grammatical correctness.
- Elocution: Spoken answers must have proper pronunciation and prosody. Improvements in text-to-speech generation, such as WaveNet and Tacotron 2, are quickly reducing the gap with human performance.
Local SEO
- Ensure that names, addresses, and phone numbers are accurate across all locations.
- Consider using a specialist platform to manage local listings and analyze your local search performance.
- Make it easy for consumers to act on their intentions. This means adding in clear calls to action (CTAs) and directions to further information. Attention spans are at a premium for voice search, so make the most of what little time you do get.
SEO strategy
- Think beyond the website: Chatbots, apps, and social media are all used to surface information for voice search queries. Optimize your presence across all of these media in a consistent brand voice.
- Create FAQ pages with voice in mind: Develop FAQ pages that answer common questions in a conversational tone. This not only provides value to your audience. It also makes it easier for voice assistants to pull concise, relevant information from your site in response to user queries.
- Use voice queries to plan future products and services: Within an app, it is possible to track and store all voice queries. This can be an invaluable resource when it comes to planning new services, as any unanswered queries will provide ideas with proven demand.
One real challenge with voice search is that it is not yet possible to segment queries within Search Query Reports or Search Console to see which were typed versus spoken. That will hopefully come, but for now marketers should aim to extract maximum value from the limited data at our disposal.
Integrate voice into your digital marketing strategy
Understand the fundamentals of digital marketing from social media marketing to analytics, website optimization and PPC so you can plan for voice search in your campaigns. DMI’s Professional Diploma in Search Marketing in partnership with expert Neil Patel will provide you with industry insights, up-to-date knowledge and in-demand skills to boost your knowledge and career.

