a team of designers collaborating

Like earlier digital innovations, a conversational interface requires a pivot in thinking through content and purpose. One of the most asked questions I get with prospects and clients is ‘what should we be considering when designing custom voice assistants?’

It is a big question that can splinter off into several different directions, but for this post, I will focus on the design of the custom voice assistant as it pertains to understanding its moving parts.

The underlying question of ‘how do I design it’ is ‘what should it do?’ Let’s assume that question has been answered, and we are moving to the next level of articulating the design itself.

As earlier digital interfaces appeared, their design practices became codified through time and failure. Early websites often played sounds or music when you opened them. Native mobile apps often looked like (and sometimes still do) web pages crammed into a native user interface.

From these failures came frameworks to think through how to design for the mediums specifically. And while a conversational interface has been around for a long time (the first one emerged in the 1950s), the technology is now in place to have Conversational AI as a part of your omnichannel experience for customers and employees alike.

To be clear, I am not talking about Alexa Skills or Google Actions. The framework outlined below suggests capabilities beyond voice apps due to constraints related to terms of service, technical & analytics limitations, and tightly controlled, walled-garden infrastructure.

Designing custom voice assistants is not copywriting

While copywriting is an important part of conversation design, it is not the entire process. It is not enough to think through the different turns of a conversation (a back and forth). One needs to step up from that layer and consider the robustness of context a custom voice assistant is available to use.

Give deference to desire and device

When designing custom voice assistants it’s important to remember they are deployable to an array of devices. Some brands limit their exposure to one endpoint, like a custom voice assistant powering a drive-through. But the benefit of deploying across multiple endpoints, like a website, native mobile app, and a kiosk, is access to a larger and more diverse audience. For this reason, the design needs to account for the endpoints themselves.

Deference to desire doesn’t imply that a CVA must make everything achievable, but rather it meets the user where they are at. Verbal communication is as unique to an individual as their facial features, and a CVA must account for different communication methods. As a result, the system needs a robust vocabulary to draw from and should avoid rigid ‘order of processes.’ Will there be parts of a conversation requiring a distinct flow? Of course. But this should not be a safety net in conversation design, not the norm. Avoid creating verbal forms or you’ll risk frustrating the user and missing the much bigger potential offered by conversational AI.

Clarity of conversation and capabilities

We’ve all had conversations with people who spend too much time getting their point across. And while most copywriters are concise, conversation design must also consider clarity and capabilities. How does this differ from copywriting? Consider the example of booking a test drive for a new car. A CVA can ask for the relevant information and read it back to the user step-by-step. The copywriting for that experience is straightforward. But design can offer a better flow by accounting for the entirety of the user input and confirming at the end. If modifications are needed, the system can reveal a step-by-step process.

Depth of modalities

Custom Voice Assistants have an advantage that earlier digital technology wasn’t able to fully use true multimodality. A custom voice assistant can live within a native mobile app, a website, headset, kiosk, drive-thru, watch, or even refrigerator. Each of these devices carries with them modalities supported as well as contextual cues. Conversational AI design needs to be aware of these so that the chosen devices are designed for thoughtfully. As an example, an airline’s custom voice assistant may choose to deploy to televisions, cars, and watches. While the basic functionality of the assistant should be consistent across the set, the modality usage will be different – even though they share similar user interface components (speakers, screens, mics, etc). Take this example of an airline assistant for instance:

custom voice assistant modalities

modality categories

Consideration of inputs for designing custom voice assistants

On the surface, the input may appear to be ‘voice’ or, more technically, ‘utterances.’ In actuality, the user is offering information digitally that an assistant could be taking advantage of. Web and native mobile managers are all too familiar with this concept. In those mediums, other information includes IP address, device type, location, and more.

Since a voice assistant can be deployed to various endpoints, it can inherit the benefits the endpoint affords it. These include:

  • Camera
  • Voice
  • GPS
  • Gyroscope
  • Accelerometer
  • Ambient noise
  • Locationality
    • Wi-Fi
    • Bluetooth
    • Beacons
  • Speed
  • Engagement with backend systems
    • CRM
    • PIM
    • CMS
    • VMS

Each of these tools or systems houses an input a voice assistant can use. Whether location data or engagement data, movement data, or environment data, the CVA should use what is relevant and what the user has approved to make the conversation and its functionality more robust.

Final Thought

If the information above felt familiar, it should. It is an evolved framework building from digital channels that came before it. Designing for a CVA can feel challenging because it feels invisible. Don’t let that be daunting; let it be an opportunity for growth and expansion – for your skillset and the capabilities of the brand. Companies that are first movers in this space will define what success looks like for those who follow.

Ready to talk? Let’s connect!