Sample Dialogs: The Key to Creating Great Actions on Google

September 06, 2018


Link copied to clipboard

Posted by Cathy Pearl, Head of Conversation Design Outreach
Illustrations by Kimberly Harvey

Hi all! I'm Cathy Pearl, head of conversation design outreach at Google. I've been building conversational systems for a while now, starting with IVRs (phone systems) and moving on to multi-modal experiences. I'm also the author of the O'Reilly book Designing Voice User Interfaces. These days, I'm keen to introduce designers and developers to our conversation design best practices so that Actions will provide the best possible user experience. Today, I'll be talking about a fundamental first step when thinking about creating an Action: writing sample dialogs.

So, you've got a cool idea for Actions on Google you want to build. You've brushed up on Dialogflow, done some codelabs, and figured out which APIs you want to use. You're ready to start coding, right?

Not so fast!

Creating an Action always needs to start with designing an Action. Don't panic; it's not going to slow you down. Planning out the design first will save you time and headaches later, and ultimately produces a better, more usable experience.

In this post, I'll talk about the first and most important component for designing a good conversational system: sample dialogs. Sample dialogs are potential conversational paths a user might take while conversing with your Action. They look a lot like film scripts, with dialog exchanges between your Action and the user. (And, like film scripts, they should be read aloud!) Writing sample dialogs comes before writing code, and even before creating flows.

When I talk to people about the importance of sample dialogs, I get a lot of nods and agreement. But when I go back later and say, "Hey, show me your sample dialogs," I often get a sheepish smile and an excuse as to why they weren't written. Common ones include:

  • "I'm just building a prototype, I can skip that stuff."
  • "I'm not worrying about the words right now—I can tweak that stuff later."
  • "The hard part is all about the backend integration! The words are the easy part."

First off, there is a misconception that "conversation design" (or voice user interface design) is just the top layer of the experience: the words, and perhaps the order of words, that the user will see/hear.

But conversation design goes much deeper. It drives the underlying structure of the experience, which includes:

  • What backend calls are we making?
  • What happens when something fails?
  • What data are we asking the user for?
  • What do we know about the user?
  • What technical constraints do we have, either with the technology itself or our own ecosystem?

In the end, these things manifest as words, to be sure. But thinking of them as "stuff you worry about later" will set you up for failure when it comes time for your user to interact with your Action. For example, without a sample dialog, you might not realize that your prompts all start with the word "Next", making them sound robotic and stilted. Sample dialogs will also show you where you need "glue" words such as "first" and "by the way".

Google has put together design guidelines for building conversational systems. They include an introduction to sample dialogs and why they're important:

Sample dialogs will give you a quick, low-fidelity sense of the "sound-and-feel" of the interaction you're designing. They convey the flow that the user will actually experience, without the technical distractions of code notation, complex flow diagrams, recognition-grammar issues, etc.

By writing sample dialogs, you can informally experiment with and evaluate different design strategies, such as how to promote the discoverability of new features or how to confirm a user's request (for example: should you use an implicit confirmation, an explicit confirmation, or no confirmation at all?).

Check out the Google I/O 2018 Action sample dialogs to see an example. (You can also take a look at the Google I/O 2018 Action code.)

Still not sure if you really need them? Let's hear from a developer who works on Actions, Jessica Dene Earley-Cha, who said in her recent Medium post:

Let's cover how this was built. Before any coding can happen, we need to build a Conversational Design. I originally had skipped this step because I thought that I could build the structure first and then fill in the content (like building a website). However, the structure is tied in with the content. I realized this when I was hitting walls that I thought were technical, but they were there because I didn't have a design.

She makes the great point that designing for conversational systems is different than designing for the web. With a conversational interface, the content itself is part of the structure, so design becomes even more important.

So now that you're (hopefully) convinced, let's discuss four of the common pitfalls developers can avoid by using sample dialogs:

PITFALL #1: Flooding the user with too much information

Suppose you're writing an Action for a bike shop: it can make repair and service appointments, give store hours, and list the latest deals. It's tempting to just start listing out options so the user will know everything they can do. Let's see what a sample dialog looks like using that strategy:

USER
Hey Google, talk to Alyssa's Bike Shop.

ACTION
Hi! Welcome to Alyssa's Bike Shop. Would you like to make an appointment for a repair, cancel an appointment, hear store hours, hear this week's deals, or hear next month's deals?

If you read this prompt out loud to someone else, it will quickly become apparent that too much information is being presented. Humans have a limited capacity for taking in audio, especially if it's the first time they're hearing it.

Here is a better way:

USER
Hey Google, talk to Alyssa's Bike Shop.

ACTION
Hi! Welcome to Alyssa's Bike Shop. I can help you make or cancel an appointment, get store hours, or tell you the latest deals. Which would you like?

Pro tip: an even better experience would be to leave out the "cancel" option if the user doesn't have any.

PITFALL #2: Keeping what your Action can do a secret

Here's a sample dialog representing a common mistake in many conversational experiences. In this example, the user is returning to an Action they've tried previously:

USER
Hey Google, talk to Cathy's Astronomy Trivia Game.

ACTION
This is Cathy's Astronomy Trivia Game. Welcome back!

USER
Uhhh… what can I do??

When we become immersed in our design, we often forget that the user will be coming to the Action without the background we have. What they can do will not always be obvious.

Here's an improvement:

USER
Hey Google, talk to Cathy's Astronomy Trivia Game.

ACTION
This is Cathy's Astronomy Trivia Game. You can play solo, or join a team. What would you like to do?

USER
Uh, I think I'll join a team.

Remember, you don't have to list all your options up front. Instead, try to guide the user through the process in a clear manner, and end with a call to action.

PITFALL #3: Not inviting your user to speak

Another common issue is not making it obvious when it's the user's turn to speak. As humans, we have many ways to let someone know it's their turn: words, intonation, eye gaze, and body language. For things like the Assistant, it's important to end the turn with a clear question or instruction. Here's an example of an Action that leaves the user hanging:

USER
Okay Google, talk to Cool Animal Facts.

ACTION
Welcome to Cool Animal Facts! What animal would you like to know about?

USER
I love penguins!

ACTION
Cool, penguins. Would you like to know about their habitat, or what they like to eat?

USER
Oh, what do they like to eat!

ACTION
Penguins can eat up to 13 pounds of fish, krill and squid a day! And did you know the fathers sit on the eggs?

USER
Yes, I saw…

ACTION
Meanwhile, the mothers go out to get fish.

USER
(thinking) Wait.. am I supposed to say something?

In this example we see two problems. First, the Action asked a rhetorical question. When the user tried to answer, the Action ignored them and kept on talking.

The second problem is that after the fact was given, nothing happened. There was no indication of what the user could do next.

Here's a better way:

ACTION
Penguins can eat up to 13 pounds of fish, krill and squid a day! By the way, male penguins spend time sitting on the eggs, while the females go out to get fish. Now, would you like to hear more about penguins, or ask about another animal?

USER
I'm done, thanks.

ACTION
Sure, thanks for talking to Cool Animal Facts. Goodbye!

Pro tip: Users often want to have an explicit way out, such as saying "goodbye".

Pro tip 2: Make your grammar robust! For the question "Would you like to hear more about penguins, or ask about another animal?" be prepared to handle responses such as "yes", "no", and names of animals.

PITFALL #4: Making your error prompts one size fits all

Gracefully handling the conversation when things go wrong is critical for any conversational system. Writing multiple error prompts for every turn in the conversation can seem daunting. Using a default "I'm sorry, I didn't understand" will not be helpful to the user. Instead, it should be based on what's been happening up until that point and where the user is currently. To make it easier, write sample dialogs for error conditions, which will make more clear what should be in the prompts.

USER
Okay Google, talk to the Zombie Quiz.

ACTION
Time for the Zombie Quiz! First question: which one of these should you do if you're being chased by a zombie: lock the door; run for the hills, or plant a garden?

USER
(thinking)

ACTION
Sorry, I didn't get that. Which one do you choose?

USER
What? I don't remember them...

ACTION
Sorry, I'm still having trouble. Which one do you choose?

USER
Argh!

ACTION
Sorry, that's incorrect. Question two…

Context-sensitive error handling would have gotten the user back on track:

USER
Okay Google, talk to the Zombie Quiz.

ACTION
Time for the Zombie Quiz! First question: which one of these should you do if you're being chased by a zombie: lock the door; run for the hills, or plant a garden?

USER
(thinking)

ACTION
Which one of these should you do if you're being chased by a zombie: lock the door; run for the hills, or plant a garden?

USER
Uhh.. I'm not sure… let me see...maybe the..

ACTION
You can also say the number of the answer, like one, two, or three. Which one do you choose?

USER
Oh, number three.

ACTION
Correct!

Your pre-flight checklist

I know you're itching to take off and starting drawing flows and writing code, but take time to write sample dialogs first. In the long run, it will make your coding easier, and you'll have fewer bugs to fix.

Here's a list of "Dos" to keep in mind when writing sample dialogs:

  • Check out the Conversation Design Guidelines for more help
  • Start your design by using written/spoken sample dialogs; diagrams of the detailed flow can come later
  • Read your sample dialogs out loud!
  • Make each sample dialog one path; they should not include branching
  • Write several "happy path" sample dialogs
  • Write several "error path" sample dialogs
  • Do a "table read" and have people unfamiliar with your sample dialog play the part of the user
  • Share your sample dialogs with everyone involved in building the Action, so everyone's on the same page
  • When testing, compare the actual working Action with the sample dialogs, to ensure it was implemented correctly
  • Iterate, iterate, iterate!

Happy writing!