*Alexa is an extremely confidential organization and therefore I am restricted in the detail / scope / entire projects on which I share. This is a small representative sample of my work.
Alexa in the Kitchen
During longitudinal studies we often asks users where the biggest pain points are in their house. Almost everybody says either, or both, cleaning and the kitchen. The kitchen has become the central hive of the house in modern America which means it's also become the most hectic. One of the most difficult tasks in the kitchen is cooking; no one person will ever have enough hands to complete everything at once. Enter Alexa.
Framing and defining the problem
The primary first step is to define the customer problem. In such a nascent space the question becomes not just "what are the customer pain points", but also "which ones are the right ones to tackle first." We worked closely across user research to determine customer habits, but also deeply engaged with our 3P partners to discuss what are the functions they wish they could have.
What we (Smart Home design) found was that we could add the most value by activating the appliances and allowing customers to control via their voice. The main hero use case revolved around abstracting times, temperatures, and modes. Just tell Alexa "Cook my dinner just the way my family likes it" and we would do all the rest.
Scoping the problem and approaching it
For the context of this, I will focus relatively narrowly. A lot of the pieces are still moving and we are shipping entirely new products and features weekly. To do this intentionally, we had to break the work down into a strategic series.
The first large block we focused on was microwaves. We ascribed to an iterative approach, and in this case the MVP was targeted at microwaves for a number of reasons (most of which I can't or won't discuss here) but it can be summarized in that microwaves are simple and cheap. Since we are the first experiment in this space we wanted to gauge the temperature before we jumped fully in. Microwaves are cheaper, have a faster purchase cycle, and (delicately put) have the lowest expectations for customers.
We used our research to inform the customer tasks for microwaves. We had them use a test microwave just as they normally would, and wrote a card for each action. EACH. SINGLE. ACTION. AND. DESCRIPTION. "finds the duration from recipe" "open the door" "place food inside" "press (1)" button "press (0)"... we then put these into a task flow to determine what a customer's base mental model was.
To make sure we were measuring our problem intentionally, we set out metrics that we would use to triangulate and guide off. These metrics (again, I cant and won't discuss in depth) revolved primarily around adoption, engagement, and successful intent retrieval.
Developing and Iterating Design
It sometimes goes with out saying what you ended up design with, but in this case we had to sort a go-to-market strategy involved what platforms we would build. Voice was a must, but all screens were a maybe. Of course we wanted to design them in parallel to understand the full breadth, but we knew very early on that we were likely minimize the number of platforms we launched. To do this we looked at the tasks and flows from the customer research and brainstormed around what modality would work best. You may not wanted to pull out your phone to get your microwave on for 30 seconds, but you likely don't want to query and respond 10 different names for your special leftover pasta by voice.
Our journey begins when the customer buys a device (be it an Echo or a new appliance). We can't assume the customer found the app and downloaded, we have to consider customers setting up a device, finding it on the network, connecting the right skills, etc. I won't go into depth here, but it all adds to the complexity of what the customer will be faced with when using this product.
The customer usage research showed that customers primarily rely on (3) different stages: prep, control, and query. We began to develop strategies around each and prototype interactions across voice, mobile, and multi-modal.
The initial prototypes lead us to a strong conclusion we have a very very small window to get the magic right before it turns into "oh forget it, I will do it myself on the damn keypad." This guided us away from hyper-precise scenarios of getting weights, desired doneness, etc. and focused on starting the microwave.
For the Voice UI, we focused on getting a customer to a successful event rather than precision (I give an example below). We went back and forth on how much this will delight the customer and how much complexity this is adding. Similarly on the GUI interfaces we went through a process of pruning. We started with a 1:1 replication of what a customer would do on a physical appliance. (0-9, preset buttons, modes, etc.). We quickly realized that this was likely not a valuable interface and started to focus on what was the best information we could provide.
Much of this iteration was based on internal play with sessions as well as in-"home" testing. It was a particular challenge to test this with real users because of the limited number of appliances in the world and and barrier of simulating it in a lab. To combat this we spent even more time with customers and focused more on the goal of their action rather than "what they were used to doing".
We tested our initial designs through a wizard of Oz setting. We initially tried to be very precise with our outcome. What we saw over and over in testing scared us;
"Alexa, cook my dinner"
"okay, is it frozen?"
"Okay, how many plates of food is it?"
"Sorry I didn't get that. How many plates of food?"
What we did to combat this was multi-turn for the simplest entry we could: time. It eliminates some of the magic, but it also results in utility rather than failure. That looks like this:
"Alexa, cook my dinner"
"Okay, for how long?"
Fine tuning, product go-to-market, and strategic alignment
This product set is unique in that we expect customers to go in and out of platforms within a single session. What that allows us is moving entire functionalities from one platform to the other. For example we may not need the ability to set explicit cook times on mobile because it's offered on the physical appliance as well as voice (not to mention it's an awkward use case no customer asked for). This made the QA and hand-off process much more difficult. Rather than running down our typical checklist of end-to-end testing, we are now overlaying multiple teams, platforms, and products to deliver a single experience.
The alignment we landed on post-testing is generally voice: control + query, mobile: settings + query, multi-modal: query. This became difficult as we pushed requirements on multiple teams, but the interaction models we saw in customers were clear. I pushed to approach this as a system.
Launch, Conclusion, and Takeaways
As of January 2018, we launched our initial feature set for microwaves on Alexa. We launched with a full voice interface, setup on mobile, and informational query states on Echo screen devices (shown here). In late October we actually had the VUI done, and had the choice of whether to launch or hold back for the full GTM suite. We could have served customers with existing microwaves on their account (very small %) but chose to push back for a stronger marketing strategy. Though I pushed to release the design for initial iterative insights, but one of our main launch partners held the launch for their own marketing schedules. It was bittersweet to hold back a design we could be getting valuable feedback, but ultimately it was more important to continue with our launch partners.
When we did launch, we closely measured voice interaction friction, engagement, adoption, and CSAT scores. The balance came out that our utterances were succeeding, but adoption was low. I have a theory that customers are either failing on the first intents and never coming back, or succeeding in initial attempts and using the product frequently. Similarly engagement was high, but CSAT was low. This points towards a product that is useful, but lacks the enjoyment factor to inspire customers to be promoters.
The most compelling takeaway for me was the balance of engagement and NPS/CSAT. It told me I had been too utilitarian with my design; made it too bare bones and didn't focus on the WOW features. Because of this we are taking time to focus improving the recipe/preset list (i.e. Alexa cook my chicken dinner). I am happy with these findings as we can iterate and build what is already there.