Successfully implementing Cortana in your UWP app

Cortana is a great way to bring some of that Windows 10 platform goodness into your app. With Windows 10 and UWP, it is now possible to get Cortana integrated into your app for mobile devices, tablets, and desktops.

Cortana is a great way to bring some of that Windows 10 platform goodness into your app.  With Windows 10 and UWP, it is now possible to get Cortana integrated into your app for mobile devices, tablets, and desktops. At InfernoRed Technology, we have worked with several of our favorite clients and partners, including National Public Radio (NPR), Sesame Street, and Nexia, to build successful Cortana experiences into their Windows 10 apps.

There are a couple of approaches you can use to bring Cortana into your app.  The approach you take will depend on what or how much you want to do with Cortana and what you want your end user’s experience to be.  Like many things in mobile app development, just because something is technically possible doesn't mean it's the best way from an end user’s perspective.  As such, it is important to carefully think through and design your Cortana experience up front.

Design up front

At first, especially for a developer, implementing Cortana may seem like a low level OS integration bound together by some key phrases and commands.  However, it is actually just as much, if not more, an exercise in user experience (UX).  Like most UX tasks - if Cortana integration is not done right and is not easily discoverable, it will not be used.

Here are some key points to keep in mind while thinking through the design of how Cortana will play a role in your app:

1. Follow the Microsoft design guidelines:

Microsoft provides highly detailed design specifications about working with Cortana in different scenarios and how to properly balance the UX between Cortana (within the OS) and your app.

Microsoft also provides documentation on the type of language and syntax to use when structuring your app’s voice commands

2. Treat each Cortana interaction as an individual feature:

As you think through the many possibilities of how you want to leverage Cortana, it might become overwhelming and hard to keep track of.  The best way to keep track of these ideas is to treat them like individual features or stories and document them as such.  There are several ways of doing this - you can make a spreadsheet, you can make a traditional design document, or you can simply include the details in the story where you manage the project (e.g., Trello, Jira, PivotalTracker).  Regardless of how you choose to do it, here are some things to include for each voice command feature in your app:

  • Purpose: What will this individual interaction/command allow the end user to do?  (Example: “Listen to a specific podcast”).
  • Command syntax: How should the user phrase the command to Cortana?  This includes optional terminology, where the app name should appear in the command, as well as how to specify search terms.

(Example: “Play [the] {searchPhrase} podcast on NPR One”).

  • Alternative commands: What other possible ways might the end user issue this command?  It is possible for the voice command definition to contain multiple phrases for a single command.  This gives your users some flexibility in how they say or type the command to Cortana.

(Example: “Play [the] {searchPhase} podcast on NPR One” and “Listen to [the] {searchPhase} podcast on NPR One” will allow the user to say “Play” or “Listen”, providing the same result either way).

  • Cortana’s response: How should Cortana respond to the user’s command?  Depending on the scenario, this might be multiple phrases.  Also consider how Cortana should respond if no results are found or the required action can not take place for some reason.

(Example: When a user asks Cortana to play a specific podcast, Cortana might first respond with something like “Ok, I’m looking for the {searchPhase} podcast” and then once Cortana finds the podcast she might respond with something like “Here you go!”.  In other scenarios Cortana might ask for some sort of confirmation (Yes/No) or even display a list of results for the user to select from).

  • App launch: What should happen in the foreground app (if anything)?

(Example: Open the app the specified podcast page and begin playing the latest episode).

Depending on how your team is made up, this task may fall to you, the developer, or it might be something one of the design architects takes on.  Either way, I believe it is instrumental to the success of integrating Cortana in your app.  As you will read in the design guidelines and the technical documentation, it is important to provide feedback to the user through Cortana every step of the way otherwise the command may get terminated. For example, make sure you plan on letting the user know when Cortana is busy doing something, or when Cortana runs into an issue

3. Consider authenticated scenarios:

A lot of the sample apps out there demonstrate Cortana very well, but like most samples, they aren't exactly real-world focused. Imagine your app requires your users to authenticate before you are able to show them any data. This is a pretty common scenario for most apps.  However, if you follow the implementation from the sample apps, things won't go exactly how you want them to.  If your Cortana app needs to reach out to an API using an authenticated token, for example, you really can't leverage the Cortana commands until after the user has successfully signed in to the app.   To overcome this, design for a scenario where Cortana doesn't have access to the data it needs in order to process the user's command.  The simplest thing to do is have Cortana respond back to the user with a message asking them to sign in first.

4. Be aware of what the OS already provides:

When thinking about how you want Cortana to aid your app's users, be aware of what the OS already provides.  There may be some commands built into the OS that you get for free.  Leveraging these OS-provided commands not only saves on development cycles, but also prevents ambiguity.  Take audio apps for example- there is no need to implement commands for playing, pausing, skipping, or rewinding the audio in your app because Cortana already provides voice commands for the default system level audio controls.  This means a user can pause your app via Cortana the same way they would pause their music in Groove, or any other music app via Cortana.

5. Consider multiple languages:

Depending on your user-base, it might be a nice touch to provide multiple language support for your Cortana commands.  Even if your app isn’t localized for other cultures, you can still easily allow your users to interact with your app using their voice in their preferred language.  Think about your end users and what will make their experience with your app the best.  You can support alternate languages by simply providing additional command sets within the voice command definition XML file.

Selecting the right technical implementation

Now that you have a nice design definition of how each Cortana interaction should work in your app it is time to decide how to implement them.  Some interactions might require secondary actions from the user, whereas others can go directly into the app using just the initial command.  In some cases, you might not even need the app to open in order to complete the task associated with the user's command.

There are two main ways of handling Cortana voice commands in your app.  You can use either one, or a combination of both of them.

  • Foreground: This means you handle the command directly in the app's activation using the VoiceCommand ActivationKind.
  • Background (Voice Command Service): This means you handle the command in a separate background task (IBackgroundTask).

If you want to tailor the experience of how responses are handled within the Cortana interface, then you must use the background task approach.  For example, if your user issues a search command, you might want to display the results directly within the Cortana interface and allow them to select a choice.  The background task approach is also good for confirming things with your user or if you need to complete a task based on the user’s command but don't need to run the foreground app (for example, deleting an item, or marking an item complete in a to-do list app scenario).

A note about performance

The Cortana documentation suggests the voice command definitions be installed each time the app runs to ensure it is always available and up to date.  However, I have found that the VCD installation adds upwards of 12 seconds to the app start up time.  This is definitely not a good user experience and conflicts with other UX related guidelines specified by Microsoft.  There are a couple ways around this… since the installation method (InstallCommandDefinitionsFromStorageFileAsync) is async, I typically run it with a ContinueWith method using the OnlyOnFaulted continuation option.  This allows the app start up processes to continue while the VCD installs and the installation task can finish silently – if it run into an issue it can be logged in the continuation action.

In conclusion

I hope some, if not all, of these pointers help you think about how you will integrate Cortana into your Windows 10 apps.  Designing and thinking through the little details up front will help ensure you have a successfully Cortana experience that your users will love to use.