Thesis Advisors: Dan Lockton, Daragh Byrne
Duration: 8 Months
Disclaimer: I wasn't affiliated with any companies and business mentioned in this thesis while I did this thesis research. All trademarks are used for illustrative purposes only. All registered trademarks and copyrighted materials are the property of their respective owners.
Conversational agents are computer programs that interact with humans using natural language, conversing. Commercially available conversational agents can talk to humans through text or voice. As these programs promise to communicate with humans in a way that humans are already good at, every day more than enough agents (aka chatbots) is being launched.
While all of these agents converse with us, they don’t get to talk with each other. Today’s chatbots can’t help us with complex tasks, and they just started to refer to each other to overcome this issue. In my thesis, I argue that the currency for these hand-off moments will be trust.
My goal with my thesis was to explore what a collaborative future may hold for agents and how would trust dynamics change in such collaboration. Based on my findings from my earlier design experiments and research, I designed a Wizard of Oz prototype to compare two scenarios with a travel booking conversational agent system:
Then I synthesized my findings into an actionable design guideline: The conversational trust design checklist provides suggestions for interaction designers, who are interested in designing for trust, with 14 implications in five categories.
While some expect that every party in a multi-party conversation can access their data, users should get information about their data usage and what all parties know about them.
A conversational agent should not refer to others (agents or websites) if it is not confident that they can handle the task. Communicate uncertainty with an indicator.
When there may be a risk for the user such as confirming before a payment, provide detailed and specific feedback to be transparent.
Communicate the reasoning behind agents’ actions and recommendations. Provide a way for users where they can fact-check bot’s suggestions and decision-making.
Users may want to change or update information that they provide to the agent, enable them to do it efficiently.
In case of failure, provide a reason and a safe exit after two times not to lose the user. Offer to do the failed task later, automatically.
Some users will not be comfortable with chatting a bot for their high stake
transactions yet. Don’t be prescriptive, provide alternatives.
Clearly state what can a bot do or not, how well can it understand the user to eliminate communication breakdowns. The name of the agent can also
affect people’s expectations.
Build upon the previous bits of the conversation. Provide users a sense of memory and a way to forget if needed.
Users expect to see the status of what the bot is doing. They expect
to get an answer from a virtual agent quicker than a human. Late responses raise questions about its reliability. A visual indicator that shows whether the bot is writing or processing makes users to perceive bot to be more human and the interaction faster, even it is longer.
Don’t make the user feel any interruptions and try not to surface the seams in the conversation. Don’t emphasize or humanize the hand-offs. Be
concise about the first introduction in a hand-off and connect it back to the conversation.
Relevant visual elements tend to increase the trustworthiness of a
To form credibility and show competence, include visual brand symbols such as logos if possible.
Users expect to put their payment information in secure and encrypted forms on a cognitively higher level than the conversation. Leverage
solutions that can show the security level of the transaction
such as a webview with secure https:// page.
On October 2018, I was able to present this project at Voice UX Meetup in the San Francisco Bay Area, hosted by Botsociety. The event was held at Google Launchpad in San Francisco. Overall, it was a great opportunity for me presenting along with Andrew Ku, a conversation designer at Google. The audience was highly engaged and interested in the project and even an attendee, Chaitrali B sketched these amazing notes to summarize the talks.
Through literature review, expert interviews, workshops, and doing formative tests, I learned about the nuances of trust and conversation design.
From my literature review, I developed a model for my thesis, which I call conversational trust. In summary, our trust in conversational agents depends on how predictable they are and good at completing their tasks to fulfill our expectations and lower our perceived risk level. This relationship also needs to sustain, to be consistent.
In addition to conversational trust, I also referred many times to a widely used trust definition in the information systems field that includes dimensions such as competence, benevolence, and integrity.
To start with, I decided to test some of the dimensions of trust with a design experiment. I developed a chatbot called Botae on Facebook to see in what levels users are willing to share their personal data and how it relates to trust.
Botae was incompetent yet benevolent by design. I intentionally tested it only with my friends, people who I have a personal relationship with, as I was very aware of the ethical implications of using deception in human-computer interaction research.
Botae also made me realize that trust is highly contextual. For example, you may trust your designer colleagues with designing, but not necessarily with mowing your grass.
Considering as risky the situation becomes as we use trust more, I decided to scope down my thesis in the context of e-commerce for two reasons:
To learn more about the conversational agents and test my initial assumptions about online shopping experiences, I decided to create another chatbot to survey a potential target user group: university students.
After my interviews with industry experts, I focused on the challenges that current conversational agents may have. I used Amir Shevat – VP of Product at Slack's classification as a framework which he explains in his book "Designing Bots":
Amir also argues that most of the personal assistants are trying to become a meta-chatbot by combining multiple simple tasks or domains on themselves such as setting an alarm, playing music, adding things into a grocery list.
All four “assistants” also have access to a community of domain-specific agents. In other words, generalist assistants are also becoming marketplaces for expert agents.
While the idea behind them is the same, assistants use two different system architecture for communicating with domain-specific agents:
For my final design experiment, I scoped down my context even further. Informed by a research report on how consumers don’t trust travel chatbots with booking their travel, I decided to focus on trip booking as it is a complex scenario. To understand how trip-planning chatbots work, I reviewed the user experience of 25 travel chatbots on the market.
To understand the mental model of people when they experience a challenge that involves multiple actors, I did a scenario-building workshop with 6 participants. I identified two insights from this workshop:
The feedback I got from this brief workshop inspired me about my final design experiment on using ‘experience breakdowns’ and agents as seams similar to Kevin Gaunt’s project on smart homes. In his work, Kevin used multiple chatbots to create a "seamful" experience, illustrating Mark Weiser’s proposal that experiences should include “beautiful” seams than trying to be seamless.
Scoping down to a fragmented experience, travel booking journey, I designed a travel chatbot on Slack and asked 6 college students to test it without knowing that there were humans pretending to be chatbots behind the curtain. I used bot-making tool Walkie to design multi-agent conversations by writing sample dialogs.
With designing Destination, my aim was to compare how trust changes in two agent-collaboration scenarios.
Scenario 1: Bot-to-service composition
The first scenario involved a bot-to-service composition, in which users interacted with different bots to handle various tasks. As the part of their role-play, participants were asked to explore Destination 2.0 to book travel for New Orleans with one of their friends.
Scenario 2: A negotiation scenario with a meta-chatbot
The second scenario involved a meta-agent scenario, which
participants interacted with a single bot to handle different
tasks. As a part of their role-play, participants learned that
their friend has to come back a day earlier. For this reason,
they were asked to change their flight tickets and book a
hotel reservation for their trip.