Designing for Trust

Exploring trust and collaboration in 
conversational agents for e-commerce

An actionable guideline on how to design for trust, aimed for interaction designers, who design conversational interfaces for e-commerce applications.

Thesis Advisors: Dan Lockton, Daragh Byrne 

Duration: 8 Months

Disclaimer: I wasn't affiliated with any companies and business mentioned in this thesis while I did this thesis research. All trademarks are used for illustrative purposes only. All registered trademarks and copyrighted materials are the property of their respective owners.


The opportunity

Conversational agents are computer programs that interact with humans using natural language, conversing. Commercially available conversational agents can talk to humans through text or voice. As these programs promise to communicate with humans in a way that humans are already good at, every day more than enough agents (aka chatbots) is being launched.

While all of these agents converse with us, they don’t get to talk with each other. Today’s chatbots can’t help us with complex tasks, and they just started to refer to each other to overcome this issue. In my thesis, I argue that the currency for these hand-off moments will be trust.

Background Image: Bot Ecosystem 2017 by Keyreply.ai

Conversational Trust Design Checklist

My goal with my thesis was to explore what a collaborative future may hold for agents and how would trust dynamics change in such collaboration. Based on my findings from my earlier design experiments and research, I designed a Wizard of Oz prototype to compare two scenarios with a travel booking conversational agent system:

  1. A negotiation scenario, which a meta agent did bargain on behalf user with other agents.
  2. A bot-to-service composition where users interacted multiple agents for specific tasks.

Then I synthesized my findings into an actionable design guideline: The conversational trust design checklist provides suggestions for interaction designers, who are interested in designing for trust, with 14 implications in five categories.

Be transparent

Share what agents (need to) know about the user

While some expect that every party in a multi-party conversation can access their data, users should get information about their data usage and what all parties know about them.

Be transparent

Refer others cautiously, visualize confidence level

A conversational agent should not refer to others (agents or websites) if it is not confident that they can handle the task. Communicate uncertainty with an indicator.

Be transparent

Give specific feedback to clarify

When there may be a risk for the user such as confirming before a payment, provide detailed and specific feedback to be transparent.

Give control to the user

Enable users to review bot’s decision-making

Communicate the reasoning behind agents’ actions and recommendations. Provide a way for users where they can fact-check bot’s suggestions and decision-making.

Give control to the user

Provide a room for revisions

Users may want to change or update information that they provide to the agent, enable them to do it efficiently.

Give control to the user

Fail gracefully, offer auto-recovery

In case of failure, provide a reason and a safe exit after two times not to lose the user. Offer to do the failed task later, automatically.

Give control to the user

Provide alternatives for agents

Some users will not be comfortable with chatting a bot for their high stake
transactions yet. Don’t be prescriptive, provide alternatives.

Be relevant

Set the expectations

Clearly state what can a bot do or not, how well can it understand the user to eliminate communication breakdowns. The name of the agent can also
affect people’s expectations.

Be relevant

Remember the context and forget it when asked

Build upon the previous bits of the conversation. Provide users a sense of memory and a way to forget if needed.

Be responsive

Indicate writing and processing visually

Users expect to see the status of what the bot is doing. They expect
to get an answer from a virtual agent quicker than a human. Late responses raise questions about its reliability. A visual indicator that shows whether the bot is writing or processing makes users to perceive bot to be more human and the interaction faster, even it is longer.

Be responsive

Don’t indicate hand-offs

Don’t make the user feel any interruptions and try not to surface the seams in the conversation. Don’t emphasize or humanize the hand-offs. Be
concise about the first introduction in a hand-off and connect it back to the conversation.

Be visual

Use visual elements to increase the credibility

Relevant visual elements tend to increase the trustworthiness of a
text-based interface.

Be visual

Include branding where possible

To form credibility and show competence, include visual brand symbols such as logos if possible.

Be visual

Provide secure gateways

Users expect to put their payment information in secure and encrypted forms on a cognitively higher level than the conversation. Leverage
solutions that can show the security level of the transaction
such as a webview with secure https:// page.

The impact (so far)

Presenting @ Voice UX Meetup #3

On October 2018, I was able to present this project at Voice UX Meetup in the San Francisco Bay Area, hosted by Botsociety. The event was held at Google Launchpad in San Francisco. Overall, it was a great opportunity for me presenting along with Andrew Ku, a conversation designer at Google. The audience was highly engaged and interested in the project and even an attendee, Chaitrali B sketched these amazing notes to summarize the talks.


Going Forward

High-level Learnings

  • Trust is a complex phenomenon and it is crucial 
for the successful conversational agents.
  • Measuring trust, particularly behavioral trust, is a challenge.
  • Establishing trust to conversational agents fully 
will take time as it depends on our ever-increasing expectations.
  • Conversational agents’ collaboration is ongoing and 
offers interesting opportunities and challenges for designers.


  • Trust is complex. Working between high-level design strategies and architectures to granular visual and conversation design decisions made me realize how vital, yet complex trust is for establishing and maintaining the relationship between humans and technology artifacts. 

  • Conversations are for building trust. Combining trust and conversation into a single model taught me how ‘building’ trust is parallel with conversing. In other words, I learned how trust becomes the outcome of a conversation. 

  • Just enough research is what is necessary. The short timespan of thesis taught me how much primary and secondary research would be necessary to advance and also to research using my skills in design and making.

  • Trust is going to be more important in future and I am just starting... Having this thesis project as a foundation to my future work in trust, I believe I still, have a lot to learn, test, and verify as a
    researcher and designer to design for interactions that will users
    trust and start a conversation.

A rolling to-do list

  • Re-evaluate findings with voice-based UIs
  • Refine the checklist by testing it with more designers
  • Research ways to make the final checklist accessible to designers
  • Share and pass knowledge and my design methods

Process overview: Research through design

Through literature review, expert interviews, workshops, and doing formative tests, I learned about the nuances of trust and conversation design. 


Conversational Trust: Simplified

From my literature review, I developed a model for my thesis, which I call conversational trust. In summary, our trust in conversational agents depends on how predictable they are and good at completing their tasks to fulfill our expectations and lower our perceived risk level. This relationship also needs to sustain, to be consistent.

Dimensions of Trust in Technology

In addition to conversational trust, I also referred many times to a widely used trust definition in the information systems field that includes dimensions such as competence, benevolence, and integrity.


Botae - An Incompetent yet Benevolent Chatbot

To start with, I decided to test some of the dimensions of trust with a design experiment. I developed a chatbot called Botae on Facebook to see in what levels users are willing to share their personal data and how it relates to trust.

Botae was incompetent yet benevolent by design. I intentionally tested it only with my friends, people who I have a personal relationship with, as I was very aware of the ethical implications of using deception in human-computer interaction research.

From Botae, I learnt that...

  • Social influence and referrals are key to the users’ trust in accepting stranger agents.
  • If referred by a trusted party, some may over(trust) conversational agents with their data.
  • Some don’t know what agents already know about them.
Quick reply buttons like “Got it” above
made conversation end quicker.
Use of certain emojis added more
visual credibility.
Botae is sending an image to user.

Why researching the trust in the e-commerce agents?

Botae also made me realize that trust is highly contextual. For example, you may trust your designer colleagues with designing, but not necessarily with mowing your grass.

Considering as risky the situation becomes as we use trust more, I decided to scope down my thesis in the context of e-commerce for two reasons:

  1. People often don’t trust e-commerce bots to experiences related with money, one of bots' important business metric.
  2. Typical e-commerce relationships are short-term and transactional vs. being collaborative and long term.

Surveybot – A bot that surveys for other bots

To learn more about the conversational agents and test my initial assumptions about online shopping experiences, I decided to create another chatbot to survey a potential target user group: university students.


From Surveybot, I learnt that university students...

  • trust agents on doing mundane and low-value transactions.
  • may want help from an agent on deciding what to buy and
    finding where to buy something, with their purchase and
    after-hours issues.

  • do not trust agents with managing valuable assets, human level understanding, agents’ intents, the level of data privacy
    they provide, and agents’ memory.
  • do not want the help of an agent because of performance
    issues such as responsiveness and because of the fear of
    losing their agency and joy of shopping.

  • both favor and dislike having a dumb agent. There is a
    paradox behind the agent intelligence
 Words clouds: Participant responses of three adjectives to describe the bot they like and hate.

Assistants: Tomorrow’s meta-chatbots

After my interviews with industry experts, I focused on the challenges that current conversational agents may have. I used Amir Shevat – VP of Product at Slack's classification as a framework which he explains in his book "Designing Bots": 

  1. Expert agents that are good at solving problems in a single domain. ie: most of the chatbots we have today such as DoNotPay.
  2. Generalist agents that are good at solving problems in multiple domains. ie: personal assistants such as Google Assistant, Apple Siri.

Amir also argues that most of the personal assistants are trying to become a meta-chatbot by combining multiple simple tasks or domains on themselves such as setting an alarm, playing music, adding things into a grocery list.

Disclaimer: Trademarks are only for illustrative purposes.

Assistants: Today's marketplaces for expert agents

All four “assistants” also have access to a community of domain-specific agents. In other words, generalist assistants are also becoming marketplaces for expert agents.

Disclaimer: Trademarks are only for illustrative purposes.

Assistants: Different system architectures

While the idea behind them is the same, assistants use two different system architecture for communicating with domain-specific agents:

  • Bot-to-bot referral: In this scenario, an agent that people trust and use (trusted agent) refer them to a stranger agent. For example, when Cortana receives an intent that exists in its knowledge database, it invites a 3rd party bot to the conversation.
  • Meta chat-bot: In this scenario, when people ask something from a trusted agent, the agent becomes a mediator between the stranger agent and people. For example, when people ask something from Siri, it interacts with apps on the back-end and returns the result to the person. This way people only have to interact with Siri, an agent that they already trust.
Derived from Amir Shevat's work. Disclaimer: Trademarks are only for illustrative purposes.

Why trust in e-commerce and travel?

For my final design experiment, I scoped down my context even further. Informed by a research report on how consumers don’t trust travel chatbots with booking their travel, I decided to focus on trip booking as it is a complex scenario. To understand how trip-planning chatbots work,  I reviewed the user experience of 25 travel chatbots on the market. 

Disclaimer: Trademarks are only for illustrative purposes.

Scenario-Building Workshop: Travel experiences with multiple agents

To understand the mental model of people when they experience a challenge that involves multiple actors, I did a scenario-building workshop with 6 participants. I identified two insights from this workshop:

  1. Participants described travel booking as a fragmented experience with many different actors involved including themselves, their relatives, friends, apps, websites, and brands.
  2. Some participants found managing their travel after they book it, challenging such as changing the date of flight or rebooking accommodations. 

The feedback I got from this brief workshop inspired me about my final design experiment on using ‘experience breakdowns’ and agents as seams similar to Kevin Gaunt’s project on smart homes. In his work, Kevin used multiple chatbots to create a "seamful" experience, illustrating Mark Weiser’s proposal that experiences should include “beautiful” seams than trying to be seamless.

Disclaimer: Trademarks are only for illustrative purposes.

A Wizard of Oz prototype: Destination Bot 

Scoping down to a fragmented experience, travel booking journey, I designed a travel chatbot on Slack and asked 6 college students to test it without knowing that there were humans pretending to be chatbots behind the curtain. I used bot-making tool Walkie to design multi-agent conversations by writing sample dialogs.

I used Walkie to write my sample dialogs for a multi-agent conversation. Disclaimer: Trademarks are only for illustrative purposes.

Two collaboration scenarios with multiple agents

With designing Destination, my aim was to compare how trust changes in two agent-collaboration scenarios.

Scenario 1: Bot-to-service composition

The first scenario involved a bot-to-service composition, in which users interacted with different bots to handle various tasks. As the part of their role-play, participants were asked to explore Destination 2.0 to book travel for New Orleans with one of their friends.

Destination bot behaved as expected.
Lodging bot behaved as unexpected by confirming inaccurate information.
Banking bot behaved as expected
Manager bot behaved as expected.

Scenario 2: A negotiation scenario with a meta-chatbot

The second scenario involved a meta-agent scenario, which
participants interacted with a single bot to handle different
tasks. As a part of their role-play, participants learned that
their friend has to come back a day earlier. For this reason,
they were asked to change their flight tickets and book a
hotel reservation for their trip.

Destination Bot behaved as expected for changing the flight tickets.
When users try to book a hotel for the first time, Destination bot gave an error. It blamed another bot as the reason for the error. After this, it behaved as expected.
Destination Bot behaved as expected for paying the order total.
Destination bot behaved as expected for surveying the customer satisfaction.

© 2018 Meriç Dağlı