DPD and the perils of doing chatbots badly

If you missed it, last week the delivery company DPD landed in some hot water after a tweet from Ashley Beauchamp talking about its chatbot went viral. The story was picked up by multiple publications, including the BBC. Here’s what happened.

Ashley was looking for some information about his orders, which the bot was unable to do. It was then unable to pass him to a human agent to resolve the queries.

So Ashley decided to have some fun with the bot. In the course of a few more messages the DPD bot:

  • Told jokes
  • Swore at Ashley
  • Wrote both a limerick and a haiku about how useless DPD was
  • Commented on how useless DPD was, and told Ashley not to call them
  • Called itself (DPD) the worst delivery firm in the world. 

(Source: Ashley Beauchamp, X)

This was clearly not a good look for DPD! The firm put it down to an update to its existing AI that caused this issue. It’s unclear which part exactly was the error – the inability to do what Ashley wanted in the first place, or the silliness that followed. 

Either way, it is not a good look for AI chatbots that use large language models. As we know a thing or two about this area, we thought it was best to talk about the benefits. 

How did this happen in the first place? What can firms do to stop this when embracing AI? We have a few ideas and tips that may help. 

(Source: Ashley Beauchamp, X)

The benefits of AI powered chatbots

It is clear why plenty of firms across a range of industries are using AI-powered chatbots. Provided chatbots are able to perform the same tasks and answer the same questions as humans, the main advantages are:

  • Speed – customers or users get answers to their questions much faster.
  • Availability – AI never sleeps, so users can get answers around the clock, and not just during office hours.
  • Accuracy – computers should not be able to get things wrong in the way that humans might. 

You may have noticed the massive caveat in the paragraph above: Provided chatbots are able to perform the same tasks and answer the same questions as humans. 

The truth is that there is a divide between humans and AI in what they’re both good at. AI is better for factual information and following rules to the letters. Humans are better for sensitivity, creativity, and nuance. The gap between what an AI can do and what a human can do is narrowing, but it still exists. 

That’s why we would always advocate that AI works alongside humans. Which brings us to our first problem. 

 

AI should always be able to pass to a human

Well-designed chatbots should be able to answer at least some basic questions – the FAQs, if you will – that come up time and again. When they are unable to do it, the default should be to pass the issue to a human to solve. 

This DPD bot failed at that first hurdle. It was unable to help, and then it could not help Ashley get in touch with a human. This is CX 101 stuff: if you are going to have a customer service, they ought to be able to actually serve and help customers. 

One retailer we spoke to gave us an example of sensitive conversations that they’d rather a human always answered. Specifically when a customer mentioned a death or illness that had affected them. That is a situation you’d always want the sensitivity of a human to handle. 

 

There should be limits on the questions a chatbot can answer

The vast capabilities of generative AI and Large Language Models means that people are reluctant to put limitations on what they can do. But without limits, the AI can be prone to go off in very different directions. 

Take the swearing. If you really want your bot to swear at customers, then you can, but why should you give it that capability. By the same token, the AI can be trained to ignore certain other questions or requests, or to refuse to answer them. Here’s what our generative AI does when you ask it to swear: 

Here, we see it refusing to action my request, and instead, recognising that this is beyond its capabilities, it passes to an agent. That’s exactly what should happen. The AI answers what it is trained to answer, and doesn’t engage in the rest.

Another factor here is cost. For a lot of firms, they are building their bots on top of a licensed Large Language Model, such as OpenAI. This means that every time they use the generative capabilities through an API, they are paying a fee to do so.

So in essence, DPD are likely paying a bot to play stupid games with customers. While this might be small in the grand scheme of things, it does seem like a bad use of money. 

 

A chatbot should not be allowed to insult the company!

I’m guessing if a human agent called the company they represented “the worst”, they would likely not have a job the next day. But this AI bot was not trained to prevent this from happening. 

If an AI is trained on vast amounts of information from a wide range of sources, when you ask it a question it may pull from one particular source, or it might pull from a different, completely contradictory source. This is what results in “hallucinations” that generative AI has been known for. 

If instead it is trained on quite a narrow source of information (say a company’s internal documentation), and told to exclude other sources then will give the right answer to that specific question. 

Once again, I’m guessing that nowhere in DPD’s internal documents does it say that DPD is the worst!  So the AI has pulled that out from somewhere, or has been “trained” through its conversation with Ashley and is essentially mirroring what it has been told by a disgruntled customer.

 

How to avoid these mistakes yourself

If you are looking to build AI chatbots like this, but want to avoid the mistakes, here are some tips:

  • Ensure you set limits on what your AI can answer so it doesn’t go off on tangents answering things that are not in your domain. 
  • Use your own proprietary and internal data to train your AI, don’t rely on external sources. For example, you could use actual agent transcripts to train your AI.
  • Know when to use AI, and when to pass to humans. You may need to test and optimise this, but even if AI cuts down 5-10% of tickets coming to your agents, that can have a big impact, and can easily grow as you get better. 
  • If you are looking to buy: look for a provider that specialises in your industry. There will be a lot of the same generic use cases across your organisation and another that can be adapted to your business. Why look to use a tool that wasn’t built for your purposes?

If you want to find out how to create a good generative-AI powered chatbot, watch our webinar with air up here.