Episode 66
The global workforce crisis is real and huge. Rob Carpenter, Founder & CEO of Valyant AI talks to Greg and Valentina about the capabilities of conversational AI and at the same time, he addresses its limitations and its impact on the economy.
Episode 66
Rob Carpenter stated that every single interaction with a customer presents a new opportunity to get some basic reasoning that can be applied from one interaction to the next, However, there is currently no system able to pull it off, which means that a tremendous amount of manual work is still needed to hack through the cases.
According to Rob, conversational AI needs a levels system, just like self-driving cars. These are in fact big umbrella terms that encompass several tools and devices with different levels of automation and reasoning. Google Home, Siri, or Alexa are great examples of conversational AI that work perfectly in home settings, to assist you with daily tasks, but they wouldn't work in busier and louder situations like in drive-thrus. For this reason, Valyant AI works to develop a 'level 3' conversational AI system that you can carry on a more natural conversation. The tool should be sophisticated enough to roll with you, respond as close as possible to human speed, and carry on multiple back-and-forths to arrive at the end of the discussion.
There is still a general concern regarding artificial intelligence and automation, especially in the workplace. The current workforce crisis is real and is probably here to stay for a while, but Rob assured us stating that what's crucial to monitor is the pace of automation. In his opinion, if it stays how it's been over the last decade it's no big deal, as economies are naturally dynamic, shedding some jobs and creating others. After all, in the late nineties, even early two-thousands, no one would have thought Social Media Manager or AI Interaction Specialist would be actual job titles.
This article summarises podcast episode 66 ”The Impact of Conversational AI on CX & the Workforce Crisis" recorded by CX Insider. For more information, listen to the episode, or contact Rob on his LinkedIn profile.
Written by Alessia Trabucco
Full episode transcript
Rob: If you go back to the late 1800s, and early 1900s in the United States, 97% of the work done in the United States was around agriculture. Today, that number is flipped and it's more like 1.5% - 2% of the US is involved in agriculture. But did that mean that we lost and we have a 98% unemployment rate?
Valentina: Hello everybody and welcome to another CX Insider podcast episode. In today's episode, Greg and I interviewed Rob Carpenter, founder, and CEO of Valyant AI, a conversational AI that can take orders in drive-thrus. We will talk about the technical complexities of this tool, why there is so much hype around this technology, and most importantly, how the implementation of conversational AI will affect the future job market. So stay tuned and enjoy the episode.
Valentina: As a little boy, Rob was growing up in a little west coast town in Alaska called Dillingham. This town was super isolated, having no roads in or out, and Rob could visit the nearest city only once every two years. His biggest childhood dream was to go to space. So he started planning his journey to make that happen. Rob knew that the probability of being a successful entrepreneur is higher than becoming an astronaut. Therefore, he left his hometown, studied in different cities, traveled around the world, and gathered as much knowledge as possible. On this journey, Rob encountered many difficult obstacles. But the real turning point happened when he moved to Colorado.
Rob: So packed up, moved to Colorado, got my master's degree out here. I started a custom software development company again, continuing to push down the entrepreneurial trajectory. Spent seven years building a company called App Ventures. I grew it from my spare bedroom to a company with 35 employees. I acquired a company in Hyderabad, India. I acquired a company in London, and grew the business, but was starting to feel restless. AI was getting kind of bored of sort of the monotony of building custom software. And I really wanted to build something that I could own, I wanted to get into a product-based company. So I brought somebody in to run the custom software development company for me. And then I started Valyant. And the kind of big, hairy, audacious goal or bag that I have for Valyant is we were trying to build digital employees that we can put in physical locations. The initial concept was essentially a holographic employee that we would render on a transparent display, and you could talk to it, carry on a conversation, and it would help you do something. But what we realized at the time was that the conversational AI technology was frankly trash, and it was fine for you could ask one thing and you get a delayed response back and that was about it. But you couldn't carry on a conversation with the system like you would if you were talking to an actual employee. So we realized if we were ever going to build this sort of holographic employee technology, we first were going to have to build a higher quality enterprise-grade conversational AI system. So within about six months, I pivoted to conversational AI. That was like mid-2017 and we've essentially stayed the course since then to build out almost an employee-level conversational AI system. So we pick drive-thru in restaurants. We haven't deviated from that course. Fast forward to where we are today. We are one of only two companies that I'm aware of in the country that have a master services agreement that allows us to sell conversational AI directly to franchisee restaurants, and scale that technology out. We are adding new live restaurants almost every single day right now, and we've made major inroads within one chain called Checkers and Rally's. We're getting ready to launch with another 3000-unit chain, and then we have a 25,000-unit chain we're hoping to launch right after that. So it's been quite the ride from growing up in a U-Pick Eskimo Alaskan village to running a conversational AI company in Colorado and trying to build holograms for the market.
Valentina: Before we dive into how this incredibly advanced automation system can impact your customers, employees, business, and even the job market. I really wanted to know why Rob decided to deploy Valiant AI in drive-thrus and not any other industry. Surely the value of the market in the US is really huge, so that's one factor. But what are the other underlying reasons?
Rob: I mean, initially I analyzed probably ten different industries trying to decide where I wanted to take this conversational AI technology. The reason that I arrived and I've chosen to focus on the fast-food industry, number one, in the United States, it's large $865 billion per year in revenue. So it's a huge opportunity. Number two, in my opinion, had a limited language set. So you might have 50 - 100 menu items, and you might have another 2 to 300 ancillary words like ketchup and napkin and please. Thank you. Things like that. And so it gave us a smaller corpus of data that we had to work on versus early on. We had conversations with some of the biggest retailers you can imagine. And if you think about some of those big-box retailers like we have here in the United States, they could have 5 million, 10 million products inside their physical stores and then maybe another 10 million that you can buy online. That's a massive number of words for an AI system to dynamically be able to understand without having done specific training. For example, one of the restaurants is rolling out a new product called a Brookie, and the Brookie is a cookie brownie hybrid. Brookie is not a word that we necessarily have in the English language. That isn't something out of the box. At least our speech-to-text systems are automatically understood. And so we did have to do some manual training on the speech text system to be able to accurately understand what is a Brookie. That would be very, very hard, nigh impossible if you had 29 different product SKUs that you were trying to always understand perfectly. So a large market limited word set and also extremely thin margins. And so if you can make even a couple of percentage points difference in profitability, you could be talking about increasing profitability by 50% at some of these restaurants. So they do high volumes of products and revenue. But then the actual take-home profitability could be in some situations, 5%. So if you can deliver another two and one-half percent, well, now you've increased profitability by 50%. If you drive an additional 5%, now you've doubled their profit. So those thin margins also allow the opportunity for conversational AI to make outsized impacts on the restaurants, actual PAL, and bottom-line finances.
Valentina: All of you have probably used a conversational AI at some point. Conversational AI can be a simple chat or a virtual agent. It essentially refers to a range of technologies. However, not all of these technologies are equal, and your Google Home assistant would probably not do well in a drive-thru.
Rob: I think Google Home or Siri or Alexa are fine, almost like 1.0 versions of conversational AI. One of the things I've advocated for and haven't been able to push the industry yet, but I'll keep working on, is that I feel like conversational AI needs a level system, just like self-driving cars. Self-driving cars is a big umbrella term, and that could literally mean a level one, which is maybe a train in an airport that kind of runs on a guided track but takes you to the different destinations all the way to level five, which could be a Range Rover in Aspen that's driving you home from the bar in a blizzard up over a mountain pass. Right. And for the record, we're like 20 years away from that scenario, probably for self-driving cars, but there is a gradient to self-driving cars and I think there needs to be a gradient to conversational AI, Siri, Alexa, Google Home, that's level one, conversational AI That's fine. You ask it, you wait, and you get an answer back. Generally speaking, they lack any sort of context, carry through common sense, if you will, or ability to kind of pull in disparate pieces of information to answer those questions, you know, up to a level five, which you could say is like C3PO, where it's a borderline sentient entity that you're talking to and carrying on conversations. So what we're trying to develop, I would say, is more of probably like a level three conversational AI system where you can carry on a more natural conversation with it, you can talk to it, open-ended, say whatever you want to say. However, you would normally say it when you're talking to a person and the AI system should be sophisticated enough to roll with you in that conversation flow, be able to respond as close to as possible the speed at which a human responds to take the context and carry over from multiple back and forths to arrive at new endpoints in the discussion. So that's really what we're trying to move these systems to, is it's not just a basic ask and answer, but it's a true open-ended conversation like you would have with an employee in one of these kinds of limited retail settings.
Valentina: In drive-thrus I imagine the noisy background cars around and people in the car shouting what they want. Pretty much confusing whoever or whatever is taking the order. So what are the obstacles in creating meaningful and productive conversations with AI?
Rob: I think when we were starting out, the hope was there's a happy path for 80% of conversations and 20% might have edge cases that you have to figure out. But once you kind of hacks your way through all of that, then you have a viable product to carry on a conversation with people. The reality is, it's probably 0.01% is the happy path and 99.99% is edge cases. So every single interaction with a customer presents a new and unique opportunity. And what you try to get to is some sort of base intelligence level of reasoning that can be applied from one random interaction to the next. But so far, we really haven't seen any systems emerge, both for us or for our competitors or the industry as a whole that can pull that off. And so what you end up doing is a tremendous amount of manual work to hack through the nearly limitless number of edge cases in terms of the way humans would interact with the system, even in an environment that's as simple as can I have a cheeseburger? We've found no end in sight to the number of unique ways people phrase wanting to order a cheeseburger. So I think that gets to something that's probably along the lines of like an NLU in natural language understanding and understanding the intent of what the customer is trying to convey to you.
Rob: And before that, the logic flow of how a conversational AI system works, is speech-to-text, because you have to first hear the customer translate the WAV files of what they said before you can do anything. And speech text is a massively difficult problem. So we benchmark ourselves against Google and Amazon for speech-to-text. And in a noisy drive-thru environment, we see Google's about 71% accurate and Amazon is about 72% accurate. If you are missing one out of every three words that a customer says to you, you will never bring a product to market. And so we've spent a lot of time over the last few years building a proprietary in-house speech-to-text engine that is trained and built on top of noisy drive-thru audio. Because you can imagine everything from cars backfiring in the distance to the car idling that you're ordering from the radios on. People are talking into the car. You could have highway noises, you could have a leaf blower going, even things like birds chirping. These are all noise artifacts that you wouldn't have inside of your home for a Google home, for example. Or theoretically, if you're using Siri, you're in a more quiet environment than these sort of noisy drive-thru environments. And so overcoming and trying to get the accuracy as high as humanly possible for a speech to text has probably been one of the biggest, most fundamental challenges.
Rob: NLU has been a significant source of challenge in understanding these millions of cases. And then both of those two things together really only tell you what the customer is trying to convey to you, which is great if the customer is perfectly accurate. But what we also find in a lot of situations is a customer will say, I want chicken nuggets, and we say, Great, you can have four pieces or six pieces, and the customer says five. It's like, Well, that wasn't one of the options that I gave you, but you still have to have the intelligence and the flexibility of your system to customers giving you that kind of response. Or they'll say, Can I have the number two? And you say, Okay, I've added a spicy chicken sandwich to your order. And they say, No, I don't want that. I wanted the big Buford, you know, and so they'll read the number wrong or something like that, and you still have to roll. And so the kind of third critical component to these conversational AI systems beyond speech to text and NLU, which just tell you what the customer wants to convey. The third one then is a logic engine, and that's really the brains of your operation, or the kind of common sense engine, if you will, that says, do we serve this product? Is it the right time of day? Is there other required information I have to get from you? Like a customer might say, Can I have a drink? And we say, Great, what would you like? We serve coke, lemonade, and iced tea.
Rob: You and sometimes the customer is great and they say, I want a Coke, or they'll say I want a Sprite. And other times they literally just come back and say, I want a drink. And so what do you do at that point? Do you keep pestering them to pick a drink or do you just say, okay, well, I'm giving you a Coke, I hope you like it kind of a situation? And those are all things that your logic engine then has to figure out and roll within that type of environment. So we've also spent years and years and years building our logic engine that can handle whatever kind of chaos flows through our system from the customer. Because some situations they tell us exactly what they want in a way that. It's clear and understandable. And other times they're all over the place and they contradict themselves. And you have to have the intellectual flexibility in your kind of core system to roll with that in those environments.
Valentina: Ironically, when scientists invent such advanced technologies that aim to resemble a human, they realize that people are incredibly creative when it comes to language and that we can compound literally anything from letters and sounds. And others will somewhat understand what we mean, except robots.
Rob: I mean, I think just the pure variety, even within something as specific as English, in a drive-through, wanting to order this basic set of 50 to 100 items, you look at and find nearly unlimited numbers of variations of how humans can construct language, which is great for all of us as creative and interesting humans. But for computer systems, which, despite the use of AI, still roughly work off of a kind of rails, the if-then type of core infrastructure is very, very hard for computer systems to adapt to that. And a lot of what we're doing with conversational AI finds a tremendous number of parallels with self-driving cars, and that's an area that more people tend to be familiar with. And so you could say, well, a 16-year-old, for the most part, learns how to drive a car without killing people. Like, how hard can it be? And yet here we are, right, 15, 20 years later, since it was initially starting to be talked about. Just I mean, probably north of $100 Billion of investment across the entire industry. And we still don't really have any kind of widespread self-driving car system yet. And that just really comes down to the pure variety of the number of different things that can happen on a road. Is that a bicyclist or a bag that's going in front of me? Tesla is doing a huge recall because their self-driving car would do a rolling stop at a stop sign versus a complete kind of by-the-book stop. They're very big. The self-driving car industry is a quandary because humans don't drive by the book like driving rules say they should. And so they've tried to adapt cars to drive more as humans do. So they fit within the existing system. But regulators want AI systems to follow things by the book. And so all of this to say is there's just a huge amount of variability that occurs in these types of environments for trying to deliver these systems. So I think variety is a key one. We see a lot of situations too where people are saying stuff that they don't mean or they're trying to talk at the same time that they're thinking, you know, and that can kind of degrade the accuracy of what they're trying to say to us. People also talk with huge amounts of underlying expectations of shared knowledge bases. People are not exact, which is what computers need. You know, they use euphemisms and colloquialisms. They leave huge amounts of gaps and the things that they convey because there's a certain baseline and understanding that humans understand all of these things. You know, I might say, can I have a burger with a ketchup packet? Right. Well, intuitively, I know that means the ketchup packet goes on the side of the burger and not that an AI is making your food right now, but unless the AI has the rules, I would just stick your ketchup packet right on the burger and then put the bun on top of it. Right. So it's important that we kind of understand those things as we build these systems out.
Valentina: It is evident why conversational AI in drive-thrus is such a good use case. The language is limited, but more importantly, from the S perspective, it is highly convenient for the customer in this specific situation and I customers don't really care about talking to a human. They want to order quickly and leave quickly. According to a couple of recent surveys, 40% of customers don't care if they were served by an AI tool or a human as long as they got what they needed. And more than 50% of customers expect a 24-hour service, which is quite unattainable without an automation system. So what are the other industries in which this tool can be useful?
Rob: Look, I think hospitality is a great one. I think that a lot of the labor problems that you're seeing in restaurants are also evident within hospitality, hotels, and things like that. So you can imagine a situation, especially the digital AI employees that I talked about, where you would walk into a hotel and then instead of waiting for five people deep in line for a human hotel front desk person to become available, you just walk up to this sort of digital person, talk to a carry on a conversation and it checks you into your room. Car rental companies are really big over here. Almost every single movie theatre I've been to has moved to kiosk-based systems, which are fine, but there's still a little slow and clunky, and it'd be nice to find ways to kind speed that transaction process up. And so I think some of these digital employees in movie theatres could be another source of major value. Retail centers could use them for directions, university campuses could use them, and hospitals could use them to check people in. I think really what we're trying to talk about and really trying to expand kind of the focus of this market opportunity is it's not about just a microphone-based system like the drive-thru. It's really looking at low-level entry customer service positions within physical retail environments. And as these labor crises get worse and worse, we're seeing that some of those bottom-rung jobs are the ones that are being hollowed out the most, because very understandably, everybody is trying to move up the labor ladder and move into better positions. And so that's creating a huge hole that that kind of entry-level customer service position. And so over the next 10 to 15 years, I think there's a nearly trillion-dollar market opportunity to automate entry-level customer service work in physical locations.
Valentina: Well, here we're getting to the controversial part. People don't quite know how to tackle the future impact of AI on the workforce and job market. This technology can and probably already is eliminating some job positions, but such a disruption will create new positions. However, those jobs that are eliminated are also the easiest to automate. On the other hand, newly created job positions will require a high degree of technical expertise. How will the job market treat people whose jobs were automated and who cannot be trained to become machine learning researchers? How will this impact the economy?
Rob: Yeah. I think we have to be a little careful too, as we talk about these things geographically sensitive because what might be true in the US might not be true in Europe, might be true in Indonesia, might not be true in India. Right. And so we just have to be kind of thoughtful about that. So I'll disclaimer that and that most of what I'm going to talk about is going to come from a US-centric perspective just because that's what we're focused on. So the US right now is a protein historically one of if not the lowest level of unemployment that we've ever seen in our kind of country's recorded history, specifically within the restaurant industry or specifically within the hospitality industry, there are currently 1.7 million unfilled jobs. One in every six job postings in the United States is for an open restaurant position. And if you're a small business owner, that's a franchisee of a restaurant chain and you own two or three restaurants, which a lot of that's a lot of our customers right now. They are also facing situations where the 150 to 300% turnover per year. So it's excruciatingly hard to find somebody. And then if you do find somebody, you have to know that within 4 to 6 months, you're probably going to lose that person. So you're going to spend a significant amount of time training that person to be able to run that restaurant just to then know you're going to lose them in four months and you're going to have to hire somebody else. And so I think, number one, the labor shortage right now is so critical. It's bordering on, in some cases, life-threatening for these small businesses.
There was a report of a bunch of Dunkin Donuts in Colorado Springs out where I'm at that had to shut down because it couldn't find people. We actually had an extremely interesting situation from our perspective about 30 days ago where one of our restaurant customers, two out of the three employees, didn't show up for their shift, but one employee, because they had I was able to run the entire restaurant and keep the restaurant open. And without our eyes, that restaurant would have shut down, which would have lost money for the small business owner, could have impacted hiring more people or opening up new stores and things like that. So all of that to say, if we look at it in kind of a US-focused, at this moment focused, I think it's pretty safe to say there's no problem because there's a massive shortage of labor. I think if we say, okay, now let's back up, let's look at a 30,000-foot view, longer-term or bigger picture perspective here. What happens if you automate a bunch of jobs within the restaurant industry and then you automate taxicabs and you automate truck drivers and you automate hospitality and you have robots that clean hotel rooms and you automate that. If you look at all of these things, if they all hit simultaneously, you are going to have a massive problem on your hand because then you're automating way more jobs than the market can keep up with and that will be bad. And that was a real concern I had probably about a decade ago, is that that would happen. I think that some of the shine has come off of AI.
There is still a lot of AI out in the market. There's a lot more that's come in is delivering hundreds of billions of dollars in value today, but it's not hitting at the pace that we thought it would. We thought we'd have self-driving cars within five years. Truck drivers in the United States are the number one job in 26 states. So if you lose all of those truck driver positions, that's a huge problem. Well, the reality is we're still probably 10 to 15 years away from really having all truck drivers be automated. And so I think what we need to pay attention to is the pace of automation. If the pace of automation has been like it's been over the last decade, we're fine because economies are naturally dynamic and they will naturally shed some jobs and then create other jobs. Even in the late nineties, who would have thought social media manager would be a job title, right? Or AI interaction specialists? If you go back to the late 1800s, and early 1900s in the United States, 87% of excuse me, 97% of the work done in the United States was around agriculture. Today, that number is flipped and it's more like one and a half to 2% of the US is involved in agriculture. But did that mean that we lost and we have a 98% unemployment rate? No, we just got rid of some jobs through automation and we created new jobs. So dynamism is healthy in a normal economy. And I think that we'll continue to see that play out as long as the pace of innovation isn't too severe.
Valentina: I hope you enjoyed this episode as much as I did. If you're interested in continuing this conversation, feel free to send us your questions or come on our podcast. If you are interested in what Rob does feel free to connect with him on LinkedIn. The link to his profile is provided in the episode description. And before we finish the episode, there are still some rapid-fire questions. 'What is the most unusual use case of AI you have seen so far?'
Rob: Somebody built a smart speaker with a hardware integration to flush your toilet so you could tell their product to flush your toilet. I don't know why you would want to do that. The handle seems perfectly functional, but hey, I mean, if you don't want to touch it, I get it.
Valentina: What advice would you give to your younger self?
Rob: Honestly, I think to stay the course. I do try to live with intentionality and not have too many regrets along the way. You know, a lot of times it was easy to think about quitting. And I've always kind of just kept going and sort of had this belief that with enough time and energy, I'd get to where I wanted to be and feel like I'm seeing a lot of the fruits of that labor. So I think if anything, maybe it would just be motivation to keep going, to keep the dream alive.
Valentina: Yeah. What's your favorite space movie?
Rob: Armageddon. I am just cheesy for space movies like that, and my dad raises me about it pretty regularly. But almost every Christmas we watch Armageddon together, and it's just it's hard to go wrong. And you got Bruce Willis, Ben Affleck, Ving Rhames, Steve Buscemi, Luke Wilson's got a Stack cast and they go to space and blow up an asteroid. How can that not be awesome?
Valentina: And my last question, what do you do when you encounter a grizzly bear in your backyard?
Rob: Don't leave your house.
Valentina: But what do you do when it's literally like there's no wall? You know, you're not in the house. It's right in front of you.
Rob: If it's right in front of you, honestly, it depends a little bit on the bear. If it's, you know, if it's got a cub close by and it's looking to protect its cub, then if it's a brown bear, you basically have to play dead because it'll make sure you're not a threat. Might take a couple of bites out of you, but then it'll walk away and you'll live, which is better than trying to fight it. And you won't live if it's a black bear. At least as we were always raised. Black bears won't leave you alone. You can't play dead. And they're better tree climbers than you are. And they're faster than you are. So with a black bear, you have to make yourself really big and tall. You have to yell, you have to scream. Pick up rocks, pick up sticks. Maybe you have a gun with you. If you're in Alaska with Black Bear, you have to defend yourself or it'll kill you with a polar bear. Because generally, you're around a polar bear. When they're on the ice, they've got pads and fur. So when they're walking on the snow and ice, you won't always hear them. So generally speaking, if you're encountering a polar bear in the wild, that's probably because it's already eating you and you didn't even notice that it was behind you.