Chris Butler

Today’s guest speaker has built design thinking systems for some of the world’s top startups that grew into enterprises. He has scaled crowdsourcing initiatives at Waze, integrated mobile design strategies at Kayak and recommended product integrations with AI and machine learning for Fortune 500 companies.

Listen in as Chris Butler of IPSoft and I discuss the future of home automation, how you can hack AI for the greater good and how human-centered design can create inclusive AI systems. This is HumAIn.

Welcome to HumAIn. My name is David Yakobovitch, and I will be your host throughout this series. Together, we will explore AI through fireside conversations with industry experts. From business executives and AI researchers, to leaders who advance AI for all. HumAIn is the channel to release new AI products, to learn about industry trends and to bridge the gap between humans and machines in the fourth Industrial Revolution.

If you like this episode, remember to subscribe and leave a review.

David Yakobovitch

Welcome back everyone to the HumAIn podcast, where we talk all things on human-centered design, AI applications and the future of our digital workforce. Today, our guest speaker is Chris Butler, in New York City. Chris is IPSoft’s Chief Product Architect and has worked in developer experiences, companies like Microsoft, Kayak and Waze.

He’s also led the AI design research initiatives including prototyping at philosophy. Chris, thanks for being with us today. 

Chris Butler

Thanks for having me. I’m really excited to talk. 

David Yakobovitch

I love how AI is changing. I’m going to be very soon having a conversation all in AI where we won’t even hear my voice, but there’ll be a robot David’s voice.

Chris Butler

Exactly. Well, definitely, there’s a company called Lyrebird that can help you out with that. 

David Yakobovitch

I love them. I actually heard all about this company on another podcast called Sleepwalkers, which is an iHeart radio episode. And their podcast is very dystopian, all about the end of the world, but somewhat promising.

And that startup, actually, I’ve been following Lyrebirds since the early days, since they were in pre pre-alpha. Actually about three years ago, I turned my voice into a robot voice and said, ‘Happy birthday, dad. This is robotic David’. So it’s amazing to see how technology, especially in AI just the last two, three years it’s taking off.

Chris Butler

Absolutely. Now, there’s some really interesting things happening with things like text to speech and speech to text. I don’t know. I guess my personal opinion is long-term that this type of stuff will be more commoditized than the actual semantic understanding of things. We’re in a really interesting place right now where we’re kind of halfway there for a lot of this stuff, so, it’s definitely an interesting time to be alive.

David Yakobovitch

When we think of AI design and thinking of that for speech to text and text to speech, so many companies are working on conversational AI and audio systems, including your own. How are you seeing better design principles going into systems today? 

Chris Butler

We’re finally starting to dig our way out of this idea that it’s all about the research or the specifics around theory when it comes to these brand new state-of-the-art systems. And so we’re starting to get into this range of what does it actually mean to do practice with these systems? And then, to be fair, there are places that have been doing this for a very long time. If you work at a credit card company, the people in fraud detection have been doing this for 20 years, but now finally, when we talk about the idea of IT automation support or conversational agents, and the way that we do them at IPSoft, we’re finally starting to get the point that it’s how are we practicing these types of NLU, NLP techniques to understand what someone is asking for, but then the other half, which is maybe less about AI and less about the machine learning aspects of things. It’s how we are now getting something done with all the legacy systems that are already there.

And that’s maybe the biggest problem, really, a lot of the time when we’re building something. It’s not necessarily all the tools that are required to train the intent recognition models or the entity extraction models. Those are things that we’re constantly building better and better over time, but for a particular instantiation of it, it really comes down to how do we integrate with these systems that have been around for 10 or 20 years in some cases. 

Especially for enterprises. If your company has been around for a few years, you probably have tens of systems. If your company has been around for more than that, you probably have hundreds. There was a meeting I was in maybe about two weeks ago where this particular company was talking about how they had counted 850 different individual systems or services that they have to integrate on their backend. And so when you talk about conversational agents, it’s really about how you are making something that is potentially complex or complicated, that most people don’t have some expertise in. And that basically is IT systems that could be any type of organization when it comes to the systems that they use to run their businesses.

But regular people need to ask those systems to do things for them. You need to have a machine that is fixed because it’s down. You need to be able to request a password reset, which is sent to be our bread and butter for some reason. It’s very simple for consumer companies. But when you talk about the 5 to 10 different systems that are required within an enterprise world to just reset your password, it gets very complicated.

And the conversational agent is really meant to help thatemployee, or that customer be able to make a request that makes sense to this huge set of machinery that’s on the backend. And so that’s the thing is we’re starting to see more integration of this with the rest of the system.

Definitely, there’s the meme that was going around that was talking about the entire system for a particular company and machine learning was this little tiny speck within that system where everything else having to do with the actual assistant being built was most important. 

David Yakobovitch

Why do you think it’s taking so long to see this design thinking be applied in all systems?

Chris Butler

When we think about design thinking, especially, it’s how are we being better at being human-centered for the things that we’re building. And the problem has always been that there’s a lot of certainty that people have, especially in corporate organizational structures about what needs to be done.

And so that’s why, probably, when I was just graduating college, waterfall wasn’t a bad word. It was actually one where if we planned upfront, we would know what we needed to do, and we would just get it done. When it comes to understanding the world as a complex system, when it comes to the way that we start to understand how we’re training up models in some way, how we’re adding data to these systems so that we can have the system and us understand the patterns better to then take action, that’s where human centeredness is really, really important because if you’re just building something because one person said it was the right thing to build you’re probably building the wrong thing. 

I definitely discovered that early on in my career, where I was writing the biggest specification that I wrote was somewhere in the range of like 500 pages. And that was a system that didn’t even do that much when it came to any type of data, understanding, it was really just about how do we get these three or four different systems really understand meaning requests between each other. So, it can get very complicated. 

The thing that’s interesting about how we design these types of systems is how you still harness what is the best state-of-the-art technology that’s available today to do a particular task, but then how you make it understandable to a human being. This whole kind of movement towards explainable AI. Maybe I’m too pedantic to really talk about it as explainability, because to me, explainability is like one practitioner to another. But the way that I really think is most important is, how do we get the people that are actually using things to be able to understand what the machine is trying to do. And so these machines are going to be creating rules or patterns that we can’t discern ourselves. And we need to do things like build trust within these interfaces.

And so a lot of my work at Philosophie was how are we building basically trust or interpretability within these systems where we have classifiers or prediction capabilities, but even something as simple as the confidence score that comes out of a prediction is not actually confidence. The machine doesn’t have confidence in a way. It believes based on the data and the patterns that have been able to pull out that there’s a likelihood that it’s correct based on what I’ve seen before, but not what it actually will see in the future. And so trying to make sure that that uncertainty comes through to the people that are trying to understand or use these things becomes really paramount when we’re talking about how machine learning or even data science is used in these types of tools.

And I would even go a step further that a lot of the time, even great data scientists don’t really truly understand uncertainty at a deep level. It’s very hard to harness that type of information or understanding. And so, we’re seeing the fact that it’s no longer these straightforward, simple linear models. We’re starting to build things that are harder and harder to understand how they actually work in some way. And when I say how they work, I mean, why did they specifically say that. And while there’s techniques that are able to provide counterfactual examples, things like that, we’re still in the early stages of that. Probably, 10 years ago we were just figuring out what the heuristics for mobile usage was, and I would say we’re really good at that now. When it comes to how we actually harness machine learning models, prediction, classification, even just the basics around that within an interface that people can use, we’re still super early days on that.

David Yakobovitch

A lot of that uncertainty comes around the data that we collect, whether it’s real data or synthetic data. And when you’re looking at conversational agents, personal assistants, I’ve had this conversation with some of the leaders at Microsoft and Amazon and Apple and all of these other companies also who have these products similar to Amelia, the one from IPSoft and one of the big seemingly consistent messagings is about that there’s so much bias still in these personal assistants. And when you dig deeper into this, it says, what if I come from a lower income community? What if I come from an underrepresented minority? Where is that bias present in my experience? And how does that result in this uncertainty? And could you dig a little bit deeper into that for us from a design perspective, how can we minimize that uncertainty?

Chris Butler

The thing that’s really important when it comes to conversational agents is that we’re including the actual people that will use the system in the design of it. And so this is something that comes from design thinking and human centered design, but I would say it goes a step further that if we’re building something for a company, in banking or insurance, they have very different terminology inside the organization than the people do that are actually going to be asking the conversational agents something.

So the example of that would be insurance, they talk about policies, they talk about underwriting. They talk about all these things that if someone is going to be calling in to that insurance company, they’re not going to say ‘my policy is out of date and I need to update it’. They’re going to say, ‘why does my insurance cost this much?’ Or, my life is changing because I got into a car accident. What do I do? 

And so those are the key aspects when it comes to avoiding bias. A lot of the time, not including those people is really the main problem. And when we talk about bias and data, it also is because a lot of the time people are underrepresented when they should be represented in data.

So, a classic example of that was the app that was created a couple of years ago for detecting potholes in Boston. The issue there was that this was maybe slightly before smartphones were really made available to, or were available enough that low to middle income people can really have them on a regular basis. And so what that meant was that while people were driving around, it was probably people in affluent neighborhoods that were reporting more potholes. And so, essentially, that means that we’re not including everybody. So the data set is biased in a way that it doesn’t include actually everybody that’s impacted by the thing you’re trying to understand.

And so that’s what I would say. Reducing bias is including the people that need to be included. And I would even go so far as to say, when we talk about autonomous systems for weapon systems, there’s still someone that’s impacted by that. And that’s the person that’s being shot at. 

And so, it’s not something that we usually think about when we refer to the idea of human-centered design, because this may even be a good way to talk about what ethics really is when we talk about artificial intelligence. But it’s very murky a lot of the time when we talk about what true ethics would be around that, because there’s probably a time and a place where violence is necessary. War is necessary in some cases. But the idea of then trying to understand when that is appropriate is incredibly nuanced and is not an easy decision. And so this idea that we’re in some way automating these things to make perfect decisions is definitely not true.

We are the ones that have to make decisions. And maybe that’s my main advice to a lot of people when we talk about conversational agents as well, is that there is a time and a place for escalation to a human to take place. Because there’s some things, especially when we talk about anomalous behavior, machine learning and those types of systems or algorithms are really great at being able to determine when there’s anomalous behavior. 

But what they don’t know what to do is how to, then, react to that in some way. So the example I would give is, there’s someone that is incredibly upset. This is some type of conversational agent for a flight booking service. And they’re incredibly upset. They’ve missed their flight for some type of reason. To a machine doesn’t mean anything. But when talking to a human being, that person may realize that it’s actually a very big hardship for that person at that moment. And then use some type of budget that they have to provide them with another flight, even though it’s maybe borderline when it comes to the standard operating procedures of that particular organization. 

So human beings are the ones that actually make these types of decisions. Kevin Kelly, who’s a really great writer who wrote the book “What Technology Wants”. He’s written a lot of other great books. I really love a lot of the stuff that he’s done. He had a really interesting, short debate on the IMTF site about autonomous weapons. And what he was saying was that he actually hopes that there’s more movement towards autonomous weapons so that there does have to be this type of hard conversation of when it is acceptable to kill someone.

Because what that ends up taking us all the way through is that we have to actually have these real discussions. And those discussions are very hard. If you look at the MIT, it did some work around the trolley problem, which to me is the biggest red herring of an AI discussion.

Whenever I hear it, I try not to roll my eyes. But the thing that’s interesting about the study that they did was that they found the actual AI, what they found was that it was incredibly cultural, based on what people’s ethical values were. And so, the idea in the United States, it really was meaningful to try to save little babies, but in somewhere like China, it would be much more meaningful to save up the elderly. If anything, that shows that it’s very hard to even get into some type of common ethical code when it comes to artificial intelligence. Joanna Bryce, who’s out of the University of Bath, makes a great point around this from the standpoint that it’s not actually AI that needs to be regulated. It’s actually businesses or outcomes that need to be regulated.

There’s a nuanced discussion about a lot of this. It really comes down to the fact that we need to understand what are the true impacts of these types of systems or algorithms on the application that we’re actually building, rather than the idea of just a common general case.

David Yakobovitch

So, we looked at the MIT trolley problem. That’s the classic AI example of the trolley. Maybe it’s run by AI. It breaks down. Is it going to kill the baby or the elder person and what that looks like? And as you mentioned today, humans are making these decisions. Trolleys are run by humans, but even with the flight example that you shared, perhaps a company like Kayak is run by a human. So is this process. But what if in the future we change the narrative. The narrative is no longer run by humans, but it’s run by machines. And it’s not just run by machines. You’re doing your escalation for your flight. And you know you’re going to get your bill discount or your refund. And instead of a human doing it, the machine might try to do it. But what if you, as a human, know the machine is doing it. In essence, you’re hacking AI. This is a new term it’s been coming on the last couple of years. What are your thoughts on hacking AI? 

Chris Butler

This is really interesting because as human beings, we’re a very wild bunch. We’re always trying to figure out two things: how do we get what we want and how do we do the bare minimum to do it?

There’s a writer that I really love, Venkatesh Rao, who writes for a newsletter called The Breaking Smart. And one of the things that he talks about when it came to AI was that there’s this graph where on the far-left end, there’s these super ropes, super stupid, simple types of transactional things. And machines do those things really well. The machine has been doing this really well for hundreds of years. We’ll build a machine to be able to just manufacture something. And then, on the far right-hand side, there’s all these things that are very complex. 

They require a lot of ability to remember many things. They require things like the idea of AlphaGo, it’s a very complex system. It requires a lot of forethought. It requires this idea of probabilities between different moves and it’s something that the human being cannot keep in their head. But then there’s this middle area where human beings are really, probably, the best. And that’s, when is enough effort good enough? And we’re the best at beating. You are mediocre at being mediocre or something like that. I can’t remember the exact term you used, but I really love that idea that it’s really human beings that are the ones that are judging, what is the right amount of effort? Machines, they don’t necessarily care about time. They don’t care about anything in reality, but if you’ve started a machine to do something, as long as it’s able to keep doing it, it will just keep doing it. 

But the reality is, when we talk about the free lunch theorem, there’s no free lunch. When it comes to the fact that more generalized algorithms will be able to do anything eventually, but it will take, maybe, all the power and all the time in the universe. And so human beings operate on a timeframe that is very different from that. And so it’s up to us to give that type of purposeful understanding when we talk about hacking AIs.

I love this idea because this is something that we do to each other all the time. There’s constantly this idea. Maybe not necessarily, there’s always a balance. It’s not that every human being is truly selfish in some way, but it’s how we balance these things of self versus our group? How are we balancing ingroup versus outgroup? So, the question will be, will that person consider that algorithm or that system to be part of its ingroup or its outgroup? 

If you think about it from the standpoint of all the botnets that have been interfering with the elections in the United States, I’m pretty sure whoever runs those botnets, maybe Russia, feels like that botnet is part of their group. It’s actually part of their ingroup around how they’re doing things now. It’s not necessarily what I feel, but it’s this idea of, how we start to build relationships with machines, it’s a really interesting question. And I do wonder if that changes how we start to define the types of interactions.

Part of my discussion, usually around design thinking for AI, and especially when we talk about the particular stage of, what types of prototypes you build to understand whether something is a good system to build or not has focused an awful lot around the interaction model. And so we have a lot of things that are assistants today. The idea of doing spell checking on your iPhone or your Android device, that’s basically an assist of a machine learning system. It looks at what are the most common mistakes and what actually people wanted to say in the end. Then, you have these automated or agentic technologies where it’s off doing something on your behalf, and then there’s generative models whereas, how has it created new possibilities for you to understand? And then, this is where a lot of the stuff around novelty based search, how you start to integrate a pick breeder was a really interesting study where thinking about how you use genetic algorithms to then create interesting imagery.

And it was the human being that was actually providing the lost functions, which things were actually good. And what you could see from this work, which was really interesting is that you would start from very rudimentary shapes, black and white. It was very boring, but eventually you get to these pictures that look like cars and skulls and butterflies, and it’s because the human being was really the optimization function in that case.

And the last part, maybe a little bit more abstract today, is this idea of animistic design for AI systems. And when I say animistic, I mean, animistic is this concept that comes from say Shinto, spiritualism or native american spiritualism is that every object has a spirit of some type. And that spirit operates in a certain way. That makes sense for that object. And so what this means is, when we talk about AI and machine learning, being in our world, we try to humanize them as much as possible. That’s why conversation is interesting in this case, because when we build conversational interfaces, it’s a way that we feel as natural to interact with these systems. 

But there are plenty of other objects in this world that don’t act like humans or don’t talk like humans that we interact with. And so I wonder when we start to get into this space of extreme distributed computing, IOT, computing everywhere type of model. I don’t think that we want everything to talk to us. I don’t think we’re going to want every object in our room to actually say something to us.

There are going to be some things we want to talk to, but we want things to act the way they should. So a lamp is a great example where I can say, I’m not going to say it right now because Alexa is nearby. But the idea of having some type of device listening to, then turn things on or off, that’s one way to go about it, but it could also be that the lamp understands me in some way and could interact with me in some way that is more of a one-on-one type of relationship, rather than a hub and spoke type of model, which is really what, when you talk about these home automation systems today, it’s very much hub and spoke. 

And so we need to think about different interaction methodologies for this. How do we have AI systems that are not only being reactive to me in some way, but how are they being proactive? And then, how are they being reactive and proactive in a way that makes sense for the relationship I need from them? Because my lamp, I’m not going to have much of a relationship with my lamp, but there’s probably things like personal assistance. There’s probably the way I handle communication with the world. In some way, that’s probably something that I actually want to have more of a relationship with. 

Anyways, that’s where we’re getting into, a very interesting territory when it comes to interaction models for these systems. But it comes down to how do you build trust, the right level of trust between a human being and these devices or these systems? How do you actually get them to understand the world in a way that makes sense? And that’s basically, when we talk about the practice of human-centered design for AI, or Google, Facebook, they all have groups that are focusing on this. It comes down to how do these machines actually fit into the world of humanity rather than humans fit into the world of machines. That really is the key aspect of it. 

David Yakobovitch

It’s super interesting because , just recently this summer, the Raspberry Pi 4 has been announced, and the Raspberry Pi 4, those who don’t know, is this mini computer device that’s affordable and accessible for everyone. And the new one says, you can have four gigabytes of RAM. You can have too many HDMI ports. You kept gigabyte ethernet, you can have sensors. The too long didn’t read. It’s a computer. It’s a full computer that can do everything and even do some machine learning on it. 

And I have some friends who have 8 to 10 of these in their houses and apartments monitoring the weather, opening doors, checking certain conditions have been met. And when we talk about home automation, that’s one of the biggest arenas that consumers can relate to and can see that’s where AI can maybe help them. 

You mentioned about what type of objects you don’t want to talk to. Do we want our toast to say your toast is done? Do you want your toilet to say that the pipe needs changing? Where does it get too far? But, from a system design perspective, home automation systems, this Internet of Things has been everywhere until recent. And now we’re seeing a lot of universal systems. What do you think about this Raspberry Pi 4? Is that going to change the game for hobbyists or even go professionally to help everyone have an AI powered home?

Chris Butler

The first step for those things is usually around data collection. So there was a really great meetup here in New York. That was the intersection between IOT and machine learning groups. And the person who was presenting was talking about how, when he worked at Hershey out of Pennsylvania, and in particular is working for a Twizzler, which is a liquorice candy.

And so the big issue for them was trying to figure out how they do the appropriate mechanisms to allow for flow out of a particular heating system. And so, one of the things that he did first off, and this was probably during Raspberry Pi 2 or 3, where he had a whole bunch of these Raspberry Pis with different sensors and then magnets on them. And so, he essentially put a Raspberry Pi at these 20 different places in the machine to monitor them and then took all of this information and pushed it into basically Microsoft Azure machine learning systems. And so he was able to do some really interesting work to find out what were the two or three actual influencers on the system in some way. 

We’re going to see initially that a lot of it is around data collection to a larger service. There’s going to be a lot of privacy concerns and problems that are going to come up because of this. But my dream someday is this idea of being in a community meeting with a set of Raspberry Pis and figuring out, what does that community need? What can we pull off the shelf that would help make this community better in some way? And so, that to me is maybe this civic, urban dream that is eventually going to happen. There are some interesting things that are coming out. AWS DeepLens is one, or there was the Intel Movidius, which was basically like a USB key you could plug into any computer, including a Raspberry Pi to then get GPU level machine learning interpretation, things like that.

When you look at things like TensorFlow, it’s mobile what the project is referred to, or tiny. It’s sensor flow, something like that. But this idea that we’re going to constantly be shrinking down how we’re doing things like inference at the edge, but the idea of then how are we doing to correct data pipelines back is a tough problem, because you don’t want to have streaming high quality data from every single device that is in the world. There’s no way to consume all of that and process all of it. None of these constant trade off between how much you understand that’s new about the world versus the idea of pushing inference out to the edge and training to these large scale things.

Maybe that’s one of my biggest concerns right now. We talk about global warming concerns or climate change concerns. The fact that one of these XL models now requires the lifetime exhaust of three cars or something like that to train. And I assume these models are going to be trained more often than not. And so, what’s the real energy impact when we talk about these things? It’s really interesting. But IoT is the way that people tend to understand, or they’ve always fantasized about what the future of this is. That’s true about personal computers as well.

For a long time, personal computers were all about, you’re going to collect all of your recipes on your local machine and you are going to balance your checkbook on your computer. And we haven’t really figured out what are the real kind of uses for all this stuff within your own home environment, around AI and machine learning other than, yes, a lot of this stuff tends to be very rule-based right now.

Maybe Nest is a good example of where they try to take it where it wasn’t just explicitly heuristic, but it was learning from the environment. I don’t know. I participate in some futurist stuff here in New York as well. And one of the things that I did through was extrapolation factory, did this meetup and provided a bunch of different themes for the future. But I do wonder, the idea of my children having an imaginary friend is one of those things that I think in the future. Maybe Diamond Age is one of those books that talks about these concepts of what does it means to have something that is self adapting that is interacting with children.

And so, there’s a lot of really interesting places to go there, but maybe that’s a little too Sci-fi to talk about today. Definitely not practical in the sense we talk about what we’re building today, but maybe just getting to your point about conversational interfaces around this was that, in the end, what really matters is that I’m able to get something done that is meaningful to me and conversational agents are one way to do that. And they’re a way that we’re not as bad as we used to be with people referred to as chatbots a couple of years ago, where essentially it was just like jamming a web form into a conversational interface.

But I also don’t think we’re quite at the place where it’s true deep semantic understanding of someone. Now, the problem there is that when we talk about natural language, understanding or processing, is that there’s so much context that goes into it. And so, there’s going to be maybe hacks or tricks that make it a lot more understandable to machines.

So my example I’d like to give is that, when it comes to password reset, it’s one of those things where if I can see the last five places you tried to log into, I probably know what you’re talking about because that context is supremely important to this conversation. Now, if I don’t have that context, then I have to ask you a lot of clarifying questions as a conversational agent.

And so, that’s where the design of this type of thing really matters. And so maybe going back to your point about the toaster, is that the way you know the toast is done is that the toast is popped up. You don’t need it to say that it’s done. So this idea of cues and affordances are really interesting too, because that gets us away where there are absolutely cases where you’re asking for something, because it’s a complex thing that doesn’t make sense necessarily in a simple way to you, that’s physical in your environment. But being able to, then, figure out what is that complex understanding disambiguating meaning is where a lot of this work is going into. 

And especially when we talk about Amelia, the work that goes into that is not so much about just intent, like utterance to intent accuracy matching. It’s not even necessarily just about this idea of a business process, but it really is a lot about how you allow the flow of conversation to be natural.

And so context switching between intense context, switching between domains, being able to provide dialogue acts like hold on or go back or all those things that are very natural for human beings to do is really key. And there’s a real tension here too, which is that most corporations or organizations they want non-deterministic AI or machine learning systems like NLP, NLU, because human beings come off as non-deterministic.

We’re asking for things in ways that don’t always make sense because, again, that conversation about insurance, it could be a five minute long story about how I got into a car accident, which means I need to change my policy in reality. So how do you understand that? But then on the other side, all of these organizations actually want a deterministic process.

They want whatever you’re asking that conversational agent to do. They want that conversational agent to actually follow the standard operating procedures or business process that’s there. And they have to do that. Otherwise, regulatory compliance, even business strategy problems start to arise. And so there’s this tension constantly between the non-determinism of what people are going to be doing and the determinism of what they actually need from a business aspect.

And so, now that we’re actually building these things at scale where we’re deploying them with Fortune 500 companies, we’re trying to see this reality. And that’s very hard for people to get the right balance right now. 

David Yakobovitch

When you think from a civic perspective, you can think of non-determinism. It’s not that predictable, but, could it be predictable, is the interesting question. And I bring up a use case. I was traveling in New York City and the other week, suddenly, the building I’m in says we’re shutting off the ward. It’s an emergency, it’s being shut off for the next 8 hours. And I said, what’s going on? And I go downstairs and I take out the dog and suddenly the first floor is just dripping with water. 

You wouldn’t believe it because a pipe had burst. And this is New York and pipes burst and temperatures change. And in the grand scheme of things, this is going to cost the building hundreds of thousands of dollars, because now they have to tear apart walls and repaint the damage that’s done. And who knows if there’s lawsuits, and that’s such a complicated process. 

But the reason I mentioned this from a civic perspective is when we’re looking at AI today, it’s really coming down to two big use cases: You’re either going to monitor where this big brother is, when they see what’s going on that could be wrong or we’re trying to log something. And I think the logging is the more interesting, the less threatening part. This is when we talk about the Raspberry Pi monitoring the humidity and the weather in our apartment. We’re even thinking from the example you shared earlier of the Boston potholes, this app called street bump. When they were being crowdsourced and you’re able to log this data. 

So other drivers know there’s a street bump on this highway. So just in case you don’t want a flat tire, we’re giving you that warning. And so, from a human-centered design perspective, when we think of inclusivity and we think of just that logging piece civically, I know you’ve also done a lot of work in crowdsourcing, so similar there, but, we’re moving into a new society where it could be a making sure that we’re inclusive for groups and take care of the disadvantaged.

What do you think are some techniques in Design Thinking that we can maybe do crowdsourcing or do logging to think about everyone? And I know the reason I’m asking this is, Chris and I, we’ve had the chance to meet in person and you have some design thinking games. And so perhaps it’d be interesting to do an inclusive design thinking game prototype on the podcast today.

Chris Butler

Absolutely. One of the things that I do, there’s something called empathy mapping for the machine. Generally, empathy mapping is when you’re trying to understand another person, you get everybody in the room together that is part of the team and you write down all of the assumptions you have about them. And those assumptions are, what are they doing? What are they thinking? What are they feeling? And so, when doing this a lot through the design practice about people, what started to make me think about was what are my expectations or assumptions about the way the system should work?

When we talk about everybody that could be impacted by it. And so, what we started doing is we flipped this around from the standpoint of how do we get people into a room where we’re doing something. We have people, not just necessarily data scientists, engineers, designers, or even product people, but the people that are actually impacted by this are the people that are doing the idea of the person who is trying to make sure that there’s ethical practice inside an organization, legal compliance, all these people that are having to deal with the fact that solutions that we build all the time are not always good or perfect. And so, getting them together to then think about what is it that we expect from the system when it comes to what it does in the world, what it understands about the world, what data does it collect and then what does it really try to optimize in some way?

We’ve done these types of exercises from the standpoint that you need to be including those people that will be impacted by the system to be able to truly get a 360 degree view of what that system is expected to do. And so, we talked about the inclusiveness of that type of thing. I don’t think we do enough to actually understand, maybe we don’t even do a good enough job with the first order effects a lot of the time. And when I say first order effects, I mean, the people that are like operators of these systems in some way. We don’t even necessarily bring them always into the fold, but there’s then the other people that are second, or end order out, as far as impact.

And this, maybe, gets the fact that the world is very complex. The interactions between everybody is very complex. And so, maybe one thing that in reference to your story about the pipe that was broken, there was a really interesting project in Detroit where they were trying to understand which lead pipes they should actually fix. And what they were using as they used them, basically a machine learning model to detect or predict actually where they should dig, because they couldn’t dig everywhere in an expeditious way. 

And so they needed to find what are the highest profile places they should dig. Now they have this system. A third party started taking it over and then scrap that, and then went back to the standard way of doing it. And so, one of the bigger problems is really how do you get the clarity from your organization or the buy-in from your organization to actually do what is towards a good outcome for people.

And that’s the biggest concern I have. To me, organizations are the ones that are not going to be inclusive in some way. And there’s a great paper by Parasuraman about trust in automation. And this is what is referred to as abuse. So there’s use of disuse, misuse, and abuse. Use as just regular trust of a system, you turn it on and off, disuse is where you don’t trust it, so you never use it. Misuse is where you trust it too much. And so you use it in cases that don’t make sense, like in the case of the Uber autonomous driving death that happened, not only did Uber overtrust their systems, but so did the safety driver. And finally, when we talk about abuse, it’s where you’re not taking into account the actual people that are impacted by this. 

And it’s something that’s usually because, again, we have too much certainty in the world. And there’s too much certainty, but that this is the thing to do, and that we don’t need those people’s opinions. So, I see in the future ways that we can use machine learning is how does it help us actually remove some of the certainty or apply randomness more to the things that we do on a regular basis? So the example that I would use is, as part of these workshops, I tend to use a lot of card decks that ask additional questions. 

And so, I have actually worked on a card deck with the group called tri triggers. They created a deck where there were basically 60 questions that try to get at what are the really meaningful things for machine learning projects. And these questions included how are you making it so it democratizes process? How are you having it so that this machine learning algorithm actually serves everybody that’s involved? Things like that. And so now, you would ideally want to include all 60 of those questions when you’re making a decision about this, but you don’t have time to answer all 60 of those questions.

So this idea of randomness or how do you have objects or machines or systems in your life, this is what people refer to in a generative way today for AI systems. But, how do you have them push you to consider more than you would have otherwise, because our biases are insidious and we don’t even know that we have them necessarily.

I see there being a real possibility that in the future we’re not going to remove all bias because bias in some ways is good. We talk about the civic use of this type of work. 

How do we try to remove as much bias as possible? That is really just people wanting to get their own thing versus the idea of how do we work towards a better community? That is something where machines can maybe help point out some of these biases to us. Like if anything, when we talk about a lot of the techniques around bias detection, IBM’s 360 tool kit, there’s a bunch of different groups that have done work in this, where in the end, the bias is an anomaly to a machine. 

That’s the interesting thing. And so, being able to detect that is great. You could imagine that a machine learning algorithm or a model that’s been trained with our data, it’s actually an exemplification of the biases that we have. And so, what you need to do is, you need to understand what those biases are and you need to understand whether the biases you’re okay with are the generalizations that are good enough for the trade-off you need to make when it comes to automating a system. That is a really interesting place for work right now, even things like differentiated privacy, all those aspects of, how are you understanding how the data impacts the way machine learning models actually develop. To me is probably some of the more interesting words. I don’t think we’re ever going to remove bias completely. But what we will do is we’ll understand our biases better. And machines can help us do that. 

David Yakobovitch

Now, what are some of the most common biases that you think we should be thinking about on a day-to-day basis, so that we’re more level-setting and thinking about everyone, not just ourselves?

Chris Butler

Under-representation of data is a really good one. Who would be impacted by this, where I’m not collecting their data? This could be that because the person is actually not the main user of this, but the person who is impacted by the use of this system. So a key aspect of it is under-representation.

And then also, that gets into things like class imbalances in the data set. And there’s a lot of tool sets that are trying to help identify that, at least. I would also say intersectionality. So this concept that there are many different ways that I may be classified in a particular system, and that’s usually each column that I exist in within that data set now, intersectionality, is the study that comes from with feminist study and kind of the sixties and seventies, and that there was some really great work that was done out of Google, around jigsaw for this. So jigsaw was the tool that was trying to identify toxicity and comments that were online.

And one of the things that they found was that there was a problem where. Certain things like LGBT types of terms or ethnic terms that were actually okay in a particular context were being misidentified as toxic. And so that’s problematic, definitely, because you can’t then have a real conversation about something like race or sexual preference from that standpoint, because they’re being identified as toxic comments when it’s reality, it’s a real discussion.

Now what they identified as a first order effect, once you start to talk about intersectionality, once you combine these different types of terms together, you get even worse misrepresentation of toxicity in these comments. So if you combine sexual preference with ethnicity, it gets even worse, as far as the toxicity and misrepresentation or misclassification. 

And so the idea of intersectionality is from the standpoint of what are the worst case scenarios when it comes to the other side of your accuracy measurement. And this is where another kind of exercise that we’ve created, we created a philosophy was called confusion mapping. And so, from the standpoint of a confusion matrix,  there’s this idea of false positives or false negatives, but it doesn’t really get it. What is the impact of these things, especially because the value surface of that algorithm, the value kind of function could be uneven in different places of it.

And so when we talk about false positive, false negative, a false positive in the case of autonomous vehicles is that there’s a car ahead of me. And so I slam on my brakes. Versus a false negative is that I don’t think there’s someone in front of me, so I just drive through them.

And there’s a clear difference in which one of those two things is worse. Confusion mapping is trying to get out imagining what are the worst case scenarios and the impacts of those things or severity of those things based on how our algorithm will perform. And then from that stack, ranking them to try to understand what do we need to do in addition to this, which could be, how do we add heuristic? How do we put guide rails on the algorithm? How do we even build ensembles of models that are good at detecting this type of problem in particular? 

And so that idea of intersectionality gets at that point that there’s this special case, which will be an intersectionality between multiple different classes of your particular use that will be really horrible. And so, trying to consider those things upfront, trying to understand how you detect that that’s happening. And then maybe most importantly from the standpoint of the Western World, when we talk about ethical thinking or ethical thought agency is one of the top things that’s there. From that perspective, I would imagine, and maybe this is a question everybody should be asking themselves is, what would happen if people could opt out of my system? And I was penalized by that fact that they were opting out. What does that lever that allows them to escape out of this? 

And when we talk about conversational interfaces, it’s very much the same type of thing that someone, when they get a phone call, when they’re doing a phone call into an IVR system and they just scream operator over and over again, that’s the opting out of the system, and that’s because the system hasn’t built the right level of trust.

It’s not operating in a way that really makes me feel like it’s going to take care of me. The idea that you have changed your menu options that recently is probably a lie. And it’s probably just trying to get me to actually press the right button rather than just hit zero. 

And so, there’s a lot around trust when it comes to these types of systems. And allowing that type of opting out or agency is one that if you go through that thought process when you’re building the systems, you probably consider the fact that you’re not going to build the right thing. And so, if that’s the case, what do you do to either detect that you didn’t build the right thing?

And then, how do you allow people to be still served in some way by that machine or by a human being? And then how do you learn from that long-term so that you can build a better system over time? 

David Yakobovitch

And learning is at the heart of everything, every single day. Today, would you share with our audience is in fact, an underrepresented field of AI design thinking. More conversations should be had about this, and you took us into several deep dives today with some fantastic use cases. So Chris, thanks so much for sharing your wealth of knowledge on HumAIn today. 

It’s been an absolute pleasure. 

Chris Butler

Great. Thank you. 

Now, I’m really hoping that people start to utilize these things in their practice. And that’s where it’s really key. You can write as many codes of ethics that you want to, but unless you make it part of your daily practice while you’re building these things, it doesn’t really matter. So thank you for giving me the chance to show them. 

David Yakobovitch

Absolutely.

David Yakobovitch

Hey humans. Thanks for listening to this episode of HumAIn. My name is David Yakobovitch, and if you like HumAIn, remember to click subscribe on Apple podcasts, Spotify or Luminary. Thanks for tuning in and join us for our next episode. New releases are every Tuesday.