How Machine Learning and Human-in-the-Loop Approaches to Automation are Set to Displace RPA With Varun Ganapathi, Co-Founder and CTO, AKASA

DUE TO SOME HEADACHES IN THE PAST, PLEASE NOTE LEGAL CONDITIONS:

WHAT YOU’RE WELCOME TO DO: You are welcome to share the below transcript (up to 500 words but not more) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The HumAIn Podcast” and link back to the humainpodcast.com URL. For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

WHAT IS NOT ALLOWED: No one is authorized to copy any portion of the podcast content or use David Yakobovitch’s name, image or likeness for any commercial purpose or use, including without limitation inclusion in any books, e-books, book summaries or synopses, or on a commercial website or social media site (e.g., Facebook, Twitter, Instagram, etc.) that offers or promotes your or another’s products or services. For the sake of clarity, media outlets are permitted to use photos of David Yakobovitch from the media room on humainpodcast.com or (obviously) license photos of David Yakobovitch from Getty Images, etc.

Welcome to our newest season of HumAIn Podcast in 2021. HumAIn is your first look at the startups and industry titans that are leading and disrupting ML and AI data science, developer tools, and technical education. I am your host, David Yakobovitch, and this is HumAIn. If you like this episode, remember to subscribe and leave a review. Now onto our show.

David Yakobovitch

Welcome back listeners to HumAIn your first look at the newest products, in the AI-powered ML-powered and data-powered landscape. Today on episode we’re featuring the co-founder and CTO of AKASA, Varun Ganapathi. Varun’s working at the crossroads of machine learning and human-in-the-loop technology to build better systems where you got humans and machines working together, as many of, that’s our big theme here on HumAIn so very excited to have Varun with us today, Varun thanks so much for joining us on the show.

Varun Ganapathi

Thank you David for having me.

David Yakobovitch

We are really excited to dive into these topics because they are the core tenants that we talk about all the time on HumAIn about data first principles, humane principles, and responsible and ethical principles. But before diving into AKASA and these principles, let’s take a step back and share with our listeners a little bit about your journey and what brought you to AKASA.

Varun Ganapathi

Sure, my journey has been, I’ve been excited about artificial intelligence for a long time since I was an undergrad at Stanford work in physics, I worked with Andrew Yang on helping helicopters fly autonomously using machine learning to Google where I worked on a project where they were scanning all of the world’s books, and I used machine learning to help index the books and extract information from them automatically.

Then to my Ph.D., where I worked on computer vision and machine learning more generally, my startup company during my Ph.D. that was acquired by Google in the computer vision space. I’ve been really working on machine learning and AI for a long time, a couple of other startups in between. I was the head of AI at Udacity for a while. Before I co-founded AKASA, the way I got to AKASA was really thinking about how AI can help healthcare.

And I saw all of these companies working on diagnosis. And I thought that’s really cool. However, it feels like there’s a lot of other areas of healthcare that could be improved with AI. And when we started thinking about that we saw there was a lot of discussion about physician burnout, about all the paperwork involved with health care, and then about how patients were filing bankruptcy because they got these huge surprise bills that were due to medical billing errors. And it seemed like a really awesome place for AI to make a difference and to just make healthcare more efficient. And so that’s how we got started with the AKASA.

David Yakobovitch

That’s super exciting. Myself as an employee and someone in the venture space, I have health insurance, and I go through the system. Even in New York, I’ll tell you working with these hospitals and these providers, I get sticker shock all the time. And I’m like, are you kidding me? I have insurance. What is this bill? What’s going on? And I’ll tell you, I’ve seen my conversations firsthand, and some of my close friends going back and forth with this medical billing. It’s a complex process that takes a lot of human intervention that isn’t as automated as it could be.

Varun Ganapathi

Exactly. That’s exactly what AKASA hopes to correct is really, just making it as automated and easy to use as possible for everyone, saving everyone time. And make sure patients get the care they deserve while paying the right price that they should be paying. Make sure providers don’t have to spend tons of time figuring out how to collect on the work that they provide. And also make sure insurers get the right claims, that there’s no mistakes and those claims when they’re submitted. So, there’s a lot of stuff that could be automated in the middle of that process and improved. And that’s exactly what we hope to do.

David Yakobovitch

And you are a leader in the AI space, whether that’s computer vision, and now focusing on these healthcare improvements across the greater economy. There’s been a lot of technology that has been used for a long time. In fact, some of it that’s been classic in the healthcare industry is this RPA technology. And now we’re going into this next phase of digital transformation where it’s no longer just about RPA. But there’s a lot more to be seen. Are you seeing some of those changes and new waves?

Varun Ganapathi

The new waves are realizing that a lot of these processes involve decision-making in the middle. There is some sort of beyond just collecting information from a system and copying it directly into another system, there’s decision-making that has to occur in the middle of that process. There’s reconciling information from different sources. There’s translating from different formats. For instance, from doctor’s notes into medical codes, that actually go on the claims. So the place where I see AI fitting in is really helping solve, extending the reach of automation beyond the simple stuff that was possible before, to actually handling the complex situations that humans are currently dealing with.

And the way we do that is by using a human in the loop, the human in the middle, which we call experts in the loop, their job is to help teach the system how to handle all of these edge cases, essentially, that analog would be in the self-driving car world, how all these self-driving car companies have drivers who, show the car essentially how to drive by driving around collecting data, and the car is learning from that experience.

And so we do the same thing, except we’re learning how to process information on a computer in the computer world. And the way we integrate with our customers is actually trying to simulate being a person as much as possible. So we interact with the systems using the same tools and methods that a normal human being would.

Our goal is just to fit in and dramatically optimize the productivity of a person in that system. And our company actually provides the entire service end to end. So we essentially give you the equivalent of our virtual FTE that you hire from us, or a lot of them actually, and we produce work in your system, using AI, essentially taking work off your hands so that you don’t have to deal with all of those cases manually.

David Yakobovitch

So I’m ultimately the provider, like, let’s frame that here. And this by provider means I could be anything from a dentist office and a chiropractor, to an outpatient, city MD to a hospital to like any sort of health care provider is that what we’re really looking at here that this process of medical billing, coding and getting everything set up right, takes a ton of human errors. There are a lot of mistakes, and there’s a lot of room for improvement.

Varun Ganapathi

Exactly, yes. So I should have clarified that, yes, our primary customers right now are the providers, our technology serves to help them automate their processes. We’re looking at other areas of healthcare as well to help see where our technology can be useful. But that’s our primary market right now.

David Yakobovitch

That’s such a great beachhead market because I know some of the providers like myself being in New York, you think of the NYU’s Columbia medical centers, Mount Sinai’s Health of the world. And some of these providers see 1000s of patients a day. And I can just think of an army of people going through medical billing, coding, and doing their best. I don’t think anyone objectively has that intent of miss diagnosing or miss billing or miss coding. It’s just there’s so much nuances in complexity, that if you can build a system that augments the best of the humans for these complex cases, and the routine standard billing practices, it sounds like that’s where a cost is fitting in.

Varun Ganapathi

That’s exactly right.

David Yakobovitch

And so beyond that, one part to dive deeper into is the complexity. So we think of complexities right, there could be a scenario where someone has some immunocompromised healthcare condition. And this can cause exceptions and outliers. And those exceptions and outliers make for differences. How do you see those fitting in with AKASA being able to automate those? That’s really where the human in the loop comes in.

Varun Ganapathi

That’s where the human in the loop comes in. The idea is to use humans to identify like the so-called mattress in the road, our systems will, when they’ve seen an example several times, and they see what the human did in that situation, the algorithm will learn to replicate that behavior, and then do that by itself. In the future. When it sees an edge case it sends out to a person, if you see the same type of edge case enough, it now no longer becomes an edge case, it becomes a normal case. Like you’re essentially eating down the long tail of edge cases in sort of speak, where the most common next edge case is the thing that the algorithm learns to solve.

And so over time, you’re essentially squeezing down what people need to do manually or, the exceptions that people have to deal with overtime. But yes, at any point in time, there will be, the really hard cases, the exceptions and that’s where we bring people in to help. I view this is not a goal of complete automation. It’s really about just improving the productivity of people if we can create leverage that itself, is extremely helpful.

And the good thing about the space that we’re in, in terms of automation is it’s very easy to pass back and forth between humans and computers. There’s no real-time requirements, like with a self-driving car, you need to react in milliseconds. Like if you’re about to crash, and you need someone to jump in and help the car navigate a difficult situation, you need connectivity, and you need to be really quick. With us, these are offline processes.

And so there’s no real need for extreme real-time performance here. If the algorithm sees something it doesn’t know how to do, it can say, okay, I’m going to insert that in a queue for someone to handle and tell me what to do. And I’m going to go on to the next task that I know how to do. And so over time, we can gradually sort of reduce the number that we’re having people do.

But in the meantime, we’re always providing value, which is really important for AI companies. In general, as some classes in AI companies seem to be, we need to solve some extremely hard problems. And until we solve it, we can’t really use it at all. What is nice here is that we can leverage AI. As we make it better, the better it gets, the more it can do, but it’s useful the whole time.

David Yakobovitch

So, at its heart, the cost of solving a variety of problems. Some of these problems are the business problem of, let’s build a more robust, streamlined healthcare system, take it from the 1980s, move it to the 2020s. So we have a modern healthcare system. And beyond that, it’s also a data and AI company, it sounds that this data capture layer is so important, because as your systems continue to read, and stream more data, then the insights can become that more valuable, the models can become that much more precise. And this generates more novel working solutions and opens up the door for other use cases over time. Would you say that AKASA is this data capture company that is also more innovative engineering around creating algorithms or implementing new AI technologies?

Varun Ganapathi

That’s a great question. I would say our focus is really on helping our customers. So our goal is just to help them do their job better. And that’s our primary focus. And what we do is we bring cutting-edge machine learning to that problem for them to help them. That’s really our main focus is honestly the data we capture is really on behalf of the customer. And to solve the edge cases that they have personally like for them. The types of things we’re learning are changing on a constant basis.

It’s like every year insurance plans change policies change, contracts change, there are new codes being added all the time. So the real benefit of machine learning is to learn really quickly from all of those changes, and then update itself like, instantly, and then leverage those new changes in its workflow as it proceeds. And what’s hard for health systems today are, every time these changes happen, there’s so many people that they have to tell about those new changes.

And there are so many little changes all across the board because, every single interaction between a particular hospital provider and a particular insurance company, there could be new rules about that interaction, or new policies in place. And keeping track of all of those details is really hard. But for people, but for a computer, that’s exactly what it becomes really good at. And so that’s really where the data adds value to the customer by reacting to changes and then enabling them to roll out. Essentially using AI as a tool. The new updated behaviors based on those changes, we’re really primarily focused on that part is really like, serving our customers in the best way possible.

David Yakobovitch

And leaning into this topic of delighting customers and providing this modern healthcare experience. There’s a lot that other companies can glean from what AKASA is doing for the entire market, which is leaning more on this human-powered AI. And we think that this human-powered AI movement has continued to evolve as humane as human-in-the-loop as even explainable to some regards, instead of this classic garbage in garbage out algorithm. Can you share more about why you’re very bullish on this approach?

Varun Ganapathi

So I am bullish on this approach because it essentially helps make sure that the algorithms don’t do something that you don’t want them to do. A lot of the time, when you are just modeling what you see there. There are a lot of different ways to use AI. Like, Netflix uses AI to predict what movies you’re going to like to watch or, advertisers there’s a lot of AI being used there to like predict what will make you click on the ad, but these can sometimes have incentives that aren’t exactly what you would necessarily want like if you were a human-like doing that task.

So, what I see is the value If humans in the loop are, it helps make sure that the system is actually achieving the goal that you want it to achieve, not optimizing some objective that like we put in front of it, but may not actually be the true objective. Like these objectives that we create for our algorithms. If they’re suddenly wrong, it can create extremely incorrect behavior, especially as we use AI more and more. And so the human in the loop really helps make sure that you keep things in check.

And for us, in particular, we use humans in the loop in a variety of ways. I discussed the edge case method where we kind of make sure that the algorithms have a confidence rating on what they’re doing. And they send only low confidence things to humans, we also use humans for quality control all the time. So we’re having people check a small percentage of our tasks on a constant basis to make sure that the algorithms are not misbehaving. So for me, it’s the human in the middle that is critical to avoid, algorithms going haywire. Like, you really want something to be constantly checking and making sure that it’s doing what it’s supposed to, and that we’re getting the results we want out of it, because ultimately, it’s to serve us, that’s the goal.

So, that’s why I really feel strongly about this, to a large extent. Also, many problems in AI really it’s about a new way to teach computers how to do what we want, by showing them rather than trying to explicitly program it. That’s really what a lot of machine learning is about. I want to teach a computer how to identify a car. And it’s basically impossible to program that by hand. But, a person can say, well, that’s a car, that’s not a car, and the algorithm can learn how to do that. That’s machine learning.

So, to some extent, honestly, a lot of AI really has humans in the loop in the form of these labelers, who are teaching the algorithms what to do, we’re just going one step beyond that, to leverage the humans more essentially, not only for just labeling once but understanding that because there’s drift because the domain can change, you really need to be involving labeling the whole time. Essentially, you can never stop labeling, you have to always have labeling going on, in order to make sure that the algorithms are always adapting to whatever changes occur.

David Yakobovitch

I love Varun, how what you’re sharing is about this guidance about humans and machines working together. And what I’ve appreciated is seeing in the industry, for example, the Carnegie Mellon standard CMU recently came out with design standards on how do you keep algorithms in check? How do you ensure that you’re building systems that are human-controlled or human augmented by these machines? And so there’s been a lot of frameworks coming out, that’s one of the ones that was in the recent years from their Software Engineering Institute, among others. And there’s a lot of great material there. I’m curious, from your vantage, as a leader in the field, are there any standards that you look towards, or material that you’ve appreciated as you’ve been exploring these human-in-the-loop systems?

Varun Ganapathi

That’s a great question. I don’t have, off the top of my head, I can’t tell you about resources, we’ve honestly had to figure it out as we go, I’ve read a lot of papers, and I’m always up on the most recent research, but on this exact topic of deploying ML into real systems and making sure they operate well, I have not myself, no to write any recently, standards on how to do that. It feels like it’s, currently at least it’s like a case by case basis.

From a statistical standpoint, really, it’s about making sure that you model the fact that the world can change. And so you want to ensure that you’re having a person or some other system, tell you about the world or confirm that what you think about the world is true, continuously so that you can adjust to that change. I’ve had conversations with my former advisors like Sebastian Thrun, and Andrew Yang, and Deputy Kohler about this. But I don’t really know of any explicit public standards about that. That would be a great thing to write about, though. And more and more people are writing about how to do this well.

David Yakobovitch

And you’re not alone there Varun, because I’ve seen a lot of that myself, having led this show for the last three-plus years speaking with precede to pre IPO founders and having looked at data first principles, even from an investment perspective, the challenge is it’s such a blank canvas, it’s a new market. It’s a market that’s evolving over time, for all industries for health care, finance, and others.

And so the standards aren’t fully set yet. And I would hypothesize how much the GDPR standards have become a golden source in the last few years, both in the States and globally. We’ll see a similar shift as more writings and publications come out at these leading AI and ML conferences.

Varun Ganapathi

Yeah.

David Yakobovitch

Well, that was definitely an interesting topic for sure. And you give me a lot to think about there.

Varun Ganapathi

Thank you. I’m really excited about AI overall, it’s like there are these industries where they’re extremely labor-intensive today. And as a result, their cost as a percentage of GDP keeps going up and up. There’s something known as Baumol’s cost disease. So they learned about where, as we automate other industries, those industries that are labor-intensive, become more expensive, essentially, because the cost of people to like, do those things like goes up enough, right as if a person could prefer to do something elsewhere they have higher productivity.

And so, education and healthcare are one of the two top ones where the cost as a percentage of our overall GDP just keeps going up and up, because they’re so labor-intensive. And so I feel like that’s why it’s imperative for us to figure out how we can deliver health care more efficiently because it’s also something that everyone needs. It’s also something we want to provide to everyone, like everyone cares about their health. And so we need to figure out ways to bring down the cost while keeping the quality high, or even improving it. And that’s kind of, AI can really help with that. And there are other places where that could be useful as well.

David Yakobovitch

Thinking about all the topics that we’ve discussed here, Varun, it sounds like the AI community is evolving, the standards are still being defined among the data, ML, and AI ops. But perhaps there also is a need to go back to the basics. One of the core areas in the data science workflow was predicted to be one of the fastest-growing areas which we saw during the pandemic data labeling, think of the companies like Scale AI.

And now these new AIOps companies like Roboflow and others are helping with both natural language and computer vision data labeling, though it is still the early days, because the classic problem that I’ve shared with students in New York is if you have a dataset, for example, in healthcare, and there are no participants in that dataset, with a certain co-morbidity or underlying condition, well, are you going to get it right if the data doesn’t exist. So I’d like to hear your take also on why the AI community might need to get back to the basics with data labeling.

Varun Ganapathi

Everyone I talked to, talks about how data labeling is one of the most important things, you’re totally right. For us, it’s a major focus, especially for us, it’s really hard for us to use any public labeling solution, because the things we have to label are actually very, customer-specific, potentially, or obviously are covered by HIPAA. And so we need to make sure that they’re our own people, and like they’re aware of HIPAA guidelines, and like they’re trained, and so on.

So, 100%, data labeling is critical. That’s like a huge part of making AI work well as making sure your data is clean. There are a variety of ways to actually create loops where you have the algorithm run, it can identify the surprising cases. And then you can focus your labeling efforts on relabeling those to make sure that the data is clean. There are other methods of active learning where you can sort of send, kind of identify what now needs to be labeled like where do you have sparsity in your input dataset. For us, this naturally happens, because since we’re only automating, like the high confidence cases, the edge cases grow.

And essentially, that’s the fringe. It’s the place where we have low confidence. That’s where we need to get more labels. And our system automatically does that by involving labeling as a core part of the system, rather than thinking of it as like, we label once we train the model, and then we’re done. We actually think of this as like, a continuous virtuous cycle, where labeling and ML, and deployment all have to coexist continuously, the systems need to be continuously maintained and improved by having labeling and people there the whole time.

So you’re 100%, right. I think why it maybe gets forgotten, or maybe the focus is not as high as there are a lot of public data sets that people use where you’re just given the pre-labeled data set. And there’s a lot of let’s just try to improve our numbers on that data set to prove the worst state of the art. And so when you create a new dataset, like, in some sense, you can’t compare against anyone, right?

Because it’s your data set. And so that’s where it becomes tricky in the ML community, who basically decided, well, let’s at least hold one variable constant will hold the data set constant, and then just see how well people like how much better is my algorithm than someone else’s algorithm like on that fixed data set? And it’s really just a way to address comparability and to know, like, are we actually making progress? But you’re totally right. In the real world. When you actually deploy machine learning, labeling is extremely critical.

David Yakobovitch

There are a lot of changes occurring in the entire data ecosystem today, there’s been talks of the modern data stack, the decade of data, and a lot of new tooling and technology coming out of the room, whether some of the trends or things that are exciting you about the industry that you and your team are working with.

Varun Ganapathi

There are constantly new things coming out all the time, we’re always looking at the newest papers that are being published, newest algorithms, I would say, our focus really has been on the modeling side and labeling side in-house. And a lot of the things we have to solve are very specific for us. But if I were to come up with general trends, the ideas of, well, this is not a particularly new trend.

But, it’s something that is really cool is about self-supervised learning, how can you use unlabeled data to make your algorithms work better, that’s one way to sort of solved the labeling problem is by figuring out ways to leverage the data set in an unlabeled form in order to learn something about, the underlying data distribution, and therefore help you solve new problems more efficiently.

That’s really exciting, because it avoids to some extent, or it at least lets you make a lot of progress before you need to solve your labeling problem, and also enables you to get more out of the labels you have. Those are the primary things I can think of right now to answer your question. But I’ll think more about that, maybe I’ll think of something else.

David Yakobovitch

That’s very cool and very thoughtful there. And myself, as someone who likes to roll up his sleeves, I’m always diving into different frameworks. And in Python, R and SQL, and other languages. Have there been any cool tools or frameworks on the horizon that you’ve been playing around with or that your team’s been exploring? Implementing?

Varun Ganapathi

So, PyTorch, we love to use a lot of that in-house. Almost all of our models are built using PyTorch. I will say, one of my friends started a company called Weights and Biases. And that’s something we’ve been looking into using and our team is really excited about it as a way of helping visualize our results in a standardized way. Other than that, I would say, AWS is, of course, adding tons of new things constantly. And, we attempt to leverage them as much as possible. Like we use AWS text track for OCR and things like that. And so, those are sort of some ideas of tools that we found pretty exciting.

David Yakobovitch

That’s awesome. And it’s a great shout out for Chris from Weights and Biases, we did a feature on our first season of HumAIn actually, Chris Van Pelt when he was just launching after his seed round. So it’s great to see the community and the ecosystem always grow together. So great shout out there. And you’re continuing to grow Varun AKASA, as a company, late 2021, you had raised a series B round of funding, so continuing to scale and grow, and there’s a lot of hiring happening. Can you tell us a little bit about the scale at AKASA and are you hiring?

Varun Ganapathi

We are desperate to hire people like everyone else. We’re a remote first company. And we’re working on a mission that we think people will find personally rewarding. Like, as when you look at some of the other companies, what we’re working on is like going to make the world better for sure and help everyone in healthcare. And so 100%, we’re trying to scale as much as possible. And we’ve been growing a lot, our customer base has been growing heavily, like we’re now helping serve a very large percentage of the hospital market, and which is a big responsibility. But it’s something we’re scaling really aggressively in order to handle all the customer demand in that sector.

David Yakobovitch

Very exciting. And also, I know back in the latter part of 2021. And AKASA was named to CB Insights as one of the topmost innovative digital health startups. So it’s exciting to see all the opportunities there. Anywhere, would you recommend listeners check out more about the AKASA, I know they can go to akasa.com/careers, any other actions or next steps you’d love to share with our audience.

Varun Ganapathi

AKASA is still a startup, but we’ve grown pretty rapidly in the last year, we’ve more than tripled in size to more than 200 employees. And I’m proud to say our customer base has In total, more than 100 and $100 billion of net patient revenue per year, which is a sizable proportion of the $1 trillion hospital market in America. So we’re growing rapidly, both on the hiring front in order to serve all those customers and on the customer side, as we expand our product space to all sorts of functions like within healthcare akasa.com is a great place to go.

There are other podcasts that I’ve done as well. So I’m sure maybe listeners who are listening to this may be interested in listening to the AON healthcare podcast, Apple podcast. That’s kind of the main place I would recommend. We also publish papers and ICML and MLHC on some of the algorithms that we’re using to actually push the state of the art in healthcare. So that might be another interesting place to look. But, the starting point would be akasa.com. Everything is like it is from there.

David Yakobovitch

Fantastic this has been an exciting episode to learn about machine learning, human-in-the-loop systems, about bringing this to the healthcare market, and we feature today Varun Ganapathi. Who is the co-founder SQL of AKASA. Varun thanks so much for joining us on the show.

Varun Ganapathi

Thank you, David. I appreciate it.

David Yakobovitch

Thank you for listening to this episode of the HumAin podcast. Did the episode measure up to your thoughts and ML and AI data science, developer tools, and technical education? Share your thoughts with me at humainpodcast.com/contact. Remember to share this episode with a friend, subscribe, and leave a review, and listen for more episodes of HumAIn.

Solid Data AI Thought Leadership

Actually being done in AI

Thought-provoking

Putting things into perspective

Digging into AI