How Synthetic Data has Revolutionized the AI Industry With Jeremy Kaufmann, Principal, Scale Venture Partners

David Yakobovitch

Today’s guest speaker is an AI investor in some of the largest scaling startups of 2019. His firm’s investments include KeepTrucking, Cognata, Solvvy, and TechSee. Listen in as Jeremy Kaufman and I discuss about his thesis to investing in AI. How the cold start problem can be overcome with synthetic data and why action detection is a new trend for the AI industry. This is HumAIn.

Welcome to HumAIn. My name is David Yakobovitch and I will be your host throughout this series. Together we will explore AI through fireside conversations with industry experts. From business executives and AI researchers to leaders who advance AI for all. HumAIn is the channel to release new AI products, to learn about industry trends and to bridge the gap between humans and machines in the fourth Industrial Revolution.

If you like this episode, remember to subscribe and leave a review. Welcome back to the HumAIn podcast. My name is David Yakobovitch, your host in bridging the gap of humans and machines in the fourth industrial revolution.

Today, our guest speaker is Jeremy Kaufmann¹, who has a wide variety of experiences. He is the Principal, Scale Venture Partners². He started his career as a statistician and economist at the New York Federal Reserve and became very interested in looking at healthcare outcomes research. So he took his love of data, gained SAS experience at Salesforce, and then took that today to Scale Venture Partners, where he has focused his last three and a half years in the world of AI and machine learning. Jeremy, thanks for being with us.

Jeremy Kaufmann

Thanks so much for having me, David.

David Yakobovitch

So Jeremy, it’s super great that you’re here on the podcast today. I’ve had a chance to look at the Scale portfolio and one of my favorite companies is KeepTrucking. I actually have a few friends who work there and I’ve seen it scale up over time. What’s going on with them today?

Jeremy Kaufmann

So KeepTrucking is one of the companies we are super excited about. In the world of SAS, It’s one of the fastest growing companies ever, to be honest. And it’s a company that we think is really a symbol of how you operate well in a vertical software market that others might overlook. A lot of SAS investors talk about finding the next Slack. They don’t talk about helping truck drivers become more productive. So we’re really excited about KeepTrucking. When we invested at the time, there was a four-point thesis.

The first was that a large segment of the long haul truckers were going to be forced to by ELD technologies, electronic logging devices because of the government regulation. Second was that this mobile bring-your-own-device product was going to be the preferred solution. 10 or 20 years ago there’s an onboard computer in a truck, but the reality is, today, truck drivers like most Americans, like everyone, have mobile phones. So wouldn’t it be great if you could integrate the mobile phone into the truck or workflow?

The third part of this thesis was that KeepTrucking was going to be able to sell into the smaller trucking fleet, which was something that no one in the industry has ever done before. People don’t realize that the trucking industry is one where the fleets are super small in size and because KeepTrucking’s freemium approach basically allowed to sell them into this smaller trucking fleet. And then finally, we’re super excited that now, going forward, KeepTrucking is going beyond the ELD into more exciting St. Driver’s safety products and thinking about freight marketplaces.

David Yakobovitch

Fantastic. And there have been other companies too, that I know you’ve had opportunity to work on specifically around facial recognition with autonomous driving. One of them is Cognata. What have they been up to?

Jeremy Kaufmann

Cognata is super interesting. It’s in the autonomous vehicle simulation market. So, this is the idea that, ran tells us that it’s going to take seven to 8 million miles to basically train an autonomous vehicle, to operate in the real world with complex corner cases, pedestrians, swerving vehicles.

And the idea of simulation is really simple, which is, if you can use a machine to train the vehicle and simulate 6 million or six and a half of those million miles, that only leaves a million or half a million miles for a human to have to drive. So that’s really important and what we think it’s an on-ramp to getting vehicles ready for the world of autonomous vehicles.

David Yakobovitch

Okay. That’s a super exciting startup because I remember just a couple of years ago, Mr. Zimmer from Lyft said, we’re going to be autonomous. We’re going to be self-driving by 2025. And here we are in 2019, we’re almost there. I know a lot of what you do is around research and trends and seeing the direction of an industry. Where do you think we are with autonomous driving?

Jeremy Kaufmann

So I think one of the very challenging things in the world of AI is people and experts who are really truly the most knowledgeable people in their field have a pretty poor track record of trying to nail down timing. And the world of AI investing in the broader world of AI is all about understanding timing risk. And if you look historically, a lot of the experts have just gotten this miserably wrong. If you go back to the 1955 Dartmouth Conference, a lot of the technologies using computer vision and natural language processing that were discussed then, only came to fruition in 2014, 2015. So you have historical evidence of people being 60 to 70 years wrong.

So what we say to investors is, it’s not good enough to get the mega-trend Yes. Autonomous driving is a thing, it’s going to be the future, but it behooves us to get the exact timing right. And to that extent, we’ve always said amount of time it takes from zero to 95% in terms of accuracy. It’s one step function. But to go from 95% to 99.9% and get all the corner cases is a much harder job. So our perspective around Cognata was it’s an on-ramp to OEMs in the startups getting to be level of safety required. And we think that Cognata is a way to bring me into it responsibly into the mode of autonomous driving.

David Yakobovitch

And what are some of these corner cases that cognitive is working through? For example, I know Tesla’s another case, announced earlier in April, 2019 about corner cases that, Oh, we’re doing a lot of recording with data, but we still don’t know if a car is flying in the air. That is not on the ground. And these are corner cases that could put you at risk, what are some of the ones that are challenging that could be limiting autonomous vehicles from becoming critical mass?

Jeremy Kaufmann

It’s really interesting. For example, one of the common corner cases, as dumb as it sounds, is making a left-hand turn. If you remember early on the way these companies solve, the problem was they just never made a left-hand turn and they just made a series of right-hand turns. Because left-hand turns are obviously harder, at least in the US given on what side of the road you are.

Other common corner cases might be pedestrians at a crosswalk. And human beings can look at them in an attempt to derive their intent, the probability at which they’re going to cross the street. Machines can’t do that today. So intent of a pedestrian to cross is a child going to run across the street. The human brain recognizes if a baseball rolls in front of you, you probably slow down because the child might come next. So AI isn’t there yet.

And then the other common example we’ve seen is sometimes the perception systems not distinguishing between, Oh, that’s a piece of cardboard or that’s a plastic box. Don’t slam on the brakes versus, Oh my God, this is a jaywalker who shouldn’t be doing this. Now you should slam on the brakes.

David Yakobovitch

All these examples are so fascinating, especially the first one you mentioned, because I remember growing up in Florida that I would always go outside to get the mail from the UPS driver. And one time he would tell us, Oh, we only make right hand turns. And I always thought that was so peculiar. And thinking about that, I always thought it was, Oh, this must be a gas savings or a time-saving maneuver, which could be true with humans running the machines. But when the machines are running the machines, then it gets more complex as you’ve described.

So these are two of the companies that I know you get to work a lot with really close. I want to hear about Solvvy as well. And how is that helping customers that understand what’s going on in the world?

Jeremy Kaufmann

So Solvvy is a company broadly in the conversational AI space. We invested in mid 2017 and what Solvvy does is it deflects, basically, customer questions. Every business generally has a customer service department, which is responsible for answering the questions of their customers. And what Solvvy says is, wouldn’t it be great if the agent’s responsible for responding to inbound queries could really devote their time to the hardest and most complex queries?

And you could have AI basically deflecting the easiest to answer questions. So when somebody asks, where is my package, or how do I reset my password or questions in lower complexity that might not need a human agent to intervene, that’s where Solvvy really does its job. What Solvyy does is, it basically ingests information from the company’s knowledge base and then applies that knowledge base to the actual inbound query. So what we think is the beauty of Solvyy, it’s going to respond to inbound questions, not only with information, but we think responding with actions is actually more important.

It’s great to tell somebody, this is how much money there is in a bank account, for example, in the world of conversational AI. But the step function better action is to say, here is how you automatically send a wire transfer. So we’re excited about Solvvy, not only answering questions, but automating actions.

David Yakobovitch

Would Solvvy be a product that I would use through, say, an Amazon Alexa, or would this be something that if I worked in a call center, perhaps that they would be empowering the call agents to help me with my concerns?

Jeremy Kaufmann

So Solvvy is actually today’s form of a customer-facing product. Meaning, if you actually go on to a given business’ help site and you press contact me or send a question, Solvvy pops up right on that screen. And what Solvvy tries to deflect the question before it even reaches the call center, the customer service agent. So Solvvy likes to try to get ahead of the query itself and likes to deflect right at the point of being asked, or the source.

David Yakobovitch

if I’m using a website and my data is being tracked, that data is being used to make recommendations on things that I may have questions for before I even know that they’re questions I have.

Jeremy Kaufmann

Absolutely. what Solvvy has shown is that if you can deflect questions at the origin, you can not only reduce costs to respond to questions, but you can actually increase the percentage of times that a given action is taken. So the likelihood of a customer pressing buy on a shopping cart or a customer requesting an item to be sent to them, if you can just solve the question, right at the point of origin, you increase the completion rate of the action, and we’re always drawn to businesses that really increase revenue rather than just decrease cost as venture investors.

David Yakobovitch

That makes a lot of sense. And deflection is such an interesting word to use in the sense of saving. And helping it be a better situation for both the customer and the company. I find that fascinating. I’m wondering what data points would have to be captured to help predict? What questions should be asked? So is this like standard cookies and things being tracked from Google Analytics and everything on the website, or what would empower, don’t get me the secret sauce, do not want to know the secret sauce in Salvi, but what would empower that engine to be able to effectively respond to a customer?

Jeremy Kaufmann

So Solvvy broadly uses natural language processing. It’s less about the actions a customer takes on a given website. It’s more about at the time they submit the question, the technology behind Solvvy allows Solvvy to parse the question. Basically compare that language in the context of a company’s knowledge base, and a knowledge base is just basically, think of it as the FAQ page. So one of the key learnings about Solvvy is that customers aren’t at the point of asking questions where the answer is pretty solvable if they simply were to look for it.

And Google does a great job of searching the world of Google and looking up websites, but there’s a difference, which is on site search. And a lot of these tools just aren’t very effective, because customers, they have a question, they type it into the on-site page and they don’t get an answer. And then they get confused, and then that forces them to ask the question to a human being. And what Solvvy says is, if we could just deflect the question and basically direct the customer to the right answer, they would never have to ask the human in the first place. So it’s not so much about predicting, based on the customer’s actions on a website, although that’s exciting. It’s more about NLP technology.

David Yakobovitch

That’s incredible. Guest speaker on one of our podcast episodes, I was talking with Mark Sears from CloudFactory, and we’re talking about the two areas of AI that are having the most promise to massive adoption with consumers are both computer vision and natural language processing. And, seeing what the first few ventures that we’re talking about here in the episode that KeepTrucking very focused on sensors and automation and understand the world around us, and Cognata simulation of that data and Solvvy, using natural language processing. The fourth venture that you also do a lot with Scale Venture Partners is tech TechSee and they do some work with computer vision. So let’s hear what are they up to today?

Jeremy Kaufmann

So we’re very excited about TechSee. TechSee is, actually, a company in Israel. Scale Venture Partners sits in San Francisco, but invests internationally. So TechSee is an example of some of that international investment. What TechSee basically does is, think of it as the digital instruction manual for the 21st century. So in the old world, if you had a question about installing a camera or a hardware device or your set-top box on your TV, you probably looked it up in the instruction manual, or you called the customer service agent, but the core problem is there.

The customer service agent never saw what you were seeing in the real world. So they’d ask you what model type is it? Have you tried to plug the blue cord into the specific port? You don’t know that. So what TechSee does is, it uses your camera right on your phone. You simply point the camera at the object in question, and all of a sudden, now the customer support agent has eyes into the real world. So they can actually help you in that installation process.

The future of the product is basically self-help installation. So rather than looking at an instruction manual, as you put together a complex piece of hardware, you can use computer vision and augmented reality to actually self-serve and installation, which we think is going to be the future.

David Yakobovitch

It’s incredible to see how to fix the world around us can be powered by the world around us, but from a digital space. And it sounds that this company with TechSee is using augmented reality. So what’s your take on augmented reality to better understand that documentation versus maybe virtual reality?

Jeremy Kaufmann

Sure. So let’s be very honest and clear about what TechSees is doing today versus what they’ll do in the future. So today what TechSee does is the agent would actually send an image. This is the port you plug it into. And we think in two or three years as AR technology becomes more evolved and more able to be used in the real world.

TechSee is going to transition from sending diagrams to creating that AR overlay. Now we like TechSee as a bridge into AR investing. As I said earlier, oftentimes, venture is about understanding timing risk, and AR and VR, or technologies that ever since 2013 and 2014, the common saying is next year will be the year.

And what TechSee is saying is, we think that this is a trend that’s going to play out in the next five to seven years. And there are a lot of common issues that need to be solved before AR becomes ubiquitous, understanding the real world, basically the tracking and making humans comfortable with those displays. So we think that TechSee was an interesting bridge into the world of AR

David Yakobovitch

That’s super fascinating, and looking at all four of these investments from KeepTrucking and Cognata to Solvvy, and TechSee. They’re all in their own right parts of the AI game and AI investing is growing very fast. And you mentioned, Jeremy, about timing risk, and also that there’s a thesis to investing, especially in the AI space. I’d love to understand from your perspective, how you look at AI investments and how you look at it, even in some of the subfields like conversational AI?

Jeremy Kaufmann

So that’s a very interesting question. It’s very broad. So What we do very well at Scale Ventures Partners is we take a fundamentally thesis driven approach to the world. So we start with the mega trend, which is some ubiquitous technological trend that is going to have an outsized impact over the next decade or so. So let’s say that that’s artificial intelligence. What we do next is we say, okay, what are the markets where that impact is truly going to be greatest, because you could spend days or weeks going through market maps of AI companies. Guess what? There’s hundreds of them. And as an investor to think about market maps of hundreds of companies, is it going to be a really hard way to do it?

So what we say is we look for verticals and industries where we think the promise is highest. So I’d say broadly, it’s a four point framework for us. First, we start with the traditional business fundamentals. So AI companies are just regular software companies. We’re interested in market size. We’re interested in financial traction. We’re interested in how well they solve customer pain points and ROI. And then there’s a couple of interesting elements to investing in AI, which are different from just general SAS investing.

Outlining three of them first is that AI is fundamentally a probabilistic technology and not deterministic, meaning it’s going to make errors and business buyers aren’t necessarily comfortable with buying a product that’s going to make errors. They don’t understand how many errors are too many errors and how wrong a product can be. So error rate is super important. We talked about autonomous driving earlier. It’s also very important in industries like the legal industry or the medical industry where the threshold and the risk to make errors is very low. Why do lawyers bill $1,500 an hour? Because customers want to make sure that they don’t make a mistake.

So AI and the ability to make mistakes is one of the fundamentally challenging problems in the discipline. The second element that’s unique to AI companies is the idea of proprietary data advantage and building a sustainable data moat. These companies have to compete against some of the largest cloud vendors in the industry like Google and Amazon and Microsoft and venture investors are typically saying, what is it about their data and their business model that’s going to allow them to be a standalone business? And then, third is a question of engineering talent. At the moment, the going rate for some of these top notch, computer vision engineers is millions of dollars. And it’s a real race for talent. And talent is truly a differentiator in some of these companies.

David Yakobovitch

It’s incredible to see how the AI industry is evolving so rapidly. And you mentioned that you invest behind big trends and AI is one of those big trends. Are there any predictions you’re starting to see for the industry, including development in human and AI relationships?

Jeremy Kaufmann

Going forward, in the past, there’s always been a little bit of a divide between. It’s an either or statement, either humans or AI. In the immediate term, what we’re seeing is this combination. Basically AI plus human beings are more productive and able to do things that alone, either individual or actor couldn’t do.

For example, we’re investors in a company called Unbabel in the business translation space. And the premise of Unbabel is that Google Translate is great for the horizontal, for the broad, but if you’re trying to translate for a specific business, you need to have a domain specific understanding. And human beings are a way to get that specific domain understanding. So going forward, a lot of the commentary has always been human versus AI going forward.

There’s going to be more comfortability with humans and AI working together. We’re seeing that in a lot of different markets, in the medical world and of AI assisting doctors when it comes to making diagnoses. In the world of physical security guards, think about going through the TSA. It’s not inconceivable to think that within the next few years, TSA agents will have somebody looking over their shoulder and really supporting them. So the human and AI trend is something I’m very excited about.

And then the second trend going forward, that I’m also thinking a lot about is the amount of data required to make predictions. And the reason why that’s important is because the world of AI to date and deep learning is all about massive quantities of data. And that’s great from an academic perspective, but in the real world, sometimes you don’t have the necessary data to get what you want.

So that creates a couple of interesting business problems. The cold start problem. How companies go from zero data to lots of data. It creates the idea of data acquisition as a unique differentiator. And what I’m thinking about in the long run is, as AI algorithms get better and transfer learning improves, and the world of synthetic data erodes some of the value of having original proprietary data advantage, business models and how companies solve this cold start problem is going to change.

David Yakobovitch

That is one of the crutches that has been holding back a lot of the companies. And, when I spoke to Mark Sears from CloudFactory, he mentioned a lot of the companies come to his organization because they, in essence, have these data empower the teams. He calls them work streams.

These cloud workers that go through computer vision and go through text analysis and help classify and work through the data. So it’s better to be understood. This is so valuable because the cold start problem is that if you don’t have data, where to get it from, do you just go to Google images and scrape lots of images without authorization or do you create simulated data somewhat, like how Cognata is doing that. And that is interesting. Would you have any recommendations on how startup founders or other ventures could think about data to overcome this problem?

Jeremy Kaufmann

So I would say, when I think about techniques to overcome the cold start problem, there’s broadly five different techniques in my mind. Technique number one is what I call SAS first, AI second, meaning you sell a first product that doesn’t actually have AI capabilities, but whose main goal is to gather the data necessary for that second product to come. So TechSee is a classic example. Product number one is all about helping people, helping customer support agents, but the big beneficiary of product number one is that second product, that self-serve product. So, premise number one, start with SAS. Go to AI.

Number two is publicly scraping data. We talked about that earlier. Number three is the idea of offering price discounts to your customers. And basically as a way of offering deals and price discounts, it’s a way to get that proprietary data. And then combine it with that publicly gathered data. Fourth is partnerships, mainly with external agencies. If I’m building an AI company in the healthcare world, who do you think has the data? It’s the hospitals. Or if I’m a security company and I’m trying to understand, diagnose 24what shoplifting looks like or what fighting looks like, I probably need to partner with a police agency because they’re the ones with the data.

And the fifth technique that is coming more ubiquitous now is this world of synthetic data creating your own data cold start. And this world of synthetic data is going to be a very interesting approach because you no longer have to have the largest quantity of data to be successful. It basically gives startups some more freedom and wiggle room to compete against those cloud giants.

David Yakobovitch

When you look at artificial intelligence and computer vision, one of the classic examples I like to share is looking at fruit and knowing whether the fruit is ripe or rotten. And, if you were to take an orange or a tomato, you can actually just use the Python programming language and a few lines of code, change that image to make it darker or lighter, or have spots or not have spots, which in essence, would you say that is synthetic data at its finest?

Jeremy Kaufmann

Absolutely. I love synthetic data and the ability to adjust for position quality of light. Also, just gathering images in difficult to reach environments that you generally don’t have access to in the real world. So maybe images that are in a dangerous location or underwater. The real world is full of these human complexities around gathering data. So the ability to simulate it is going to be one of the major trends I’m looking for in 2019 and 2020.

David Yakobovitch

So speaking of synthetic data and difficult to reach environments, one of those cases you mentioned earlier is shoplifting. And you’re seeing today with Amazon Go and a lot of new companies on howto solve such a challenging problem, which is these different spaces and an event that occurs very infrequently, synthetic data could help this scenario as well. What do you think about autonomous shopping and solving the shoplifting dilemma as well?

Jeremy Kaufmann

So let’s take those two problems separately. First off, let’s talk about shoplifting and then we’ll get into stores, secondly. So it’s interesting. The last four or five years, we hear the term image recognition thrown around often. This is a commonly understood problem in the computer vision space. Here’s an object, classified, detected. Is it a dog? Is it a cat? That is what I call object detection. When I think about recognizing things like shoplifting or recognizing a car crash that involves something more, which is what I call action recognition, which is thinking in three dimensions.

It’s also thinking in putting time into the equation, because for example, in shoplifting or detecting unusual behaviors in the security world, if somebody is loitering in front of a building and they stand there for one second, that’s probably not loitering. They’re just tying their shoe. But if it’s 2:00 AM in the morning and they stand there for 12 minutes and you understand things like, Oh, it’s two in the morning, people shouldn’t be there, or Oh, why has it been 14 minutes? If you add that element of time to the equation, it gets you a step function higher. So going beyond basically object detection to action detection.

And then, the second question you asked was my thoughts on autonomous shopping. And basically to date, what we’re seeing is really efforts to constrain the problem. When you think about successful AI implementation, the first way to be successful is to first constrain the problem. And then secondly, allow for the general case. So what people are doing very well there is, they’re limiting the number of people walking into the Amazon Go store. They are limiting the number of skews being sold, and they’re basically constraining the size of the store.

And the reason why this is happening is because the cost of the hardware is extraordinary. The business economics of this is you’re only going to automate your store if it’s cheaper than having a cashier. And today the cost of the hardware and the sensor is just dramatically more expensive than one would think. So the most successful companies in this space right now have constrained the problem. It’s actually more constrained than a store. The best ones that I’ve seen are basically doing vending machines. So you start with constraining the problem.

And in the next 5 to 10 years, you’re going to be having reduction in the cost of sensors. And you’re going to be seeing this technology going to larger and larger stores. So, over the next 5 to 10 years, I’m looking forward to the price of sensors coming down. But unfortunately in the immediate term, if the sensors don’t come down, the economics really aren’t going to be there to get that cost savings. So for us, it’s all about the business case and the economics, not only about the AI.

David Yakobovitch

And that’s super interesting because from a sensor perspective, one that has come down in cost is cameras. And that’s leading to the ubiquity of cameras from security cameras and passive recording devices. And that’s going to allow this whole computer vision, fourth industrial revolution to come about.

Jeremy Kaufmann

Totally. Cameras are a great example of a sensor where five or six years ago, people were talking about cameras of one of potentially six or seven interesting sensors in the world of IOT or internet of things. And then what’s happened is the price of the camera has come down so dramatically that now it’s just cost-effective to put cameras where they’ve never been before.

An interesting industry that I’ve been spending some time in where you might not think cameras and AI are core to the industry’s success is actually the construction industry. This is one of the oldest industries in America. It’s an industry that has not seen the productivity improvements that you’ve seen in sectors like manufacturing and other real world industries.

And what we’re seeing in the world of construction is, all of a sudden, the largest construction sites now have cameras at the entryways to basically track people coming in and out, to track the trucks carrying objects, like sand and gravel and bricks coming in and out. So this world of cameras as cheaper, they’re just going to show up in newer places and there’s going to be some really exciting applications that you might not have ordinarily thought about.

David Yakobovitch

Now, let’s also look at some more surprising industries, not only in construction sites, which is important in New York City. There’s so many buildings coming up. In fact, I live nearby what would have been one of Amazon’s new buildings for HQ. And so you do see cameras outside every single building. But there are other frontier technologies also that are surprising, where AI is getting involved, drones and robotics. What excites you about these industries?

Jeremy Kaufmann

Totally first off, our firm has been very excited about drones and robots. We’ve actually made three investments in the space over the last year, three years. We invested in DroneDeploy two or three years ago, a software layer for operators of drones, helping pilots plan their flights, helping with photogrammetry. And we’ve been really excited about that company’s progress. And then in the world of robotics, we invested in Locus Robotics, robots for the e-commerce warehouse and soft robotics. Soft gripping technology. So broadly, what’s going on here is software’s entering the physical world in a way that we haven’t seen before.

So what excites us in the robotics world is, first off robots are simply becoming smarter than they have in the past. They are able to grip things, items more dexterously. In the past, human beings could pick up a piece of fruit, but a robot, no way. So we’re really excited about that trend. And what we look for in the world of robotics is obviously people are coming up with interesting applications of robots. Boston Robotics has great jumping robots tape. They can bound up and onto tables. Over chairs. It’s amazing. But what we’re excited about is use cases where the ROI is really here and now.

And one thing we love about robots for the e-commerce warehouse is it’s just such an obvious use case. Amazon purchased Kiva several years ago to make their warehouses more productive. And we think as robotics broadly takes over, as e-commerce grows and its ubiquity as a percentage of all items purchased, robots in the e-commerce warehouse, driving productivity is super logical. So we are broadly excited about the space. And, for us, there are many cool technologies and robots can do different things, but it’s really about where are the robots going to be most reasonable and cost-saving and business productivity driving.

David Yakobovitch

Now looking at all of these ventures, from those in automation and AI-powered to drones and robots. One of the primary goals of any venture firm is to see the success of every investment, and to ultimately lead to an exit for each and every venture. And whether it’s consumer cases or enterprise cases, every exit looks different and you’ve had many exits with your portfolio. What are your thoughts on how AI companies can scale to have a successful exit?

Jeremy Kaufmann

So I would say again, broadly the exits to date, the largest IPOs, I would say most of these are not what people would call traditional AI companies. However, this AI technology is the next generation. And when firms scale, think about AI and getting to exit, which we want our companies to IPO.

That is what we’re going for. And that’s what we want our portfolio CEOs to go for. There’s a couple of items that make scaling in an AI startup different than in other startups. So point number one is that oftentimes, the sales process in selling an AI product is just really hard because the reason is, right now, AI is somewhat of a black box. It’s not very explainable. So you need to sell your product in such a way that you trust, you force someone else to trust your product.

We made investments in Forter, which is basically detecting credit card fraud, and Socure, which is helping banks ensure customer identity. And in both these products, there were internal teams responsible for building models. And what these two companies do is they say to these teams, ‘We have a third-party model and third-party data that can essentially really drive business productivity’, and it’s all about getting those customers to trust you. And it’s very hard to do that because customers are going to ask, well, what’s your error rate and what drives the model? And unfortunately, sometimes these models are black boxes, so it makes the sales process hard. Some of the other elements we think about is the role of data. Data moats over time.

So we talked earlier in the podcast about bootstrapping data and five different strategies there. When these AI companies grow up, what we encourage our companies is to say, AI data moats and data network effects are not always going to drive long-term success of a business. You still need to invest in all other drivers of differentiation that later stage companies have. So if you’re an AI company, unfortunately that data advantage might not hold over time. So what can you do about it? You should be thoughtful about your data acquisition strategy. Just mopping more data in isn’t going to make your algorithm necessarily the winner, you should think about verticalization.

You should think about Salesforce effectiveness. You should think about all the other ways that enterprises scale, possibly moving up market. So what we say to our companies is, we invest in companies broadly at the same stage, we have a book of playbooks around go-to market strategy and help our companies go through that phase. But we are mindful that there are certain unique challenges that come from being an AI startup, but broadly, the goal is the same. We want to invest in the most ambitious founders and we want to help coach them to IPO.

David Yakobovitch

That’s fantastic. And if we’re looking particularly at the AI markets is such a fast growing market. Yes. Selling the product, as you mentioned, is very challenging. There are playbooks and go to market strategies to align founders, to be able to better successfully do that in scale, but beyond selling and beyond data moats, for AI and AI companies, what have you seen are some of the common misconceptions around AI that we can better help our audience understand?

Jeremy Kaufmann

Unfortunately there’s a common saying, and I don’t know where it came from or who started it, but it bothers me to be honest. And it says data is the new oil. I honestly couldn’t agree any less with that statement. It drives a lot of misconception in the industry. So let me give you an example. If I was a manager at an oil firm, if I worked at Exxon and I, my team, discovered a new field of oil in the Permian Basin, if I knew the price of oil and I knew the size of the oil field discovered, I knew the value of that oil field. But in the world of data, If an entrepreneur comes to me and says, I have a data set of 3 million observations and another entrepreneur comes and says, I’ve got 8 million observations. Well, that leaves me confused because that’s not telling me what idea is better, what data set is better. So this idea of counting observations is just wrong.

A friend of mine, Zetta Venture Partners now at point 72 blogged about it. So giving her credit for the idea, but, what I’m looking at is basically how the number of variables in the data set I’m looking at. For example, how well the data set does. And covering the corner cases, I’m making sure that there’s enough proprietary data in that data set too. It’s not going to be replicated. So this idea of an entrepreneur pitching a VC and saying, I’ve got seven, 7 million observations, I’ve got a great company is somewhat deceiving.

And then I’d say the second area that is commonly misunderstood is this idea that just because the problem is constrained, meaning the AI might not work overall, but only works in a small narrow subset of the problem. And just because it can’t work everywhere today means it’s a failure. And people error when they make that decision, because basically what venture investors are looking for is, we’re looking for improvements over time.

So it’s almost foolhardy for people to make claims, like I’m going to solve autonomous driving in two years, but I’m actually excited about entrepreneurs that are thoughtful and say, I’m going to start on campuses with senior citizens. I’m going to start on college campuses where there’s fewer cars on the road. So I like entrepreneurs that constrain the problem upfront. Chatbots are a classic example. Three years ago, people were saying chatbots were going to solve all our problems and unconstrained language. It was the future and the entrepreneurs today that are saying, if I can constrain the problem that gives me a better match to the customer need, the consumer wants. And I don’t look down upon people that constrain problems. It’s a way to show that you’re thoughtful.

And you actually recognize the technical limitations because this is some of the most frontier technology in the world, and you need to be thoughtful about risk. And people that undermine people, that can strain problems or talk down on that are missing the big picture.

David Yakobovitch

With chatbots and conversational AI, there was some of that hype and there was some of those empty promises, but we’re starting to see now real use cases where these products are becoming valuable, whether it’s scaling on platforms like Amazon, Google, Microsoft, or many of the ventures that scale has. AI is evolving quite quickly. And from your perspective, Jeremy, what are some of the other advances in AI that you think are feasible in the near future? and maybe some other over-hyped or other empty promises.

Jeremy Kaufmann

Totally. You just actually mentioned one that we’re pretty excited about at the moment, which is the world of conversational AI. There are three large megatrends driving this. The first is just the explosion of messaging and voice endpoints. Facebook Messenger, the Alexa device in your home. The second is just overall improvements in auto ASR and text to speech technology, both in the cloud and on the device. And then the third is basically improvements in natural language understanding and the ability to handle multi-step conversations while maintaining state. So because of these three broad trends we think that the so-called chatbot over-hype in 2015 and 2016 is now being addressed in a more feasible and reasonable way.

We actually have made an investment, which has not yet been announced in the chatbot space. And that’s our first one. And we were always very skeptical of it because we thought that it might not always be the best interface. Back in 2016, a lot of VCs were running around screaming that chatbots are the new UI and they never paused to say, well, sometimes I want to talk through Facebook Messenger. Sometimes I want to talk through an app.

But in a different circumstance, maybe voice should be the guiding UI. For example, if I’m a doctor and I’m in a hospital room, maybe I do want to dictate my notes rather than type them into an app. So you need to go use case by use case, but conversational AI broadly is something we’re really excited about right now. So that is something that I’m definitely spending some time in.

David Yakobovitch

And going a little bit deeper into conversational AI. I know voice overall is predicted to grow massively. Reports from McKinsey and reports from Deloitte say that voice is going to by 2030 be somewhere between 30 and 50% of all communications with applications and interactions. What’s your take on the voice market?

Jeremy Kaufmann

So the voice market is very broad. It’s one of these markets where you call it a market, but there’s a lot of really different sub markets going on. And we try to spend a lot of time really teasing out, what’s going on well in the market. So, two or three years ago, Amazon came out with Alexa, it started to explode. The number of Alexa skills are exploding, but stepping back today, if you look at the usage and distribution of those skills there’s only a couple hundred that have large-scale adoption. So most of these skills aren’t being used by consumers. So I’d say an example of it was a trend that was running away from us.

And now we’re getting a better, better handle on it. You have to recognize that sometimes people want to talk to a person, not a computer. Sometimes you have to understand that, like all questions of AI, it’s a question about error rate and error tolerance. It’s just frustrating when you’re using voice and trying to order at a fast food restaurant, for example, and the machine just doesn’t work.

So what we’re thinking about voices, we’re excited about it. We think it’s going to evolve over time and there’s a couple of narrow domains where we’re most excited first, the ordering market in fast food. Automated speech recognition, quick service restaurants. We think that’s an interesting application where consumers want to use voice and voice technology is getting better. Secondly, we think it’s going to be very important. As I mentioned earlier in the world of medical AI. If you look at companies like Nuance, back from the two thousands, if you actually look at some of their financial statements, half of their revenue is actually coming through medical applications.

So that’s a pretty interesting market. And then, stepping back, venture investors are often faced with this dilemma. Do I approach a problem as a fox or a hedgehog? If you go back to the broad construction of the problem back from Isaiah Berlin, and Isaiah Berlin basically claimed that there are certain people which are hedgehogs, they’re experts. They try to boil the world down into one concrete statement, like voices taking over the world or voices, the new AI. There’s other people that are foxes.

And they say, where is voice going to be best used? What are the best use cases? How do I predict the error rate going forward? And the team at scale and myself, we tend, at least I think of myself as a fox. I’m trying to overlay broad market trends with market feedback and individual use cases. So I would say I tend to error on the side of foxes. I tend not to think in those broad statements, voices, the new UI, but rather, there are certain places where it’s going to be really impactful.

David Yakobovitch

That’s super cool. And for all of our listeners on HumAIn today, I can tell you that both the voice of Jeremy and myself are real voices. These are not built on AI yet.

Jeremy Kaufmann

But if they weren’t, how would anyone know? The trend of deep fakes is certainly something that is very interesting and is going to have to be addressed by technology. But yes, I agree with David. It is really me.

David Yakobovitch

That’s right. What if I’m really Joe Rogan and Jeremy is Tim Ferriss here and we’re just changing our voices? It could be fun, but perhaps in the future. And the exciting part about humans and machines is the more we can augment our experiences. And as we’ve learned today from your ventures, all of them are trying to augment the experiences, take out the routine.

The repetition that humans do allow us to work on the more cognitively challenging experiences to better serve humanity. So it’s amazing to hear what you’re doing at scale. Jeremy, thanks so much for being with us today.

Jeremy Kaufmann

Thanks for having me on.

David Yakobovitch

Hey humans. Thanks for listening to this episode of HumAIn. My name is David Yakobovitch. And if you like HumAIn, remember to click subscribe on Apple podcasts, Spotify or Luminary. Thanks for tuning in and join us for our next episode. New releases are every Tuesday.

Works Cited

¹Jeremy Kaufmann

Companies Cited

²Scale Venture Partners

Solid Data AI Thought Leadership

Actually being done in AI

Thought-provoking

Putting things into perspective

Digging into AI