How To Leverage Mobile Phones And 3D Data To Build Robust Computer Vision Systems With Stephen Miller, SVP, Fyusion

DUE TO SOME HEADACHES IN THE PAST, PLEASE NOTE LEGAL CONDITIONS:

WHAT YOU’RE WELCOME TO DO: You are welcome to share the below transcript (up to 500 words but not more) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The HumAIn Podcast” and link back to the humainpodcast.com URL. For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.

WHAT IS NOT ALLOWED: No one is authorized to copy any portion of the podcast content or use David Yakobovitch’s name, image or likeness for any commercial purpose or use, including without limitation inclusion in any books, e-books, book summaries or synopses, or on a commercial website or social media site (e.g., Facebook, Twitter, Instagram, etc.) that offers or promotes your or another’s products or services. For the sake of clarity, media outlets are permitted to use photos of David Yakobovitch from the media room on humainpodcast.com or (obviously) license photos of David Yakobovitch from Getty Images, etc.

Welcome to our newest season of HumAIn podcast in 2021. HumAIn is your first look at the startups and industry titans that are leading and disrupting ML and AI, data science, developer tools, and technical education. I am your host David Yakobovitch, and this is HumAIn. If you like this episode, remember to subscribe and leave a review. Now onto our show.

David Yakobovitch

Welcome back listeners to the HumAIn podcast, your channel to launch AI products, discover tech trends and augment humans. Today on the show I’m bringing our guests, Stephen Miller, who’s the Co-founder and Senior Vice-president of engineering at Fyusion, a Cox Automotive Company. Steven has worked in the industry with 2D and 3D data, computer vision, deep learning, and as part of the next wave of artificial intelligence. Stephen, thanks so much for joining us on the show.

Stephen Miller

Thanks for having me.

David Yakobovitch

To start out, If you can share with our audience, let’s take a step back. Can you tell us a little bit about your background and what got you to become the founder of Fyusion, and then now part of Cox Automotive.

Stephen Miller

I actually started in robotics at UC Berkeley when I was an undergrad, back in 2010, and we were working in this lab that had this vision of automating mundane tasks, basically having robots do things that humans wouldn’t want to do.

The two key things we worked on were a surgical robot, so having a robot that ties sutures, and personal robotics, so having a robot that would fold laundry, put it in the washing machine, put it in the dryer. That was really fun, especially as an undergrad to explore all that. But what we kept beating our heads against was it was relatively easy to make a robot move in the way that we wanted, to learn how to do a folding motion. But It was very hard to make a robot understand the world that it was moving around. Understand, like what is a piece of laundry? How do I know what I’m looking at?

So I started focusing in grad school on that task of, basically, how do we understand the world in 3D. The most traditional methods that the computer vision community used back then was very 2D focused. The world is a great wall of pixels of different colors, let’s see what we can understand from that. But in order to make robots work, you need to have something that can move around in a 3D world and not bump into things, and that led me to focus pretty hard in grad school on 3D understanding and 3D perception.

A few years into this process, new hardware started coming out. The Microsoft Kinect was kind of a big splash around 2011- 2012, promising that we would actually have easy to use hardware that could give us 3D data. The kind of thing that we would have mounted on a robot in a lab, and that led me and my co-founders to think: What if we started a company in this space. What if we tried to take commodified hardware that everyone already has ,namely the mobile phone in your pocket, and build a company that uses all the technology. We came up with four robotics to power actual human things that they can use today, rather than waiting for a half million dollar robot to be available on the market.

So, basically, we went from working on high-profile robots to working on smartphone apps, trying to make technology that let people image and understand the world in 3D by walking around it with a camera as newer and newer sensors come out like the LIDAR sensor on the iPhone or the stereo cameras. So our company has just kind of grown up.

David Yakobovitch

There’s a few things that I want to unpack, and that I really loved as you shared, Steven. First, the Microsoft Kinect. I remember owning the Xbox 360 and having a Kinect, and it was really fun in those early days to be able to move around and see myself on the screen. This was like pre-Oculus days, and it was so fascinating.

Stephen Miller

Very pre-Oculus days. When the Kinect came out, many people saw it as a video game tool, like track your posture in 3D, and you can use it to dance or control a character. But the research world, at least for quite a few years, started using it everywhere. We had drones with Kinect mounted on them. I had robots with Kinect mounted on their heads. Because it was this really inexpensive piece of hardware that gives you 3D data and images together. And it was kind of a game changer in the world of research in a way that I’m not sure if everyday people who work in robotics realize just how much progress we made.

David Yakobovitch

I remember that Kinect originally was an Israeli company, until Microsoft partnered with them, with a joint venture, to bring that technology to the markets. So it was really interesting to see that, and of course, then we saw the Kinect. Oculus came out when Nintendo came out with their own version for the Wii, and other companies since then.

Of course, that’s given rise now, fast forward to where we are in 2021, where we have our enlightened futurists. Elon Musk’s coming out with the Tesla humanoid. We saw that, just a few months ago, talking about that. Of course, it was a human in a costume pretending to be a robot. But I think back to all these videos I’ve seen on YouTube about Boston Dynamics, and these parkour robots. The example that you gave us earlier, Steven, is that they’re thinking in 2D, but they don’t understand the context of the world. And see these videos now of robots doing somersaults and navigating a course. And I start to think: How have we made that much progress in the last decade?

Stephen Miller

So as a roboticist, I have to be a little bit of a healthy skeptic. If I remember 10 years ago, when we were asking interviews, how soon will we have robots in the house doing chores? We said, pessimistically five years. That was 10 years ago and I don’t have a robot in my house. I do think technology has come a very long way. I think the rise of deep learning, especially people have learned how to apply that to multiple sensor streams, and some of the stuff is very exciting and very real.

Self-driving cars really are on the road and being used every day. Boston Dynamics has so many interesting demos, but they show where the robots must be reasoning in 3D with some kind of spatial awareness. And I think of drone companies, Skydio comes to mind as an exciting one, where they have drones that are really clever at being able to follow you, film you and avoid obstacles at the same time.

So the hype is real to an extent. We are getting to the point where robots can understand the world in 3D. What is tough is when the robot has to do Isn’t just interact with itself, like walk around and not hit things, but delicately interact with the world in some way: understand this is this model of car and there’s a little scuff on the paint right here,

or this is a piece of clothing. This is a t-shirt that is this large, and if I gently lift it up and spread it like this, I’ll be able to fold it. I think that kind of really delicate reasoning is still pretty far out.

David Yakobovitch

I wonder if we are really just solving for the tick tock generation or are we solving for good use cases to make humanity in the world a better place? I think about one of our portfolio companies, a data frame ventures is Embodied, and Embodied created a Moxie, which was the time invention of 2020. Basically, this little robot that could go into hospitals and kids’ rooms. It’s really simple, It’s a robot on wheels and it’s connected to the cloud and it has an OLED display, and the robot gives different emotions and moves around, and can actually in real time take the input that the child shares and give back some feedback: I’m happy, I’m sad or a lot of really interesting cases. So they’re starting to get there, but I liked your skepticism there, Steven.

I don’t know if Elan’s dancing humanoid robot is just there today. I don’t know if I want to place a long bet if it will be there in 2025, 2030, 2050. But I was talking to my dad about this a couple of weeks ago and I said: Dad, Are you ready to live in the age of the Jetsons? And my dad said: David, I’ve been waiting to do this since the 1980s.

Stephen Miller

This is a thing, especially in the world of robotics, that we’ve been dealing with forever is any demo that we make is not just juxtaposed against the state of the art, it’s juxtaposed to Terminator, against things that people have been seeing in popular entertainment for decades. That gap, especially between what people want their technology to be able to do, what movies have kind of taught them that it can, and the real world is pretty strong, but I do think there are great applications to your point.

When you talk about Embodied, we can use today those keys in on an important aspect, which is you don’t need to make a technology solve everything perfectly. You need to understand its limitations and have it interact with people to do the rest. So in this case, having a robot that drives around, interacts with children, displays things, assesses emotions. That’s great. I imagine when they built that they came up with lots of things on top of the pure robotics to make that be a great user experience. That makes me really excited when we think about technology interacting with the human element, rather than just having perfect automation, having a Terminator running around.

David Yakobovitch

That’s right. One thing that I’d like to dissect further that you share in that example here, which was that the technology needs to be better, but not perfect. It’s looking to do something better and is one of these portfolio companies that we’re also excited to invest in HyperSpec. AI, out of San Francisco, and they’re actually taking this self-driving approach of we are going to work with the maps and traditionally maps very well.

Typically, when you’re using LIDAR and the 3D technology, you might only get like a 5% scenario of being mapped because of all the miles and the amount of data required. They’re augmenting that with synthetic data, that edge computing and other devices. So I’m sure you have a strong opinion on this. When you’re thinking about the hardware and the software, there is a gap. The gap might be the data gap. The gap may be a theory and practice gap. Can you start unpacking for the audience a little bit about that gap that we’re hearing in these scenarios?

Stephen Miller

Absolutely. So this was one of the reasons that I was excited to take a leave of absence from academia that has at least eight years, and counting, and go into the industry. Is because it did seem like there was a large gap between theory and practice. Basically in the academic world when you have to publish, then you need to evaluate on a dataset of some sort. And typically that means that someone in a lab collects a ton of photos or they aggregate photos from Google or something, and they run on those datasets.

So you want to see how well your algorithm can detect people. You run it on a data set of 10 million images of people. You see that it works 98% of the time. And you say: we’ve solved people detection. That’s it. But as you point out, data is very important. In particular, when you have to solve a problem in the real world there’s a huge gap between the kind of data we tend to collect for academic purposes and the kind of data that would be required to really solve something in practice.

You mentioned self-driving cars, and that is a key point to focus on self-driving cars. The companies that have historically done very well, historically way more at least, were pretty far out in front, today that’s a bit more in question. But the reason they were is because Google had mapped so much of the country in 3D already, they have just so much data that they had painstakingly collected.

Now as we see more clever ways to kind of augment that, to share datasets and make it so the rest of the world can catch up. It’s really exciting, because one key thing when we think about deep learning and computers is that We’re often, again, thinking about that pixel space about a flat 2D image. Looking for a square that looks vaguely like a stop sign, or it looks very vaguely car-like.

In practice when you’re actually moving around in the world, whether you’re a robot or a person holding a cell phone or a self-driving car, what you’re really seeing are so many varied angles of the thing.

You’re seeing a combination of the sun shining on the world and reflections on the windshield of a vehicle and how it looks and turns slightly. And all that kind of nuance is really difficult to reason about if you limit yourself to 2D. So I’m very excited about this world of collecting 3D data sets and teaching our algorithms to kind of learn more in that space, instead.

David Yakobovitch

Sounds like that space does include a lot of deep convolution neural nets, perhaps things around 3D CNNs, and even more advanced than that.

Stephen Miller

Absolutely. One recent development that has gotten me pretty excited in the academic world is last year, a few friends of ours, including Ben Milton Hall published a paper in CVPR called NeRF for Neural Radiance Fields for views synthesis.

I’m not sure how much this has crept into the industry culture outside of pure academia, but in academia it’s been huge. We see Google research. Facebook research. Everyone is suddenly focused on how we represent the world in 3D. How do we represent 3D space with neural networks? And I would hazard a guess that the reason this is happening is because people are realizing that if we can compactly understand the world in 3D, that is going to be a really rich data set to reason about as opposed to the 2D photos that deep CNN has kind of been brought up using up to.

David Yakobovitch

Thinking about this research, NeRF was presented at ECCV 2020. Then there, as you mentioned, it’s working with this data in real time in a 3D capacity. A lot of the methods we’ve seen today it’s been mostly 2D. What I wonder is how much of this new research everyone’s diving into a NeRF, Neural Radiance Fields. How much of this is real time versus like batch and being brought to lightly there on?

Stephen Miller

Well, that’s an interesting question. And it goes back to one of those gaps between theory and practice in the academic world. One thing that often frustrated me is a paper would say: Hey, we achieved state-of-the-art 3D understanding. We can understand the shape of a room from five photos, or we’ve built something that can track the distance between two photos to figure out how the camera moves. And then in the little asterisk at the end of the paper, it would say this takes three hours to run on a pair of photos. Which when you think about an actual application that would never work in practice.

What excites me? I can’t totally speak for the large corporations, but I hazard a guess that they are also thinking this way. A lot of these technologies are designed to be very rapidly run, to be able to be optimized on the edge. I know here at Fyusion, we’ve been working very hard in similar techniques.

We had a paper come out a couple years ago that was kind of a nerve precursor of sorts. And there we worked on real-time rendering for that exact reason, because everything we do we want to be able to fit in the palm of your hand. So given a couple of years, we are going to see a lot of companies come out with near real time versions of this. Now optimizing that for mobile, especially for low end devices, that is a never ending journey. But we’re going to see a lot of progress there.

David Yakobovitch

It sounds like the challenge for mobile devices is, would you say it’s storage, is it computing, is it the battery, the cost? Of course we have the big machines, the big tech companies that are backed right with the large GPU and TPU server farms, everything’s possible, but on a device that takes a lot of effort.

Stephen Miller

A lot goes into optimizing things for mobile, and there are quite a few challenges. One of them right off the bat is you have so many different versions of devices to support in the wild. You think of all the flavors of Android, all the different types of hardware you have to support. And each one has different, special ways that it wants to reason, especially about images.

But beyond that mobile phones, just give you a lot of constraints. As you mentioned, the CPU and battery are really heavy ones. For Apple, at least, if you go above a certain level of CPU, the OSTP will just kill your application. They won’t really tell you why. It’s just that if you use too much dataside, you aren’t worth it because on mobile your application might be one of 30 that is running at the same time.

Memory is another big issue on mobile phones to work around. There’s way less memory for us to work with than in a traditional computer. Then battery life, kind of the bargain we always have to strike, is the cooler thing we port to a mobile, the more intelligent we want our system to be, inevitably, the more battery that is going to drain.

So if we have someone who is going to be out in the field for eight hours a day, walking around cars on an auction, we can’t drain 20% of the battery every time they image a vehicle. So there’s a lot of things that we have to worry about here in order to make mobile actually work in practice. Most of that comes down to some really clever, low level optimizations, taking something that would be really heavy to compute and approximating it with something a little bit lighter, and then other kinds of tricks of the trade to make sure that we’re not doing any unnecessary computation.

David Yakobovitch

It sounds like there’s bridging the gap of divide, not only theory and practice, but also making sure that we have a good user experience and that user experience can be also bridging this divide with the performance, you call it the implementation detail. Can you speak more about that to our audience?

Stephen Miller

Definitely. So, again, these things that are listed as implementation details as if an engineer will solve it eventually, and they aren’t interesting enough for a person to write a paper about. So much of that is the difference between a successful product and a non-successful product. If I have a computer vision algorithm, and it requires someone to upload a video to the internet and wait for me to do processing, and then download it.

That may not work for 90% of the population, for people who aren’t in LTE networks, for people who are in 3D and don’t have a chance to upload. So a lot of what we do really goes into how do we make an ideal user experience? And that means push whatever we can to the edge. So we never leave people waiting due to a lack of connectivity and then gracefully handling errors and other issues like that.

One thing that I’m quite passionate about in this space is the idea of What do we do when our AI fails? Very few things are 100%. We can look at things like QR code readers, one of the most solved AI problems that we have.

And still we’ve all been in situations where we point our phone at a digital menu and it doesn’t work, for whatever combination of reasons worrying the 0.1% where it failed. Now, how do we handle those situations? A poor user experience is one that tells someone to just keep pointing and hope that eventually it will work, but most successful implementations of AI find a way to kind of wrap this uncertainty in a user experience, but feels predictable and intuitive, things that provide fallbacks, like: Hey, I didn’t catch that. Can you type this number by hand instead? Or: Hey, I feel like I didn’t fully understand when you walked around that vehicle, it would help me a lot if you went back to this part and took a photo. So, a lot of what we do now at Cox Automotive is just more broadly in the AI world.

Is reasoning about how we take things that are not 100% accurate and wrap them in a pleasant user experience, then makes it accelerate the person who is working with you, basically. How do we make sure even when it isn’t perfect, it is still providing value to the end user. That’s a very interesting challenge in the world of AI is wrapping things that are not always going to be perfect into a layer that feels intuitive and easy to use for people.

David Yakobovitch

It’s building an AI augmented experience or a human augmented experience. I can completely relate to that scenario, Stephen. I live in New York City and I take a City Bike. City bike is a ride share bike system in the city of New York city, properly owned by Lyft. Lyft to improve the experience has put QR codes on all the City Bikes, so you can scan an undock and dock your bikes. But sometimes the QR codes don’t work, so they actually have a separate serial number on the bike that you can grab and type into your phone. And if they don’t get the scan, they make it pretty clear, you can click a button and type it in so that you’re not left hanging. So it sounds to me that I didn’t know that until you just shared, but that this was Lyft intuitive AI working around.

Stephen Miller

Absolutely. That’s why some of the work in AI that most excites me, when we move from these mostly solved problems, like QR codes, where the edge case can only happen 1% of the time into the world of AI. Where we’re usually looking at things that are 80% accurate, 85% accurate. I’m really interested in people that have found a way to make AI be seen as a tool that expedites the productivity of an end user and provides graceful handling when it doesn’t work. A very recent example I watched was when my friends over at open AI for showcasing this new system codex that is meant to do just that. It’s a deep network that writes software.So a very exciting idea.

You give it a command, like: I want to make a game that moves the cursor 10 pixels to the right, and it outputs code. But it also lets you talk to it after and kind of refine it, because they’ve realized our network will not be 100% accurate. If we want people to use this we need to let them be able to inspect and talk and modify over time, say: oh, that didn’t quite work, let me tweak this parameter here.

And things like that are more like smart auto complete things, and AI will do the bulk of the work to save you time. Scanning a QR code is faster than typing a serial number, but if you don’t still provide that serial number at the end, the way for the person to intervene and feel that they have control, then it’s very difficult to make a product that will satisfy end-users.

David Yakobovitch

Even with, as you just mentioned, the open AI codex, it’s so fascinating. I remember when I was running hackathons, about five years ago for AngelHack and we would see some solutions, where you’d see these high school and college kids talk into the audio, which would transcribe these triggers or inputs that would then make the background yellow or add text in red.

We have simple use cases to see. Now they’ve evolved with this open AI codex, where you could build an entire dynamic website and have the code available and will it still be perfect in all scenarios? Not necessarily, but it could be good enough, especially for prototyping POC and getting the point across.

Stephen Miller

Absolutely. And where we see computer vision having been effective today is use cases like that. You think about when we use Zoom or Teams or something, and we have this little button that lets you blur out our background. We all know it doesn’t work perfectly, but it solves a problem for a lot of people, and it’s in a domain where we have a bit of flexibility where it doesn’t need to be perfect to provide value to us.

Whereas in the past, especially at the beginning of the deep learning revolution, in the last decade, we would see companies that would claim to do perfect understanding or point your phone at a thing and figure out how much it costs or tell me everything that is in the scene.

Most of the time, those companies don’t seem to have worked too well because they failed to see this idea of providing incremental value on the way to perfection. What we’re really seeing now is the technology getting better and better because we found these ways to provide value to the end user before we hit 100% accuracy.

David Yakobovitch

Now, Steven, I know you’ve taken an approach, as you’ve been describing here, of perhaps building a system that brings humans and machines together, we can call thato approach a hybrid approach to AI in these human driven processes.It sounds like the QR code example was one of those, even this open AI codex could be one. Can you speak to us more about the hybrid approach to AI and human driven processes?

Stephen Miller

Absolutely. I can speak to a personal example, which is Fyusion at Cox Automotive. One of the challenges that we are trying to solve is how do you make vehicle inspections more reliable? Today people, basically, if you’ve ever gone to a rental shop, for instance, you’ve had this experience. When you’re returning a car, they have you walk around it with this diagram and by hand draw little X next to every scratch, every dent, everything you’ve seen that could be wrong. And you always have this kind of intuition that there must be a better way to do this.

This feels inherently fragile because it’s relying so much on kind of me eyeballing and hoping I don’t miss anything. So with Cox, we’ve been working on automated condition report generation, where people walk around the car with a cell phone and use that to get a 3D image of the vehicle that can then be assessed for damages. But, as we mentioned before, hybrid approaches are key.

So we would never want to build a system that just goes 100% trust automation at the end. What we’re doing, instead, is building tools that will make the lives of people who actually do the inspection process easier, kind of pre-populating the sorts of damages, but they would be able to find. What we try to do, again, is much like smart auto-complete is do things that help you catch things you might’ve missed. Let the AI help you, but ultimately keep the human expert in the loop, as well.

David Yakobovitch

Thinking about human experts in the loop. Why is it so critical to keep humans in the loop? I know for me and many of the listeners in the show, we believe in augmenting humans, we believe in a human first society augmented by AI. What’s your perspective of having worked with the machines, working with the software on why it’s important to keep humans on the loop.

Stephen Miller

Well, it’s critical for a few reasons. One is just pragmatic. We all know with AI the challenge is it works until the moment that It doesn’t. It works until it hits an edge case, but it has never been seen before. For those kinds of situations, that is when human expertise is really key to make sure that you have people involved in the process. It’s also important to build trust in any kind of technology that you develop. Generally If someone told me we have a system that is just a black box where I’m going to hand it a picture and it is going to give me a dollar value back I would feel very distrustful of that, because there are so many decades of industry experience that go into deciding how these technologies work.

I also feel just on a moral level, it’s important for us to build technology that kind of makes the lives of everyday people better, rather than trying to build technology that just becomes a kind of archaic system. So, the human element is good just for practical reasons, experts know so many things that we couldn’t possibly know, and they also build trust in the end user to keep them there. But I also think it’s just important as we grow technology, to make sure that we’re doing it in partnership with people rather than in opposition to them.

David Yakobovitch

I really liked that, especially about the partnership, and talking about the partnership, as you mentioned, Fyusion was acquired by Cox Automotive’s. How has the experience been for you being an entrepreneur and now turning to an interpreneur?

Stephen Miller

It’s been great. It really does mirror that idea of the gap between theory and practice. I left grad school because I wanted to try building products in the real world. When you become an entrepreneur compared to writing a research paper, that is the real world. But really you’re building technology and you’re throwing it out in the ether, and you’re hoping that you find people who would use it.

Comparing that to this new intrepreneurial experience where all of a sudden the people are there guaranteed. There are so many people with concrete needs. They need to solve today and clear areas that we can provide value to them. And for me, as someone who cares so much more about building systems that work than necessarily having just a theoretical breakthrough, it’s been really fun to see how we can apply cutting edge AI techniques to solve very real, very immediate problems. Every state in the country there’s so many people using our technology today in ways that, as an entrepreneur in the startup world, I never really would have anticipated. So I love the new challenge of seeing how we fit technology to help augment existing processes.

David Yakobovitch

Thinking about how technology’s evolving, there’s always the classic question about hype versus reality. Now in 2021, there’s been a lot of progress, whether it’s some of the progress you’ve seen, especially in your neighborhood, that you’ve seen in the last five, six years.

Stephen Miller

So, the most clear one has been self-driving cars. I’ve joked before that self-driving cars were two years around the corner every year for the last decade. That was always the line that they were coming any day. Now they’re coming. Now we finally are to a point where I can’t step outside for a cup of coffee without seeing a Waymo or Cruise Automation vehicle drive by. We see Teslas with autopilot being used in the wild as well. So we do see things that are more real now than they used to be.

But in general, the successful technologies are the ones that find more subtle ways to integrate themselves. You think about things like Snapchat filters, They’re live, they’re assessing the world. They’re using really cool tech under the hood, but the stakes are a little lower in areas where we want high stage performances. It’s been really nice to see how hardware has evolved. The dream when we started Fyusion was the connection is coming, when is new hardware going to come to cell phones? And that dream has finally happened. Your newest iPhone has a LIDAR sensor in the back of it. We also have technologies like AR kit and AR core that lets people finally do what as a grad student I wished I could do, which is reasoning about the 3D world without being an expert in 3D.

So especially over the next few years, we’re going to see more technology that really leverages these real world constraints, that thinks about the world as 3D, rather than as pixels. So those things get me very excited. I’m also a bit of a film nerd. So I love how all of these can relate to the entertainment industry, and we’ve seen really cool things happening in marker-less motion capture and the ability to kind of scan a person in 3D and then render scenes around them.

When you think about the Cloverfield movie and how they have a shaky camera and a monster that is clearly digitally added in post that comes from this kind of understanding of the world in 3D. In the last few years technology has really become commodified, making it easier for amateurs to do this as well. So I find that really exciting.

David Yakobovitch

It’s amazing to see how LIDAR has also come of age. I mean, did anyone expect that we would be having LIDAR sensors on the iPhone 12 and iPhone 13? We saw on Teslas how to get on these small devices is quite incredible.

Stephen Miller

Exactly. The line used to be: You need LIDAR to do self-driving cars and LIDAR will cost you a hundred thousand dollars. Bear for self-driving cars will not happen. Now we have these commodified LIDAR sensors. It is really amazing. The progress that hardware has made

David Yakobovitch

For our audience who’s listening. I was making a joke there before Tesla does not use LIDAR technology. There is a different technology there, but some self-driving does have LIDAR, some doesn’t. Just as we see many different AI algorithms, there are different approaches that we’re seeing in the 3D environment around us. Steven, if you can tease our audience, from what you saw to where we are to what’s coming and what’s exciting you about the next two years around the corner, so to speak?

Stephen Miller

Certainly. So in the research world, I’m really excited about this push towards 3D representations, a NeRF and Neural Rendering is great. If I had a hunch of what that is leading towards is going to be a kind of explosion of the use of 3D for solving practical applications. We are finally going to see the dream of a 3D data set being used rather than purely 2D.

And that should lead to a lot of real-time experiences that were hard to build before. In the industry, I do finally feel that self-driving cars really are around the corner. So that excites me after so long of being told that it was coming. It finally seems like the technology is really starting to measure up with that.

And just in general, I’m excited to see what people can do having advanced technologies in the palm of their hand. We’re going to see a lot more real time understanding, a lot more 3D reconstruction techniques being used to build cool experiences for people, and hopefully finally see robots that can navigate the world in an intuitive way.

David Yakobovitch

I am ready to have my Tesla humanoids. So perhaps that will be here, not necessarily the next two years, but in the next two years, after the two years, after the two years.

Stephen Miller

We’ll be pessimistic and say five years. The five are infinity and technology.

David Yakobovitch

Perfect. Well, I’m super excited about that, and Stephen, it’s been such a pleasure. For the audience, Stephen Miller, the Co-founder and Senior Vice-president Engineering at Fyusion, a Cox Automotive Company. Stephen, thanks so much for joining us on the show.

Stephen Miller

Thanks for having me.

David Yakobovitch

Thank you for listening to this episode of the HumAIn podcast. Did the episode measure up to your thoughts in ML and AI, data sciences, developer tools and technical education? Share your thoughts with me at humainpodcast.com/contact. Remember to share this episode with a friend, subscribe and leave a review, and listen for more episodes of HumAIn.

Solid Data AI Thought Leadership

Actually being done in AI

Thought-provoking

Putting things into perspective

Digging into AI