How to Enable AI in Software Development with Chris Van Pelt
Chris Van Pelt joined Weights & Biases as Co-Founder in 2017.Chris co-founded CrowdFlower. He was previously a technical product manager at Powerset, Inc., a natural language search technology company later acquired by Microsoft. Chris has worked as a studio artist, computer scientist, and web engineer, and pours his diverse background into his role as Chief Technology Officer. He combines deep design insight coding abilities that enables him to produce anything, sometimes within minutes. Chris studied both art and computer science at Hope College.
Chris Van Pelt’s LinkedIn: https://www.linkedin.com/in/chrisvanpelt/
Chris Van Pelt’s Twitter: @vanpelt
Chris Van Pelt’s Website: https://wandb.ai/site
Podcast website: https://www.humainpodcast.com
YouTube Full Episodes: https://www.youtube.com/channel/UCxvclFvpPvFM9_RxcNg1rag
Support and Social Media:
– Check out the sponsors above, it’s the best way to support this podcast
– Support on Patreon: https://www.patreon.com/humain/creators
– Twitter: https://twitter.com/dyakobovitch
– Instagram: https://www.instagram.com/humainpodcast/
– LinkedIn: https://www.linkedin.com/in/davidyakobovitch/
– Facebook: https://www.facebook.com/HumainPodcast/
– HumAIn Website Articles: https://www.humainpodcast.com/blog/
Here’s the timestamps for the episode:
(00:00) – Introduction
(01:55) –Enabling AI is a paradigm shift in software development. It’s going to change the way that software is getting written.
(02:32) – Enabling AI by opening up to the community through “benchmarks”, which are mini Kaggle competitions, oftentimes focused around social good or something to make positive change in the world.
(02:54) – Drought Watch exemplifies one of these benchmarks. It’s taking satellite imagery of various drought prone regions in the world, as a call to folks in the machine learning community to create an algorithm to predict drought conditions before they happen so that we can take appropriate action and ensure that the impact on humanity is minimal.
(05:58) – Developer tools for machine learning show two different approaches in the marketplace: data science as a service from data ingestion and transformation to training of models to actually deploying those models. Weights & Biases tries to create an entire platform as a service focusing on the training and experimentation around creating models.
(07:59) – Figure Eight can give companies data in a highly scalable, efficient and accurate manner. Weights & Biases is a tool intended to build a model. But first you’d need to label the data. FigureEight calls it a “Human in the loop” who targets examples that maybe the model didn’t do well to go back through a labeling pipeline and get labels on to further improve the model as it is being retrained.
(09:56) – Companies like Google have spent tens of millions of dollars, hundreds of years of compute and processing power on working on data sets and labeling data to get it to a good enough steady state that now can outperform a human and still have Humans in the loop. It is a core aspect of any real-world mature machine learning application.
(12:38) – The tooling in the space of deep learning was pretty lacking. Weights & Biases was first trying to address this issue of keeping track of what you had done and then hopefully better enabling teams to reproduce any results that had been obtained in the past.
(16:13) – We’re at least a few years out before we see any meaningful usage of technology, before getting autonomous.
(18:42) – Computer vision started the hype around deep learning a few years back. And it’s been really exciting to see the advances in natural language processing over the last couple of years. Image captioning merges both worlds.
(22:50) – We are going to continue to need humans for cognitively challenging tasks such as authentication, fingerprinting and spoofing. Any time there’s some underlying pattern in your data that is not getting after the core of what you’re trying to predict, but instead, the systematic of something else in your data collection process, that is bias.
(25:00) – Reducing bias by trying to understand data sets. In the initial training data creation and curation process, pull all sorts of statistics over various axes.And once you’ve created a model, measure how the outputs of that model are performing across an evaluation data center, some set of data.
(27:42) – As we create deep learning models with tens of thousands or millions of parameters, it becomes really difficult to explain why any given output was chosen or what their thought process was.
(29:15) – Reinforcement learning is definitely more on the frontier of ML. Some companies use Weight and Biases’ RL at least in an experimental context.
(31:11) – Research trends include unsupervised machine learning use cases being able to take data that hasn’t been labeled by any human and actually surface or unearth patterns simply by looking at all the data.
(32:30) – Data sets are going to continue to become larger and computes is going to become less constrained. It’s all about the custom hardware. Many startups are trying to make hardware chips that can do all of this matrix math really quickly and highly parallelized. Those are going to be continued innovation, and likely some big step gains as the market matures.
(37:46) – Using Weights and Biases tools will help you unearth any underlying bias or issue with your model and enable you to debug it quickly.