Craig Wiley has journeyed from the hand-deployed models of yore to the pinnacle of automated AI. During a decade at Amazon, he led SageMaker, the company’s web-enabled machine learning platform, from concept to rollout. Today, as chief product manager of Google Cloud’s AI services, he’s making advanced tools and processes available to anyone with a credit card. Funny thing: He spent the early part of his career managing YMCA summer camps. Maybe that’s what enables him to view the AI revolution with a child’s eye, marveling at its potential to renew entire industries and imagining the bright future of streamlined model deployment — so he can build it for the rest of us.
The Batch: There’s a huge gap between machine learning in the lab and production. How can we close it?
Wiley: We used to talk about how to bring the rigor of computer science to data science. We’re beginning to see it with MLOps.
The Batch: People have different definitions of MLOps. What is yours?
Wiley: MLOps is a set of processes and tools that helps ensure that machine learning models perform in production the way the people who built them expected them to. For instance, if you had built models based on human behavior before Covid, they probably went out of whack last March when everyone’s behavior suddenly changed. You’d go to ecommerce sites and see wonky recommendations because people weren’t shopping the way they had been. In that case, MLOps would notice the change, get the most recent data, and start doing recommendations on that.
The Batch: Describe an experience that illustrates the power of MLOps.
Wiley: In 2019, Spotify published a blog saying it used some of our pipelining technology and saw a 700 percent increase in the productivity of its data scientists. Data scientists are expensive, and there aren’t enough of them. Generally we would celebrate a 30 percent increase in productivity — 700 percent borders on absurd! That was remarkable to us.
The Batch: How is it relevant to engineers in small teams?
Wiley: If nothing else, it saves time. If you start using pipelines and everybody breaks their model down into their components, it transforms the way you build models. No longer do I start with a blinking cursor in a Jupyter notebook. I go to my team’s repository of pipeline components and gather components for data ingestion, model evaluation, data evaluation, and so on. Now I’m changing small pieces of code rather than writing a 3,000-line corpus from beginning to end.
The Batch: How far along the adoption curve are we, as an industry?
Wiley: I think the top machine learning companies are those that are using these kinds of tools. At the point where we start struggling to name those companies, we’re getting to the ones that are excited to start using these tools. A lot of the more nascent players are trying to figure out who to listen to. Someone at a data analytics company told me, “MLOps is a waste of time. You only need it if you’re moving it to production, and 95 percent of models never make it into production.” As a Googler and former Amazonian, I’ve seen the value of models in production. If you’re not building models in production, the machine learning you’re doing is not maximizing its value for your company.
The Batch: What comes next?
Wiley: Think about what it was like two or three years after distributed systems were created. You needed a PhD in distributed systems to touch these things. Now every college graduate is comfortable working with them. I think we’re seeing a similar thing in machine learning. In a few years, we’ll look back on where we are today and say, “We’ve learned a lot since then.”