Josh Wills is currently a software engineer at WeaveGrid, a startup focused on decarbonizing transportation. He has worked as an engineer and data science leader at companies like Slack, Google, Cloudera, and IBM, and is also a member of the Apache Software Foundation and founder of the Apache Crunch project. Josh recently sat down with Nihit Desai, CTO and co-founder or Refuel.AI and instructor of the MLOps course on Uplimit (formerly CoRise), to talk about his experience applying machine learning to real-world challenges.
The following excerpts from Josh and Nihit’s conversation have been edited and condensed for clarity.
Nihit: Josh, can you tell us a bit about the experiences you've had building data and machine learning teams at companies of different sizes and stages of maturity?
Josh: Good question. I’ll start by talking about Google, which was in many ways the first machine learning company. Google was always a data company through and through, so machine learning was a very natural progression. It was already a reasonably well-established practice there when I joined. And it was my first exposure to big machine learning, so I was working on a lot of different things, getting a sense of how to apply machine learning to different types of problems.
At Cloudera, I got to see how other companies were approaching data and machine learning. Some of those classic big companies — like Capital One, State Farm, Walmart — are actually phenomenal data companies. They already had care and stewardship around things like data quality, and they took data seriously, so it was relatively easy to introduce them to machine learning.
Slack was really interesting because there wasn’t much of a foundation at all when I got there. So I got to have that experience of building the data culture, the infrastructure, all that stuff. And then I chose to join WeaveGrid because of their focus on climate tech. Like Google, it’s a data company through and through, but what I really wanted in my next role was to work on a big, meaningful problem like climate change. That’s what led me here.
Nihit: Can you talk us through what WeaveGrid does?
Josh: Our job is to help large utilities manage electric vehicles on the grid. The hope is that most people will be driving electric cars within the next couple of decades — but right now, the grid isn’t designed for that load. So utilities are very interested in understanding the demand from EVs and evolving to meet it. They want to know roughly where EVs are, when they're charging, etc. We do a bunch of machine learning around those problems. It starts with collecting and analyzing meter data from utilities, and then there are really interesting optimization problems — for example, if you have all these EVs on the grid, how do you schedule their charging across a 24-hour period so you keep the load minimized?
Nihit: When you’re coming into a new company, how do you hone in on the problems you need to solve? What are some questions that any machine learning engineer should ask or think through?
Josh: I think the best advice I got on this was from a fellow named Chris Wiggins, who was the chief data scientist at the New York Times for several years. I remember asking him, how did you get started with machine learning at the New York Times? What was the problem you chose to go after first? And his answer was that he figured out what the executive team was most afraid of, and he built a model focused on that. There are so many interesting things you could do, but at the end of the day, that’s what gets you taken seriously.
Nihit: Let's switch gears a little bit. What parts of the machine learning life cycle do you think have become easier over the last, let's say, 3-4 years, which parts remain harder?
Josh: Model training's gotten pretty good, and feature engineering has gotten a lot easier. We have a lot of good tools and practices. But I think data collection is one of the great unsolved problems of our time. Even people who think they take data collection seriously underestimate how important and how hard it is. And monitoring is the other one — in the same way, people don’t take monitoring seriously enough. Honestly, it’s just not fun. Running your model in production for the first time is fun. Having a model degrade and drift, and having to figure out why, that’s not fun.
Nihit: If you look at the next, say, five years, what do you think will change in terms of how we build machine learning applications? Do you think there's a difference in the types of application that will be built in the next five years compared to the last five?
Josh: It's a good question. I think we have largely settled on the architecture for a certain class of machine learning models. I think things have become a lot more standardized and automated, which is fantastic. The honest answer really comes down to how creative we are with respect to data collection. In my opinion, any problem where we can collect sufficient data in digital form is amendable to machine learning right now. The unknown is, do we get better at acquiring data across different fields? If you find yourself at a company that for whatever reason happens to have an amazing treasure trove of medical data, political data, image data, you name it — the actual modeling is going to be fairly straightforward.
Nihit: A lot of folks in the audience are fairly early on in their machine learning careers, and I think we all want to know — what advice or insights would you share with someone who's just getting started?
Josh: I would say yes, learn the tools and the technical skills, but more than that, focus on learning to frame questions and make the best use of the data you have. I think the most valuable skills are really around problem selection and formulation. That’s the thing I’m really looking for when I hire people. The technology changes, the tools change, but those ways of thinking about data are fundamental. That's the stuff that matters.
Nihit: Awesome. Thanks so much. Let's do some audience questions. Here’s one—when Josh started at Slack, how did he decide the tech stack?
Josh: When I joined Slack in 2015, the only large-scale data architecture I was familiar with was Google stuff. So I mostly just went out of my way to replicate Google's data systems and architectures and tooling.
The one thing I did that had a huge downstream impact was to make sure every single record that went to the data warehouse had a Thrift schema to go with it. That was the single best decision I made at Slack. Everything else was built on top of that solid foundation, and it just saved an enormous amount of time. If I was doing this again today or starting over, I might pick something other than Thrift, but I would pick one thing and make it the standard.
Nihit: Awesome. One more question from the audience: what do you think of the state of model testing and behavioral testing?
Josh: I assume you're asking about safety and risk. And the answer is, it’s not great, broadly speaking. We don’t talk about it enough, because it’s super hard.
There's a lot of stuff we can do with machine learning that's a really bad idea. For example, you take a company like Slack or Microsoft or Google, they could easily tell from their usage data who is about to quit their job. Given that, should you build a machine learning model that would tell a company when someone is about to quit? Is that an ethical thing to do? Probably not, but we don’t have a good way to make those decisions on a large scale, or to make sure our models aren’t used in a way we didn’t think about or intend.
I think the best solution we’ve come up with at this point is just to have a diverse and thoughtful and empathetic group of people around a table talking about what it is we're trying to do. And if you’re working in this field, make sure you don’t just study the technology. Study philosophy, study ethics, read literature. Think about what it's like to be a person who’s on the receiving end of whatever you’re building.