 Have you ever wondered how your favorite beverage gets to the store for you to purchase? At a price you can afford? We're taking you on a virtual field trip to see how big data and machine learning can be used to solve real supply chain problems or real companies. This means we'll talk with expert research scientists here at MIT CTL to see how they're using the techniques that you learn in class. And we'll also travel to New York City to work with our industry partners, the ABMF. Welcome to Anhyzer Prasimbev, come on in. I'm Michael Kress, I'm what we call Tier 2 Logistics. I'm the director of Tier 2 Logistics Transportation, which means I'm overseeing projects and processes related to taking our beer from the warehouse out to our customer network. So how can we make our drivers more efficient, provide better service, be safer? So my name is Matthias Winkenbach, I'm a research scientist here at MIT CTL, leading the Mega City Logistics Lab. My name is Esteban Mascarino, I'm a graduate research assistant at the Mega City Logistics Lab. If I use Brazil, we have 110 warehouses or so in Brazil, so I want to look at yearly efficiency at the DC level, 110 data points. If I get down to daily, sounds like a lot, but actually we can manage that, daily routines manage that. If I want to look at routes, I've got a million, and I want to look at customers, I have 16 million. I remind our executives that Excel breaks at a million dollars. Yeah, this project was interesting because we actually were sitting on a large pile of data that ABI provided us with, first only for Mexico City, later on then also for other cities around the world. We have all of this data from Mexico City, you know, if we did this and looked at it this way, we maybe we can come over some insight. So essentially the company had a huge amount of data, they wanted to do something useful with the data, and that specifically useful thing was coming up with a systematic approach in order to measure and benchmark productivity across the globe. So ABI has customers all over the world, and logistical teams all over the world who serve the stores that you and I buy their products from. But the company wants to know, why are some of these logistical teams performing differently? So we have 500 warehouses or so around the world. We operate in 30 countries, and we try to compare and say, how do I push one warehouse to be as efficient as another warehouse? So it's a little challenging to try to think about how you cluster or compare different realities against each other. And so what we worked a lot with on this project that we'll talk about with MIT is how do you put a little bit more math and methodology behind that comparison set? In a classroom environment, be it in person or online, your teacher will show you some new technique or concept. You'll get to practice with it by solving problems that the teacher creates. But as many of you might know, in practice, you don't always start with knowing the exact question or even knowing the appropriate method. ABI doesn't need to predict anything, not yet anyway. What they need is to organize their data, to probe it to see what it can tell them. So clustering was one of the main methods that we used because we first of all wanted to basically describe what's going on. We wanted to provide Anwar Zabush in-bath with a metric that they could apply to similar data in any other city that would basically classify their routes. But I don't know if we had thought through how to use a clustering methodology to apply to this project. I'm not sure any of us really knew, oh, clustering would be the solution, right? And I think that's when some of your students can say, oh, I know a technique. So once you've decided that you're going to do a clustering approach, do you just clustering and then you're done? I wish, I wish so, but no. First you get tons of data, hopefully. I mean, if you get data, that's great, because then you can clean it up, which takes four, five months. Cleaning the data, making sure it's consistent, making sure it's correct, making sure it's complete is a huge effort. If you just input garbage, you'll get garbage from your model. And it's not your model's fault. It's your fault because you didn't do a proper or a dot of the data cleaning process. It came out naturally from the development of the project that we should, at some point, segment the routes into their different operational characteristics. So for instance, urban routes with low drop size or urban routes with high drop size. So we opted for a categorical classification of the operational environment and that added an additional complexity to the whole model because we had to deviate from k-means and start mixing both k-means and k-modes. And that's k-prototypes, which is a well-validated clustering methodology. So the team cleaned the data, which was a major effort, and then they decided which machine learning algorithm was appropriate to their goals. Did anything surprise you when you looked at the result? We get the same insights from the same clusters that the methodology we came up with was pretty robust. There were still a lot of unexplained differences in route performance that we had some hypothesis on but that we actually struggled to find the right data to actually describe. So a big learning that we've had, probably the biggest learning actually is the need to be more focused on data and be more of a data-driven logistics operation. I think that's a fundamental shift that we've begun with MIT, we're the beginning of that journey. But I do think that will be one of the changes that we use as data as an asset going forward. Our course is about supply chain technology and systems. Can those things help a business? Yes. We've also covered data analysis. Can that provide new insights? Of course. But let's talk to our experts about how they use those things in practice. My challenge to all colleagues would be let a data scientist or people who are thinking about your business in a new way take you along that journey. And it's not that you don't know what you're doing or you don't know your business, it's that there's just somebody looking at it a different way who might find something. But you might find a couple of nuggets here or there, you say, oh wow. And those couple of things can transform your business. They actually went and integrated our analysis into their ERP systems so that you could roll it out easily all over the globe. And that's what makes me a little bit happy because it started as a small analysis here at MIT and is now globally used within API. Maybe you want to do a project like this yourself, either with your own company or in the company that you might work for in the future. So first of all, they should all take courses in statistics and probability. That's something that's a must. And then they can step up and go for more advanced things. But they should start from scratch with simple things, linear regression, clustering methods, classification methods, I don't know, K&N, hierarchical clustering, analyze a little bit about neural networks, random forests, and that kind of stuff. And they should know which are the advantages and disadvantages of each of the methodologies in order to pre-select them or to have them in their toolbox and to know when to apply each of them or when to mix them. And very closely related to that, you need a decent understanding on how to code and how to actually implement these analysis at a large scale. Go one level deeper and try to be able to at least formulate a first version of a code, for instance, in a language like Python. So I think for any of your students, you know, focusing on storytelling or consulting type of skill sets or understanding that journey that you need to take people on, that the answer is the beginning. And so you can do a lot of work to get to a number. And you might be very proud because I got you the A in the class, but that doesn't get you an action or a business accomplishment. So my greatest advice would probably be focus on that journey and rounding out your skill sets because it doesn't end with the answer, and it starts with the answer. Matias, thank you so much for taking the time. Thank you so much, man. Thank you so much, guys. It was a pleasure. Michael, thank you so much for your time. Of course. It's so valuable. Sometimes the best way to learn a new concept or technique is to see how it's applied in practice. Thank you so much for joining us on this virtual field trip.