 Hello, everyone. Can you hear me? Yes? Great. So we are going to talk, Vikash and myself. My name is Alex. We are data scientists. We are going to talk about how we, Renmart can fulfill the demand for tissue paper, water, and all the thousands of items that we are selling using many cool stuff, including fractal curves. So I think that many of you probably have not worked in the past in warehouses and in transport and logistics. I mean, we could first try to give you an idea of what's happening on the ground, what problems you are actually trying to solve, and then frame it in a computer science sort of context in computer science terms, explain what particular difficulties we are facing at Renmart with this problem. And with Vikash, we will present you our engineering solution. So the context is what I call the old-bound transport chain. So this is everything that happens to deliver your order once you have placed it on the website and then once it starts being processed at our warehouse and then by our transport team. So it starts here with what I call delivery scheduling. And this is where our piece of software that we are going to talk about runs. It runs in order to later allow the efficient and on-time delivery of all of your items while keeping certain quality metrics and so on. So the delivery scheduling part consists in getting the data, the data about all of the orders that you guys have placed, then running the data through the VRP solver that Vikash will detail later. And then this VRP solver outputs things that get translated into a road plan. So this is a snapshot of our real interface that our routers use to do some manual adjustments to the road plans when necessary. And those road plans correspond to the actual trajectories that the delivery trucks will be taking on the ground. And this physically gets printed out on actually delivery manifests that the drivers can also have a look at and that are useful for ground operations. So this is really the final output of all that process. I mean, the final output is what you get in your fridge, but this is what the delivery guys get. And, well, once, I mean, the timeline of events goes as follows. So once the delivery scheduling has taken place, at some point, picking needs to happen. Picking it means that there is a guy that goes through the warehouse with a little trolley and the boxes that you see, those red boxes that you see, and fills them up with the contents of your order. So it's called picking. Then there is QC. I don't have a picture, so I'm putting cartoon characters on other pictures. The QC just, it's the quality control of some randomized sample of the boxes that have been filled up. Then staging. Staging means that all of those boxes get stacked up on pallets somewhere in the warehouse. And then the feeder logistics. So the feeder logistics is where those pallets get loaded on feeder trucks. The feeder trucks are larger trucks that transport several of those pallets and they get dispatched to intermediate delivery centers that we call cross-docs or depots. And from these cross-docs, then you need to unload the trucks. And once you have unloaded the trucks, you need to load them on the vans as per the roads that were scheduled earlier. Sorry. So these are the cross-doc operations from the big truck to the small trucks. And then there is the last mile bit, which is what we are going to talk about the most, of course, because this is what the VRP solver outputs. It's the routes that are being followed by the delivery guys on their trucks. And so those start all from those cross-docs. And then you get your groceries, but it's not the end of the story. Then there is the reverse logistics part, which consists in returning the boxes to where they come from. This is the timeline of events, right? But logically, the way the delivery schedule should happen is that at that point when the delivery schedule happens, the thing that is being thought about, that is being calculated, is the last mile routes based on the constraints that you have specified, for instance, the delivery time slot, your address. And also if you have fresh items, things should not stay too long out and things like that. This scheduling of the last mile delivery trips has a direct influence on the actual feeder tracks, which feeder tracks the orders should go on. So there may be one feeder track that leaves at 1 p.m. and one leaves at 3 p.m. But if your order is scheduled for delivery at, let's say, 4 p.m., then maybe it cannot go on the 3 p.m. one. So the consequence is that it has to go on the 1 p.m. one. So that's made up example, but this delivery scheduling, there's a whole chain of consequences when we do this scheduling. And so the VRP solver has to eventually decide also what feeder track stuff should go on, the order should go on. And this in turn means that the picking should start at, by a certain time and dates so that the things that are ready to be loaded on the feeder track on time, right? So I mean, we are going to mostly talk about the bit about the last mile delivery round trips. And the name of that game in the acrylic world is the VRP, the vehicle routing problem. So in our case, it's a capacitated vehicle routing problem because its track has a capacity with a heterogeneous fleet because the tracks have different capacities and different capabilities. There are service time windows here in this little drawing that I made, the numbers 1, 2, 4, and so on correspond to some imaginary time windows. So for instance, you can imagine this is from 2 to 4, this is from 1 to 3, and so on. The VRP is here is also, I mean, we also have multiple depots. So we need to, I mean, maybe there are simplifications to the problem, but overall we need to decide which, from which cross dock, from which depot, a particular order will be served. So that's roughly the academic context of the computer science problem being solved. Here I just made a solution. I didn't make it by hand. I just looked at it, and normally you can't really make it by hand because it's an NP-hard problem, the capacitated VRP with time windows and so on, so you need some quite sophisticated technologies to solve it that Vika should be talking about more in details. But you see that I tried to make it follow so that the routes get delivered to things in order of the time windows requirements. And here we have a new trick that can happen. The VRP constraint, the real hard problem actually is not the academic side of things, but it's the real-world side of things because there are many extra constraints that are not discussed usually in the academic literature. But for instance, the objective function being optimized by the solver is actually a good function that will model really the optimal thing to do on the ground. So this is really a function that returns numbers. And those numbers do they reflect the difficulty and the benefits for Redmat and for the customers on the ground? How can we model the capacity of vans? I said that the vans have different capacities. You can put more less stuff in them and so on. How can we model that? It needs to run quickly because the VRP solver makes many iterations, many trials of various combinations of orders and vans. How can we estimate the durations of training from one point to the other? How can we account for the traffic predictions if we can make traffic predictions on these routes? How can we model and reduce the risk overall? There are many risks. There is a risk of damage, a risk that stuff that is frozen will unfreeze, maybe the risk that will be late because this is a central business discrete at peak hour or something like that. How can we estimate the duration of each delivery once we reach the parking spot to go and deliver stuff? So here I'm going to just quickly talk how to estimate the duration of deliveries, for instance. The first thing is, do we actually have data about the duration of deliveries? No, we don't at the moment, for instance. So how do we do that? We need to find proxy, a proxy that will most closely approximate this information so that we have the least overestimate of the delivery of the duration of each pass delivery. Here we happen to have the GPS location and the engine signal when the engine of the truck was turned on or off. So we can use the engine signal around the customer signature time to determine when the guy maybe was out of his truck. So this is an overestimate. This is the best guess is an overestimate because maybe the guy had to do something in his truck, maybe he had to check something on the phone or he had to wait a little bit after his delivery because there was a bit of slack time before the next delivery and something like that. But that's the best data we have, so we need to do with it. And that doesn't, I mean, this is not really solving the problem because it's just some observations, but how do we actually build a model of that will predict future delivery durations for new customers or for deliveries in the same location, but for which we have never seen such a large volume, for instance, of items. So this is a problem that requires some modeling. Also, once we are facing this problem, we can also look at what data to collect. So we can collect new data that will help in the future to solve this problem better. So here we are starting collecting data with the proximity between the driver and the truck. So that gives us a better idea of how long it takes to make deliveries. The other members of the transport team will be talking later, may touch on it. So there are many other problems, oh, sorry. There are many other problems that are being solved embedded in the VRHacker routing problem. And Vcache, we will know, touch about some more of those problems and solutions. Oh, sorry. Yeah, there's this thing. Okay. So Alex gave you an overview of the VRP problem and also explained how we are calculating the service duration matrix. So what is the input to our VRP solver? So what we have is a set of locations, a set of orders with locations and the items within those orders and also the time at which they're supposed to be delivered. We also have a list of vehicles that start and end locations and what in what time they they are able to service. The output would be a set of orders which are assigned to these vehicles and a sequence and also some additional statistics that how much time and what distance and etc. takes during that route. Okay. So I mean this is a very standard constraint optimization problem and some constraints are called hard constraints because if these are violated, the solution is infeasible and hence can't be used. So they are, I mean the routes must start and end at the location for each vehicle. Each delivery must be served by exactly one route. They should not exceed the capacity of the van, otherwise it's impossible to fill them up and also must be done within the time window as they're supposed to be. So I mean filling a van is a 3D beam packing problem. It's a NP-hard problem within an NP-hard problem. So we have to be, the solution has to be really, really fast so that because these estimates have to be done on very quickly. So what we did was we had to approximate our capacity. So this is the standard red mark tote and we pre-calculated for each of the different vehicles, how many of this we can fit and then we calculate an equivalent for each of the deliveries which map to this tote. So this way we get a very good approximate for capacity and this is actually an overestimate by about 10 to 15 percent which is actually good because if you're very, very accurate in fitting our vans there would be exact sequence in which the items need to be placed and which would be very hard for our delivery reps to actually fill the van. So we need some sort of space so that they're able to work with it. And I mean each delivery has to be done within this time window as it's known. In addition to these hard constraints, once they are satisfied that the solution is feasible but we can improve upon them and these are the soft constraints which helps us to work with that. So deliveries must be made within the time window but it is good that they're made much earlier than time it with the end. Also we have to take into account how much total time it takes from the crosstalk to crosstalk and also some driver preferences. I mean our drivers have, they get used to the location and they're really, really fast at that location. So we try to model that so that it's easier for them to operate. So what we do is, so let's say if this is a time window we prefer that the delivery is done like 15, at least 15 minutes before the time window ends otherwise the risk of delay increases exponentially. So driving duration, I mean if this is a sample delivery route you would start with from location A, travel to, service at location A and then travel to location C, service location C and then travel to location B. So in addition to the travel time, the entire delivery time it's a soft constraint. So this is an example of a travel time matrix which is an input to the solver. So how do we restrict routes to work so that they work based on our previous experience of our drivers working in a particular area? So we have these small polygons and what we do is we create a graph of adjacent polygons and then use an algorithm called point in polygon to find out what are the deliveries within each polygons and then if an adjacent delivery is away from the adjacent polygon you have an additional cost on that. Also we prefer that the routes are geographically compact and close to each other. So how do we do it? So what we do is this is called a space filling curve and for each of the deliveries we create an index on this curve and those deliveries which are closer to each other would have a closer index. So if you sort them linearly you would get deliveries which are closer to each other and then based on the capacity heuristic which we discussed before we would fill our vans and create an initial solution. So this is just an initial solution and it may not be feasible. So we have to run a search on top of this initial solution. So what we basically use is a class of algorithms called metaheuristics where during the solving phase the algorithm can accept a solution which is worse than the previously best known solution in order to escape local minima. So we had the initial solution which we discussed in the earlier slide and then you would create multiple candidate solutions from this initial solution, evaluate the objective function, choose the best and keep on doing this until we hit a termination criteria. So how do we create multiple candidate solutions? So what we do is we do certain kind of moves. For example if these are two trucks you try to exchange a drop from one truck to other and then recalculate our objective function. So this is called a change move. Other type of move is called swap where we swap two deliveries between two trucks and then again recalculate the objective function or something called a tail change swap where within the truck we swap the index or the position of the deliveries so that we get a much better out. Also I mean these problems can have a really huge search space. So what we try to do is is to reduce the search space so that it becomes faster for us to converge. So to reduce that we use something called a random distribution to select which deliveries to either exchange or swap or do a tail change swap. So what we do is something called so they should be either nearby based on the coordinates or nearby polygon which we discussed earlier. So for example it makes more sense to swap A with E rather than A with Z. So just trying to reduce the search space so we use something called a parabolic distribution where the closer you are you are more likely to be chosen for these kind of moves. And also when you terminate so right now we have a hard limit of 5 minutes or number of rounds without improvements. So if you are not able to improve the solution for the next 200 rounds it is better to terminate. So even if you are using meta heuristic which are good at escaping local minima we could still get stuck at local minima. So what is the best way to escape them. So what we are doing right now is we run instead of one search we run 16 parallel searches at the same time and for each round there are 2000 candidate solutions being evaluated. So for each round we are evaluating 32000 candidate solutions before we reach our termination criteria. That's it. Thanks.