 You have to think, okay, you can order the right amount so the customer gets his items earlier, like Amazon Prime and stuff like that, but we're talking about about 100,000, 200,000 items in this mail order company. We've got personality, the delivery time of goods does vary but we are only getting an estimate from our customer that's for about four days from order to delivery. And well, there are a lot of, you see a lot of items, we've got a lot of slow selling goods that's quite hard to predict. And we've got all the returns from the customers, so if you order something, you have, keep in mind, okay, I'm ordering, but I'm getting all the stuff back with some delay, about one, two, three weeks, depends on the product group. Well, that's one case. So, the other one is retail. Let's imagine a supermarket where if you've got some amount of goods which is sold every day, let's say meat, milk, stuff like that, and well, you have to order the correct amount to minimize auto stock situations. Auto stock means your shelf is empty and you have to handle as well as the excess the surplus of goods in the evening because we're dealing here sometimes with expiry dates, we've got weather, we've got seasonality, we've got special events like soccer championship, we take care of everything and we're talking about some hundred items and some hundred locations, so you've got a multiplication here. In the mail order example, you don't have locations, it's only one location. And here you've got, usually you've got fast moving goods here because milk and bread and stuff like that that's sold in amounts about 100 per store. Okay, so what is the essence of this? Okay, you have to know the demand and stocks for certain periods of time in advance and you have to know the frequency of orders. So, well, orders the right amount of items considering the conditions, the boundary conditions, you have to calculate the correct amount of an order. So, well, what do we need in a framework? Well, my talk covers more or less the prediction simulation replenishment module and the testing. Everything else is necessary but I had no time to really put it into the framework. Input output is usually a big deal. We are dealing with companies they have to bring the data to us. Sometimes it's corrupt, sometimes we've got problems with defining interfaces and also output interfaces are quite difficult sometimes. And, well, I just cook it down to CSV files so it's no database and nothing in it, just CSV files. I will talk about predictions but only very, very low level predictions like moving average or rolling mean. And in the replenishment, I will make some assumptions so the calculation is easy and you can follow the formula. Well, in testing I will talk some minutes about pie test and, well, I've written some unit tests for my functions I use in the framework. Well, plotting, logging, documentation, deployment, reporting and monitoring is also very, very important. So keep in mind if you have this list and you think about it then you've got, well, a minimal set for a framework. Okay, for predictions, well, we need to know tomorrow's demand or another period of time, let's say two days, three days in advance and I prepared some easy model examples like take yesterday's sales, take the last week's sales of the same week, well, I left it out. Rolling mean and moving average. You can use gradient boost or whatever you like from scikit-learn. The order calculation. Well, I make some assumptions, okay, we're just, we've got a stock count in the evening of zero so we're throwing everything away each day because it's easier to predict and make an order forecast for items that are thrown away each day so I don't have to do a stock simulation, a complicated one. My demand interval is one day so everything gets delivered after one day. Well, here I've shown, okay, you've got the stock, you know the stock in the evening is zero, you will calculate an order for that. So you know, okay, my order is coming in tomorrow and I've got a demand to fulfill so the customer's happy and well, then the order is the demand tomorrow minus the stock in the evening and the stock in the evening is zero so the order is the demand. Very simple case. Okay, for reasons I'll explain a little later I'm doing a simulation. So I predict the demand for each day on a test sample so I have two cases, take yesterday's demand or I calculate the rolling mean of the last five days and then I calculate the order for a given strategy. For example, take the expectation value of your prediction or more complicated, take a quantile of assumed probability density. So let's say usually you can assume that goods are sold like in a personal distribution or something with more difficult gamma personal distribution and you can calculate this distribution from an expectation value you forecast and then take a quantile of this assumed probability density. Okay, let's switch to a notebook, some simulated data whereas there's a simple data generator available in the repository you can download. It produces data, a time series with a personal distribution between one and 10. You can put in a number of items while I took about 100 items. And, well, let's see. So I'm plotting one product. So that's the whole time series in a histogram where you see, okay, we've got some very low number of zero sales, then you've got about 40 sales of one and so on. And, well, you see the mean is about four here. So that's a personal distribution with a mean of four. And I can plot a time series of this product. It looks like that. Here you see I simulated a time series from 2014 till end of May of 2015. And you can see, well, the sales are fluctuating between, well, four and 10 more or less. And you can plot also the sum of all sales of all products. So I've got about 100 products and the mean is about five. So, well, you see some personal fluctuations each day. You can plot the mean for each product. So my product ID goes from one to 100 and you see, okay, each one has a mean between zero and 10. Okay, well, I've written some demos on simulation where you can do some different strategies like order one, order 10, orders expectation value, orders a quantile, take different predictions. I can show you in a shell, take some time. So the simulation will take about 30 seconds and I'm giving it a config file. You can have a look into this config. It's moving average and I give it a name. Well, the ID is the output name or less here. And it will run. And we can have a look in the configuration file. Well, here I've got some configuration file I'm giving to the program. And now it tells the program, okay, simulate quantiles from 10 to 99 in some steps. Take the model simple prediction, window zero, okay. I show you the code for the prediction. So window zero means nothing here. You've got a start date and an end date for the simulation and the replenishment rule you want to use. So that's what I call strategy simulation code. So it's more or less reading in data frame, takes some arguments and evaluates a prediction function and gives the arguments a prediction function and does the replenishment and calculates what's too much and what's missing and writes it back. So you've got a rate as a result of this calculation for each product and each day. Okay, and the prediction is in here. Well, I took, right now I took the simple prediction. That's just shifted. Well, I've got sales in my data frame and it shifted by zero days because the shift value is also zero I'm putting in and that means, okay, I'm at the evening and I know, okay, I have nothing in stock but I know the last sale, for example, five for this product. So my guess for an order expectation value is five for tomorrow. Okay, so it's, it did run and we got some, call it AVMR here, it's 14.41, yeah. I got a result, we can have a look. Well, we've got index then surplus means, okay, how many items did I have sold and what's left in my stock and it can be bigger than one that means I've ordered more than I sold. It's bad. I've got an auto stock rate that means, okay, how many days has this item been out of stock and it's the mean of all items. So for low quantiles, it's 0.8 here and it's getting bigger quantiles, it's getting quite slow. So you're ordering a lot and but you've got a lot of leftovers. Well, you can do it for every of those combination prediction replenishment and you're getting curves like that. So you've seen the output and well, here I wrote down the definition of access rate and auto stock rate. So surplus amount versus sold amount and item days out of stock versus all items times days combinations. And now you're getting a working curve and depends on the customer, what the customer does want. Well, let's think, okay, I simulated you some expectation values that would be the usual way, okay, we are ordering what has been sold yesterday. Then you're getting a value about, well, auto stock rate about 50% and 25% access rate. But you can do better. We can take the rolling mean, for example, with a window of five and you see this yellowish curve. I hope you can see it and you're getting at the same auto stock rate about 40% you get about 18% access rate. So this strategy would benefit, yeah, your customer would benefit about 6% points lower surplus amount. You can choose the working point for your replenishment solution. You can put it in the config file and while you get a replenishment. Okay, I did this with some parallelization because I didn't write it in a very optimized way. So I thought, okay, I can parallelize it. So it's getting a bit faster and it's quite easy if you use multiprocessing standard library of Python. And well, I've got four cores on this laptop and the result is map of function. The function is the simulation wrapper and I put in a list of values. The list of values are my quantiles. So I parallelize each job is calculated for one given quantile. And our projects of this mail order commonly we're using Redis for calculating it on multiple hosts because we're dealing here with, let's say, so 650,000 products we have to simulate for about 25 quantiles and over time period of 60 days. So it takes about one hour with 90 CPUs and the code is a little bit more complicated and it's a little bit optimized for speed. Okay, testing. Well, you've seen I've got some functions in my code. I've seen the predict functions here, three functions I've got in the simulation. I've got as well some functions. Here I've got the simulation per quantile and I've written some tests and the tests are in tests. Just very simple, assuming some data frame, putting in the data frame the function and testing the result. And I'm using PyTest, just type PyTest and it looks for all the test files in new directory. I've got three, I've got 10 unit tests and they're all passed. And I can put in, I'm using also PyTest coverage. So I can say, okay, give me those two directories and test for how many percent are covered. I think it's out in HTML file. Can have a look into it. So I've tested those two directories, the replenishment and simulation directory and you can see the test coverage is 99%, sounds good. And we can have a look into it and okay, you've got the code and everything's green. So this function has been tested. Well, usually you don't test every case but you can see you can test every function here. So stock simulation is tested in the init files. It's just for importing, there's nothing in it. And the simulation is tested. Well, I didn't test the wrapper. Can see it here, it's red. This function is not tested. So I've got 96% of coverage. One missing, 25 run. And I'm testing also my config. Well, I'm using the Polyptus package on the PyPy server. It's quite nice if you want to test config files. So as I showed you earlier, I've got some config files like the simulation with the diction and I've written a validator while I pass this config first, it's in YAML. And then I'm testing it in the config. The validator right here, it's, okay, I'm loading in importing from Polyptus I'm importing scheme, object, and so on. Look into documentation, it's quite nicely documented. And here I'm defining some lists. So a list of ends for testing my quantiles. And then I've got some check for dates and then here it's a scheme. And I'm testing for the scheme. So the quantiles have to be in between one and 100. Okay, 100 is not in the range. So it's more to 99. And model has to be a string. The window has to be end. Okay, this tool has to be a date with a certain format, year, month, day and an input file with string. Okay, let's test it. I'm going to send the simulation and type a simulation. And nothing happens. Okay, everything's fine. So let's change it. Let's go back, simulation. And I put in a string into the window and run it again. And I got an invalid message. I couldn't put it nice, I really got a nice exception. Okay, here extra keys not allowed in data prediction start date. Oh, why in start date? It's actually, it's before that. Interesting. Sometimes I don't know what's happening. And now it's working. Yeah, now it's in the window. Okay, that was the conflict testing and the pie test. Well, replenishment. Well, it's nearly the same code I use in the simulation. It's just a little bit different. You don't scan over a range of quantiles. You define your working point from the nice curve I showed you earlier. So here I take a 60% quantile. So I hope I'll get an out of stock rate of 33% and excess rate of 23%. So that means in two of, in 66%, so that's two of three days, the customer stands in front of an empty shelf. And customer, well, the grocery store or something like that has to throw away about 23% what he's going to be, what he's selling. So that's quite high, but I'm throwing everything away each evening. So that's really a worst case scenario. Usually you've got in grocery stores, you've expiry dates for a week or two or three days depend on the product. So the code is nearly the same. Let's go into the code. And it's just called order. I've got a model. I've got a prediction model. I've got this replenishment rule. Same as in simulation. And then I'm doing order. And it's predictions. And I apply a lambda on it. But first I'm calculating a prediction. And then I've got the prediction, which is my demand for the next day. And I apply a lambda on it. And it's a replenishment function. And the replenishment function is defined in replenishment rules. So I can give it into the config and let's say, okay, order one, order 10 and what's not good. Order, let's say it's an expectation value of this prediction. Well, it's more of a prediction. And here it's the standard replenishment I'm doing. So I'm taking password distribution, put in the quantile and the prognosis, the forecast for the sales is a lambda. Okay. And well, the order then is just sales minus stock, stock zero and open orders is also zero. Because I'm just ordering for tomorrow, so there's no incoming goods. And then I'm doing a stochastic rounding. So let's run it. So it's just an order for one day. Takes about five seconds, two or three seconds. So I just calculated the orders. And usually you're putting that into a database and then you're putting it on an output file and the customer retrieves the output file and does an automatic order. So that's the optimization. Okay, and the order is just a simple CSV file. Okay, with a date, a product ID, sales plus the truth. Usually you don't know that. And prediction and the order. So my prediction is six. And I'm ordering, I think the 70% quantile. So it's a little bit higher because it's above the mean. And you see, okay, you've got some, can have a look into a file I prepared earlier in a notebook, orders, the order and plot everything into a histogram. You'll see the order is distributed between zero and 17. And the prediction is a little bit lower because I said it's the average. And what's my average order for this day? It's six. So my mean is I'm ordering about six items per order. As some, six, the amount of six per item per day. And while my randomly simulated data has more or less an average of five. So it's not so bad. But what is missing? I didn't talk about logging, but I think it's quite important if you've got a big project, a lot of stuff going on, you want to analyze what's wrong because a system is usually quite stable, but sometimes something happens like a customer sends corrupt data or you've got memory which you didn't expect and stuff like that, and then you need a locker. So everything's covered in the standard library of logging. You can use gray log and everything, something like that to evaluate the logging data. And then reporting, monitoring, plotting is very important because, well, data in CSV files is nice, but I am a visually guy, so it's just, I want to see some plots, some data visualized. And you can use different tools like matplotlib and on top on it, cburn for doing some simple regressions or you use bouquet where we had some talks around here. Okay, documentation is also important. I just wrote down which parameters go into functions as a comment, but you can put good documentation, you can use things. And also if you have got some productive system or testing system you want to deploy code, at the moment I'm using Ansible, so there's a nice talks around here and you can just look into the web pages. While putting it all together, what do we have? We've got a prediction module, a replenishment rules, put together to a replenishment module and we've got a stock simulation which is very, very simple, but it's easy to enhance it and we've got a simulation. And what we also put nice to have is documentation, logging, monitoring, tests and configuration we have or I showed you that and deployment and reporting. Reporting means customer wants daily report of how many items were out of stock, how much was left over in the shelves, what's the average amount of items ordered, stuff like that because usually if you send him the data and he puts it in the database, sometimes it's very hard for our customers to get the data out. So reporting plays a big role in other companies, we don't like too much reporting stuff because it takes time, you have to prepare data and usually we have to look at the data ourselves because it's as a cross chat with our customers so the customer says, okay, the access rate was 12% and we say, okay, we've got monitoring reporting, no, we just made a plot, it's 8.5%, you're doing something wrong and you always have to cross check with your customer because in the real world while simulating is nice but double checking is even better. Okay, well, what do we have? We've got a simple replenishment framework with a basic solution for automated order forecast system in the case of everyday order and everyday replenishment and it's quite simple to put this together with tools, pysons, standard library, pandas and so on and NumPy gives to you and well, what's the lesson? Well, writing Python code is easy and you can use it in everyday life and you can use it to produce real value for companies because throwing away goods or shortening delivery times is very good for our customers and they are quite thankful for us that we are applying them with good prognosis and good forecast, so thank you and visit us at our Blue Yander booth. I hope I see you soon and if you'd like to ask some questions, be free.