 Hey, everybody, I'm Adam. I'm a software engineer and community lead at a payments company that I'm not going to name. Can I just get a quick show of hands here? How many people here have taken an algorithms class, either in college or on Coursera? And who has never taken one? OK, that's me. It's not just symbolically raising my hand. So hopefully this talk should have something for both of you, but especially the second group. So this talk is called Python for Homec. And the slides and my speaker notes will be posted to GitHub after the talk is done. So this is Wikipedia's picture for Homec. And so, yeah, Homec is defined as the profession and field of study that deals with the economics and management of the home and community. So I'm going to walk through two problems that show you how to do things in your home more efficiently and hopefully remind you that programming can help you reach a better solution to problems in your everyday life. So this is not a picture of me with glasses on. This is a picture of a co-worker of mine, Jonathan. He's the one who inspired this talk, as well as being an awesome software engineer. He's also a farmer and a do-it-yourselfer. This is his garage right after he moved out the cabinets that I really wanted to get a picture of. But the problems I'm going to talk about today are real problems that he had at home last fall. These aren't made-up examples. They're real. And the first problem came up when he was working on this countertop right here in the picture. Butcher block countertops like this are made up of lots of small pieces of wood. Hopefully, you can see that. And this particular one is L-shaped and joined along a diagonal, which means that every piece of wood you see there is a different length. No two are the same. So Jonathan's problem was, how does he cut all those different length pieces out of the standard lengths of wood that he already had at home? So just for some terminology, I'm going to talk about boards and cuts. Boards are the pieces of wood he has. Cuts are the lengths of wood he wants to have. So in this example, you have a seven foot long board and a five foot long board. And you want two four foot cuts, a three foot cut, and a one foot cut. So as you see, if you start out by cutting the three foot cut out of the shorter board, you're not going to have enough leftover to make both four foot cuts. But if you start out by first making both four foot cuts, then you actually have enough wood to get all the pieces you need. So it's important when you're trying to cut boards into smaller lengths that you think about what order to cut those pieces in. So let's figure out an algorithm for this. We know that our goal is to find places for the biggest cuts first. So let's just do that. Let's start from the biggest cut and go through all the boards trying to fit it in. So in step one up there, you can see that first I take a four foot cut out of the smaller board, then in two I take a four foot cut out of the bigger board, and then I take the three foot cut out of what's left of the bigger board, and I take the one foot cut out of what's left of the smaller board. So let's look at the countertop again. Our example was great. There were only four cuts that we needed and two boards. Well, as you can see in this countertop, there's actually 64 pieces you have to cut. So way harder to think about by hand. And instead of having two boards or cutting them out of, there's actually 35, and there are three different lengths. This is all, again, real information from the actual countertop you made. And so keep in mind that there are only three different lengths, despite there being 35 boards, because that'll be important later. And one thing I haven't mentioned so far is if you see those little black lines in the middle of the boards, you actually lose some length every time you use the saw to make a cut. And that's called the kerf, K-E-R-F. And that's what you lose every time you use the saw. So even though here we pretended that you can make a one foot cut and a four foot cut out of a five foot board, really you can't. So all these details are why we need code to solve this problem, rather than just doing it in our heads or on paper. So here's the code. We're going to walk through it, so you don't need to get it all right now. This is the entire algorithm. It's less than 20 lines. And don't worry if, even as I walk through it, you have some trouble following exactly what I'm saying. You can always come back to the video or look at the code on GitHub later. So first of all, we're going to call this method greedy approximation. And the reason it's greedy is because we take the biggest cuts first. We take as big pieces as we can at the very beginning. And it just needs three pieces of information. It needs a list of the lengths we want to cut, a list of the lengths of the boards we have, and how wide the kerf of the saw is. So different saws will have different ones. So the next thing we're going to do is we're going to sort the cuts in reverse order. So that means we want the biggest cut first, then the next biggest one, and the next biggest one all the way down to the smallest. Because like we said, we want to be greedy. We want the biggest cut first. And then finally, we're going to sort of annotate these list of lengths. So instead of just having the length of the cut and the length of the board, we also want to keep track of which cuts we found a place to make and which we didn't in case we can't find somewhere for everyone. And we want to track how much of each board we've used so far, so we know how much is left over. And we want to track which specific cuts we made out of each board. And this is really the output of this program, is this list of cuts for each board. Because that lets us go use the actual saw and make the cuts physically. So for each cut, and remember that they're already sorted, we want to go through each board. And if the board has already been used, we don't just need to cut the length of the cut. We also need to take into account the curve. So we don't need to do that for every cut, but only everyone after the first. Because if you cut a board in half, you're only making one cut, not two. So you need to adjust the length to account for that. And then if the remaining length used, if the length remaining on the board, which is its length minus the used bit, is at least the length that you need, you can mark that you found a place to make the cut. You add the new length to how much of the board you've used so far. And you mark that you made this specific cut from this specific board, so that you can again use the actual saw later. And since we found a place to make this cut, we don't need to keep looking at more boards for a place to do that. So we just break out of the loop. So you may have noticed here that I skipped something. We sorted the cuts, but we didn't sort the boards. This is actually kind of important. The order does matter. And this problem is a particularly good example of that. If we were to sort all the boards from smallest to largest, we wouldn't be able to find places to make 17 of the cuts we need given this algorithm. But if we sort them from largest to smallest, we can actually find places to make all of the cuts we need except for one. So it's just a huge difference depending on how you sort them. So if we can make every cut but one, that's actually good enough. Since this is a real world problem, it's OK if one of those pieces of the countertop is made up from two smaller boards. So that's great. And we already really have a solution. But it is possible to do better here. Remember that Jonathan's boards are only of three different lengths. So we tried smallest first, and we tried largest first. What if we tried smallest, then largest, then the ones in the middle? And it turns out if you do it that way, which you can just do manually, there's only three different lengths. It's really easy to reorder them and try them differently, you actually can make every cut. So don't be afraid to use a little manual work to add to what the computer is doing for you. Sometimes it can get you a better solution. And so in this case, he was able to make every single piece of that butcher block countertop from a single piece of wood. So let's jump from the practice, from this example to theory. There is a very classic computer science problem that this is a virgin of. Can anybody, based on what I've said or based on this picture, tell me what that is? Yeah, the knapsack problem. So you've got a bag, and it can hold a certain weight, in this case, 15 kilograms. And you have a bunch of items, each with both a weight, 12 kilograms, and a value, $4. And your goal is to maximize the value of the objects you put in the backpack without going over that 15 kilogram weight limit. And how to pick those items best is the knapsack problem. So if we wanted to apply our algorithm to the knapsack problem, where you have a separate weight and value instead of just one, you'd have to combine the two by taking the value to weight ratio. Since our algorithm only looks at, in this case, length, only looks at one criteria, not two different ones. So there are lots of different versions of the knapsack problem. Some of you may have seen this XKCD comic before. If the value and the weight are the same, and you have to exactly hit the weight limit instead of just being under it, it's called the subset sum problem. And so if you had a menu with different prices on it, and you said, I want exactly $15.05 of appetizers, then the waiter trying to figure out which appetizers that meant you wanted would be an example of the subset sum problem. So on the other hand, if instead of having to hit the weight limit exactly, your weights equaled your values, but you had multiple knapsacks, you didn't just have one, you would have the cutting stock problem. And it's often thought of in two dimensions, like in this image, think of this as a big piece of paper, and we're cutting a bunch of smaller pieces of paper of all different shapes and sizes out of it. And so how to pack those different cuts into this two dimensional space is the cutting stock problem. And so this is what we just solved, except that when we solved it, we only had one dimension length instead of length and width. But it's the same basic idea. So why is all this important? Why is the knapsack problem classic? Why did I just spend 10 minutes talking to you about countertops? The answer is that let's think about your data center that you have at work. Maybe it's in the cloud. Maybe it's a physical data center. It doesn't matter. You've got a bunch of servers, and you have a bunch of VMs or processes you want to run on them. And each of those servers, maybe one has seven gigs of RAM, and one has five gigs of RAM. And two of your VMs have four gigs of RAM. One needs three, and one needs one. If you want to pack those VMs efficiently onto your physical servers, you need to solve the knapsack problem to figure out where to put each VM based on its RAM needs. And if you end up with wasted server capacity and not enough space, if you don't do that efficiently. Now, in real life, it's not quite the simple. VMs don't just need RAM. They need CPU time. Sometimes they need multiple cores, and they need IOPS. So it is a multi-dimensional problem, much like the stock cutting problem pictured here rather than the version we've solved. But it's the same general idea. So this is a problem that you do actually have to solve if you're Amazon, especially, and you have a huge number of VMs to allocate. So if you're interested in learning more about the knapsack problem and more advanced algorithms for solving it other than just greedy approximation, Wikipedia is a perfectly good place to start. So let's jump from here to talking about the second problem that Jonathan had. The second one involved his pantry rather than his kitchen. Unfortunately, unlike the picture of the countertop, this is a fake. This is not actually his pantry. But just like in this picture, Jonathan has lots of different kinds of preserves because he's a farmer. So he grows stuff and then preserves it for the winner. And his family eats this stuff all winter and all spring until they get fresh produce again at the beginning of the summer. So what he really wants is to have as much variety as possible during the winter. He doesn't want to eat all the raspberry jam first and then all the blueberry jam. He wants to spread them out. And this seems like a simple problem, but it's actually a little bit complicated. So let's talk about why. Let's say you've got the same number of three different types of jars, red, blue, and green. If you wanted to sort of have a lot of variety, it's not that hard. You just sort of roughly space them out evenly through time. And you're going to have a variety from one week to the next of which one you're eating. But let's say suddenly you had way more red jars than you did blue or green. Now it's a lot harder to make sure the red jars are spread out evenly through time. If we try and spread the blue jars out evenly, that means that the red jars are going to end up bunched up. And you're going to have red, whatever, jelly, two weeks in a row. And that's not what we want. So we want the red, the most common type of jar, the red jar, to be spaced as evenly as possible. So you can see here, by evenly spacing the red jars first, you don't have any color twice in a row, but blue is a little bit imperfect. It's not quite evenly spaced, but it's still better than it was in the previous version, where there were two reds right next to each other. So we need a solution that when both the blue and red jars sort of to be evenly spaced would need to be in the middle, it picks the more common jar, the red jar in this case, to be in that middle spot. So the first thing I thought of was let's just put all the most common jars evenly spaced on the shelf, and then put the next most common ones between them, and then the least common ones in between at the end. But the problem is if you actually do that, you end up with the gaps left in your shelf at the end for the least common ones, aren't at all evenly spaced. So you see here, the two green ones, if you try and evenly space both red and blue, end up together in the middle. And so this isn't as bad as two red ones being together, but it's still not ideal. They're still grouped up. So how do we avoid this? What if instead of putting all the jars on the shelf spaced out, we put them all next to each other and then move them apart and stick new ones in between? Well, if you start out with the most common jar, the red jars, you put them all on the shelf, and then you stick the blue ones in between them. It looks pretty good. But then when you stick the green ones in between, it nothing ever separated those two red ones that are together in the center. So you still end up with some of the most common ones bunched up, and that's really the last thing we want. So what happens if we do the same exact thing, but we start from the least common type of jar and move to the most common? We start out with two green ones, and then we stick the four blue ones in between them, and then finally we take the red ones and stick them in between those. Because we're doing the most common one last, we can put them exactly where we want them to be rather than having to worry about them getting moved around by other jars coming later. So this is the algorithm we want, and the code for it is even shorter than the code for the previous problem. As is often true, it was figuring out what code we wanted that was tricky and took me some time rather than actually writing it. So we're gonna walk through it again. Don't worry again if you don't get every little bit. You can always pull it down from GitHub and play with it later. So first of all, this is, I thought, quite interesting. These are the actual contents of Jonathan's pantry at the beginning of last winter. So you can see he's got three jars of tomatillo, cape or soup, five of spicy marinara sauce, 20 of salsa verde. So there really are very different numbers of different jars, different types of preserves. So it really is important that that salsa verde ends up spaced out as evenly as possible, or he's gonna be eating it almost every day. So, and this is gonna be stored in our program in a simple dictionary, a mapping of name to frequency or quantity of jars. So the first thing we do in our code is we sort those jars in increasing order because we wanna start with the least common type of jar and move up to the most common type. So we're just gonna sort them and then we're gonna create a variable list to represent our shelf. And of course it starts empty because we haven't put anything on the shelf yet. So then for each item that we wanna put into the pantry, we figure out how many gaps there are gonna be between those jars because what we're trying to figure out here is every X jars, I'm gonna insert one of the next most common type. So first when there's nothing on the shelf, every jar, I'm just gonna put them all in one after another, but then after that every one or two jars, I'm gonna insert the next most common one. So I want to keep track of what this ideal spacing between the new jars I'm putting on should be. So I calculate the gaps, I look at how many jars there are on the shelf and I divide that number of jars on the shelf by the number of groups I need them in in order to get a spacing. And note that since this is a division, this isn't necessarily going to be an integer, this could be fractional. So and then you go through all the jars you have in this new set that you're just putting on the shelf and you put them in after each group of jars. So after the first one goes in at position zero at the very beginning, the second one goes in after the first group of jars, the third one after the second group of jars, et cetera. So you multiply the number you're on by the spacing and then you round that to an integer because again spacing might be fractional and then we add the number of the jar it is that we're adding and that's because every time I stick a jar on the shelf, the index of all the jars to its right is gonna go up by one. So if you insert a new item in the beginning of a list, all the others are gonna go up by one. So yeah, so if you follow this algorithm, you actually end up with a really good solution and I'll hopefully we'll have time for me to show you at the end. But it works really good for a relatively small number of jars. If you're familiar with how fast or slow things are in Python, you might have noticed this insert statement. Insert takes time proportional to the length of the list you're inserting into. So since I'm inserting into the shelf once for every item that goes on the shelf, this is gonna go up in the time it takes by the square of the number of items you have. So it's not, and I do think that you can actually correct the supplementation to be linear time but I haven't taken the time to do it because we only had less than 100 jars. But it does mean that this might not scale up to giant problems where you have maybe 100,000 jars. So let's step back for a bit and think about this type of problem in general. We've got some criteria that we can measure that we're trying to optimize. Here we wanna space jars out evenly so we don't get tired of eating any one thing and we care more about the more common types of jar and because of that we were able to find a good approximation for figuring out how to make them as evenly spaced as possible. But our approximation is very specific. It only works for sorting jars. We can't use it to solve other types of problems where we wanna optimize some criteria but there are techniques that allow you to optimize many different types of problems and I'm gonna talk about one here. One we could use is called simulated annealing. Annealing is actually a physical process that involves heating up a metal to a certain temperature above its crystallization point and then cooling it back down and you do it in a specific way that when you're done the metal is softer and easier to work with so it's less likely to crack, it's less hard and basically its structure is just more uniform internally so there isn't a lot of stress caught in the metal. It's very easy to mold because it's all the same on the inside. So simulated annealing is just what it sounds like. We use a computer to simulate the process that the metal goes through as it cools. You calculate the total energy held in the structure of the metal and as it cools you generally tend to move to states that have less energy pent up in that structure. So let's watch this animation from Wikipedia for a second. You can see as the temperature goes down we tend to reach a point that has an optimal energy level. While the temperature is high we're jumping around to points that may or may not be optimal in any particular sense but we get closer and closer as the temperature drops. So think back to our problem. We want our jars to be spaced as uniformly as possible and that's exactly what annealing does. It makes the structure, it simulates the structure of a metal becoming more uniform. So if you treat the jars as having more energy the closer together they are then when you simulate annealing they'll tend to move farther apart until they're evenly spaced because that is their lowest energy state. So that's how you could apply simulated annealing to a one-dimensional problem like this. So another place that you might use simulated annealing is when you're designing a circuit board. You want the board to be as small as possible, compact as possible but you also want all the wires on the board to be as short as possible so you don't wanna have to run them all around the outside you wanna be able to run some of them through the middle and you have all these different pieces of different shapes that have to fit on the board. So what you can do is define a criteria that combines all of those things the total size of the board, how the pieces are located and the length of the traces and you then use simulated annealing to optimize that and you end up with a better board layout than you could come up with by hand. So just like with the knapsack problem Wikipedia is a good place to learn more about simulated annealing and I didn't really explain exactly how you actually use simulated annealing since obviously we didn't actually use it to solve this problem we used a different approximation technique but the specifics of it are explained pretty directly in pseudo code on the Wikipedia page. So hopefully these examples reminded you that programming can be used to solve everyday problems and you've learned a little bit about thinking through a problem and coming up with an algorithm to solve it. I don't know about you but in my day-to-day job as a programmer I'm not really coming up with algorithms I'm just implementing them so this is a good exercise because you really practice thinking algorithmically and also I hope that you learned a little bit about how fairly simple everyday problems can relate back to big important and classic computer science problems and algorithms. So we're gonna try and take a quick code break and I'm going to run these programs for you so that you can see that I'm not making them up and they really work. So let's start with the pantry problem. Hopefully you can all see that we have our list of jars that are in the pantry and our algorithm down here and then at the end all we do is print out the list we end up with. So if you look here we know salsa verde is most common, right? So we definitely want that to be at both ends because we know if it's not at the very ends of the list then it's not spaced apart as far as it could be which is what we want from the one that's most common. So you see salsa verde is there at the bottom and then again about six items up and if you look it's also at the very top so it looks like a good result and then if we jump over to our greedy approximation method here you can see that in addition to what I showed you before there is a little bit more code that just sets up the lengths of cuts we need and the lengths of boards we have notice again the boards are not in order. We take the shortest one first then the nine longest ones and then the 25 in between and we've set our curve, our cut width to be an eighth of an inch and then we get the result of the function and print it out and in this case you can see we found there are no remaining cuts to make every single one was made and for every single board we have a list of the cuts we need to make out of that board how much is used and so we can then just print this out take it over to our table saw and make all of our cuts. So that's about all I have. Happy to take questions now and like I said everything will be posted on GitHub after the talk if you want to learn a little more about it. Thank you very much.