 I'm going to talk to you about using EC2 spot instances to get a lot of RAM for a relatively short amount of time to do some processing. I stole a trick from Julia and that's the URL for these slides if you want to follow along on your laptops. So this is motivated by sometimes you might have a big file, big data, well, you know, for some definition of big, that you can't do on your laptop. My laptop's got four gigs of RAM. If stuff starts going into swap, then it's going to be really, really slow. It's motivated by me. I want to do something also kind of bike-related. I have a whole bunch of historical Bixie data and I wanted to process it a little bit. So we're talking 15, 30 gigs CSV file. And with four gigs of RAM, that's not going to go very far. So I need a lot of RAM, but only for a couple hours to do some processing. And the great thing is EC2 sells their surplus capacity at like 10 times less than the regular price. So you can get a machine with not quite 250, but 244 gigs of RAM for 34 cents an hour. So yeah, quite a bit cheaper. And I'm going to show you six relatively easy steps to get this going. The first step is a lie. I gave this talk at PyCon Canada almost exactly one month ago and unfortunately in that time, more fortunately, really, they released a new set of command line tools, which is the first step to install the command line tools. The new ones are in Python, so that's cool, right? The old ones are in Java. They're really slow and really horrible. So here are some of the examples of the commands like EC2-describe-spot-price-history. There's a whole bunch of these. These are the ones that you use if you're doing stuff with spot instances. They do have short, very unpronounceable abbreviations like EC2-describe-spot-price-history. These are all out of date. Go get the new command line tools and they look different. And then here's the key part. If you're using EC2 and you're using an Ubuntu image, the really key part is this line up here, cloud boot hook. This is the script that runs when your instance boots up and this is just basically saying I want to install these packages from Ubuntu and certain packages, like Julia said, always install IPython from pip. This runs before the instance boots up. The cloud boot hook is the earliest possible thing you can do with EC2 and you want the stuff installed before it even tries to connect to, no, not before it tries to connect to the network because then it wouldn't be able to fetch them, but really early on before anything else is running. And so then the other part is this is a upstart file. This is why you want stuff to start really early. You want the upstart file to be in place so that when the system boots up, it starts up IPython notebook with PyLab inline, as Julia told us. So what I like to do if I'm doing this or what I have done if I'm doing this is connect to an IPython instance and do all my processing in the browser as if it was my local machine, except I have so much more processing power, so much more RAM. The next step is you have to choose the right Amazon image and they make this really hard because it depends on the region you're in, the type of instance you're using, like whether it's an instance store or whatever, and they all have these really annoying names and you cannot shorten this into something rememberable, and this is actually incorrect, probably, because I just copied the wrong one, most likely. So don't use that, find yours here. Step four is to request the spot instance. So this is where you tell Amazon I want an instance with loads of RAM and I want it with this image and I want it with, so this image, loads of RAM, 8x large is 244 gigs of RAM, use that cloud boot hook script showed you earlier, this says don't charge me more than 70 cents an hour, so spot instances will die if the current price goes above that. And they've got SSH key pairs, so to use that one. Step five, there's a lot of waiting. I'm sorry, there's about eight minutes of waiting. The spot price request has to get into their system, it has to be evaluated, it has to tend for a while, it has to launch and start and so many, you know. But after six or seven minutes, you're good to go. I couldn't do a demo because it's a five minute talk. But then step six, this is the good part, use your gobs of RAM. So the way I do this, like I said, is I connect over SSH, so this is a long command you can find on the slides, it's just setting up an SSH tunnel between local host on, let's see, I can't tell which, I can't remember which is local host on the local machine and local host on the remote machine, but it's just saying forward everything on port 88888 over SSH. And then this way you can connect to it through your web browser. And it looks like this. And this is free dash G gigabytes and the machine has 236 gigs of RAM and you're in IPython and you get to do all the Pandas stuff, Julie just showed you just with more RAM. Thanks.