 Thank you all for coming. So my name is Nate Pinshot and this talk is E-commerce with Django at Scale, Effective Performance Lessons Learned. So I'm going to take you through some of the things we learned while building our E-commerce site at Undercover Tourist. I've got four topics for you. Database Read Replicas and Failover, Data Caching Strategy, Two Pass Rendering with Class-Based Views and Migration Rules. Got a lot to cover and just a little bit of time so we're going to go quickly. First up we have Database Read Replicas and Failover. I'm going to show you how we can implement a simple custom database back end that allows for Read Replicas Failover. So why do we need Read Replicas and why are they important? One of the easiest ways to scale your database read operations. Of course there are other options, sharding, partitioning, and if you have tons of money you can just pay Oracle and everything will magically work. So what can we get out of the box? Let's take a look before we implement a custom database back end. Django has a functional example in their documentation and you can use it as a great starting point for getting Read Replicas working. You can get up and running quickly by adding your Read Replicas to the databases setting and by using Django's built-in database routers. So this is a quick example and it's taken from the documentation. It's extremely simple and straightforward. You may have seen this before. If you haven't, please go check out Django's documentation. It's great. There is a problem with this implementation though and it could be kind of a big one. There's no support for Failover. So just because you have Read Replicas doesn't mean you'll have 100% uptime or you'll have 100% uptime for your site because if one of them goes down some of your queries might fail. Read Replicas are used for scaling but they can also be a great fault tolerance mechanism. So if one of your Read Replicas were to go down some of the people visiting your site who could have been potential customers are now going to see an error page at best or a loading bar until a connection timeout at worst. And either way you're going to end up losing potential customers. So let's go through the implementation. I should note that this Read Replicas database backend is only for Django's ORM and not for third-party ORMs like SQLCAMI. And I'm also going to be referencing my SQL here but this should also work for Postgres. Before we look at the code I'm going to warn you that this is using monkey patching in order to accomplish failing over from Read Replicas connection errors. Why? Well, we don't really need to but it's the most reasonable option. The other choice would be to implement a full database backend. Either way the best way I found it possible to recover from a database connection failure is inside the database backend. Unfortunately there's no way in a database router to handle a connection error. Also Django unfortunately stores references to the database on private properties of the ORM instance objects and these references are used in multiple places whenever Django needs to reconnect to the database. So the best place to recover from a database connection failure is inside the database backend. This is an overview of the code we're going to use to implement our backend and let's walk through it. So in order for Django to recognize our backend we need to implement the database wrapper class. Since we're piggybacking off Django's implementation we can just import it and use it like it's our own. Next we need a reference to the get new connection method so we can call it whenever our custom method is called. After we've stored a reference to the original method we can overwrite it with our custom method that handles the Read Replicas failure. That's the monkey patch. Our custom method is very simple and straightforward. We call the original method and trap any connection errors. At the moment you'll see we're just re-raising the exception and the heart of our custom backend will be the exception handling. But before we get into that we need to add a few custom settings through our databases. So here's an overview. Notice we're using our MySQL failover backend engine. So the first custom setting we need is for a dual purpose. This is a way we can identify Read Replicas that have the same master database when we're trying to find another one in the event of a failure. In addition this is also the master database that we'll use if all the Read Replicas have failed. This next setting is not for our custom failover but it's still very important. In order to failover with an unreasonable amount of time before a connection timeout we need to define a connection timeout for the database. For Django's MySQL backend this gets passed directly to the database driver. And note this is a connection timeout not a query timeout. I've used five seconds here which is actually what we use in production for our Read Replicas but please test and see if you need something different. So getting back to the implementation let's drill down into our method. We need to do something better than just re-raising the errors. So let's implement the exception handling. First thing we want to do is make sure we're working with a Read Replica that has failover support so we check for a failover master setting. I'm going to bring that back so you can see it. And the setting is only on the Read Replicas that we want to support failover for. Next we'll be hitting this exception from... We may be hitting this exception from another failover so we grab the database alias and we also have an optional keyword argument. And the database alias is the databases setting dictionary alias key. This may not be also the first failure so we keep track of the previous failures with the optional keyword arguments and add the current failure. And here's the heart of our failover. This is very simple but let's step through it quickly. We'll go through our databases. We're looking for a database that has not previously failed and we want a Read Replica for the same master database. If we find a usable Read Replica we'll store it in the new DB variable and if we didn't find a usable Read Replica we'll fall back to the master database. In case you're not familiar with the else keyword this means I didn't hit a break on the for loop. And we're all set. We've got a new database to use so we'll just override the host in the connection parameters and we can get a new connection. I should note that you may also need to replace the port, user name, et cetera, password in case your different database hosts have different values for those. So now whenever someone visits your site everybody can be a potential customer. And so that's how we implemented Automatic Read Replica failover with a custom database backend. In case your hands are not very fast and you scribbled all that down the code will be online on GitHub. So data caching strategy. Why? Well hitting the database is expensive. Cache isn't expensive and not hitting the database multiple times for the same data is even better. So for our data caching strategy first we need a database. Our database populates our shared data objects and our data objects and our data objects are also populated by our shared data objects. Our views are populated by our shared data objects and data objects and finally our shared data objects, data objects and views are stored in the cache. We also need a way to refresh our cache so we can use a scheduling mechanism such as a crown job to automatically refresh the frequently used data objects. So that's our data caching strategy. I don't have any code for this section but this is an important stepping stone for the next section. The main takeaway here is you want to make sure you're building your data objects or view models using shared data objects or shared view models whenever possible. For example if you sell products that appear on different pages of your website you might have a page that lists products only for a certain category or you might have a page that lists all products. So with this caching strategy you want to make sure that you're caching each product data object or view model which would be a shared data object or view model at that point and then you can use that cache data object on both of those views. So two-pass rendering with class-based views. First, as we did with the other topic, let's take a look at what we can get out of the box with Django. Caching your views with Django is simple and there are examples in the documentation. You set up your cache server with the caches setting and then you wrap the view function in the URL Conf with cache page or use the decorator version. Very simple. This is what it would look like. This is directly from the documentation so I'm not going to go through it but you've decided you want to implement this because you want your site to be very fast so your first potential customer Mr. Hackercat comes to your site and logs in and everything works perfectly and it's really fast on page reload. He can't DOS your site because your database servers are still happy. Everything's coming from cache. So then your next potential customer visits your site and, well, still welcoming Mr. Hackercat. This person is devastated by your website and thinks your site might be a giant security risk so good luck getting that potential customer to enter their credit card data. The good news is Django has a solution for this, actually two. The first one is very headers. You add another simple decorator and each user's views will be cached separately. The second option is template fragment caching. You make use of the cache template tag and you specify a parameter such as the user ID and everything will just work perfectly. So this is what these could look like. Again, this is from Django's documentation so I'm not going to review it. So that seems really good. What's the problem? We get caching per user and we can even specify sections of our templates to cache per user. The problem is if many or most of your views have user specific content, you'll need lots of cache storage, which means the solution is not infinitely scalable. I do realize that caching HTML is a very small footprint, but there's another problem as well. With this solution, your views will need to be rendered and cached for each user. So if you have a view that doesn't render very fast, that means each user has to wait n seconds for that view to render. You're probably saying, wait a moment. You just showed us we should use a multi-layered caching solution. So why are you saying Django's built-in support won't be good enough? If your data objects or shared data objects are large or if you have many of them to load for a page then it can still take a while for all these objects to load and the template to render. So if you have lots of products to sell, there will be a lot of template tags to render. And you want your pages to be rendered very quickly in all cases. So the next step after we've decided that we're going to use very headers or the cache template tag is that we can actually do a little bit better because if we're using all those and loading a lot of objects through the cache then it could still take a while for our views to render. So what we can do is use two-pass rendering is what I've called it. So the next section is actually code. So hopefully, all right, let's see how this works. So two-pass rendering. The idea is very simple. Let's take a look at the process flow. First we will load the data in our view. We'll render the template on the first pass. And this is going to be all the data that's not specifically tied to a user. This pass will have lots of template tags to render and could take a while. For example, something that would be rendered in this view would be blog comments or products. So next we'll cache the first pass result and finally render the second pass. This pass will render all of the user-specific data and it will render much more quickly in comparison to the first pass. For example, this would include the number of items in your shopping cart or the person's name. So this is a layered approach to content rendering and caching. So then when the next person visits the page, the only thing we need to do is load the first pass render from cache and render the user-specific content. This will be extremely fast because there will only be one cache hit and a few template tags to render. So let's take a look at how we can implement this. First, we've got a very simple class-based view which inherits from a class called cache view that we'll define in a moment. And we've also got a couple of extra methods here that will be used by our cache view parent class. So let's take a look at those. The first is get first pass context vars. This is where we'll implement all of our context variables for our template as you would normally think of them, you know, non-user-specific data again, blog comments, products, anything not specific to the person viewing the page. Next, we've got get second pass context vars. This will be anything that's specific to the person viewing the page, perhaps a shopping cart count or a user-specific message. There's also one very other important item that is specific to everyone who's going to submit a form. The CSRF token. You may be thinking we don't need to worry about that because Django handles it for us. And you're right. But we do need to be specific to each person viewing the page. So I'm going to show you how we handle that in a moment. So let's build out the cache view implementation. We need a few imports from standard Django functionality and you can see those at the top. And I'm going to remove those so we can focus on the cache view. So the main functionality for the home view is called to the render method under cache view. So let's break that down. The first thing in the render method is a call to the render first pass. So I'm going to drop out the home view and let's pull out the render first pass method. And let's break down what's going on in there. So first, we try to load the rendered template from the cache and if we couldn't load it, we'll build it. Building the template, we need the first pass context variables and these are the non-user specific context variables for the view. Then we render the template to a string using Django's render to string method. You may notice here we're not using a request context and this means you won't be able to render CSRF tokens. I know I've mentioned this a couple of times now and I'll come back to it in a moment. So we save our render template to the cache and we return our first pass template. So back in our render method, we need to do the second pass. Like we did with the first pass render, we need the context variables and these will be the user specific context variables. Next, we turn the result of the first pass render into a template. This will be part of the magic of the two pass templates and I'll go into more detail on this in a moment. Then we'll render the second pass template. Notice we're using request context here so this will allow CSRF tokens to render. And finally, we return the second pass rendered. So let's get back to the CSRF token. Now that you've seen the implementation, I'm going to expose the magic behind it. Let's take a quick look at a simple Django template. So here's a simple template that has a welcome message for the user who's viewing the page in a simple form where they can input a quantity to purchase our products. Unfortunately, if we render this through our cache view, you'll see a lot of warnings like this and the CSRF tokens won't be rendered. The reason we get this error is very straightforward. So let's take a quick look back at our cache view class. Remember when I mentioned that the render to string method wouldn't render our CSRF tokens? This is why. So let's go back to the HTML template and address the issue. So remember, we're going to render this template twice. So on the first pass, we want to render these lines which have non-user-specific data. And for the second, we want to render these lines which have the user-specific data and, of course, the CSRF token. So how can we accomplish that? So if you watch closely and the animation works and it doesn't speed ahead 50 slides, you'll see that this will be our new first pass template. So what's going on here? The solution is that we'll render the first pass template to become the second pass template by using Django's open block, close block, open variable, and close variable template tags. And that's the magic. Then we render our first pass template and we'll end up with this. So then we can render that as the second pass template. At this point, the only template logic which remains in our template is the second pass template variables and the CSRF token. So this way, we rendered our second pass template and we get two very important things. First, we can render the CSRF token. And more importantly, we'll be able to render the template extremely quickly because we only need one cache and the Django template only has a few template tags to parse. So that's two pass rendering with class-based views. Migration rules, this is a quick one. So raise your hand if you've never had to take down your production site during migrations for a deploy. Okay. I don't have anything for you in this section. So hopefully you'll enjoy it still. So these are some simple rules we've put in place so that when we deploy production in our multi-server environment we can do it while keeping our site online and without spewing a bunch of errors. These may seem like common sense once I get started, but I think it's still helpful. First, don't go from a more precise to a less precise value type. For example, don't go from a double to a float or to an integer. Instead, make a new column with a new value type and copy the values from the existing column. Second, don't rename columns or tables. Instead, make a new column or table and copy the data from the existing table into the new one. And last but not least, don't delete columns or tables. There's no instead on this one, just don't delete them. You can delete them eventually. Obviously you don't want your database to fill up and run out of space, but wait a few versions out and then delete them in case you need to roll back or in case the deploy fails. So that's it. Thank you. I appreciate your time. Sorry for the technical difficulties. I hope that's been helpful. Again, the code will be on GitHub. I can give you a link, come find me, but the user name is npinshot, just my first initial and my last name on GitHub. It's already up there. Feel free to check it out. It's a working, very simple Django project that shows all the code from this presentation. And one quick note, if I may. We are looking for Python programmer. So if you are looking for a company that's self-sustained and fun to work with, check out undercovertourist.com and look at our careers page. That's it.