 So I'll go ahead and get started. Thanks, everybody, for coming. The title of this talk is Putting Rails in a Corner, Understanding Database Isolation. And that was originally a reference to, quote, from the movie Dirty Dancing, but much like a lot of lines from 80s movies that kind of came out of nowhere and doesn't really go anywhere. So this doesn't really be a dirty dancing-themed talk, but instead you get this lovely botanical illustrations, which I find quite nice. So I'm Emile, and I'm now, as of the last hour, so a software writer at Binti. And Binti is a great company. We work in the GovTech space, working with foster agencies with the goal of helping to find every child a family. And we're hiring, if you wanna talk to us about that. I'm also on Twitter, I've got my handle at the bottom right hand corner, if at any time you want to tweet something about the talk. I talk about computers on there sometimes, but mostly like music theory and jazz and sometimes cute animals. Cool. So let me answer the first question that's all on your minds right now, is should you try and sneak out or else who might benefit from this talk? So say that you have some controller actions that you wrapped in a transaction and you're finding that you're still having inconsistent data afterwards, stick around, this is probably for you. If you have big slow background jobs that use active job or something like that and those need to run transactions, this is another good candidate. Or if you simply write anything to the database that's based on something that you read from the database. So really quite general. In other words, people who need confidence in their data and probably use the SQL database, otherwise this might not apply as much. But if on the other hand somebody just called you and said that they've got some ice cream and it's gonna melt, like totally that sounds better. So feel free to duck out. Great. So thanks for staying. And let me just make a kind of meta note before we really dive in. So this is this kind of a dense topic and I wouldn't, I think the takeaway here is not that you'd be able to walk away and like immediately write code based on sitting and configuring database isolation levels. So I'd recommend that you just kind of get a sense of the kind of problems that are affected by the space and how you might think about configuration and then just remember some terms and go back and search for it later. Cool. So let's consider the context that we're talking about. So we're talking about transactions. We're talking about how they interact with ORMs and especially active record. And it was awesome that DHH set this term up in our mind of leaky abstractions because that's what this is all about and why we still need to talk about this. So just a reminder on transactions. They're really like a sequence of interactions with the database that really you hope have in a lot of cases, these falling properties. So if you've heard the term acid, it's an acronym that stands for atomicity, consistency, isolation, durability. Atomicity means that either everything that you tried to do happened or nothing happened. Consistency means that when you run a transaction, your transaction starts with a consistent state of the database and it ends with a consistent state of the database. It might be the same state or it might be a different one but it should be consistent. Isolation is kind of the primary focus of this talk and I think it's really fair to use your intuition about what isolation means but the official definition of this is something along the lines of if you have a bunch of transactions that are running at the same time, they can be run as if they were run one after the other serially. And then durability means that the things that you write and committed, you expect to be able to that they stay there until you delete them at least. So what we're gonna talk about is how we can control isolation and then affect control consistency. So not too much about durability, maybe a little bit about atomicity in this talk. Active record is awesome, it gives us this really helpful object model for dealing with our data and it even gives us some controls for transactions. So we can start a transaction, we can roll back a transaction as of Rails 4, we can even control isolation level. So that's super helpful. But unfortunately it might hide some of the finer points about what's going on with the database that we need to know about to make our code behave like we expect. So many of you may have used transactions and you take a block of code which you think should have those kind of acid properties and you wrapped it in a active record-based transaction block and then we hope that everything goes right. Well, let's take a look at a scenario of an app called the Congrats app. So this is just an imaginary app but it's an app to send a card from a group of people to one person usually probably to congratulate them. So you can go in, you can create a card, identify a recipient and set a group of people who can sign that card, maybe live a note and then set a send-by date. So people can come in and sign and remove their signature or update their note and then we have a send-by date. So say that not everyone has signed by that send-by date, we go ahead and send it. If you've ever tried to collaborate and get a bunch of people to sign a card for somebody, you know there's always some stragglers. So we'll just put that into our app. So let's say that we have this timeline. So ED gets a promotion, that's pretty awesome. ED has a friend named Pat who is really cool and decides that they wanna send a congratulatory card to ED and invites Dana and Reese. So Dana is totally on top of things, signs right away, but then some drama happens between Dana and ED. I don't know what it is, but I'd love to speculate. So also it's a totally imaginary scenario so I can make it up whatever I want, but that's not the important bit. The important bit is that the next thing that happens is that Reese finally signs and Dana at the exact same time decides to remove their signature. So we've set up our app so that ED is getting a card, but when are they gonna get it? Whose signatures are gonna be on it? It's not really obvious from this, right? So let's zoom in on that simultaneous event. So this could be one way that happens. So Reese says, clicks the sign button, it comes to our controller. We say load the card with the signatures, add Reese's signature, see if everybody signed, and if everybody has signed, then go ahead and send the card. At the same time, Dana clicks, like remove me, I no longer wanna be associated, and we load their signature and then delete it. So this could happen this way, depending on garbage collection or whatever, it could happen this way. Who knows? So what's supposed to happen in either of these scenarios? What do we intend to happen? Well, it could go two ways as far as I can see. One is that we don't send the card until the send by date, and only the people who intended to sign during that entire timeline are represented, and I think that's probably what I would want. So like Dana's signature is not on there, but everybody else's is. But we could also end up sending the card before the send by date, and still having deleted Dana's signature from the database. So at some point, even in our, not only have we misrepresented what everybody wanted, in our database, we have a card that was sent before the send by date without all the signatures. So that seems bad. So why can we end up in these inconsistent states? Well, when we actually do the sending, we're assuming that the context from before hasn't changed. We just made a decision based on that context. So the thing that we have to know is, even though we talked about those asset transactions, what went wrong is that the database actually makes trade-offs. So we relax, the database is relaxed isolation to improve concurrency and performance. So the database says in certain cases we don't have to be a strict, and I put this really cute dog photo on there so that you'd remember this really key important part. So remember the blissed out dog. So there's actually a spectrum of isolation that we can choose, and it trades off between performance cost and how isolated you are. And don't try and read all these right now, we'll go through them. But you are always in one of these isolation levels no matter what. And depending on which database you're in, you might be in one by default, or another one by default. And also depending on which database you're in, it might implement these differently because the requirements are only minimum requirements every level it could actually be more isolated than is specified. So let's go through these. Read uncommitted TLDRs, there's no guarantee about isolation, right? There's nothing that is required of the database to isolate your transaction. So you can in fact even read rows that have been updated or inserted in other transactions that haven't even committed yet. And that means that if you read data during your transaction, if you read it again, it could be totally different. There's no warning and you don't know where it came from. And in fact, it might even roll back. So it's really, you have no guarantees whatsoever. I can't think of a really good reason to use this for production code, but I have actually used it for a sneak and a peek at what's going on in my SQL in production if I have a really long running transaction. So for the next level up is read committed. So in this case, when you read something from the database, you're guaranteed that it has at least been committed by another transaction. However, that means that you still may be able to read data twice and it be different without you having changed anything in that transaction. So it still doesn't sound super isolated, but it's better than reading somebody's kind of like half done work. And then there's no real warning if anything happened. And this is what Postgres chooses its default. Repeatable read is the next level up. So this means that once you've read a row, if you try and read it again, you'll only see changes that you've made to it. So at this point, we're now in a place where the things that we kind of touched during our transaction are now consistent within the transaction. So that's a pretty nice guarantee. And what happens is if you read something and then start updating things, the database will warn you and say, hey, you just did something based on something that changed and committed to a different state by the time you were done with your transaction. So you may want to do something else. So it won't fix anything. It's not an automatic fix, but it is a really helpful warning. You can still end up with inconsistent data in this level. It's a little bit harder, but it's definitely possible. And then this is what MySQL chooses as its default. Although, as again, I said it's implemented differently from Postgres. Serializable is basically full isolation. So transactions can only happen in a way that they could have been written serially. So this is hopefully an implementation of that full I in the ACID spec. It's pretty, I could not think of a way to get inconsistent data without incorrect code in this level. But if you think of a way or you know of a way, let me know, I think it'd be really interesting. And then this is the most expensive. So let's go back to our scenario and see what would happen if we used repeatable read or serializable with this scenario that happened before. So if we got to the end of the transaction in Reese's transaction and say Dana's committed before, then we would get a warning from the database that says, hey, sorry, your assumption's changed. Whatever you did, you can't rely on that. We're going to roll back everything and try again. So that's at least a helpful warning to you to help you know that you have to do something else. It's also possible that Dana could have gotten a roll back and said, sorry, we couldn't have deleted your signature before we sent the card. But at least you'll be able to warn the user in that case. So everything I talked about before is basically just for SQL. We didn't really need to know anything about Rails or Ruby. But as I mentioned, you can enable isolation in Rails as of Rails 4. And what you do is when you have a transaction block, you specify an isolation level as an argument to the transaction block. So cool. We're all done, right? It was less than one line of code for this whole talk. It's simply a parameter change. Yeah, not quite. So as I mentioned, as you increase the isolation guarantees, your performance may suffer. And it may suffer because there is actually more load on the database as well. You may hold locks, things like that. You may have to repeat transactions, but that's the cost of doing business there. And depending on how you implement things, you may introduce deadlocks. It's not the worst thing in the world. It actually looks a lot like those serialization errors that we saw before. But it's another thing that might come up. So what's actually special about ORMs and an active record? Well, there is this great abstraction that reads from the database for you at times that allow caching and then it caches things. And it also writes to the database. And it may do that at any time. So you actually don't always really know if the data that you're reading has been read into a model yet. So we have to go through special procedures like preloading and eager loading or refreshing. So in order to get this great facility from the database that warns us when our isolation has a problem, we need to give a hint to the database to tell it, hey, I'm using this data and I'm making assumptions based on it. And my application logic needs that. So let's consider how we might implement that send-by-date that we mentioned. So one thing that we could do is use an active job implementation. And if you're familiar with active job, it's really cool. Or if you're not, it's also cool. One thing that you can do is call these jobs and say perform this later on this model. So maybe we've got a card model and we say, send the card at the send-by-date later on with this card. And it does this really cool magic. Underneath the hood, it says, I know what this card is. I'm going to take an ID, stick that into whatever our store is for our background job queue. Maybe it's Redis or ActiveMQ or something like that. And then when I call you back in this perform method, I'm going to reform that card for you and pass it in. Unfortunately, what that means is that the read for the database happened outside the transaction. So we thought we could use the serializable isolation level for the body of this active job. But it turns out that the database has no way of knowing that we're using the data in the card to make our assumptions. So one thing that we can do to improve that, unfortunately, we have to kind of not let active job get the card for us. But it's not too bad. We can just introduce an additional line here. Pass in the card ID to perform the job later and then grab it within the transaction. So now the database knows, hey, this is something that we're using for our logic. But there's a problem. What if we sent a card in the actual mail and then the database told us, hey, you shouldn't have done that. Somebody's going to have to go digging through the postbox. And that's not great. So we can try something else. So this is another technique that you can use, which is we'll still continue to pass the card ID and we'll still grab it within the course of the transaction. And then maybe we'll add a Boolean field to the card table that says, are we committed to send? And what we want to do in terms of getting the logic here is we want to say, we're only going to send the card in the mail when this field goes from false to true. Never again. So it should happen really at most once. And depending on how we, it might happen at least once as well, depending on how we deal with our jobs. But here we kind of guarantee that happens at most once. And so in this second line here with the return, we're making a decision based on what we read from the database. Is this committed to send? If not, then continue on, say, we are committed to send. And then the only way that we can either exit the transaction and continue on to the next line or exit the transaction without a rollback is if our assumption went from false to true. And we use the database to helpfully get us from one state to the next just once. Got to watch out for loops, too. So one thing that you might think about doing is sending a reminder email to everybody who hasn't signed a card yet, and it's before their send by date. So one thing you might think is, I'm just going to do this loop in a transaction, do them all at once. And that might be safe, and it might be OK, too, if you know that this doesn't happen, or if there's not very many cards like that, or not very many reminders to send out. But you may also experience a lot of rollbacks. So really, you've got to watch out for the surface area of your transactions. So it's better if you don't make a lot of assumptions and then make a lot of actions based on those assumptions. So classic advice, keep your transactions small. If you do decide to do that, though, at least try and put an ordering, a consistent ordering, on the way that you access resources. Here, we're only accessing them from one table, but we at least should try and access them in, say, an ID order or something that's consistent between transactions. Otherwise, you could lead to deadlocks. I'm not going to talk too much about that, but that's another thing that I would recommend digging into a little bit more if you haven't come across database deadlocks. That's another huge topic. So maybe a better idea would be to go and get all of the cards that you think might need a reminder and then do a transaction around each one and then explicitly do a reload within the transaction block to say, hey, go get this from the database again. Tell it that we're using this data and then make our decision based on what we read within the transaction and then send it if need be. And a little trick you can do is just select the ID and then if you do a reload, it'll actually reload all the fields for you. So we talked about how to configure and use all of the database isolation levels, but how do you actually identify those rollbacks so you can make a decision about whether or not you should do something? Well, if you're running a transaction, what will happen is it will throw this active record statement and valid error. Unfortunately, at the current state of things, the exact nature of that statement and valid could be anything. It could be that your SQL is invalid or something else. So active record does provide this cause field on this statement and valid error. And then what's attached is database driver specific, not just database specific. Postgres is pretty well typed. And then within there, you can decide to try or do not retry, depending to paraphrase Yoda. And my SQL has done less with the inheritance tree here. So it's just error. And these also might all be different for JDBC as well. So the best way to do this is try and create scenarios where you will run into isolation errors and see what's thrown, unfortunately. We haven't talked at all about tests, but also I'm sorry. This leads to trouble with tests. So between each test, for most tests, ideally what you can do is set up a transaction before the test runs, run all your logic. And then when it's done, roll it back and get back to the state that you started at, possibly empty or possibly with some seed data. And that's great. It's super fast. And it's a really consistent way of dealing with test data. Unfortunately, once a transaction has been started, you cannot change the transaction level. So if you're testing code that changes the transaction level, you probably are going to want to use deletion or truncation strategies for your database cleaner. So I wrote a little poem to mention this. This is just to say that I've slowed the tests that were in transactions, and what you're probably hoping would remain fast. Forgive me, they were so simple and so clean and so clear. But now they're not. Sorry. Also apologies to William Carlos always. So some more testing considerations. How do you actually test the concurrency here? It's actually really hard. I don't have a great answer for you. I'm not going to lie. So one thing you can do is load testing. This is more likely to come up with load testing. And then the other thing you can do is kind of manually test it. And this really stinks. But you can try and add random sleeps to things just to prove, at least to yourself, what kinds of errors come up and that they're handled. Yeah, I can't say sorry enough for that. But it totally works, and it works really consistently. But it's really hard to automate. So if anybody comes up with some great ideas for testing, that would be awesome. So let's do a quick review. So remember my top happy dog slide. Databases try to trade isolation for performance. The database and ActiveRecord will let you choose the level. But choosing it may require code and test changes. And it's a performance hit. It's not just a performance hit. And I say it's worth it for the correctness. I mean, when I say it's worth it, I mean, it's required. There's not really much else you can do. You just kind of have to do it. So yeah. So I added this slide after I saw both DHHs and Eileen's talk. And I realized as I was going through this talk, this kind of sucks. This is really hard. And when you saw the final form of that job, I hope that like, or I don't know if it was like me, I was like, wow, this really looks like threading code. And that's really kind of rough. So I don't think it has to be this way. I think we can do more with this. And I think the more people understand this and kind of get an idea of the use cases for this, maybe we can do better and try and push some of this stuff further down the stack. Maybe we can figure out when we entered a transaction that changed the isolation level, it was very specific about isolation level, and we explicitly reread things so that we can hint to the database that we made assumptions on that data, something like that. But we can definitely compress the concept down into Rails and ActiveRecord. I'm not totally sure how yet, but there's definitely work that we could be doing with that. Cool, I just love it. So thanks to Slides Carnival, which I use this nice slide deck under Creative Commons Attribution. And thanks, everybody.