 Yeah, so I'm just gonna start this thing off. Accord is the next great Bitcoin transaction protocol. That's right, this is your Crypto Bro path to happiness. And I'm gonna talk about how you can make a million dollars today with Accord. You didn't expect that, did you Blake? Yeah, yeah, on the money, yeah. How's everyone's summit going? Is this good? We're almost through, right? Yeah, like, day two is hard, huh? Because you burn hard the first day and you're like, I could do this two days. Then you wake up and you're like, no, I can't. Yeah, that's what, I always feel like we should have like a day and a half conferences. Like a two day late conference. And like re-invent is literally a week now. It's Sunday to Saturday now. What's that? A long morning? I'm like, yeah, we could do that too. We start at three a.m. and end at noon. We'll start at midnight. I know some of you are probably like, that's the middle of the day for me. All right, four 10. I think this is when I'm supposed to start. I don't know, I'm kind of losing track of time. So I'm supposed to be entertaining. Josh. Let's do this. All right. Everybody loves acid. Yes, it is. Funny, I just read an article about how Silicon Valley is the acid capital of the world. And I'm like, yeah, that's my talk tomorrow. Not that. All right, yeah, I mean, come on, the jokes are easy, right? Let's just go. Okay, so here's everybody. So when I've talked to any user about Cassandra since 19, no, it was probably around 2011, 2012. It was the thing I keep hearing like, oh yeah, Cassandra's awesome, man, uptime is awesome, scale is awesome, but I need transactions. And I get it. I mean, we all have lived, you know, we're like Barbies living in a Barbie world, right? We're living in like relational database world. I've been using relational databases since a long time ago. And that's what we all expect. They were built around acid transactions, right? Oracle started this whole thing in the 80s and it's just gone crazy since. So it's not weird. So we're just gonna have to deal with this. So there are three use cases. I'm gonna talk about three use cases, macro use cases that Cassandra can't do yet. And this is what I've always had to talk around. Now it's either you don't do it or you make up something super crazy. Hey guys, how you doing? We've done that. You're like, oh, it doesn't have transactions. Let's use magic and code to make it work, right? And you can fake it, but you're putting a lot of work on yourself and not on the database. So number one, bank transactions, be classic, right? Even though banks would actually work like this. You know, it's this thing of like exclusively moving from one table to another. That's the big deal, right? There's this order. Like this thing happened before that. The exclusivity, you know, that it's only one person or one process making a change. Many of you know, because you were dinner with me last night, this is a mid-journey image. I asked it to give me an Alice to Bob and there's Alice giving Bob a big fat stack of Benjamins, say, but that's a transaction. This is a hard one, right? This is probably the number one thing that people talk about when they talk about transactions is like, oh, I want to be able to do a bank transaction because people love money, right? And money, if money gets lost, that's a data loss that hurts, right? Because if your bank loses your money, you're probably not gonna work with that bank anymore. Inventory management. And this is one that I think once you understand the problem, this is probably more of an issue with like inventory is being finite. So it's not that you have inventory, it's that eventually you're gonna go to zero. Good example is if I want to buy PlayStation 5 and there's only five left, you know, you all had this problem, right? Am I the only one, right? The PlayStation 5 and there's five left and everybody clicked, who wins? Well, what you don't want is you find out that you now have negative 300 PS5s in inventory. That's just not a normal number, it's not okay. So you need to be able to be precise and first come, first serve. And it's a high concurrency problem too. Like if you have one record in your database that says PlayStation 5 and here's your inventory account, that's a problem. So this is another issue that definitely stay away from. And then finally the distributed ledger. I told you it was gonna be Bitcoin in here. But actually this is much more than just Bitcoin. It's a more generalized problem. It's like, and this is one that's really tough. It's like the changes have to be in order but also it has to adhere to some sort of timing as well. This happened before that. And this is wall clock. If you talk about consensus protocols, there's the real time. Like if I'm sitting there looking at the clock and the second hand ticking, I wanna know that the order is I clicked here and then that person clicked here, I win, right? There's no funny math when it comes to time. Time is supposedly an arrow. So that, and that's a really hard problem is sticking to that. So what needs to change in Cassandra? All right, I should say what has changed in Cassandra because this has been, spoiler alert, this has been working, this has been happening for a while. So what needs to change? Well, this is a great, if you wanna know more about consistency levels, I've been going back to this Jepson website for a long time just because this is a really good explanation of consistency levels as you can find because they're gathered in one place. And I love this chart. I've been using it for a while and it just shows you the hierarchy, you know. The higher you go, the stronger the consistency, the lower you go down, yolo. And we fit, Cassandra's fit in various places in that, but really serializable, see if you notice there's a fork in the road here, there's, over here is linearizable, God, these words. And then serializable, I don't even think those are real words, but we've made it word. They fork off, they're two different things, right? Well, serializable is the one that is really the current limit of distributed databases where people are at. And these are like the Spanners, Aurora's, Cockroach, yeah, goodbye. And the reason is is because, and they're very clear, like you go to the Cockroach website, they say, hey, we are serializable and this is why, because to make it linear, it is super hard and it's not that it's impossible, the costs are too high. Like to do it, it just takes too much time and resources and users would have a terrible experience. So it's just a trade-off. It's not a terrible trade-off, but it is a trade-off. It's really, it's maintaining that strict sequence. So you don't have this race condition where one process overwrites another. And that current limited distributed databases is good enough for a lot of things, but not the best. And what we want to say, oh, current, that's the current limit. I'm sorry, my slide is messed up. It should say word Cassandra and what we're trying to accomplish with Cassandra with a cord is taking it all the way to the top, to strict serializable. So you get the strict sequence and the real-time ordering. And that's a pretty squishy word strict serializable. There's a lot of definitions out there, but really the thing you need to understand is that it's not just the order of when you did it, but it's also the wall clock time and maintaining that time order. Which is a great way to not lose data in the long run. Because all of these, the only reason we care about consistency in a database is because we have situations that could lose data or have overwrite. And so the lower you go down, the more of those cases could happen. The higher you go up, and this is the best thing is you can be super lazy. So the developer experience, as I said, is browned this whole business of, we in the Cassandra community have decided to put features lighter in the database at first. It's very much put and get, but we put a lot of complexity on the developer. That has been evolving and changing for a while. And that's good because what we're doing is we're making, because developers are like, yeah, no, member thrift, it was awesome. But you had to really understand how to write good code to not destroy your data. So as we make more features available for the database, we make it easier for developers to write code against it. And it really just transfers all that complexity over to the database where it belongs. And it really is where databases have been for a while, right? I was an Oracle DBA, don't judge. But yeah, there was a lot of complexity put into that. And so whenever I wrote a very simple SQL command, I knew lots of stuff happened, but I also trusted it, right? Well, after like nine, eight was, but after nine it was totally solid. But you just trust what that does, but I didn't have to write a lot of complex code. I just threw it all in the databases and figured it out, right? And because I'm a lazy developer, right? We all should be, we should all be lazy developers. And the difference too is this observer reference frame and this is not a physics thing. This is about like, as you're doing something with your application, you wanna feel like you're the only one there. This is what you wanna feel like. And this is you. I'm the only one using the database, right? Even though there could be thousands and thousands of processes using the database. You don't wanna have to think. I mean, if anyone's had to think through a race condition, yeah, yeah, I see the faces. Thinking through a race conditions are hard because you just, I mean, they're kind of amorphous and weird and they usually aren't firmed up. Like there's no deterministic way to say this is a race condition because they just occur and that's why they suck. So let's just avoid that completely. You just wanna look like you're the only person working on a database. All right, liven up everyone, we're almost done. All right, so I talk about this a lot with this, but it is a thing. Pat Heland, I love this article. This uncoordinated business is like, the more we have to think about coordination, the worse off we are because it is a hard problem. It requires a lot of thinking. So just start working on uncoordinated as possible because coordination is expensive. It is the most expensive operation you do on a database. That's why whenever you write data into Cassandra at a consistency level of one, it's super cheap, right? What's the fastest way? Well, any, don't use any. But one is fast, right? And why? Because it just yolos the data right into the database and just drops it in and doesn't do anything to check it. It really comes right back. Great, and that's very cheap operation because it's a single node operation mostly. And when you start thinking about all the coordination that used to happen, it gets really expensive. So we don't want that. So that we need to have a single system experience with a distributed system. And I'm not saying it's the holy grail, but I'm saying it's pretty tough. So back in the day, 1989 to be exact. Yeah, that's right. Top movie in 1989 was Lethal Weapon 2 and behold, Paxos. I saw Lethal Weapon, I didn't know about Paxos in 89. But Paxos is actually, if I look at it now after years of looking at it, it's actually not too hard. It's easy to reason through, it's impossible to implement. I think we just nailed it in the Cassandra project this week, last week. It's been a process, right? But it's just kind of the headwaters of all consensus protocols, Paxos back in the day. And it's just saying like, if one system in the distributed system says, hey, I wanna make a change, it proposes that to everyone else. All the other nodes are like, cool. And then if it gets a cool back from everybody, it does it. And if one says, no, then it starts again. So, but it's a consensus protocol, meaning it has to get consensus. You can't have naysayers, right? And Blake, remember that time we argued about E-Paxos? Oh yeah, that's where I get my Paxos knowledge. Back in the day, we had a great time arguing about it. You won, by the way. But Paxos is, it's not an overly complicated thing, but it's when you look at it, you're like, oh, there could be a lot going on there. Yes, that's actually true. It can go back and forth. You can have like under high contention, like there's one thing that everybody's trying to change, all the processes are trying to change. You can get really noisy, right? Because everyone's like, I want to change it, I want to change it, I want to change it. And they overlap and they're like, no, yes, no. It's like Congress. Anyway, Spanner saved us all. Just kidding. Spanner was a good idea. 2012, when that came out, a little later than Paxos. But Spanner is really just multi-Paxos. And it still, the Paxos still applies, but it uses some method to make sure that from one Paxos domain to another, it maintains that sequencing, that serialization. The Spanner, like the Google Spanner, they have their own true time, some atomic clock sitting in a dark room somewhere. But there are other databases that use the Spanner protocol or Spanner paper, their derivative like Cockroach is a good example where they said, hey, we don't have atomic clocks, right? So they implemented their own way to keep things synchronized. I believe they use vector clocks. And then, Yugabyte is also a Spanner, but they used another thing as well. So it's all kind of based on the same things. You have these multi-Paxos and then you have this coordination layer. What that means is inserts can be really expensive. And I think that is the downside of Spanner. Reads are not that hard because you turn into read replicas. But that's really bad for Cassandra because, and this is why I'm gonna point that out, is because it creates a natural situation for Cassandra. Now we have to do all this massive coordination between data centers, which it's too much. We wanna think about in terms of, we wanna scale the way we wanna scale, like add a node. We wanna do five data centers, not just two. And we don't wanna have to change up your data modeling. Like you can't change that partition key because it's in a different data center. Those are weird things that we'll start throwing at users. Remember, developer experience is a good thing. The easier it is to use, the less mistakes you'll make. So a chord, and I know there's probably something wrong on here yet, but keep it really high level. It really was designed for Cassandra, the way Cassandra works. And it's just leaderless, scales like Cassandra, the failure modes match up. And that's one of the things with consensus protocols which are really hard. Like what happens when you have a failure? A node goes offline. You don't wanna have to like stop everything. Hold on and reelect the leader and come back online. That is not a Cassandra experience. We want it so that y'all love to talk about it, but it's a thing and I'm gonna bring it back with a chord, I'm gonna start destroying Raspberry Pi's with a hammer. Because that's the experience you want, right? You wanna have lightning strike your data center, and not have any downtime, right? That's the dream we should all have and Cassandra can do that. But we shouldn't say oh, but if you want acid transactions then all that's out the door. That's super lame, let's not do that. So what about usage? Let's get into using it. Look at my clock, we're doing good. All right, how do we use, what is it about a chord and what we do with acid transaction with Cassandra? There is some syntax, stand by. And I think that this is still pretty firm. If not the case, I don't know Blake, Ariel, somebody tell me I'm wrong. Caleb, I think you're in here too. But anyway, this is the current version and what's new is adding this begin transaction, commit transaction. That's your block, right? That's your thing that says inside of here, acid do occur. And inside of it, I'm gonna break that down a bit too. First things first, there's a way that we can collect like this current state of the data before we do a mutation and that's like this let command and you can put it into a tuple and basically do a pre-select of like what am I gonna change? And this comes in handy in a minute. You can also do a select that will show you the change after you run if this was successful. So this is like a before and after setup. This is where all the magic occurs and it's really cool. We have this if statement that checks that tuple condition so that's why we grabbed that tuple. So we can look at what was the state of that data and then make a decision. Now I will walk you through some examples that will bring this home but like I said, this is where the magic is because when you have this conditional and then inside of that you can have updates, inserts, deletes. That conditional is where you make some really interesting decisions around thing. This is gonna come in super handy for a lot of things like inventory control. So I'm gonna walk you through a couple use cases here. The bank transaction, right? Here we go. Alice is gonna give Bob some money. So here's my setup. Here's my table. I have an account table and just real simple. I have who has the money and how much money do they have and I give, I'm giving Alice and Bob 100 bucks. 100 decimals, the real money that you can use. Probably Bitcoin at this point, right? Crypto. So if they both have 100 I'm gonna walk through the transaction. So the first thing I'm gonna do is I'm gonna grab Alice's because I'm moving money from Alice to Bob. I'm gonna grab Alice's bank balance. Like how much does Alice have an account? I'm also gonna emit how much after it's done how much Alice has after the transaction. Like I said, this is where the magic is. What I'm gonna do is because I'm gonna move $20 from Alice to Bob, right? I wanna make sure that Alice has that money in there. I do not want to overdraw. I know this is not a really good bank thing. Of course banks wanna overdraw you because that's how they make money, right? But anyway, let's just imagine that we're a good bank. We don't wanna overdraw, have our people overdraw. So we don't wanna overdraw the account. So what we do is we check to see if that account balance is greater than or equal to 20. If it is, Alice has enough money to give Bob and it happens. So if you look over here, you see the update? There's two updates that happen. There's another syntax that I absolutely love. Now the update can do a plus equals, minus equals, like smiling because he knows I love this so much. I wish we just had this in the regular syntax by the way and so does Aaron. But if we just do able to increment, we can increment this in place. So we'd say update. So take 20 bucks from Alice, give 20 bucks Bob. That all happens wrapped up in a transaction. I commit to transaction, we're done. Now, Bob, before you ask, no, there's no rollback. I know. You wanna leave? All right, I'll keep going. Rollback's a different problem. Inventory management. This one is also super cool because this is one I've never been able to solve well with Cassandra. It just is a super hard problem. It's an intractable problem for many cases. So I have some products in a product table. I have my shopping cart. Probably all of you have created this at one point in your life. So I'm going to insert that PlayStation 5. I'm gonna put 100 PlayStation 5s into the product because probably that's not even how many were there back in the day. It was weird, like Best Buy would get 20, Walmart would get 20. Like what? What do they need somewhere? But anyway, limited inventory. There's only 100. Now, what I'm gonna do, same type of thing set up. I'm gonna look to see what the current inventory is and this is all locked. When I start this transaction, this is all locked now. You don't look at the current inventory for PlayStation 5s. I'm gonna emit the inventory after I'm done. But then, again, this is where I make sure that I don't go past zero. So if the inventory is greater than zero, then I'm gonna update, I'm gonna change the inventory, decrement it down one. So there's one less PlayStation in the world and I'm gonna insert it into my shopping cart. Yes, I got a PlayStation. But yeah, I know, thanks, Max. But that's all happening in a transaction and locked under a highly concurrent workload. All those idiots clicking on the Buy button, right? It's gonna be okay. Relax, because the database got you and that's impossible to do without it being in the database. I'm just gonna say that, it's impossible. It needs to be closer to the wire, where the data exists. And what this does, so under a highly-continued load, if I had, here's where the order and the linearity comes in, if I have two people that click really close to each other, they're still milliseconds apart. So, because Max was really excited too and he was just a millisecond late, I got the PlayStation, sorry, Max. Sorry, last one. But you know, you don't wanna send your customer an email saying, sorry, we totally blew past the inventory. Here's a coupon. Is that what you guys do at Walmart? Yeah, pretty much, yeah. So, but think about, this isn't just shopping cart or inventory, think about other things where this applies. Again, these are general patterns, but a good example of how this works. I built this thing as funny as like, yeah, I wanna make this to sell my book. But, yeah, I can't even get them away, so there's still 10 in inventory. So, but this is like the thing that I think about. It's like, oh, the thundering hurt up. Like people are like, get it now, click. This used to be scary, now it's not. Now, I don't have a really good distributed ledger one. I actually was trying to come up with a really good one. Yes, that's an interesting problem. Yes, that's the one we can't do with Cassandra, but I actually wanna go after something that has been impossible to do well, that I think has more applicability for the Cassandra community, which is this real atomic batch. Now, when I say real, some of you may realize that there was a controversy, or know about this controversy, they're at one point, somewhere in the docks, somewhere, the batch in Cassandra was called atomic. Yeah, you wanna start a fight in the Cassandra mailing list, call it atomic batches. Oh, people start throwing out definitions. That's not atomic, and so, and of course, they weren't really, I just got named that and unfortunate naming, but I'm here to save the day. I'm gonna be a peacemaker. You can actually do it now. So think of this, and this is a very common pattern in Cassandra where you have a base table like user, but then you create lookup tables, user by email, user by location. These are index tables. Now, if I was to change something in my user record, like their city or by their email address or something like that, I need to update the index tables as well. Now, you can do that with a batch, but the batch is non-deterministic, right? It will eventually happen. It's eventually consistent, but it isn't a asset transaction. I'm not changing all my tables at the same time. And what this is the same thing I'm gonna check, I'm gonna say, does this person even exist? So I'm gonna make sure and not do anything that doesn't exist. And this is that example of that, if the tuple is null, right? That is an interesting problem, right? So what I'm doing is I'm actually creating a situation where if this person exists, I'm not gonna stomp on them, but if the account doesn't exist, then I'm gonna create a brand new one. Now, I can also do updates, like if I wanted to change my account, and that would be a different conditional. But in this case, what I'm doing is making sure that lightweight transactions, let me quickly stop here. Lightweight transactions do this for one partition. If you need to do more than one, wah-wah. So this is how you do that multi-partition change. And it's just built into the database. So, cool. Okay, doing great on time. So when can I get it? Ariel, thank you. CP-15 Accord, there's a branch right now. You can go download it and build it yourself. If you need help with that, hit me up. I was gonna build a Docker image, but I didn't get time because I was hanging out with you guys, which was worth it. It's all good. But I think I'm gonna still do that. Maybe this weekend, if I have time. But now I wanna create a Docker image around this because it's not a part of any official release right now, so I have to do it as a individual, not as official project thing. But the target release is 5.1. As you heard me maybe say this in the keynote yesterday, it was originally slated for 5.0, after some discussion in the mailing list, some light banter back and forth. Some easy discussion about differences between one version or another because we just love each other in the mailing list and this is what we do. We decided that we're gonna decouple the Accord and transactional cluster metadata, TCM from 5.0 just to let 5.0 ship, it's got everything. So it's in beta right now, go play with it. But 5.1 will have, there'll be a fast follow and we'll have the Accord and that. But I mean it's just a matter of, I think the important thing is that we all realize that this is a major change to Cassandra. Was like 88,000 lines of code got touched or does that seem like a number? Anybody wanna say yes or no? No, that was just TCM. Just TCM was AT, so yeah, it's a big one. But when you touch that much code in any consensus protocol, we wanna really make sure we test it a lot and make sure, you know, cause we as a project, we hold dear quality, right? And that's important. That's what we all agreed to in the mailing list. It's a matter of just making sure to give it enough room and I think that's fine. But I do think it was important for everyone who's interested in using transactions with their next application to get out there and start getting used to the syntax because it is different. You're gonna have to learn about it. You're gonna have to figure out how it fits in your application and just, you know, it's just knowing how it works. Now, understanding that, you know, trusting it beyond the way, there's a lot of testing going on, that sort of thing, but just get the syntax right. I think that's what I wanna recommend. This QR code goes directly to that branch if you wanna go to there. And then finally, thank you very much. I think I nailed the timing. Max, questions? Transaction template. What I would really, really love to see is find its way into some frameworks. Like, you know, there's a lot of frameworks now. You know, Aaron's talked about Mongoose, for instance. Mongoose is a JavaScript framework for Cassandra. I just think that would be cool to see that just magically get awesomer. But under the covers, it's using transactions. So, no particular template's right in Cassandra, but I think I would like to see this go out into the ecosystem and just become more of that. Spring is another one. Spring Data has some hooks in it that I would love to now be able to use. That's a good example, yeah. Second question? Oh yeah. Yeah, thank you for Mike. Second question. I trust you created a lot of different ways to check the condition, but these are very chance, but at some point of time, we are going to be able to upload our jar to Cassandra and do very complicated checks for the condition before we enter the locked state. A very complicated check. Like, what would be an example of a complicated check? Like a subselect maybe? Or, no, you already do a select off your data. So, anything that's supported in a select statement is available, you can create a tuple with. What if we have some, you know, complicated math around with math functions which are not supported by the tuple sender. I don't know if that's really, does anyone know? I mean, like if you wanted to do like a max or a min or something like that, is that what you're talking about? Like math functions. Yeah, math functions, but maybe more complicated than just maxed, I mean. Or a count is another one that could be done. Like how many of these things? Yeah, I know. I got everyone laughing and I was like, oh God, here it comes. See, you should never release code because not everyone wants more, right? There's no plans right now, but this is where I encourage you as a user of the system to join us in the Apache Software Foundation Slack. There is one called Cassandra Accord. Right now it's just a bunch of people trying to figure out how to compile it and make it work and there's a lot of stuff that's going in there about the details. But I would love to see that transition more to hey, I'm using it and here's my experience. Yeah. Yeah, JIRAs are awesome because, and whenever you create a JIRA, and I'll say this because the camera's on, when you create a JIRA, you can say what subsystem it is and you can say, yeah, this is for Accord. And yes, those are very valuable because first of all, it's official, it's in the system and it can be followed up on it, it can be routed, signed, all these things. So JIRA system is very good for, this is what we use in our Cassandra project, yeah. I think we're kind of, wait, do you have one more question over here? Yeah. So I noticed all of your examples had a single let. Yes. Is it possible to have more than one? Yes, you can have more than one let. Awesome, so is there, what's the limit? You know, I don't think we have one. Ah, nice. Yeah, this is, welcome to the world of guardrails. We have not defined any for Accord yet. Yeah, guardrails is a five or a four, four-dotto concept, but yeah, I think that part of this is we're gonna find our guardrails right now, it's a little go for it and let's see where it breaks. So yeah, I expect we're gonna find somebody. What do you mean I can't do 10,000 let's? Well, that's unusable. All right, thank you very much, we'll be outside, but thank you. Thank you.