 and talks about how CRDTs are the game types for the apocalypse. So a few disclaimers. CRDTs are strange knowledge in biologic English learners. The academic papers are very hard to understand. There are very few general purpose CRDT libraries out there. And most people are very interested in them but can barely understand them. My hope is that some of you might be interested in implementing some. But most likely I want to get people interested in actually using them. I do have a library on Hex called Loom. It's a bit older but it supports some of the new Delta CRDTs that are recent and it works. I have a new API that's coming out soon but it won't be ready until Christmas probably. Now it's ironic that the barriers to understanding this can be really high but they aren't actually that complicated to do. You just have to be extremely precise about what you're doing. There's actually not that much code involved. It's just that it can't be wrong at all. There's no almost in CRDTs. You will corrupt your source of truth if you get them wrong. So what are CRDTs? Generically they're known as conflict-free replicated data types. There are two flavors, one of them is commutative data types and those are the log-based ones which you just keep a log of all the updates that happen and you read over the logs and apply the operations and spit out a result. And then there's ones called state-based which are a lot easier to reason about. But they can get very large and it's very hard to ship very large CRDTs around. They can lead to downtime just like a garbage collection pause. In general, you issue an update operation to mutate the state of a CRDT. Updates are defined as being commutative, associative, and item potent. There are exceptions to this. But generally people don't like to talk about them. CRDTs are trivial to communicate over a network. The previous properties are great because you can always repeat a message, send it five times, send stuff from the past, send stuff out of order. And as long as the latest messages are communicated, you will arrive at a consistent state once everyone sees all the messages. And in general, as long as you have reasonably recent updates, all the answers from all your different servers will be close. So they're very great for distributed systems. Some of you are saying, I'm not building a distributed system, but you are building a distributed system most likely. If you're involving a browser, you're building a distributed system. The client expects things when they interoperate with the user interface. And then when you save them, there is a disagreement between what the browser last saw and what's on the server now. Someone else could have edited something in the meantime. And frankly, you are making a distributed system if you're making anything web. But even beyond that, if you are using Elixir and Erlang processes themselves, unless you serialize everything through a process, a lot of the properties of distributed systems are within the beam itself. You can't really have two processes in identical state without tying them together in such a strict way that they might as well be serialized in one process anyway. So what are some common CRDD types that are around? You've got flags. Those are Booleans that are either, there's a bug. OK, so flags are Booleans that are either true or false. They're counters. They go up or down. Sets maintain some lists. There are registers which maintain one thing most of the time. Some of them retain more than one thing in special cases. There are maps which associate a name or a term with a value. Maps have the interesting property that many of them can nest CRDTs. The great thing about that is that you can nest maps and maps in most cases. And you can create very large, rich objects that you can pass off to another system and all of your updates will be preserved pretty well. There are also graphs. They help manage a graph system, though the semantics for those get kind of wonky in some cases. And there are also CRDT documents, though I would not look at using them at all. The storage requirements involved are often 10 times or more for collaborative editing anyway. They end up being 10 times more than the actual space it takes to store the document. And it can get even worse if there's a lot of concurrency. OK. So even though many application properties can be built on CRDTs, there are kinds of application logic that require strong coordination. But you can actually get away with a lot of contorting your problem to fit into the CRDT itself. Forging ahead without reading the academic literature is currently a mistake. There's a lot of subtleties with a lot of the libraries, and you have to understand how actors work and what a unit of concurrency actually means in the literature for you to actually be able to understand most of the libraries. They're not documented right now to the point where you could just pick them up and go. I'm hoping they change that eventually. If you make CRDTs, you must test them very heavily. And just because you get 100% code coverage does not mean you've covered all the possible interleavings, which is the source of a lot of different kinds of bugs. I have been bitten by this because in this kind of system, you can't out-engineer your way out of the problem with CRDTs. There's inefficiency of CRDTs that you can engineer your way out of. But if your properties have logical contradictions in them, you're up against a mathematical certainty, and it's just hopeless. I wasted six weeks. So here's some examples of some so you can actually see kind of how they work. I wish I had a laser pointer, but I couldn't probably point anyway. So there's a gCounter. It counts up. It's mostly mostly easily implemented as a simple map. Single actor increments their own ID in the map. And when the actors can only mutate their own value, and it's a laser pointer. OK. Yes, OK. You got it. It's on. Oh, it's on? Yeah, I want to say that. Oh, OK. It's a laser pointer. Ha ha ha ha ha. OK. And each actor messages all the others with their whole counter, and the maps are merged. Duplicate entries by the same actor resolve via an upper bound or max of each active value. And that's generally what it looks like. Let's see if I can actually, whoops, point. So here we're adding one, or A is adding one to its counter. Then it adds two, so it's three here. This is a communication down here. And you can see just the merge has both properties as the communication goes on. And what happens when there is, say there was an out of order message, and this one was delivered straight over here, because you're taking the maximum, the red one could not overwrite the red three. The red three would remain. And so all of these messages can arrive out of order in any order. And then at the end, whenever you want to read the value, you sum up each individual item in the set, and you're done. One of the most simple kinds of CRDTs, but there are some problems. That only goes up. You can't go down. But you can get around this by using two of them together, one for increments and one for decrements, and then you subtract one from the other. This is called a PN counter. This is the kind of contortions you have to do in order to actually maintain CRDT properties in a real life application. So this is a grow only set, which is interesting. If you restrict set operations to only add, sets are CRDTs. It's pretty simple. If something already exists, it exists. Problem obviously, you can't remove items from that list easily. If you think about having something similar to the PN counter with the CRDT, you end up with a two-piece set, but you can only remove an item once. And there's another variation where when you remove an item, you remove it from the first set, and then when you add it again, you can only re-add at once. And then there's even more solutions using tombstones of various kinds, but they're inflationary, and those solutions that are inflationary, they'll just grow unbounded as your application life goes on and on. There is a way out. It's called an oarslot, which is an observed remove set without tombstones. This is kind of complicated. So each of these is the add operation itself. The interesting part here happens when I take bread off the list, and this one's copy. Bread's back here. I removed the bee's bread. And then when I merge back, what happens is for every event, I increment this counter, and each value is tagged with a counter. And if when I merge this with that in the algorithm, I check if they both should have seen the update for bread because they both didn't see it. Only this one still sees it. That means that someone must have removed it, which means it can disappear from the state forever. And as long as I keep these clocks consistent in every event that happens, I add one to this clock and I tag the values correctly, I can add and remove things that will. And what you have to do is you have to just decide on a total order for add and remove. There's an add, remove wins, add wins, and remove wins. And then when an add wins, if there's a concurrent update that things didn't see each other, the add wins the day. And if the remove wins, the remove wins the day. Pretty straightforward. So to explain a little bit more, this is kind of like a vector clock, but not really. There's a slight difference, and I'll go over them really quickly. So that's what I explain. Each actor has a value in time for each event. Each actor starts at zero. You bump on each internal event, and you have to bump the counter, and you tag the value. In vector clocks, you have to bump when you prepare to send a message. With version vectors, you don't. And the word dotted version vector just simply means the tagging of the value. And then, of course, you send a copy of the clock in every message so that you can actually perform the logic of, should this be in both sets? And if it should be in both sets, and it isn't in both sets, you remove it. This is kind of essential to getting the more intuitive view to the way tombstone-less stuff works in CRDTs. And this is from the Wikipedia page on vector clocks, and it's kind of amazing. So relative to this dot here, everything here is in the past. So when I remove an element from a set that's tagged in this blue area, I know it's not there, because it has formed the causation into here. Because we don't really have a unified time, everything is actually based on causation. Time is a fiction, basically. Especially when you have multiple servers, time can move around on the server itself. That's why we just use numbers. There's actually a phenomenon called tombstones with the tombstone varieties where if you use wall clock time and the time is in the future, you won't be able to perform an operation until the time gets past there. And if you set it really far in the future, it's just doomed. Everything after here is an effect of what you're doing. And then everything where not every value is larger than the other one, or not every value is smaller than the other one, these relationships are concurrent operations. And this is even the case if something doesn't communicate for three days, it might have happened two days ago. It's still a concurrent operation because they're not causally related. This is actually an interesting property because in the set cases, we can actually perform some kind of intuitive semantic kinds of reasoning to create more complex types. That the user, whenever they edit something and they edit separate fields, both of those edits will be preserved. No, it's a partial ordering. So everything is ordered on each server. But you don't know the order between everything else. Because it's all relative to where you are as well. Because if we switch to a different dot, what the cause and effects are going to be completely different. I already covered this. Just to clarify what you just said. Yes. Yes. B1 different. Yeah, the time scale is completely different. There's no unified notion of time. You have to just abandon that illusion. It's quite nice, but. Yes. This is just like reasoning through relativity. There's a reason why the word observation is used. That's what happens is based on what you see. And there's some pseudo paradoxes with Einstein's theory of relativity where you can just kind of, you get things that seem like paradoxes and they're very interesting. I went over that. So or slots are kind of a big deal. I've abstracted them in a library into something called a dot kernel. I took that work from Carlos Baccaro, who has a very amazing C++ implementation of CRDTs that is simultaneously hard to read because of all the single letter variables. So there's a feature called delta CRDTs where you can actually take an empty CRDT kind of. This is the kind of methodology. It's slightly different. And apply operations to those. And instead of using a dot context or a dot clock, you have a dot cloud, which is just the little atomic pieces of the future that that delta represents. And you can ship just that delta. And if everything is actually contiguous, what will happen is you'll be able to ship just a subset of your state changes, which is amazing. It solves one of the biggest drawbacks of state-based CRDTs. There's another feature, which is incompatible but also desirable, which is upsetting, is we can actually support causation in a very large nested CRDT structure if we share the context. So if I were to pass that context down through some nested maps and then weigh back up, I could actually do a lot of reasoning about the kinds of things that change. And it would be very easy to support very complex data structures. But it's not easily compatible with delta CRDTs at the moment. It might become easier if you give up nested maps, which I think might be possible if with a graph, but I haven't thought enough about that at all. It takes about a week to get through one of those papers for me. You have to translate it into English. So how or why can you use these? These seem fairly restrictive. So one of the things I really like about the Elixir community myself is a lot of the community values where we wanna be able to reach, people at a lower skill level, not only that, but not only for that sake, but because when I don't have to think about something, I can focus all my mental energy on something actually hard that I'm interested in. CRDTs really should be a means and not an ends. So a lot of times you'll hear, I know how to solve my problem. I know I can solve my problems using a CRDT, but I have no idea how to do that. All the current CRDT adapters, a lot of them seem to only use them when faced with strong external pressures, like the fact that they can't fit all the data on one machine. The amount of English in the talk from SoundCloud about how they implemented an observer move set without tombstones, they implemented a remove wins version of that and they were very pained at having to implement this and they actually implemented in Lua on top of Redis so they would have to do the least amount possible and they would use Redis operations, I think, to actually send the updates. I think the status quo on this can be broken. One of the strategies is Christopher Michael John's last, but I think there's also some other use cases we can get some really quick wins on that aren't necessarily the most efficient, but will work in most people's use cases. But some of the solutions will require rich client context like single page applications, native apps, et cetera. You're gonna have to keep track of state in the client and apply updates yourself for some of the more complicated and nicer things. But the nice part is you can actually defer your updates and apply operations locally on the device or in the single page application and not wait for a response from the server to get a source of truth. So you can actually make very fast progress within the application. It's a big benefit. So here's actually one of the easy wins that I thought about a week ago and I've been impressed with ever since because it just blew my mind. So we can create some Ecto support for CRDT's where we add a CRDT field. It kind of works, I kind of took inspiration from the optimistic locking support in Ecto. So what this actually does is it creates an optimistic lock that can retry and reapply operations in the face of conflict for Ecto and to the database. And so basically how it works probably easiest to go this slide. So we add a CRDT and we just binary determine code it and we sign it with an HMAX and no one can edit it, stick it in an hidden field in a form in Phoenix. We can, when the user submits the form, we can diff the new values from the old ones in the CRDT when it's backed up to the server and the CRDT might have changed in the database but we're using the copy that's submitted with the user. What happens after that is we can do a diff, find the differences, extract operations and apply them back on the CRDT. That sounds complicated but what we get is whenever we, the third statement there, the update, that's an optimistic lock update where if the constraint is violated we actually fail to insert. So you can't actually insert your new values if someone else did it between your old CRDT and the current one. But what you can do is select the conflicting row, merge the CRDTs and reapply it. So imagine a very simple case where you're really lazy and you don't want to think about anything and you just add last right wins to every single field. You might think that this is kind of trivial and useless but you can have a very large table in a very large form with a lot of different fields and someone might only edit two or three, someone else might edit three or four and what'll happen with the last right wins context is that you preserve both sets of rights because the things that didn't change won't conflict. And there's actually, if you wanna do more work you can't do this in the easy context. You can actually do a little bit of JavaScript logic and use something called a multi-value register where on conflict it'll preserve both and then you can pull out the latest one and prompt the user to actually pick one that might have conflicted with them at some later point in time. So you can still preserve some of this stuff and keep it in the very simple state. And what's really great about this is if I go back I'm still preserving all my fields so I can still run complex select queries. I don't give up anything. I can still do all the stuff I wanna do. The only thing I can't do is just insert directly without touching the CRDT. Everything else though with select statements and complex reporting, I get that and it is great. So those are some good short term goals for me. I'm hoping to have that stuff working by Christmas. Medium term goals. Can we do agent.Lattice as an agent.CRDT which is from Jose's talk, I think at early user conference. We can kinda do that, kinda. So let's see. A lot of the CRDTs only become really possible though whenever we actually have more cooperation because Elixir is so locked in on the server and we don't really have Elixir iPhone apps yet. We're actually gonna need to rely on some other technologies to catch up before we can really take off with CRDTs. So things on my wish list that we really, really need. I really want a good gossip protocol that's freed from another Erlang project and doesn't have really complex dependencies. I don't wanna include an entire project just to get its gossip protocol. An external representation for these CRDT types is really important and really difficult to reason about because they don't fit neatly into JSON and a lot of the other nicer stuff, nicer serialization formats, they don't really have much buy-in. So we might have to implement something on top of JSON that we would just run a quick map over and be able to transform that into something we can use. Like I said, other language support is gonna be required to use some of these CRDT properties in larger systems because we're all making fairly heterogeneous systems. I think one of the easy ones might actually be library support, particularly Ember and React. And actually GraphQL would be a great way to do things because you could include the CRDT context with your updates and you can actually apply, instead of inferring the events, you can actually apply events directly from the user. And the state, if they're well-tested, the state is gonna be well-guarded, having the browser be a lot more stateful. It's a lot safer whenever you're in CRDT land. So why are they the data types for the apocalypse? So this is... Yeah. I worked for some short period of time in emergency operations not an emergency operation center, but I was working for a vendor who sold stuff to emergency operation centers. The stuff they use and the stuff they contract out is held together with duct tape. It's very poor. The interfaces look like they're about 15 years old and probably are. I don't think many of the vendors at all hire permanent programmers except just to support the new version of Windows. They don't really communicate well. They assume your emergency operation center is online. And if you remember during Hurricane Katrina, and I've read some of the retrospectives recently, there's been... One of the biggest issues was there was no communication. The cell phone networks were down and they would send someone out to go find what would happen and they would disappear for two, three, four days while they were helping solve problems. But none of the decision makers of the tree knew what was happening on the ground. Just people would disappear. New Orleans was just like a black hole that would just suck in volunteers and government officials and no one would see them for three or four days. And you can't make decisions on what to provision and send in if you don't know what they need. One use case that would be really great is once you have CRDTs on the phone, you can communicate peer to peer. If you're worried about security, you can sign everything with CA certificates and there's a certificate chain. You can solve that. These are things that we can go grab libraries off the shelf. The missing part is actually the CRDTs and not many languages actually have them. And they're one of the ways that we could probably save people a lot of pain. We lose a lot of our technological advantage when a natural disaster happens because a lot of the stuff we use to automate logistics just relies so heavily on technology. And not many people are actually trying to make this work in a disconnected state. And the ones that do, I mean we see how many of you have phone apps where you do something on the phone and you have the browser app and nothing matches up. You have to sync. And sometimes you have to close the browser, sync and then like re-log in to just see the updates. So I think CRDTs can provide a lot of serious value and I hope you're tempted to use them. You get a lot of resiliency and speed and you can defer synchronization for days on some of the models to where there's not even a business case for, you know, if you've only got like 50 or 60 people in your organization and no one communicates for a week and then everyone comes into the office and everyone's CRDTs sync up together. You know, there's only a very few conflicts that you have to work through and everything works fine. And you don't even have to worry about real kinds of consistency issues. You can just wait a week or two. Also helps network connected apps that wanna work on the subway where you might wanna authenticate against a server for a login but once you have that for a while you wanna be able to let users apply operations locally and always have those edits be able to sync up even if they, someone else or they have done something in the browser. And I could also use help. Pull request and discussions are welcome. I can talk about this stuff for a very long time. Some people here probably already realized that. So here's some really interesting people and work that you can look up. Lindsay Cooper does a lot of really interesting reasoning. One of the things that I actually discovered reading some of her blog post was you can actually apply what's called inflationary updates without actually doing any synchronization and you don't need to preserve the commutative and associative properties of your updates. So one example would be in the Phoenix presence. Say you wanna protect against join floods and you're syncing all the servers every two or three seconds and so what you can do is when a user joins you have a local set that you can add and remove from naively. And then when those three seconds are up you collapse all those and merge them into the CRDT set as an operation. If someone joins the channel 10 times and leaves the channel 10 times what happens is all those events get collapsed down by adding or removing to just that set to at most one. And so at most once they'll be able to flood the channel with a join or leave message which is a really great property. And so I've heard a thing for that. Christopher Mickeljohn, he has about twice a year so he publishes all the papers on CRDTs that are going around in academia. Carlos Baccaro, I'm really indebted to him for the Oreswat stuff, he's really pushed that far. The Reoc DT source is really interesting. That's Reoc's data types. That's the other main inspiration for the current library that's in Hex called Loom. And this is a summary paper that goes into a lot more detail and while they have a lot of Greek letters and everything they also have really good diagrams in the paper and that's pretty much it. Thank you.