 All right. Hello everyone. Welcome to my session on data consistency across cloud data systems. I'm Jimmy Zelensky. This talk is going to be relatively technical. I'm going to cover some kind of fundamental distributed systems topics, but I'm going to give an example at a very high level that kind of introduces the concept of going to ask and answer the questions of what is consistency, why is it important, then walk you through that example. And then we're going to look at a couple of different systems that are kind of ubiquitous components that you'll find in cloud data architectures and talk about some aspects about them that are interesting, some aspects that may be surprising, and the things that they've kind of the history behind the ideas that they have and what they've kind of contributed to the overall kind of landscape for architecting software with an eye towards the consistency of the data being used. So jumping into that, I am Jimmy Zelensky. I am the co-founder of a company called AuthZ. We are the creators of a database system called SpiceDB. SpiceDB is a data store specifically designed for storing and computing authorization data. So that means that it is basically the core engine that your business would use to compute whether a person has access to perform an action or not. I like to use that term permission or permission system rather than authorization system. I think it's far more approachable and defines the problem way better. But you can understand how consistency is kind of a core part of this just because fundamentally permission systems for software, they have to be correct or else there is a security flaw. If some software allows someone to perform an action that they otherwise shouldn't, that is absolutely mission critical in most software. And doing this kind of work at scale and at low latency because absolutely every action that has to take place in your software system has to check whether it has the access allowed to perform that action. It puts us in the critical path. So data consistency is super important for SpiceDB. But before off Zed, I was previously working in the cloud native space. I worked at a company called Red Hat by means of the CoreOS acquisition. So I've been kind of working in this space since before actually the CMCF was created. CoreOS was kind of building distributed systems and kind of container technologies, basically the foundation of cloud native software. Since before kind of the CMCF and this whole Kubernetes ecosystem emerged. And in that time, I have contributed to a bunch of cloud native projects. I've co-created some. I am also a maintainer of OCI which is the standard body for container images. And kind of this all kind of folds back into my passion for distributed systems. Even before working at CoreOS, I always had an eye towards distributed systems. And as an early adopter of a project at CD, which ultimately became the data store that's used by Kubernetes, I'm going to talk a bit about at CD later in this talk, but then also as a part of kind of building kind of large scale SaaS systems on cloud native software, I have also ran my SQL Postgres, these type of relational databases. At scale, I've seen where they fall over. I know sharp edges. When you build enterprise software, for example, you try to do things without introducing new dependencies on other systems. So your customers don't have to set up yet another software dependency. And you start to bend a lot of these products to their will. So in ways that they should not bend. So you're trying to like actually get different database properties out of databases that were never designed to do certain things. I've been down those dark roads as well. So I definitely have some very like scary knowledge of some of these systems and know when you're doing the right thing and when you're doing the wrong thing and how you should really focus on architecting things for success in the space. So I also left my contact information on this slide. So if at any point in time you want to reach out to me, feel free to just shoot me an email with a question or actually again, the final slide I'm also going to link to discord community that you can join to kind of discuss distributed systems in general or related consistency. But I prefer email and then you might see me around on Twitter or GitHub under these handles as well. And enough about me. It's time to move on to the actual primary subject which is you may have seen these terms thrown around in your software development career. A lot of these are specifically kind of like used particularly in the database world. If you're selecting database reading database documentation, maybe you're taking a course in college on how to build databases, you're definitely going to hear these terms thrown around. But fundamentally these concepts, they're not kind of unique to databases because so many software systems store data eventually they kind of punt it off to a database that is then responsible for maintaining that data. But like the fundamental systems are still offering views of data and they're modifying data potentially and before they pass it all around. So whether we're talking about databases or microservices, the concept of kind of the data you're working with is always going to be relevant. It's actually really interesting ACID which is like one of the super acronyms thrown around in this space which stands for Atomicity Consistency Isolation and Durability. That acronym actually has this kind of story around it where actually folks believe that the C was just made up to make the acronym work. Well that C is consistency which is the topic of this whole presentation. So I hope by the end of it I can kind of formally explain at least how I think about consistency and why it is almost certainly even if that was the initial intent definitely not the case that it is a made up concept that is just to make an acronym work because you'll find that it's used in many other places and other discussions outside of just the term ACID. So it clearly has some relevancy on its own. So I'm going to actually use some of these terms in here I'm going to use them in different contexts so that their kind of definitions become more clear rather than just kind of trying to describe them abstractly in a vacuum. So I've talked a lot about these things but I still haven't covered the very basics yet which is what actually is consistency. Now I didn't use the Wikipedia definition that you just Google for and find. Instead I kind of defined it the way I like to think about it and the way I feel like most engineers colloquially use it. I think that's really important because you can go and look up a lot of this terminology and read a very dense article or read research papers that talk about these concepts but that doesn't matter if you're just trying to communicate something to your fellow engineer what actually matters is that kind of you have this effective communication tool and you both have a shared understanding of this topic. So I tried to define it in my own words rather than like the mathematical terms that you might find elsewhere. So how I define it is strictly around the contract between how data can be observed in the system. I often kind of talk about freshness with this like the concern of like how fresh is the data that you're working with becomes a part of that equation but fundamentally I think the core concept here and the way people most often use this term is largely around what I would say quote unquote correctness and correctness basically is context dependent which makes it kind of tricky. It depends on what type of system you're trying to build and when you're trying to build these systems you're going to first kind of talk about the problem and then work backwards to find the consistency model that is going to work for your solution. So why does consistency matter and why are we working backwards to arrive at it? It's because if you're building applications and fundamentally you have a contract between what your expectations of the data that you're going to use in the application and the data in the database for example or the users will see and your application that you've built if that contract is broken systems can explode in catastrophic ways. Basically silent errors can occur data corruption can occur and fundamentally if you want to solve these problems between inconsistency certain things will actually just be impossible for you to do without totally re-architecting your software around something that works more consistent. The door closes behind you when you open up and move to a less consistent system. You don't have the capability of adding this in retroactively and that's kind of the scary part about consistency is you really need to understand your problem in your domain first because if you pick something that is not going to jive with the system in the future you are going to be in a world of pain probably re-architecting or carving out some subsection of your application that has to be treated special with completely isolated data that works at a higher consistency level. All that might not mean too much now but I'm going to go through a concrete example now and then eventually we'll talk about some systems in the real world those components how this all plays out in those components. So here I've got this example and it's hypothetical but it's a real problem that's actually faced by everyone designing e-commerce systems in the world unless they're building on top of a pre-existing system of someone who has already solved this problem for them but even then as you extend those systems with your own systems you still have to continually think about systems and see how that plays out but I digress. Here's the hypothetical scenario. There are two humans involved in the scenario, a child and a parent. The parent is supervising the purchase of an item by a child online the child is going to basically review the orders on their account and see if the item has been purchased yet then they're going to purchase the item and then the parent is going to double check and just make sure that the child did the correct thing and so basically we have this flow over time which is the child first reads the orders they see that an order hasn't taken place yet so then they're going to purchase the item and then the parent is going to read the orders and find that the child successfully purchased the item, the child was good, everything acted accordingly and I just wanted to make this kind of concrete one more time Nowhere have I mentioned servers, databases, microservices none of that, this is all actually purely from the external facing side of the system, the user at the end of the day sometimes your users are real people sometimes your users are other services and your microservice architecture sometimes they are the actual service and you are the database but the point here is that these types of problems that I'm going to describe in the scenario, this plays out regardless of that, it does not matter actually what those are, it's equally capable of happening in all of these scenarios So here is the problem this is another way the order of events can take place the child reads the orders, the parent reads the orders, the child buys the item and then the parent buys the item, now why did this happen? It's because the parent checked the orders right before I mean the child was going to actually purchase the item and at that point in time the parent looked at the order list and said oh my child didn't purchase this, so now I have to, as the fail over I have to go and purchase this item because the child was not successful but actually the child was successful, the parent just checked too early and these events got basically interweaved and this is kind of a problem because the parent fundamentally made their purchase based on stale data, so by the time they made their purchase technically the read that they made was invalid because the child actually had already purchased the item, so they would have had to reread before actually finally making that purchase to do this successfully but they have absolutely no signal to tell them that they needed to reread So computer scientists a lot of them sound really smart and they like to use words from math and physics so there's actually a term for the relationship between these two events which is causality or causal ordering or causal dependency because the purchasing of the item is dependent on the read basically if you look at this from like a physics relativity perspective the only reason why the purchase happens is because the read happened and the output of the read, the outcome of the read was there's no item already in the purchase history so I'm going to progress with purchasing this so fundamentally this is kind of the type of language that a lot of people use to talk about these types of ordering of events and systems because these types of events where causality involved when you have overlapping leads to inconsistencies in data so moving on from here there kind of is a really obvious way that a lot of people think about solving this problem and it truly does solve the problem which is to just combine things with causal dependencies why can't they happen at one point in time so when folks typically think about this they think about like transactions and atomic operations in like the sync libraries and like their programming languages so this does solve the problem and it also leads to a pretty good segue which is that actually so far I've really been describing itemicity in this example so that's the A in acid if you'll recall but not the C in acid that's because while we have grouped all these things together we've been working under the assumption that every single time an action takes place here and if we follow the flow of time that is immediately visible to all of the outside actors in that system and this is where we sort of really get deep into like the physics and relativity kind of analogies that exist in distributed systems so you can actually imagine scenarios where when you actually perform these actions the visibility happens later so a child and a parent the exact same story plays out the child performs the atomic operation but when the parent goes to read it actually happens before the atomic operation it is visible to all participants in the system obviously way more in distributed systems because for example you might have a read replica that is getting changes streamed to it asynchronously and it is best effort trying to stay up to date with the most recent information so that maybe folks in another geo region on the planet get fast performance for stale data or just not up to the most consistent level of data so quite a common real life scenario largely in distributed systems this is a difference between animosity and consistency is like this visibility aspect that happens in relativistic systems and this is what will play out time after time in any distributed system it may seem like kind of far fetched looking at this like the example that like well hey how is the visibility actually like how can this be delayed maybe if it's not a distributed system but even in normal database you're running a database on a single system the time at which it takes for a transaction to commit and the time at which it is actually then propagated to actually the visible data that is queried that actually is a time window in which you can race against the visibility so this can even happen on single nodes that's getting a little bit in the weeds but I just wanted to be clear that this is not purely distributed systems problem so I kind of have this consistency spectrum where I kind of lay out the problem in two different dimensions when we're talking about the types of consistency that systems can have across the bottom the x axis have immediate consistency which is what I was talking about for most of the example like when I was talking about atomicity that is once a change happens it is immediately visible to everyone in the system and then on the right hand side of the x axis I have eventual consistency which is time passes and eventually folks will receive an update they'll eventually see the change that has occurred and that could be basically arbitrary amount of time until it happens this is what's most commonly talked about I feel like when folks talk about consistency do we need immediate or eventual consistency what does this system look like especially a few years ago when there was a lot of discussion of no SQL systems a lot of those were making consistency tradeoffs and opting for more eventual consistent systems so I feel like a lot of folks talk about that but they actually less so talk about what I have portrayed here on the y axis going theoretically we have strict and weak ordering so I think the other important aspect of consistency that often gets overlooked is the order of the operations that go in so that means that if something occurs first it's guaranteed to be before the thing that happens after it that sounds a little bit specious but I'll get into why that's relevant and the systems that benefit from it later I kind of dropped in some examples here of these types of systems for example linearizability you can see that's on the but it's the most immediate and the most strict ordering so that is one like the strong strong strong kind of guaranteed that you can find in a system and what that actually means is there is a total global ordering across of all the systems that are each change in the system and when those changes are applied they are immediately visible to everyone in the system this is the strongest guarantee you could possibly get and then on the polar opposite of that I have this eventual consistency that is also weekly ordered now that is a kind of interesting bit of technology called a conflict free replicated data type and CRDTs are a kind of building block that a lot of systems are exploring right now and what that actually lets you do is propagate changes that can basically when you have a scenario where changes don't matter what ordering is applied you can actually use this as a very effective way to synchronize data they basically rely on this property of commutativity you might actually remember this from algebra class or maybe you just remember basically learning addition as a child so that means that you can do things like one plus two equals three or two plus one equals three it doesn't matter what order you receive those changes when you sum together a bunch of numbers they're always going to converge to the same result so if you're performing operations on your data that regardless of whatever order you apply to them you're always going to converge at the same result that's great that means you're going to be good and you can use one of the systems and still get correct answers so there's kind of other variants in the space I kind of use serializability which is that there is an order for example immediately consistent there's no total global ordering it just means everything happens independently and isolated in an order and then we also have serial consistency which is more similar to what you see out of a lot of the new SQL systems I just kind of wanted to show that like there's varying levels in the spectrum it's not just kind of like the opposite corners and in the middle ground but I kind of wanted to dig a little bit deeper into this because there is super important to understand properties involved here if you'll look at the bottom of the x-axis I've replaced immediate and eventual with slow and fast because in the real world this is the application of choosing one of these effectively it is way way way less performant to choose something that is immediately consistent because you have to make sure before you write something that is going to be visible to everyone so that means it probably has to be replicated everywhere before it becomes visible and accept it as a write and then CRDTs for example you can just basically dump out a stream of changes hope that eventually someone gets all the changes and then they're good what is really interesting here is that we kind of have this middle ground this box and I call this box cleverness because this is where you're going to find a lot of the stuff in the real world that's compromising and trying to make a lot of systems viable if you have a problem that needs to be solved with linearizability or you have a problem that can be solved with CRDTs you kind of got your choices pretty easily made for you your problem domain has made it really obvious you're stuck in one of these camps you're stuck in either corner of the spectrum but actually most systems will not actually have those problems and instead they're going to live somewhere in this middle ground and this middle ground is where there's going to be a lot of interesting tricks and things that you're going to be able to partially apply to gain a lot of benefits in your system without necessarily paying the costs globally across all of the data that you're working with some typos in those slides so that was basically what I wanted to cover at the high level of kind of like the conceptual side of consistency and now I kind of want to jump into some examples of basically live systems components that folks are using in industry to solve their problems in the cloud native ecosystem S3 I wanted a really good example this is actually kind of funny I wanted a good example of a ubiquitous system that's eventually consistent that a lot of folks are using and I immediately thought of S3 I built a product in the past on top of S3 and yeah you basically would submit blobs to S3 it would tell you hey I wrote it that's great but if someone else immediately then tried to pull down that blob it would not be available yet so there was no necessarily a guarantee that after you had written something it was immediately viewable to external actors and actually as I went to go make this presentation I found an article AWS actually fixed this they actually changed this a couple years ago so I don't know if this is actually true across all the implementations of blob storage so if you go to Google for Google cloud storage or Azure for Azure blob storage or Minio if you're running something yourself I'm not actually sure I'll make those same strict guarantees but certainly for the vast majority of the lifetime of these kind of blob services eventual consistency was kind of the status quo there and what is kind of interesting is I'm going to dive into it deeper later but the system backing S3 storing metadata was given additional consistency capabilities what made it possible for the developers of S3 to actually change this consistency to make it actually more consistent that is very typical for systems and this is actually an example that's kind of rebutting my conjecture that it's actually impossible or hard prohibitively hard to kind of add this consistency after you've designed a system that doesn't have it but S3 is actually sufficiently simple that it was actually not much of a hurdle once the underlying dependency that they had offered that capability so remember to take everything with a grain of sand that I say a grain of salt that I say because you know the system you're working with and I have to kind of speak in generalities for systems that I think are what are the most commonplace and that I've seen most commonplace but maybe you're building something that is not exactly that so here's one that's going to be really fun relational databases this is kind of the system that a lot of people are familiar with and I think that in the general case a lot of developers believe that by simply adopting transactions in their usage of a relational database that they have kind of consistency problems and I'm here to tell you that transactions are not a silver bullet not nearly and actually what dictates your consistency in the systems far more than transactions is actually basically the isolation level set in the database and even if you don't include transactions whatsoever implicitly the statements that you send to the server are going to be wrapped in transactions so like whether you think like everything is a transaction basically it's an atomic unit inside of a relational database whether you use the keyboard or not in the sequel you're writing alright so what are isolation levels isolation levels are kind of this aspect of the database and it's usually defined at the table the table layer effectively what it says is the consistency of the data that you're working with on that particular table so there's no standard for this or anything like that but across MySQL and Postgres they kind of agree that these are kind of the basic isolation layers in MySQL the default isolation level is repeatable read so you'll see that's the second from the top and then in Postgres it's actually recommitted so you'll see that's actually the third from the top so Postgres is actually by default more lenient it can actually have less consistent responses by default if you don't actually go in and clarify what isolation level you need in this sequel you're writing so I kind of wanted to run through the different types of scenarios that are kind of outlined here dirty reads are when you perform a transaction and when you read a row another transaction which has not been committed yet it hasn't been written to the database you'll see that data you'll see changes that have occurred so basically I open a transaction I try to modify something you open a transaction you go to read that thing you'll see the change that is a dirty read and so you'll see that unless you're really kind of like explicitly choosing to go inconsistent that is unlikely to be ever a scenario you'll see when you're using relational database unless you specifically say I want read uncommitted so that's not a super common problem but it's interesting to note that it's even possible and then effectively that eliminates some of the benefits that you get from what's called a MVCC or a multi version concurrency control database system so that is super atypical unless you're working with a database that is not a MVCC so then we have non repeatable read which this is when you basically you reread data committed by other transaction that bind has been modified so this means that you're in your transaction another transaction modifies some data that you've already read and then if you go to read that data again you'll see it's updated within your transaction so this is breaking the kind of atomicity thing that a lot of people consider transactions to provide them but actually you'll find that this non repeatable read in read committed that's actually possible so that by default in Postgres this is totally a scenario that can happen to you even though that's probably shocking to people to believe that hey I'm supposed to just be reading from that particular snapshot I'm not supposed to see these types of changes but it's quite possible and then finally we have phantom read which this is more commonly the thing that is it's most quiet because you're not going to see it but also it's the thing that's going to actually corrupt your data but it is definitely going to be the most surprising way that could possibly corrupt your data which is basically you read some data in your transaction another transaction modifies it and when the database goes to apply it it just happily does it both it doesn't try to rerun your transaction and your transaction performs a read and then depended on the value of that read it performs its write if another transaction before it comes in it just totally swaps that out doesn't matter it's just going to progress anyway if you had reread it reread the value in the non repeatable read scenario it would not have mattered it would have lied to you and say the value hasn't changed but then fundamentally when these transactions go to get committed that's the which the time the value is going to have changed and you're going to be SOL so phantom reads are actually even possible basically at the default isolation level in my SQL so unless you have explicitly configured your data store to be serializable that's the only opportunity for all of these to not occur to you now the interestingness that you get in kind of like that cleverness box that I kind of showed off in the diagram earlier the interesting thing that you you get to have is there are constructs in SQL that allow you to do individual row level locking so in SQL there is a select for update clause that you can write that says I'm going to read this row because I am going to causally update some other row based on its value so that actually lets you describe this causal dependency this causal ordering in in terms that the database can understand and what actually happens internally when you use select for update is it does a row level lock on that data so that that actually locks that data and prevents any other transactions from modifying that data for the duration of your transaction until you can actually commit your transaction so this is what's going to give you guarantees that no one else changed that value that from underneath you and this is kind of having your cake and eating it too you don't have to turn on full serializability to get that guarantee you can do that with any of these modes so that is where you're in that cleverness box again you're selectively choosing I need consistency for this operation right here locally but I don't need consistency across the board everywhere so there's a lot more and there's a lot of other tricks deep inside these relational databases for kind of like working with this data but I think this is the very high level most important thing if I had to like teach someone about consistency and relational databases that had 10 minutes to tell you something this is it walk away knowing that like isolation levels are a thing that you should constantly make sure you're reminded of and familiar with when you're writing schemas for relational databases and also knowing that like if you need to read them right in a relational database you should be using select for update most likely cool so let's talk about a less commonly used system but an equally interesting and very relevant one which is lock services and that is kind of what I call this class of software although they were originally designed to be lock services they kind of have larger scopes these days these are projects like xcd and suit keeper so what are the guarantees here or actually let's a little bit talk about what lock services are so basically in the probably mid to late 2000s Google wrote a paper about a system that they built internally called chubby and that internal system is a distributed lock service and the point of the system was we have distributed systems we have a bunch of different applications they all need to coordinate together so they need to they need some mechanism for them to safely one of them needs to acquire exclusive access to some resource they need a lock a distributed lock that's very tricky and it turns out formally that to solve that problem you need a linearizable system so what ended up happening is they were in a paper having designed this distributed lock service and ultimately we see projects inspired by that zookeeper is I believe it is definitely inspired by that paper it doesn't implement it directly it implements completely its own unique algorithm called ZAB but then we also see systems like scd which is also inspired by kind of the later computer science research around kind of the same consensus research in distributed systems and ultimately these are linearizable systems but then we as the systems have gotten more and more mature they kind of decided hey this is a really useful property having something that is linearizable so we have always guarantees about whatever like critical data we have in our distributed system let's save it all over there we don't just need it for locks we need it for more things so actually a lot of these services expanded beyond just locking as a capability that they could serve and our general purpose key value stores and so notably at cd as the solves this exact problem it's a key value store that is actually at the core data store used for Kubernetes and what's really interesting is that kind of while it is linearizable there are lots of tricks that you can kind of use in the protocol and under the hood to optimize the consistency and like the performance of such a system without kind of impacting the external user facing appearance the freshness of the data that they're seeing so there's a lot of really interesting distributed system tricks here I'm just going to call them tricks because I don't want to dive too deep into them and there's lots of variations of algorithms under the hood that are short cutting things to to make this really fast and then there's also capabilities of these systems to kind of relax the consistency that you can actually use the system if you would like to trade that in order to get higher performance and the important thing to note here is that when you are kind of like relaxing this consistency and like really kind of playing with these things these are the authors of these systems and they're doing it kind of at the protocol level and like in the API level so it's not really exposed to anyone on the outside consuming this system so much as kind of like optimized around the guarantees that they can provide there are some that provide APIs so you don't necessarily you can choose if this needs to be a quorum read or not for example but by and large lots of the tricks are internal to these systems so most critical most strictly required strong consistency they get sort in long services so then basically what happened was the new sequel revolution and that eventually turned into kind of what we all kind of refer to now as distributed sequel I think of databases in this space as cockroach DB and tie DB but there are a couple more this era is really interesting because they basically looked at solving the problem of scaling out a relational database in a horizontal fashion so you can keep spinning up individual nodes and scale up the system without any like replication lag there is you don't have to direct rights to one particular node they're kind of solving these traditional problems in scaling these relational databases and doing so by applying some of the research from the lock service optimizations so what happened was the creators of these databases looked at the research that kind of gone into like the lock services and went well hey the data we need to effectively the internal database the bookkeeping we need to make sure that we can actually scale the systems we can serve that basically store that in using these lock service techniques using the consistency tricks that these systems have developed and that is going to make it so we can actually scale and provide our sequel systems now you'll notice that as a part of this process they're not actually kind of passing on any of that to the end users they're still providing the same kind of isolation levels transaction logic select for update logic all those all the things that I mentioned previously about relational databases the guarantees are still here in these systems they've only just been made possible to be expanded in this kind of new kind of auto scaling world but that's not to say that magically these distributed SQL systems are linearizable in fact the examples I've given here cockroach DB and tie DB neither of those are linearizable so really the folks from the consistency being used here are the database admins the operators the SREs running these systems because they're able to basically scale out these relational databases using the more effective kind of consistency for only the core data necessary there but not any of the end user application developer data so cool that unlocked like a bunch of new capabilities for the sequel databases but the end users themselves didn't necessarily get anything better and this is kind of where there this is the less defined what I would say is like an interesting new era of kind of databases which I'm calling ad hoc here but I think this is kind of flexible consistency systems I use two examples here Cosmos DB and Dynamo DB which are kind of foundational data stores at Azure and AWS respectively Dynamo is actually notorious for having been a paper published many many years ago probably mid yeah mid late two thousands and what has actually happened is the Dynamo described in that paper is not at all like the Dynamo that currently runs at Amazon now which is why they're able to add capabilities to S3 right to actually kind of add more consistency is now systems like this expose consistency to the end user API so the folks consuming this these databases on the fly can choose what level of consistency they want for the data in the response so if we like touch back on the other systems lock services are very strict and strong consistency no matter what you do and then relational databases you don't have any ops there beyond the isolation level and kind of select for update mechanisms but that is all like very very domain specific and and kind of it's still non obvious there's no way to say this particular treat this whole glob of things this whole operation I'm going to perform with this this amount of consistency in those systems so very very different from what we previously had this is actually kind of what I would describe as like unifying all of the things all the benefits that I just talked about because now the users are more in control of the consistency of their data they get to pay for exactly what they use in terms of like the performance cost of when things need to be consistent they can actually slow things down just for that part and then for the things that don't need to be business that they can actually take advantage of the optimization and really go for it and what's interesting here is space DB so the database that my company builds is an example of one of these ad hoc systems users per request can specify the consistency they need and there is a default to be specified one but what is really really interesting is that this is an opportunity for a lot of user experience research because at least in our domain because we're specifically handling authorization data we can actually tell users what is going to happen at the different levels of consistency because we know more about their data because we know about the domain they're operating in so we don't just have to explain kind of like these distributed systems research concepts to every user of the system instead we can make it obvious that like if you check this permission with this it's going to be based on time where time is rounded for performance reasons so like we can actually draw on actually analogies that folks understand about our domain rather than trying to treat teach them very low level concepts I'm actually super excited for more of this I would really like to see APIs where you kind of can sneak the concepts of consistency into it so folks by choosing the proper APIs thinking about their use case they no longer think about consistency they think about their use case and then by virtue of picking that then they have just chosen the API with the proper consistency for what they need so it's very very cool stuff I'm really excited about this space generally and it wouldn't be possible if not for an approach where we kind of start with very consistent world view and then allow folks to relax that over time because I feel like when you work with the relational database model where it is very relaxed and then you layer on more and more strictness when you do that you just have to know too much about both the domain details and the internal knowledge of the data store you're working with or the internal knowledge of the system you're working with and I think it's only by starting from a safe posture and relaxing yourself that we're actually able to see more of these ad hoc style systems where it's way more flexible adding consistency or using consistency and with that that's all I have for today this is a very deep subject and I just want folks to know that I have only kind of scratched the surface here people spend their whole professional careers researching this stuff that's a cuckoo clock going off that's how I know I hit my number for the time for this webinar but yeah consistency is a super super important subject I still see very seasoned developers over looking at or downplaying it in their systems and especially when they make architectural decisions for their overall design of their system and it really shouldn't be overlooked it's one of the most important things that you need to discuss and decide when you're designing something because excluding very few of these occasions we have previously not worked with a lot of systems that give you that flexibility to actually adopt more consistency if you need it so if you've foregone consistency and chosen a system where you would then have to somehow layer on more you often just can't go back you often are stuck and you'll break the contract and things will start breaking there are many of large companies that folks in industry think are the bastions of pinnacle of engineering that get the higher experts all around the world they pay well because they're a giant company building a really cool product but a lot of these companies have huge problems because the way they grew they were incapable of dealing with consistency they didn't have the technology at the time to deal with the consistency of their systems and they have outages as a result they have data loss as a result and they are stuck in a state where they're forever trying to migrate critical data or the data that has different consistency into new systems and I do not want to see more developers in that scenario especially since we've made so much progress on this subject as an industry and with that I'd like to thank everyone for watching if you want to discuss more topics like this or learn more about how for example space DB uses ad hoc consistency you can join our discord the link is right in front of you and that's not just exclusively for space DB users it's an open source community where folks that care about distributed systems can talk about all kinds of research and practical usage in industry so thanks for your time