 All right. Let's get started. My name is Greg Fauna, and we are here today to talk about testing CEPH, the distributed storage system, and things that have gone well with that over the past decade and things that have gone poorly. I say the past decade because that's actually about how long I've been working on the CEPH system. It's very exciting for me. Quick show of hands. Who here thinks they know anything at all about CEPH? All right. A whirlwind introduction because the particulars don't matter too much. But what's important is that CEPH is a distributed storage, scalable storage system. It runs a whole bunch of different servers, demons on a whole bunch of different servers, and aggregates all the storage in them together, so you have hundreds or thousands of processes coordinating and working together in one system, and we need to be able to test it. At the bottom level of CEPH, we have the reliable autonomic distributed object store, and that's responsible for storing all the user data and keeping it safe in the event of power outages and server failures. You talk to RADOS using the lib-rados library, and on top of that we've built an S3 and Swift-compatible RADOS gateway, HTTP service, a virtual block-to-base, ball-RVT, and a POSIX-compliance CEPH best-distributed storage file system. At the RADOS layer, there are two main kinds of components. There are object storage devices, or object storage demons, of which you have tens to 10,000 in a cluster. They're responsible for actually taking data from clients, and storing it on a disk, and for noticing when their peers fail and and recovering data back and forth. And then you have a small number three to five monitor servers who are responsible for keeping track of which, for taking in reports and saying, oh yes, this OSD over there failed, and sending out updates to the system so that they can make changes in their recovery and who's in charge of doing reason rights. From a client's perspective, then your application runs, and it uses the lib-rados, and there's just sort of a network pipe out to a cloudy cluster. That's a very quick overview. If you want to know more, I'm actually giving you an intro to CEPH tomorrow. Well, we're going to more details, but sort of from the testing perspective, distributed system, lots of processes running need to be able to test different kinds of failures and different behaviors and cross-processes and the ways they interact live. So, CEPH is short for cephalopod, which is the sort of biological family of creatures including squids and octopuses. So, tuthology is the study of cephalopods, and so that's the name of our testing system. Tuthology started out almost eight years ago now. We needed to formalize the way we're testing CEPH. At the time, the CEPH team consisted of three people sitting in office together, writing code, and we had some, you know, scripts, and we had some tests like PJD or, I don't know, debenture, whatever, that we would periodically take and like run against a CEPH FS file mount and see if it passed, and if it didn't pass, we'd figure out why. But it was all very ad hoc. We didn't have anything that was running on a regular basis. We didn't have anything that was guarding git commits, and so at one point I ran a test, and I said, oh, this test failed, and I tracked down why, and I found an if condition that, and I said, oh, that if condition is backwards, and so I swapped it and I pushed that commit into git, and then a few weeks later someone else ran a different test, and they said, huh, something's not working right in this test, and they debugged it, and they ran through, and they found an if condition, and they said, huh, that if condition is backwards, and they swapped it back, and then I happened to run my test again and noticed that the if condition I changed had been swapped back by someone else, and I went and looked, and happened to notice that, oh, it had just been, depending on which test we ran, we were swapping a less than or greater than back and forth, because in fact the problem wasn't that the if condition was backwards, it was that the test wasn't good enough in the code, the logic test in the code, and nothing was telling us that in our regular testing, and that's not a sustainable way to build a distributed system with a lot of people. So eight years ago, nothing existed for distributed systems. We had a had one of a new teammate who was very excited about building a test system, and he tried out, first of all, auto-test, which I think came out of the Linux kernel community, but auto-test was all about doing lots and lots of things to one server at a time, and so it was unsuccessful, and it was very important to us to be able to manipulate multiple machines at once. So he sat down, and the first thing he wrote was called an orchestra module. It was in Python. It used an SSH system and G event. Nowadays, this is probably not anything super exciting, but it was really important. It was not quite as common back then. Orchestra, it was cool for us because it would it let us do SSH commands to different nodes from a controller, and it was real-time and interactive, and we could do things like, and we could do things like access into a server and run a process and get an object back that represented that running process and send it standard input and receive it standard output. And we could do things on a particular server or on all the servers that are of type x86 or all the servers that aren't of type x86 within our cluster. Tuthology itself was and it was initially just a test runner. We run a task that told the system to do something on targets, machines, and those tasks would map on to particular machines via roles. It would automatically set up the system in a set cluster, it would monitor the health of Seth, it would run the test, it would archive whether the test had passed or failed, and the logs that were generated by the running processes, any core dumps that happened during the test, and then clean the machines back up. Targets, this is all in the ML format, were just a list of, you know, use your names and machines that say Gen2, roles would be a mapping saying, hey, on my first machine, I want to run one monitor, one metadata server, and one OSD. On the third machine, we have another monitor and a client, and the tasks would be something like, hey, I want to invoke the Seth task, I want to mount a kernel client on the client.zero role machine, and then I want to run a work unit, which is located at the file suite slash dbench on all the clients. These tasks are called Context Manager, our thing in Thailand called Context Managers. So the Seth task, despite only appearing once, will set up a Seth cluster, and then when all the other tasks are done, it'll tear down that Seth cluster. The kernel client will mount, the kernel, or the kernel client task will mount the kernel client, and then when all the tests have run, it will unmount it before returning, et cetera. And then the tuthology interface would combine all the YAML fragments you give it on a command line, so I can have a Targets.yaml file that I use over and over again that consists of three machines, and I can have a kernel client roles.yaml and a kernel client dbench.yaml that specify, oh, this one's for running the kernel client, and I want to run this particular dbench test, and I'm archiving it in this folder. And then tuthology just shoves all those YAML fragments together into one file that it uses as its configuration. And when the test is run, it logs everything. We've got here a, oh, it passed or failed, if it failed, here's the line that caused it to fail. We have a log of all the input and output that the main tuthology process got, which pointed out all the standard output of the demons running on other servers. We have the configuration that we started with, and the SHA-1 of what version of Seth we're running, and also all the logs that were generated on our remote nodes. That was in 2011. Tuthology today looks a little different. Some of this was done later in 2011, some in 2012, some of it last year. First of all, tuthology mostly, instead of running on a cluster of like five shared dev servers, tuthology mostly now runs in the CPIA lab. This is a community lab devoted to Seth development consisting of several hundred machines that any committed Seth contributor can get access to. More than a hundred people can get into it right now, and all these machines are just 24-7 running Seth tests to ensure stability and test incoming code. When you have that many different people running, it's very important that you not accidentally use the same machines because things go very wrong if you have two different servers issuing commands to turn on clusters and turn off clusters. So you can lock machines. It's not a very sophisticated thing, it's just a database sitting somewhere that says, oh, these machines are available, I've given you three of them. You can instead of running tuthology directly from your server or from your laptop, you can schedule it. And so you just invoke the tuthology schedule command instead of tuthology and it'll take all the YAML files you give it and put them into a beanstalk queue. And then there's a server running in our lab that has like 50 tuthology worker processes that go, oh, hey, I'm idle. Are there any tuthology jobs available to me? Hey, here's one. Let me walk the three nodes that are required for this job, enter on this job and store the results, and then, hey, are there any jobs available to me? It's super primitive. Second of all, instead of having a bunch of, instead of having whatever random, yeah, whatever random YAML fragments you want, we've organized them all into suites and have a command that can assemble those suites into specific jobs. Suites are just collections of directories or directories of YAML fragments that are combined in various ways. So here's a small example from one of them. We have a verify suite inside of our radius component that runs through a bunch of validation. Now, there's a lot of different ways that you can configure steps. So this just does a couple of different things in this example. First of all, we might thrash, either the default or the no thrashing, and that's doing things to the cluster in this test. Next, we can choose the object store, which lets you decide how data is stored within an OSD. Either our classic, we have an XFS file system to store data in, or we have this newer thing called BlueStore. That's our own custom block-wise access survey. You can figure that BlueStore in a couple of different ways. And then we have a bunch of different tasks. We can test the recovery of the monitors, or that the Liberator's API works, or that something called Radar's Classes function. And when you run two-thology suite against this directory, then the first two lines up there just keep your agent and share it all the jobs. It'll go through and it'll select one YAML fragment from each directory to create a job out of that. So this first one is running with the default thrashing and this BlueStore bitmap and monitor recovery. And that's job one. But then job two is with the Radar's API test, and job three is with the Radar's Class, all tests. And then job four switches the object store type to this BlueStore compression and goes back. And so we get a combinatorial explosion of all the different kinds of fragments so that we can make sure that if we add some new feature, we can just add one new fragment that controls that feature and run it in combination with all the others in the system. The QA suite test coverage is pretty extensive. These are some of... Well, I don't know if there's still all the suites we have, but there are all the suites we have when I generated this slide. Most of these things are different set components. Some of them are targeted suites that are used for specific things. I alluded before to the thrashers. So thrashers are sort of chaos monkeys that run while other tests are running, our other tests are in progress, and can do things like say, hey, I'm going to randomly turn object storage demons on or off or change the way that we shard our data across the OSDs or do other mean things, maybe bring them into the cluster or take them out. So that's sort of the basic function of deutology. So here's how we use it. First of all, developers are granted access to the lab and so they can run tests on work-in-progress branches as they're building up new features. We can push a branch to a special sefci.git repository on GitHub and it gets automatically turned into set packages. And then we invoke the deutology suite command saying, all right, I want to run on the set of machines, the Rado suite against my branch, drag one, two, three, four, five tests. And that just sends off all the jobs in the Rado suite into Beanstalk and it turns through them and the results get posted publicly at polpito.sef.com, which is our test result viewer. Second, pull requests get submitted to us on GitHub and the tech leads and reviewers go through those PRs and when they think that they might be ready for merge, then they build integration branches out of several PRs at once and run them through the appropriate suites to check for issues. We have some Python schooling to help with building those branches and pushing them for builds and running the tests and nothing merges to the sefmaster branch without passing these tests and then the results are publicly available and so when a test completes, then the reviewer says, well, something failed. I think it might be this PR so I'm gonna have that person look at it or hey, everything passed, I can merge everything. And then finally, we have a variety of nightly jobs. Most of the suites are run from once a week to every single night and again, all the results publicly posted. You can see what jobs we actually run with that cron tab configuration that's on GitHub. Super nice. However, there are some gaps in tuthology where we have not succeeded. I don't want you to think that we're doing too bad a job so we are actually successful in a lot of ways. We're like, we're pretty good at this, it turns out. Functional coverage is great. We have a lot of specific tools to test behaviors. The Ceph test rate as binary is a thing that generates random API calls and knows what results it should be getting based on what it's putting in the past and we run that against all the thrashers in the system and make sure that the client doesn't see different results depending on what failure rows happen because that's one of Ceph's guarantees is that you don't see that. We deliberately inject failures to test demons like saying, hey, I want you to randomly fail to deliver one message out of 100 to the OSD in the messenger layer. We do things like fiddle with the RADOS objects and make sure that CephFS or RBD on top of them notice and recover those from those failures in the ways that we expect. We can go poke at files in XFS and make sure that the OSD notices and that the total RADOS system notices and recovers in the ways we expect. And do you think that's not the only way we test Ceph? Different sub-components and associated projects like the Ceph Ansible installer or the Ceph volume configuration and OSD deployment tool have their own testing systems. And we have a whole bunch of unit tests some of which are not really unit tests but you pretty sophisticated things that run inside of our make check build target that's built on every single pull request and whose results are posted so you can see if it's passing, those are not. But now that I've covered my butt about how we're great there are some gaps. Tuthology handles demons directly. It SSA just into a test node and runs a process in the foreground which means that it doesn't test our init system configuration scripts. Tuthology does not do performance testing. We have a new per suite that uses a separate thing called the Ceph benchmark tool but there's no analysis up, no automated analysis of it and the tests are very limited. We expect in Tuthology that tests run within hours in as little space as possible so there's not a lot of scale testing or long term testing that the system does. And finally, Tuthology handles the Ceph installs and cluster config on its own so it does not test deployment tools. There's some fixes that we've engaged in and some that we've planned for these issues. We want to expand the API so that we can restart and signal using the init system instead of literally saying, hey, I want to go to the monitor node and run this, fill all that in my command. Which is how a lot of stuff, not everything, but a lot of things happen right now. In terms of performance testing, for several months now, we've been gathering some historical data in our per suite and we want to start guessing at a rough limit for how long is anomalous and failed tests when the times start taking too long, but even that's pretty primitive. So long term, it would be great to track performance numbers and alert when they're out of bounds based on some prediction of how we expect a cluster on these nodes to perform. For scale testing, we just decided, you know what, Tuthology's the wrong place for scale testing, it's an integration test suite. So within Red Hat, we've got tests and a whole separate group that runs tests of long term allocations in their own lab space and there are a few things that do this not too many, but we'd like to start mocking up and injecting age data and large scale data within shorter tests by, you know, running something for a while and then cloning the disk state and being able to copy that in very quickly without going through set and see what happens. This last one is a pretty big problem for us actually. So when we started Tuthology, it was our expectation that all of the set users would be writing their own Chef or Puppet scripts because they were managing data centers and set would just be another thing they managed in their data center and that turned out to be very wrong, but it never got updated. So we need to write a new API to request installs and farm them out to things like so fanciful or sooses deep sea or roofing Kubernetes or whatever is next. And that's what I'm working on right now. And finally, sort of related to that problem, we have a new component in Chef called the Chef Manager and the Manager is a Python service unlike the rest of Chef, which is all in C++ and it handles some reporting from the nodes and aggregating data and things like running a web dashboard. But it's also got one set of cool new functionality coming in called orchestrators that let us do day two operations and things like say, hey, I wanna provision some new OSDs on these servers or hey, I've got an NFS cluster that's backed by ChefVest and I need to deploy a new NFS server. And those orchestrators rely on Chef Ansible or deep sea or roof to do those things because it doesn't talk to them themselves and because there's no integration of those components into Chef, then this is a new key piece of functionality that you can't test in tutology right now. So it's really important that we do that. And it's a good example or a very bad example of how missing testing integrations can have long-term consequences that you don't initially foresee and that you need to be aware of. So that's some stuff that we're missing just in terms of the tests we've written within tutology. But there are also some weaknesses of the framework itself that you wanna be aware of if you're starting to build this sort of system up for your own project. First of all, tutology is strongly tied to Chef. When it was written, it was intended not to be. Orchestra should be usable elsewhere, although I don't know if there's any value to it at that point anymore. And tutology was designed to support other multi-node systems, but nobody else uses it. So we sort of things creep in. In retrospect, we should have found at least one community to use tutology with us. That would have been good. There were other communities that needed things like tutology and that we probably could have worked with if we put the effort in, but we didn't think about it. At this point, we're not gonna worry about it. Orchestra and tutology are reliable code bases that serve our needs. And if you're building a new testing system, you shouldn't build a new testing system, you should find one that exists. You're not gonna get everything you need in one coherent package, but there are things like Zool. There are things, you know, there's Jenkins. There's a lot of different CI and test automation that's available now that you can build systems out of. And there's a lot more interest in distributed systems testing. Second of all, well, big problem. Tutology is strongly tied to CPIA. It's not supposed to be, but hard-coded values sneak in like assuming, oh, all my new packages are located on the set.com domain and not my private downstream repo domain. And because it's a very internal project, then the documentation, while there is documentation, it's often unclear and it's a bunch of different services that you need to wrangle together. Now, the upstream project community is not the only group running it. There are other groups, either distributors like Red Hat or private companies running their own clouds who do downstreams and run Tutology for their testing, but they tend to do a local port where they go through and scrub all these things out. And that makes collaboration between us a little harder than it should be. It also makes developing tests like new tests difficult because we need to get machines in the CPIA lab. We need to image the machines. We need to install packages in order to find out that, oh, we really should have run PyFlakes on our Python code first. It means that you need built-Seph packages to deploy to the CPIA lab. Once upon a time, it actually just built a tarball of the locally built SEPH executables you'd built inside of your source directory and SSH them over to the machine, but that doesn't work when your machines might be different and you have more than four shared dev servers. And that's sad because it means you need to go through a whole package build cycle instead of just testing local source code. And all of this together makes it hard for third parties to compute to mainline Tutology development and for new SEPH contributors to test their patches and that's bad. There are things we should have done to make this better in hindsight. If you don't test something, it doesn't work, so we really should have done regular teardown and set up a part of our practice of the Tutology system. It would have been good when we discovered group sample trouble set up and install it. We'd worked a lot more closely on it because we've had some attrition of people who have the skills to work in the system and that's scary. And when we did some changes like switching to require packages, we didn't think a lot about people who weren't us and that was detrimental to our community. I'm not sure if we would have ended up making a change given the advantages but it wasn't even something we considered. There was a thing called Tutology OpenStack to solve some of these problems. It was a set of scripts. It would talk to a random OpenStack cloud and turn on Tutology and run a test suite that you used to ask for. And that's a great thing to have in theory but it died upstream due to lack of maintenance and interest and because it was running and using I think the actual CLI tools, not some library and all the commands changed. Over the years. At this point we have made some strides to move away from this problem about being strongly tied to CFIA. The file system and some of the RGW gateway tests run against something called a VSTART runner. This is a restricted API, not the full Tutology sub thing but specified subset of it that lets you write a single test that runs in a local dev system or in the full Tutology lab. We're doing a lot more work to try and foster community around CEP. We have a weekly meeting to talk about testing issues and things that people are having trouble with their working on in that part of the project. We've tried to reach out to the known users because we found out about several of them about a year ago at our first CEPLICON that we had no idea existed and we're working to merge enhancements and fixes upstream from those groups. In the future we'd also like to find out what things people have done to do the localizations, what strengths they've had to change, what assumptions we had about resources being available that might not be true and make those configuration options instead of just things that you need to change in the source code and it would be great to rebuild Tutology OpenStack with LibCloud or some other newer, more reliable API. Okay, so Tutology as a framework is good for looking at an individual test and saying whether it passed or failed. We can make it fail based off of all kinds of things that might happen in the system, core dumps, it scans the logs and can say, oh, I see a warn message or an error message and I wanna fail, we can say in the code, hey, I wanna fail if this thing is true or fail if this thing is false. It's very robust about that but there's not a good way to say, oh, this YAML fragment has failed the last 15 times it was included in any test or to say, oh, this test is now taking twice as long as it was three weeks ago and that means that we're weak at certain kinds of problem identification. It hasn't become a critical issue yet so we don't have any specific plans but this is something I hope that one of the AI tracking systems might eventually be able to help us with. A big problem for a while that we've mostly mitigated is that suites have really exploded in size. They are a geometric combination of all the fragments so the RADO suite last time I checked this was well over 100,000 possible jobs and there's not really any way to prioritize or any of those tests to know if 15 things failed which is the most important to say, okay, like here are the fastest tests and I wanna get feedback from them really quickly while working on longer term ones or for resource constrained users to know which tests are most important to run. A couple of years ago, we added something called subset functionality which is a big critical thing for this. So the subset functionality lets you say, hey, I want to run subset one of 500 and it will make sure that we run every single YAML fragment but not every single combination. So when we do this one of 500, we add up to 397 jobs instead of 124,310 and that won't do all the combinations but it covers everything at least once and importantly, if we then run subset two of 500 and three of 500 and four of 500 all the way up to 499, 500 and 500 and 500 then we will have run all 124,000 tests and so this is a good way for us to step through incrementally every time we run in the nightlies while still getting quick feedback of the things that, while still getting fast feedback that something's wrong while maybe not testing every single combination every single time. In addition to the subsets, we also have filtering options that let you say, hey, only run things that match this particular regex in the description of the test. We can say, hey, I want to rerun all the tests that failed in this last indication because I think I fixed it, fixed the problem I found and I want to check. I'm going to do more in the future but this hasn't become a critical issue since we did the subset. More important, scheduling for us is very primitive. Jobs are picked off of Beanstalk and then one of 50 processes that has locked them tries to lock the servers. So if you have a five no job, it never runs because we have a bunch of two no jobs that just always say, hey, do you have two no's and the assistant says yes and then the five no job says so you have five and it says no and the two no jobs always win that race which means that besides those jobs never running we can also find out that scheduled nightlies can be delayed and then run on very old code. So we've come with a number of hacks some of which have patterned themselves in our brain in ways that, in our brains in ways that are pretty bad but we usually but not always notice. So we make all our jobs run in the same number of nodes at this point so that they actually run. But that means we're restricted to jobs that run in the same number of nodes and sometimes we forget that when we're talking about what our coverage is actually like. We have people in our QA team who monitor the queue and say, oh, we've got 7,000 jobs running and 7,000 jobs queued up and 6,000 of them are nightlies that are more than three days old so we're just gonna kill those. One of the earlier things coming up in the framework itself is to update this so that we can actually lock the number of nodes that are required before jobs come off the queue and so we can do that in order to make sure that five node jobs run. We'd also like to write a more robust scheduler but we're not sure we're gonna have any scheduling or any time developer time for that. One thing I also want but don't really know if we'll be able to do is to say, hey, like we've had seven people manually schedule a run on master of rados so we probably don't need to run that one in the nightlies today. And a new problem that's come up recently is that physical machines are first class citizens in toothology. We say, hey, I need to install Samba on these two machines but not this third one and that's okay. And some of our suites do things like say, hey, I want to power cycle that server. And demons, which we run, also belong to the machine that we run them on but that doesn't work for Kubernetes because none of those concepts really make sense for it and Kubernetes is becoming very important to Seth. We're the main storage provider for Rook which is the main storage provider in the CNCF that lives within Kubernetes. So our plan is to define new packages for making those, or new interfaces for making those packages available and starting them but we haven't made a lot of progress in it yet and that'll come right after the deployer interfaces. So that's what we've learned over the past decade. So are we going to stick with it in terms of running our own test system? Yes, we're keeping our tests. It's been a bunch of years of development and it's critically toothology is proven effective at exploring the state space of Seth and testing recovery from failures in ways that other systems we have and that other components work with just don't. At this point it's operating successfully with low maintenance requirements and we have developer experience with the quirks but we are trying to make life easier on new desktop users and update it for the new requirements for development. Should you be doing automated testing? Yes, you should definitely be doing automated testing. Manual testing is terrible. You can never scale it the way you need to. But you probably should not write your own test framework the way we did. There are a bunch of good ones out there now and lots of projects with custom frameworks. If you wanna build your own, I have some questions you should probably ask yourself. First of all, are you sure that there isn't something public that you can't use? Now it might be that one of them won't solve all your problems but you can use it in combination with others. So there's things like G-Test that are designed for any unit test or Pi-Test or whatever's in your language of choice all the way up to OpenStack Zool which is designed for multiple projects integrating together. And even if you don't see something like that there's probably a software project that does something like what yours does and that probably does testing somehow. And so if you really think you wanna build your own you should know why your test needs are different from projects like yours and why the capabilities that exist out there aren't sufficient for you. If you go through that and you decide you need to write your own tests or your own test framework do the simplest thing that fits your needs. Small frameworks are a lot easier to embed in other systems later on or extend. And that's one of the problems you've noticed with tuthology is that because we built our test framework up around SSH again and running commands on machines then luckily things we do are pretty straightforward like restarting a process or killing it or whatever but sometimes we just like ship off a bundle of Python code and say execute this over there. And that is also a thing we can automate but you know like sometimes it's more complicated build up so of shell commands. And it's good to try and keep your components discreet so they can be replaced later. This is another one of the problems we've had. A lot of our, we've had new developers come on board working on things that are sort of peripheral to SEF and they say wow, tuthology's huge and complicated and I don't want to work and I don't want to have to go through it when I'm testing my system. So they built a whole new system out of completely separate components because it was hard to extract the useful bits of tuthology for them. I've sort of alluded to this in the past but there are some comprehensive changes coming. SEF is going in really big on Kubernetes and Rook and we're actively discussing how to test that. We aren't sure yet how we're gonna run Kubernetes inside tuthology but it's gotta happen and like I said, we're figuring out how to do these deployer interfaces so that tests that we write to test these have kind of functionality can be agnostic to new environments and new ways of deploying things in the future. So, I guess I'm done a little bit early but we have plenty of time for questions. Sounds like a SEF question. Can you resize the volume or is this about the testing? We are using goal quest like questions. Okay, well if everyone has lots of those not testing questions then we can do that but I also will say we have like five SEF sessions tomorrow that are actually about SEF maybe better suited for that. Anyway, I have no idea what it requires. Open stack and poses on changing volume sizes. SEF allows that. I must afford you a lot of tears, I'm sorry. I thought I was gonna run out of time or I would have done some extra questions about community engagement. Yes? Do you know the Jepsen test? Ah. Do you have something similar? Yes, so do we know about the Jepsen test? Yes, Jepsen is a framework for testing distributed systems and the way they respond to different kinds of standard failure scenarios. Jepsen was created after Geutology was and they do a lot of the same things. Not exactly the same things. Jepsen's focused a lot more on specific message flow than SEF is but the thrashers and the RADOS API tests that we're doing are pretty equivalent to exactly what Jepsen does except that Jepsen is mostly black box testing because it doesn't know what the system looks like in the back end. But yes, Jepsen is a great example of a thing that if you're building a distributed system, you should be aware of and either run or know why your thing is better than it. That case, I'll let you go. I'll be around. We also have a SEF booth up in the lobby. If you have specific questions, feel free to come by there and someone should be around most of the time to talk about you and your use cases. Thanks very much.