 Okay. We've got about one minute here. Before I get started, how many people in here already open shift users? Yeah. Let's go open shift users. Okay. So the people who are not open shift users, how many people here use Docker? At least in the desktop. Okay. So we've got a few people. How many people here were just introduced to Linux containers when they came to DevConf. There we go. So we've got our whole range of things here. Are you sure? Next speaker, George Blackwood, from Red Hat, will be talking about deploying stateful set. Okay. Well, thank you. Yeah. Well, stateful set, of course, is a Kubernetes and open shift feature. And that's what I'm going to be talking about if you weren't clear on that whole thing. And stateful set is actually several different features that have been rolled up under one object for running stateful containers in a container cloud. Up until about two months ago, stateful set was called Petset, which was a bad name. It was a working name that we gave it temporarily that was never supposed to be the release name. And then somehow we released under that name. And somebody pointed it out. And so it's been renamed to stateful set. But for a lot of you who are on, for example, open shift 3.3, it's still called Petset there. The examples that I give will still work. You just need to actually name stuff Petset instead of stateful set. Oh, and there's code for this. The example codes are under that URL, github, jberkas, atomicTV, where I'm going to be continuing to add more examples of running clustered Postgres under Kubernetes. So I got involved, interested in involved with Docker about two and a half years ago now, something like that. It was like Docker version 04. And as somebody who worked on previous details in Solaris zones, this was really interesting to me and that sort of thing. But I quickly ran across, but of course, I've also worked on Postgres QL for 18 years. So my whole reason for being interested in containers was that I wanted to use them to produce automated, high availability, scalable Postgres. And I almost immediately ran into some issues with Docker and in the Docker community. Because everything was about stateless applications. Oh, stateless applications. Here's this, you know, that sort of thing. And here's your web server. And here's your PHP application. And here's your Node.js thing and that sort of thing. And I'm like, yeah, but what about databases? And this was even codified as a policy. There was this big paper published called the 12 Factor app. And if you look at the 12 Factor app, there's one factor which is 12 Factor processes are stateless and sure. Nothing in it out needs to persist. It'll be stored in a stateful backing service, particularly a database. Well, so they've just codified the idea that you're supposed to use an external backing store, which is not your responsibility, which means what they're really talking about is Amazon RDS. Well, I'm a database geek. I'm not interested in a system that requires me to turn my data over to a database run by somebody else. I want to run the database. And so I was like, well, this is a big pile of... Well, you know what? I mean, what I really felt like when I was working on this was, here I'm looking at these, you know, this container system and there's all this hype, there's all this good stuff. But what really feels to me is like we built a house without a foundation. I mean, like, I'm sure Facebook would be lots of fun if it couldn't store any data. Wikipedia would, you know, really work well without sharing any kind of state. So, I mean, state is actually what makes applications valuable. It's what we need. And so then later on, I get involved in the Kubernetes project and we start working on this Petset thing and that sort of thing. And we start looking at, well, what is state exactly? Because we wanted to come up with a Kubernetes object feature that would handle stateful applications. In general, it's a way to say, okay, if we're talking about stateful applications, what exactly are we talking about that needs to be done that's different from quote, unquote, stateless apps? So can anybody give me like a capsule definition of state or give me a one sentence definition of state? Someone want to stab at this? An implicit argument to the rest of the system, did you say? Okay. That's interesting. Anyone want to try something else? Okay, something you retain between runs of the container. That's pretty good. This is the one that I like. The difference between code and running applications. Because when we started talking about state, one of the things that we actually realized is that there aren't actually any stateless applications. There are no applications that are purely stateless. They're applications that have extremely minimal state. But even your quote, unquote stateless application has some consciousness of a current task. It usually has at least a small amount of stuff cashed in memory. And importantly, it has a location and a status within the cloud itself. So we're not really talking about stateless versus stateful. What we're talking about is actually a scale of statefulness where applications have more or less stateful features that they depend on. Ranging anywhere from a so-called serverless code architecture, which is sort of our minimum case for statefulness, all the way up to a transactional database, which is actually kind of our maximal case for statefulness in terms of all of the kind of stateful features it needs. Another way you could actually look at this, and this was actually very close to your description, which is switching cost. That is, if I have to drop one container in one node and switch to another one, how much does it cost me? How much lost time and how much lost data am I looking at if I have to do that switch? Absent any sort of supporting infrastructure to reduce those things. Again, here we're looking for minimally measurable switching cost for some applications that just run a code snippet and stop all the way up to potentially long periods of loss if you have a giant data warehouse where the data needs to be migrated and everything else. Within the stateful set project, we're actually able to break this down into four different kinds of state that some of which ended up as Kubernetes features and one of which didn't. The obvious one and the one that people always talk about when they talk about state is storage. If you have data, you need some place to store that data, a file system, a remote object store, or something else. Storage is the obvious one. Do people know what the other three are? If you want to guess what the other three are. Anyone? Like a cache, like memory cache. That's kind of a different form of storage, but yeah, the right direction. Something else? Identity. Yeah. Configuration. Open network connections. Yeah, we got most of these here actually. So yeah, node identity, session state, which is the network connections and that sort of thing. And cluster role actually, because for stateful applications, they will often have, so that goes beyond configuration as part of that, but it goes beyond configuration, which is to say they have some sort of information and I'll explain these in more detail, right? So again, we'll start with the obvious one, storage, which is when you're looking at doing stateful applications that have stateful storage, containerized application, you want a way to move that storage around if you need to move the container around. And to assign this, if you have to re-instantiate the container, but you still retain the storage to assign that to the new container, et cetera, and you want some way to make the right access to that storage be exclusive to the container that is, you know, to a particular container. Now, we had a lot, there were some really awkward workarounds for this before stateful set, which included writing stuff into the actual containerized application that would create directories named after the pod, and try to separate things that way. There was this whole thing called Flocker by cluster HQ that did its own sort of segregation as a Docker plugin, but never integrated particularly well with Kubernetes. And I'll show you in a minute the stateful set person volume template, which was the Kubernetes solution. And the idea here is that when you actually start up a stateful set, a pod under stateful set, rather than having a persistent volume claim, do you understand how many people were at the persistent storage in Kubernetes thing yesterday? Yeah, okay. So persistent volume claim is when you say this pod, this container needs to be allocated this amount of storage. And for normal pod, you're allocating a connection to a particular storage. So what we do with stateful set to create this thing called persistent volume template, and the persistent volume template generates a persistent volume claim for each pod that you deploy, so that gets its own persistent storage separate from the other pods of the same kind. And also that that is permanently associated with that particular pod. Let's make a little bit more sense when I actually show it to you. The and the idea is that if I have, for example, a node postgres number one, and I lose the running container postgres number one, but that was say redundant network storage, then if I restart postgres number one, it should get reassigned that same persistent volume that was attached to the previous instance, so that I can recover my data. Here's actually a very simple example of this. You get your volume claim template, and you've got some metadata about it, and, you know, your storage node, like this is all persistent volume claim stuff, and it and it looks just very much like the persistent volume claim that you would have for normal pod as opposed to stateful set pod. Only we call it a template. The and then when you actually start it up, and you allocate all of these things, you see, for each pod, we have this bound claim attached to a generator persistent volume claim. Now one of the corollaries to this is at least in the current state of stateful set. This means that this only works if you have network storage that you can dynamically allocate and examples of that would be stuff like Amazon EBS, like Gluster, like the Google storage thing and that sort of thing where you can actually ask the network storage for a new share when you deploy the pod. There's a to do, there's like an issue in Kubernetes to actually be able to do things like if you have a shared volume to be able to allocate directories, but that's not currently implemented. Now, let's talk about the second part of this, because you notice I've got these sort of numbered things here. And for people who've used Kubernetes or OpenTrip, they're like, hey, how can we have these what, zero, one, two, three things? Well, so that actually comes into the question of identity. Now, the basic idea of anything with stateful set is when you're aiming for when you have your minimally stateful application, what we call a stateless application, your goal is to make each individual running container look like it is a mutually interchangeable, identical member of a set, right? Any differences that are between those running containers are things you want to minimize, not things that you want to cater to. So, you know, basically you deploy that, you've got your image, right? We've got our NGNX image, we deploy one to many NGNX individual containers running in pods. And we actually want to treat those all as not being different in any way. And for that reason, they all get, you know, hexadecimal numbers that get regenerated and are not stable. They can get moved around and that sort of thing. Whereas with a stateful application, what happens is you've got your base application image, you deploy the stateful application and then once you've deployed it, each individual container starts accumulating state and therefore getting more and more different after the point of deployment. And because that's going to be true, we need to be able to tell the difference between that container and that container and that container because they are not the same, they have accumulated state, which means they need to have an identity. On top of which, most of the applications that we would want to deploy into a stateful set themselves require an identity concept in order to do their work. For example, all of your multi-master databases require a period, that is, the different nodes need to connect to each other and in order to connect to each other, they need to have a list of the other nodes to connect to, which means that we need to know in advance what the names of the other nodes are going to be. This includes the ETCD, Cassandra, Postgres replication slots, that sort of thing. Also, a lot of stateful applications have their own clustering concepts within the application and for those clustering concepts, they need to assign specific roles to nodes even though they're part of the same set. For example, single master application Postgres, what I'm going to show you, we've got a replication master, maybe replicas, reporting nodes, shadow nodes, in the case of sharded Postgres and that sort of thing, that need to have a sort of different role. Those require having a different identity. Now, again, four aspects of identity in order to make identity work is it has to be individual, obviously. Identity only makes sense if it's attached to only one container because identity could attach to more than one container. You've already broken things. It needs to be durable, as in we don't change the name of a container on the fly just because something reset. It needs to be predictable so that I don't run into the bootstrap problem of I can't start ETCD because ETCD is a list of nodes and I don't have the list of nodes until I start them. And it needs to be addressable. That is, we need to be able to, via DNS or networking, connect to these nodes by those identity names. Now StatefulSet implements this. I'll show you this again during the demo where they actually would just implemented the very simple concept of you're going to, every time if you deploy a StatefulSet, the first node starts with an array number of zero and then we just go upwards from there. And if this container dies, if, you know, Petroni1 dies and we have to replace it, it gets replaced with another container named Petroni1. And if we grow or shrink the number of nodes that we have, it always does it from the end. And that way we not only can address the individual nodes, but in fact we will know what set of nodes we will have based on the parameters for the StatefulSet. Now our third Statefulness is cluster role. Yes? Yeah, I can do a couple of ways. I mean one is obviously you can get the list of pods and just look. And you can also query describe the StatefulSet and it'll tell you how many nodes it's supposed to have. Yeah. And it'll tell you how many nodes it actually does have. So actually we'll look at that in a minute. Because obviously if you've done things where you tell it, for example, that it's not allowed to put two nodes on the same machine and you're short on machines then you could actually have less nodes than you're supposed to have. And again it'll always do that from the end, right? If you told it to put five in the StatefulSet but it can only deploy three, it'll deploy zero, one and two. It won't deploy three or four. Now generally for StatefulSet we're talking about highly available applications. And for Stateful applications the application itself needs to implement at least some of its own high availability. Because Kubernetes doesn't understand enough of the internals of the application. Because we're talking about things like databases here, right? Kubernetes doesn't understand enough of the internals of the database to implement high availability for the database. And so if we're talking about applications that have high availability then they have to have an understanding of what their role in the high availability cluster for that application is. So in the case of replicated database you have a master and have replicas. In the case of a sharded dataset you will have shard numbers. In the case of a sharded file storage and that sort of thing you'll have individual sort of storage buckets. Some services require a boot strap node when you're first starting the cluster. Things like ZooKeeper for example you have to start up with a bootstrap node. And so these are all sort of cluster roles. Now that portion of it is not directly handled by Kubernetes. It needs to get handled by the application possibly using the Kubernetes catalog to store the information by putting it in the form of labels. But Kubernetes doesn't supply any special functionality for cluster node. It just makes it possible to implement that because we have identity. The and the other thing about cluster role that makes it different from identity is that unlike identity you can actually change the cluster role, right? Again take the simplest master replica database. You have a failover which container is the master and which one is the replica so we'll change. Sometimes these roles need to be exclusive which means you need some form of cluster level locking in order to make sure say there's only one master or there's only one bootstrap node. Some of these means leader elections. So there actually is a thing open that Kubernetes may support leader elections as a primitive. That's still open as an RFC. Yeah? Yeah. And so this is actually one of the reasons why I said this ended up being implemented as several different features because that's not actually included in the stateful set object. That's part of a general concept called affinity anti affinity. Which is a Kubernetes feature. And you can, affinity anti affinity you actually make these sort of rules which are not terribly user friendly but it is an existing feature and it does work where you can say things like don't put it with X and don't put it with Y. I think for 1.6, I think it's for 1.6 that they're talking about deploying a simplified anti affinity declaration for the most common cases like only one per node or not with service X or always with service Y. But I've lost track of the status of that to be honest. Yes. So that goes to affinity. But then other things like for example leader election right now you just handle that through a distributed consensus store. Which you deploy yourself. And part of the reason why there's been a big push on having Kubernetes support that is it's so easy to do it yourself. That there hasn't been a huge amount of urgency around having that built into Kubernetes. For people who are not familiar with distributed consensus stores, we're talking about here things like EDCD console zookeeper embedded stuff like Python raft and that sort of thing. That basically just allow you to have locking and consistent metadata information among a distributed group of nodes. By using the rafter pack source algorithms for sharing data. So here's an example. So for example, I'm going to show you in this Postgres system, consistent system. We just use an external DCS and write stuff to that in EDCD cluster. And so for example, we have cluster role here in EDCD, which says okay, this particular node, now that it's spun up is the master. And we're going to save the information about who's the master shared in the DCS. And then what we actually go ahead and do, and this is an example of how you can sort of extend Kubernetes and OpenShift yourself is we write a label to Kubernetes saying what the role is. And the importance of writing a label is once it exists a label in Kubernetes, I can then use it for other purposes like managing connections, which I will show you. So statefulness portion number four, which is session state. Otherwise titled not everything is a restful request. Because a bunch of things actually require maintaining some information on where you were connected to and what you were last doing when you're connected. For things like downloads and streaming, for database transactions, for authentication, where you have authenticated and received tokens from authentication servers and that sort of thing. Starting over with that session state is fairly expensive. And so you want to maintain the sessions if you can. Now this is not entirely solved. And falls under the more sort of ideas welcome state of things. Because then, you know, I mean right now what we generally do are one or two approaches. One is to rely heavily on the discovery in Kubernetes. So that rather than, so you use discovery but then you actually connect directly to the individual node. And then that gives you some sort of session state. But, you know, if we run out of resources on an individual node, they may get automatically migrated in which case you've still lost your session. Another way to do it is through actual smart proxies. But the current state of the art in the way of smart proxies for various different kinds of stateful applications is not great. I mean, I'm working with Crunchy and one for PostgresQL to have a better smart proxy for that. But I don't. That feels like an immediate solution rather than a long term solution. But in the meantime, you know, the easy solution is to use discovery. And it can be very easily implemented through Kubernetes services because we have this node identity and we have stateful set and we have the ability to write labels. So let's go from that to actually showing you some of this stuff because all of that was fairly abstract. So I got kicked off. So first let's show you some of the definitions here. So let's look at, let's look at a simpler one, right, a pure peering service, which doesn't require different cluster roles. And this would be like ETCD, for example, the distributed consensus store, which I need anyway in order to store my Postgres state. So the first thing that you actually do when creating a stateful set is you actually create a service for that stateful set. And the service, this is a special kind of service. You're generally actually not going to use it to connect to the individual nodes. You establish separate services for this. This service just supports the stateful set. It has to be cluster IP none. And then you actually give it a selector for selecting the particular nodes. Then cluster IP none is in order to actually tell it not to assign an IP. And I just call it out to mention that because it's a little bit confusingly documented. And importantly, it will fail silently if you don't say cluster IP none. And you won't realize until you can't use the stateful set that you've done something wrong. And then the secret is just in there because secrets. The secret is not a requirement of the stateful set. It's just that we need a password for connection. And then here we have for stateful set. Now again, like I said, if you're using OpenShift 3.3, this will be pet set and it will be V1, alpha 1, but otherwise it will be the same. And so a stateful set starts out a lot like a replica set where we've got this except. One of the things that changes here is that we need to link it with the service that defines the stateful set. So that service name links with this service up here. And then you give it some number of replicas which can later on be scaled up or down. Give it your various labelization and everything else. None of this is different from a replica set. And I'm passing it the various parameters that I need to start the ATZ cluster. But here's where I show you where the whole predictable identity thing becomes really valuable. Because one of the problems that we have with starting the period of the application like ATZ is that when I start the ATZ node, it needs to know all of the nodes in the cluster for consensus to work. And every node has to have a list of all the other nodes in the cluster. Otherwise, consensus fails. And the problem with auto-generated IDs is that you had this bootstrapping problem, right? Well, I know that the first node is going to be named EZ0 and I know the second one is going to be named EZ1 and I know the third one is going to be named EZ2. So if I know how many I'm going to have, then I can very easily give it a list of peers even when I haven't started any nodes yet. Now, obviously, if you wanted to support dynamic scaling the size of the cluster, you'd actually add a little bit of scripting here rather than hard coding the numbers in. And there are some online examples of that. But this actually makes it very easy. And to give a comparison of this, I don't actually have it loaded up. But take a look at the CoreOS website. They have the pre-stateful set version of how you boot up an ATZ cluster. And it's this goofy thing with bootstrap nodes and stuff and it's about 16 pages long. And whereas this is about two screens. So, and then because with this particular ATZ cluster, I don't actually want the data to be durable. I want to dispose of it if we lose an individual container and re-instantiate it by syncing the peers. I'm using empty-dir, which Kubernetes will take care of cleaning up for me if I lose an individual pod. So that is a simpler one. So let's see if I can reconnect to my cluster. Oh, yes, yes, I can. Yay. Oh, by the way, if you didn't see where it is. Yeah, there we go. So if people saw my presentation on Friday, this is Kubernetes 1.5 cluster that I just spun up on Atomic Coast on AWS. I added a couple of nodes to it to actually make it big enough that we can kill off nodes. So you see I haven't even renamed the directories. So let's actually start by spinning up the ATZ cluster. So you see one of the other things that's interesting about this, which is that the nodes always get spun up in order by default. They get spun up serially one at a time in order. And that actually helps a lot of applications where we need to know the startup order. So with Cassandra, for example, there needs during the bootstrapping phase of Cassandra, there has to be a specific master node that triggers everything else. I mean, once the cluster is fully operational, the nodes are all peers, but that's not what happens during bootstrap. And so we needed something like stuff starting up in order. And the advantage of that is if you do have an application where you have to have an original master, you know that the original master will always be number zero. So there we are. We've got our ETCD cluster up. Oh, yeah. That would help, wouldn't it? There we go. I'm just adding a secret. Yeah, there's the secret. If anybody wants to reverse engineer that, it's a really dumb password. So obviously in the production instance, we would be using something like Vault. Oh, and I was having trouble with network storage this morning. So I'm actually creating ephemeral nodes. But I'm going to show you the non ephemeral version because it's more interesting. Oh, we also need create. So this will take a little bit longer to spin up. You can see it's spinning up there. So let's start to show you what's in that. So again, we're starting out in this case. So this is Petroni is a high availability tool for Postgres that automatically sets up a master slave replication cluster, given a group of Postgres nodes. And it doesn't require being in a containerized environment, but it works a lot better in a containerized environment. So we've got, here we were set up the sort of Petroni thing. Again, we've got three nodes that we're actually setting up. So master and two replicas. Again, I have to pass it a bunch of information. I have to tell it what Postgres cluster it's part of. I have to pass through a bunch of configuration variables. None of this is there. And I have to tell it where it's connecting to DCS in this case, EDCD. And then we've got a couple of volumes on here, one of which is passing through the Postgres configuration and the other of which is the data. And so here we're going to have a volume claim template in order to actually create template out the storage, which if I hadn't somehow messed this up when I was setting up the demo, I would actually be showing you it creating the individual storage volumes for each backend as it creates new nodes. Now, let's see how we're doing on deploy. There we go. Yeah, let's go down there. Okay, well that's working fine internally. I'm mislabeling this. Oh, Petroni roll. Right. Sorry. There we go. So what's happened here is internally my Postgres high availability system has had a leader election and decided who's the master. Now, because we deployed the nodes in order, the initial master is inevitably going to be numbered Petroni zero. And again, this is where having the predictable order actually helps you in initial deployment of services, as opposed to if we look at here, number one, we've got a secondary. And then what that does, and then what we have to do is that we have it write a label to Kubernetes. Now, I would want to write a label to Kubernetes. Well, it's because we're going to actually use that label to support Kubernetes discovery and services. So for example, in a single master cluster, you can only send writes to the master. So it's important that applications be able to connect to the master if they need to write. And the way that we can actually do that is we can use the label that we've created in order to send right connections to that master. And that we can actually change that label on the fly when the master fails over. Let's create that service, by the way. Oh, and there's a second service here, because of course we want reads to be load balanced among all nodes. Now, once you've actually set this up, you know, the combination of Kubernetes with a database has actually been high in, you know, a database frame is actually an engineer for high availability, then allows us to do things in order to automate failover. So now zero is our master. The, it took a limited amount of time. This actually does work if we kill off the individual Amazon instance. It just takes longer, because Kubernetes takes a while to decide that. And it changed in 1.5, which will affect us when we get to OpenShift 3.5 presumably, where it's a lot more conservative about automatically failing the nodes. But if we actually look at the high availability system in there, we'll see that we lost the connection to the master, and it will take a minute here. I guess I hit the cycle wrong. There we go. So they're now going to host a leader election. And this node did not win the leader election, so it became a replica of the new leader, which is Petroni 2. So if we now, we see now we've actually switched which one is the master and which one is the replica. The, oh, and the reason why we still have Petroni 0 is that Kubernetes automatically regenerated it. So the label is set by Petroni, and we're actually taking advantage of, and as far as they know undocumented Kubernetes feature, so if you set up a label when you create the cluster, if you set up a label as the master when you create the cluster, then it's not modifiable at runtime by the application. But if the label doesn't exist when you create the definition of the pod, then the pod itself can actually add new labels and modify them. Yeah. So we take advantage of that in order to actually determine the service and that sort of thing. I do sometimes worry about this implementation just because as far as I know, that's not documented as part of the API, which means it might change in the future. I'd like if it didn't, it turns out to be very useful functionality. The, and if it does change, I'll be in the Kubernetes issues advocating pretty strenuously for replacement because it's pretty indispensable. The, so yeah, so we've got this set up now. Yeah. So at CD is supporting the Petroni high availability system, which in the current production version requires an external distributed configuration store. There actually is an alpha version that implements its own configuration store embedded in the node so that you're not required to use an external one. The thing is it's actually a trade off. There's advantages of using an external one in terms of redundancy. It's obviously it's more set up, but it's actually in some ways a little bit more reliable to have it separated. But we also want to have the option of an embedded one because an embedded one means you have, you have X number of database nodes and those nodes are also consensus nodes and you don't have to worry about it otherwise. The, so now, single master databases are one way. Now, we also want to, you know, also obviously another big use case for this is supporting sharded databases because a Postgres guy when we show you it is sharded Postgres. I won't run through the full thing of it, but we'll actually do an appointment. So, so Citus DB is a sharded version of Postgres. So, here we have again, we've got the service, which is for our, you know, defining our service for Citus. We have a stateful set and then we have a number of replicas. Now, one of the things I have to do is actually have to, Citus needs to know how many nodes they're going to be so it can shard its data appropriately. So, I'm passing through the number of nodes there. I actually have, I have an issue out about exposing that as a variable within Kubernetes, which it isn't currently. Right now you actually have to set it manually because I can't, I don't have any way to expose at runtime an environment variable telling me how many nodes there are in the pet set, in the stateful set. But at some point that will be a feature, really. The, and, you know, passing through some other, you know, things like secrets, et cetera, in order to set it up. And now here, like I said before, that having an ordering for these in the order they deploy in is very useful. And for Citus, you see why it's useful because Citus actually has two different kinds of nodes. It has shards that hold data and it has query nodes that accept connections. And so I know node zero will always be a query node. And so I can immediately have it automatically deploy as a query node from node zero. So, so we've got our secret for a password. Oh, except that it looks like from the last one. I forgot to delete it. It's fine. It's already there. So, again, deploying an order is valuable here because I've added logic, set up logic to the Citus container that says, hey, check what your name is. If your name ends in dash zero, then you are the initial, then you are the first query node. And then the rest of the nodes become shards. And if your name doesn't end in zero, then you are a shard. And connect to node zero because it's your query node. And so then all of this can be easily automated compared to the process of setting it up manually. So, and that's really all I have for demos. So do we have other questions that people did not already ask? Go for it. You mean it could restart it too quickly. Well, actually, it's not a problem. So the problem was, are you saying that it was a question? They actually make it clear. It's a question that you're saying, how does a Petroni application, if Petroni zero is being restarted, how does the Petroni application know not to make that the master again? Yes. And the answer is it doesn't because sometimes it should be the master again. It's based on timing, really. So because it's a question of why did the pod die, right? If the pod died because of some sort of ephemeral problem that came back up within five seconds, then it probably should just resume its role of being the master. On the other hand, if we actually lost the whole machine, then recreating Petroni zero on another machine where it has to instantiate the data directory and stuff is going to take more than 30 seconds. And in that amount of time, one of the other nodes will get elected. So there are some, by the way, there are some use cases that are almost impossible to automate like if we lose the entire network and we come back up. You can come back up in an incoherent state where we can't decide which nodes should be the master. And these are problems that are not completely resolvable even on paper. But if we can at least, you know, the case of you can lose all but one node and still have a database you can use, then we're in good shape. More questions? So the question was are all the database nodes sharing the same data store for the database files and Kubernetes is not doing anything with that. Well, if in the version with volume claim templates, the answer is no, actually. Like before I broke it, what it was doing was it was actually creating a separate EBS volume for each pod that spun up. So each one of those actually has its own storage. And in the case of dynamic network storage, that means that Petroni zero, if we lost it would actually come up fairly quickly because all we're doing is starting a container that data already exists on the redundant network storage. And it'll be attached to that. And also it's exclusive because if Petroni zero and Petroni one could write to the same network share, then we'd have data corruption. More questions? Well, thank you very much.