 So, this will be an overview of CoreOS and sort of the entire ecosystem that has been evolving around containers and scheduling and that sort of thing. It's going to be a fairly technical talk, but given the time constraints, the demos will be fairly short if existing at all. But I encourage you to ask questions because there is kind of a lot of ground cover if you're not familiar with these sorts of systems. So I'm the C2 and co-founder of CoreOS. I've been a systems engineer for a long time. I started my career at SUSE Linux as a kernel developer and then worked most recently at Rackspace, but I've been doing system software for a while now. So the first question that we kind of want to answer, actually the question that we're trying to answer with this talk is why we built CoreOS and why we're building CoreOS. So a couple of years ago, Google came out with a white paper, well, it's difficult to call it a white paper because it's like 60 pages long, a small book called the Data Center as a Computer. And the basic tenet is that you want to treat infrastructure not as something that's focused on the individual server, but more focused on applications that are running on lots of logical servers, a data center. And so the basic properties of this system, this data center as a computer thing, is that you add more machines, you get more capacity. The individual server doesn't matter, so you want to design for the inevitable failure of hard disks and CPU and RAM, et cetera, and not have that be a reason for a human being to get involved. The focus is on the application, not necessarily on the operating system, not on the... So you have APIs into deploying applications. You're not thinking about SSH-ing into servers and installing packages. And there are no maintenance windows. This goes back to the idea that the hardware failures are okay, the idea that you don't take down time because your hardware is tired or faulty. And instead of using really smart hardware, you use commodity hardware and build smarter software to get all this happening. All right, so how do we build this? The first part of this is containers. Who's familiar with containers or Docker or anything like that? Use the technology. Okay. I'll give just a brief rundown of all this stuff, because there's about a quarter of the room who is not familiar. But this is the first piece of technology that you need in order to build this. And the reason you need it is that it isolates the application from the operating system. And so if you want to think about and be focused on the application, you need to care less about the underlying operating system that's running it. And so what exactly are these things? What are these containers? There's a lot of confusion about what a container is. And it's through no fault of the people who are trying to understand the technology because containers is a very nebulous term that's really a spectrum of technologies. But the basic idea and generally how it's been talked about most recently is the idea of an application container. So for a long while, we've had the idea of a system container, essentially a VM that shares the host's kernel. And so we've had things like LXC and OpenVZ that are very much focused on this idea that I spin up a full init system and that init system has SSH and it has its own dedicated networking, et cetera. But what's been the primary thing that we've been calling a container for the last year or two because of Docker is the idea of an application container. So I'll just call it an application container. And the idea is that traditionally in a Linux operating system, we've had this full stack in red that's provided by the OS. And then you bring in your last 400 lines of code that's actually your application. And so you are tightly, tightly coupling your application with the underlying stack of the operating system. And so you use Python from slash user on your host and that runs your application. So a container simply makes the problem essentially worse because you now have a proliferation of copies of libc and Python all over your host that you have to manage. It kind of just moves the problem around, but it moves it around in a useful way. Instead of considering and worrying about what version of Python is my host running, like, oh, is the next version of my operating system going to break my app because it's using Python 2.7 instead of 2.6, you bundle all those together with your application and you ship those as a single unit. And generally what we do with that is we give it a name. And ideally we put it in the DNS namespace. So this is example.com, my app. And this way we can talk about these things across hosts and we can pull them down over the internet or over the network and place them. Similar to how we do app get install today, you have to give it some sort of name. And the reason we use the DNS namespace is because as we've seen with the package namespace, we end up with all sorts of creative munging of strings because four people wrote an application called foobar. Now we have the foobar-wm, et cetera, et cetera. And so the basic way that you interact with these systems is via tooling. So the idea that you have some container runtime, we'll call it Docker, Rocket, or whatever, and you go fetch this name and then you run the name. All right, so there's a number of technologies that make this possible. The UNIX kernel and the UNIX APIs divided up. It has a number of names. And so when we're talking about virtual machines, these names are implicitly isolated from the host. But in a container, we can isolate each individual namespace of the UNIX kernel. So we have these global namespaces currently in the UNIX kernel. When I say killpid1400, that means something in particular. But once we add the pid namespace, 1400 inside of a container and outside mean different things. Similarly, with user namespaces, we're able to map UIDs inside and outside the container, network namespaces. So F0 outside the container means something different than inside the container, et cetera, et cetera. This one is really just a trute, so the idea that I can change a different root file system. And this is what gives us the ability to, say, have a different version of Python inside the container versus outside the container. And then that's partitioning up the namespaces that we've currently thought about in the UNIX kernel. The other piece is we want to be able to isolate the resource consumption. So on a VM, on the command line of KVM, we give it the max size of RAM it's able to consume. And we give it a block device that only has x numbers of gigabytes. And that's how we do resource constraints. Under the kernel, we have a way of containing individual processes called C groups. And this manages the resources in a similar way to the budgets of your engineering team are managed. So there's a maximum amount that you can spend, and then you have somebody counting how you're spending it. So it can put constraints and then count how those things are being consumed. Unfortunately, the kernel APIs aren't the most useful in their raw form. So we have a number of tools that have been built over the last year or two to make consumption of these resource counts easier. This is a particularly interesting and good project from Google called Seedvisor. And the thing that it does is it takes these resource numbers coming out of the kernel and gives us a time series, because really what we want to see is we want to see, hey, this container or process is consuming 100 megabytes right now. But what was it consuming 15 minutes ago? What was it consuming two hours ago? And so Seedvisor provides an API and a way of collecting these metrics and giving you a time series of what and why things happened, which is super useful for making scheduling decisions, which we'll talk about later, for giving feedback to developers, telling them how broken their code is, which is something we all like to do to other developers, not to ourselves, and maybe even monitoring our own code and seeing how it performs. So Seed Groups essentially allows us to limit resources and count them. The other piece, so all these technologies, the namespace and Seed Groups stuff, are combined together into what we call a container. This is a completely abstract thing that exists only in user space. The kernel is not really aware of it as a container. So we need user space tooling to bring it all together, the one that's been really exciting for the last two years, I guess year and a half, and that a lot of people have been talking about it as the Docker engine, and essentially the Docker engine did a couple of things that we hadn't had before. One, it gave a name to these things and a name that you could fetch over the internet. And so for a long time we kind of had all these technologies with LXC and other tools, but there wasn't a transport for them. And so the Docker engine, I think a lot of the reason that it became so popular is because it gave us a transport, a way of saying, hey, this name that I'm running on my host, I want to run it on this other host and actually have that work. And then also to find an HTTP API so that people could interact with it and build dashboards and that sort of thing. I think those two things combined kind of made it the exciting piece of technology that it is. Another application container runtime, arguably Docker was one of the first super popular ones, is Rocket, which is a standard and an application runtime engine that we've built at CoreOS. It's definitely very much in a prototype stage, but what we want to do here is have multiple independent implementations of what we're calling that app container spec, which is an independent specification from our implementation, which is Rocket. And so you can imagine that we have multiple implementations of these things for other operating systems, but we all agree on how to essentially lay out the file systems, how those are transported over the internet, discovered. I gave a talk on Monday about this, and I'm happy to answer questions, but it's really not the focus of the talk. And there's been lots of other container runtime systems, as I mentioned. There's been, let me contain that for you, or as we call L McDuffie from Google. Cloud Foundry had a project called Garden, or has a project called Garden. Mesa has defined their own container standard for use in their scheduling system. And then there's those more system focused container things like Alexine and Spawn. All right, so great. So how are these container things created? Well, really it's very similar to how we're all kind of happily packaging together our code into tar balls and then downloading them to our host and extracting them into opt. But it's a little more formalized, so you take some sort of code, run it through a CI system that gets uploaded to some sort of registry or HTTP server, and then that gets downloaded to the host. So that's the basic workflow that people are using in order to take advantage of these containers, these application containers. And so in each of these sections, as I introduce the technologies, what I'm gonna do is talk through what sort of superpowers that you gain by using these technologies. So with containers, what we gain is we gain this ability to do two things. One, we isolate our application from the underlying operating system, and then we also have gained the ability to talk about an application and run it in an identical way across multiple hosts. Before doing these sort of application containers that were named, we would use a Chef or Puppet or our own hand-grown scripts that would download things from the internet or from some sort of build server, lay them out on disk and then start them or restart them. And so the two things that we've gained is this naming property and this downloading thing in a sort of consistent manner and this ability to not care as much about the underlying operating system. All right. Oh, I forgot about this slide. So I'll go through the same things I just talked about. All right, so now that we've separated the application from the operating system, we're able to start doing something interesting with the operating system. We're able to reduce the API contracts that the operating system is making. So by show of hands, how many people love complex, large, interdependent APIs? Okay, so what we've kind of forced our operating systems or our Linux distros to do since the beginning is give us a very complex set of APIs. So our application relies on all these things like a Python runtime, a Java runtime, Nginx, our HTTP server, our database server, our TLS implementation, our kernel. And what we've asked these operating systems to do is say, what I would like you to do is not break anything, but keep it up to date, which is a very difficult thing for anyone who's maintained any large system, let alone people who are maintaining systems as a Linux distribution where they might not necessarily know every line of code because they're not their original developer. It's a very difficult task and what I would assert is that many of us, what we do is we choose some LTS release and then we hope and pray that we're not around in the organization when that LTS release finally gets retired and our application needs to be moved to the next operating system. I would argue that a lot of us have done that. Anyone gonna refute my point? Okay, so what we can do because we have applications is we can now start to say that the operating system is able to make less promises. It's able to keep fewer APIs. It's in charge of keeping fewer APIs stable and it has a lot less things to be concerned about. So we can rewrite this contract. And so, since the application brings Lipsy up, essentially all we have to do is maintain a kernel if you think about it. We just have to maintain that kernel and then some sort of system in order to download, verify and get an application container running. And is Linus Torvalds in the room? Okay. So the reason that we're able to do this is because Linus and the rest of the kernel team is extremely serious about not breaking the kernel ABI. And so there's been regular LKML listings where Linus and no few words, probably far more words than the developer would have cared to have seen, is told that we never, ever, ever break the kernel ABI. And so, since the developers are so serious about that we're able to update and keep up to date the kernel underneath the application. And that's essentially the only API and the only ABI that the operating system is now in charge of besides the application container runtime. Right, so essentially your distro ends up looking like this where you have each of the containers running in user space and bringing in all the tooling besides the kernel and the container runtime. And what this allows us to do is another interesting thing which is so about six, seven years ago, I think internet explorer and Firefox were the incumbent internet browsers. And we kind of had this terrible thing if any of you were working in IT at the time called Patch Tuesday where Microsoft would leave some patch and then everyone would scramble around trying to get internet explorer on all their client machines updated. Firefox came out and was slightly more secure. So it was like patch every month, not every week. But it didn't really solve the problem. We still have the next, next, next, continue yes, yes admin password process. And then Chrome came along and said, we actually know that we can maintain the software fairly well and we know that a lot of people are having a hard time doing that. And so what we saw was that they were pushing the security of the web forward in a pretty substantial way. And internet explorer and Firefox kind of followed suit. So we've done this awesome thing. The front end internet is the most secure it's ever been because all these browsers are automatically updating themselves, they're able to revoke TLS certificates that are hacked in Eastern Europe or Asia. And that's great. We can worry less about the front end internet. Unfortunately, what's happened thanks to the cloud is that all of our data now lives on the back end internet. And we are all just running LTS releases of whatever it is on the back end and not actually taking care of our infrastructure necessarily as well as we should be. So what we're able to do now with this operating system decoupled from the application is we're able to update the operating system automatically. And with how we've done updates in Corus, you're able to do an atomic rollback. And so what Corus does is like routers, it has an A and B partition. So these are like good routers, like Cisco routers, not the $50 ASUS routers that you have at home because that would have cost them an extra $0.50. But you have an A and B partition. And while you're running the A copy of the operating system or the firmware, however you want to call it, the B partition is updated in the background. And then we roll over to the B partition when your system's ready to update via reboot. And then you're running on the B partition. And once that's actually come up successfully, we're able to update the A partition again. So we have this kind of flip-flop atomic updating system. And we do all sorts of adorable things. For a while, we were using Kexec as a bootloader, which wasn't that great for a variety of reasons. And recently, we just, in the last week, wrote a bunch of code in Grub 2. I really want to thank Michael Marino, who did all that work so that none of the rest of us had to. But we wrote Grub 2 modules so that we are able to read the GPT table, do an atomic bit flip of metadata that we keep on the GPT table, and then do this atomic rollback safely without using Kexec in a proper bootloader. All right. The problem with Kexec is that Kexec works on probably about 80% of platforms. The notable platforms where it doesn't work is ZIN. And so that's a huge number of users, AWS, essentially. The other place is that a lot of times, this commodity hardware has the worst firmware ever. And it always ends up setting up some sort of memory mapping incorrectly. And then the kernel will just hang after a Kexec. Right. So what are the superpowers that we've just gained here? The first is that we now have this new opportunity for atomic updates. We're able to have a fully consistent set of software across all the hosts. Because we have an A and B partition, what we actually do is we essentially DD the block device to disk. And so the file system is fully consistent. It looks identical and can be cryptographically verified at the block level. And yeah, the OS is independent from the app. All right. So what we have now is we have a way of separating out the app and the operating system. We've enabled the operating system to be automatically updated and reduce the API contract. The operating system vendors have to make the application. The next piece is that we have to design for host failure. So how do we do that? The first thing that we have to do is essentially replicate all of the important data elsewhere. Important data can't exist in a single place. Otherwise, we've broken the very first tenant of what we're trying to build. So we built a software called EtsyD. It's got its name through very clever method. So we wanted to have something of a slash Etsy but distributed over multiple hosts. So it's implemented as a key value store that is accessible over HTTP. But the basic idea is that the keys and values are very similar to the sorts of files that we may store in Etsy. So it's configuration, but it's configuration that needs to exist at the cluster level. So for all of you that missed our clever naming scheme, EtsyD, EtsyD. All right. So EtsyD has a few important properties. It's open source software. Yeah, I actually meant to update this. But essentially, sequentially consistent meaning that the order in which things are updated and applied to the data store is seen in the same view and order across multiple hosts. And it's exposed via HTTP. So everyone who has Bash and Curl is able to interact with it and write leader election wouldn't recommend doing that. But it's possible. And then it's runtime reconfigurable. So you're able to remove or replace hosts that are in the cluster without taking downtime of the cluster itself. The interface is essentially what you would expect. It is you get to get keys out, you put keys in, and you delete keys using the HTTP verbs. Right. So what an EtsyD cluster looks like is generally it's a odd number of hosts. It's three to five nodes in general is the number of machines that are in the consistency set. This just means that people get scared. They're only three to five hosts. Then how do I scale up the CoreOS? Essentially, you want a small number of hosts because what you're trying to do is design for a failure domain of individual pieces of hardware. You can move these hosts around as time goes on within your cluster and not take downtime. But you need some small number of hosts because you're replicating this data every time changes happen. And so within a Google data center, according to the Chubby paper, essentially what they do is they have five dedicated hosts that run Google Chubby, which is essentially the equivalent of EtsyD. And it's the consistent data store for the cluster. So EtsyD is able to be resilient to individual host failures. And once you lose two hosts, it's still available, but you should be paging somebody. Because generally what the use case and the pattern is is that you have one unplanned outage and a planned outage, and that's your buffer. And the system still works correctly. But once you lose a third machine in a five-member cluster, the service is unavailable. So you're not able to write keys into the data store any longer. And you can optionally still read keys out of the data store but not able to write. So that's essentially the fault tolerance of the configuration data in this cluster. It's also EtsyD automatically does master election based on the consensus protocol it uses, called draft. So it's resilient to having a single follower failure and then also to having a leader fail. At this point, since there's no leader, we have to pause work until we do a master election. Rights will still come in through the implementation, but we hold those rights until the leader election finishes and completes, at which point a leader election happens. Leader is elected. And then the cluster is able to move forward and actually do useful work again. So essentially what we've done is we've made the individual server less critical because now this critical data that we need in our cluster is replicated over multiple hosts. So the idea is to share this configuration data, resilient host failures, and designed for consistency across hosts. And what I mean by consistency is that you're able to do things like take semaphores or take mutexes across the network. Because of how the underlying consensus protocol works, you're able to implement things like a semaphore across multiple hosts. It's obviously bounded by the time of your network latency and your disk latency, but for important things like doing a leader election of a database, doing a leader election of a scheduler, et cetera, you need this primitive of doing an atomic mutex across hosts. All right. So we've gotten ourselves up to the point where individual hosts don't matter. We have a way of sharing configuration across hosts and making that configuration data resilient to individual host failures. Now the piece that we actually are interested in doing is running an application. So we've kind of rethought and relayed out the infrastructure stack, but now we actually want to get useful work to the servers. And who's used a scheduling-based infrastructure rather through working at Google or using Mezos or working at Twitter? Or OK. So a very small number of hands. So I'll give you the really high overview of how a scheduler works. First, it begins selfishly with you. You being the user who wants to actually go home at night and also get your job done. And what you are trying to accomplish is you, say, have written an application that needs to be exposed to users. Let's say it's an HTTP service and you want to have 100 of these running. So you tell an API, here's the name of my container or whatever it is. I want to have 100 of these running in the infrastructure. And it has these constraints. It requires a gig of RAM. And it needs disk IO of this much and network IO of this much. So you tell this to the API, essentially run this 100 times. Then the scheduler is a really simple algorithm that essentially takes your desired state, the state that you wish to happen in the infrastructure, and makes it true. Computers are really good at doing this sort of stuff, looking at a bunch of available resources, deciding, hey, these are the things I need to fit into these resources and scheduling it out. And essentially what the scheduler does is it takes this 100 of things, divides it up, and then tells individual machines in your infrastructure where these things should land. So to give a practical example of how this might look, we have a simple scheduler that we wrote for CoreOS. One of the schedulers you can run on CoreOS is called Fleet. And essentially, you define the service that you'd like to run. And then you tell Fleet CTL or the API directly, I want to run this file. I don't care where it runs. And so Fleet uploads that to the scheduler. The scheduler takes that definition of work and then runs it on an individual machine. So you can see here that Fleet CTL start landed it on some machine with a machine ID of E1, something something, with a public IP address of that. And the algorithm, like I said, is really straightforward for doing these sorts of schedulers. Obviously, you can get very sophisticated in this diff function. But the basic algorithm is that you take this desired state from the user. I want 100 of these things running. You take the diff of what's currently running in the infrastructure. And you're like, well, there's zero of these things running. So I need to get 100 of these running, which gives you the to-do, essentially the things that need to be scheduled out. And then using a variety of heuristics and bin packing algorithms, you schedule that out to individual hosts. And that's what takes the list of work that's not currently running in the infrastructure and then transforms that into the machines that should be running the thing. And this just runs forever. That's why it's a wall-true loop. So it's running forever, and it's constantly checking, hey, is the current state of the infrastructure what the user told me it should be? No? OK. Then I'm missing two of the 100 that the person said, because of a hardware failure or whatever. And you get two more of those things running. All right, now the entire infrastructure is in the state that the user has told me. So it's constantly trying to drive like a thermometer in our house. It's constantly trying to drive the state towards where the user has requested the state to be. And there's a lot of various schedulers that have been developed. Fleet is the one in this example. It's a simple scheduler that gives a user interface similar to SystemD, but SystemD distributed over lots of hosts. Mesos is one that is used, and it's an Apache project that's used by a lot of companies. Most notably, Twitter has adopted Mesos internally. Kubernetes is an open-source project that was started by Google, but has lots of other contributors, including ourselves. And Kubernetes uses FCD internally, also along with Fleet. And then the Docker folks recently released a prototype called Swarm, which is another scheduling tool. But essentially, all these things are trying to accomplish the same goal of taking user input and transforming it into running instances of processes on hosts. But there's other things that need to be scheduled, not just my application needing to run. And this tool called Locksmith is actually one of the original reasons that we built FCD. What Locksmith attempts to do and what it does is it frees the user administrator from thinking about when updates are going to be applied to the infrastructure. So you can run at Locksmith on your infrastructure. And as the updates land on the host, the host will ask Locksmith, hey, I'd like to reboot. I'd like to reboot so I can get this latest update applied to my kernel or whatever. And so it's implemented as a semaphore. So the system administrator says, allow one machine or in machines to reboot at a given time. And so the host acquires a lock, gives the locking service its machine ID reboots, and then it unlocks with its machine ID on the other side of the reboot if it successfully comes up. And so you can imagine if you have 100 hosts, if one host downloads a bad update, reboots and never comes back. It holds onto that lock forever. And then system administrator intervention is required. But if the rollout of the update is going great, then each of the hosts reboot in lockstep. And you're able to update your infrastructure without actually thinking about it. So the basic idea here is that the schedulers give us the superpower of freeing us from, as human beings, thinking about a lot of the infrastructure and thinking about the individual server as something that we need to manage. We put our hands in a wall true loop and reap the benefits. And so it gives us the ability to think about app capacity to take advantage of compute resources more efficiently because the computers are landing applications on hosts based on how much available resources on that host, whether it be RAM or CPU, et cetera. And then it allows us to build for resiliency of host failures because the scheduler will land something on a host. That host will fail. The scheduler will come back through the loop again. Notice that only 98 out of 100 of the things are running and then reschedule that thing. All right. So what we've now done is we've gotten ourselves up to the point where you have 100 servers and now we've just sprayed your application all over the place and you have no idea where anything is running. So perfect. I'll just leave it there. No, the next piece of this and a rather tricky piece of this because we have a very heterogeneous set of applications that people want to use is service discovery. And so what we've seen is that a variety of service discovery mechanisms have been built on top of XED and on top of these schedulers too. So things that allow you to export the location of the applications in the infrastructure to DNS, export them to an HTTP API so that people can query and DNS isn't very good at a lot of things. For example, who's the closest person to me? Like, I have this information about myself, my network that I'm on, the rack I'm in, et cetera. Who's the closest service to me? The DNS isn't very good at this. So there's more sophisticated things like discovery. And then there's also things, you imagine that I'm using HA proxy or NGINX, things that want to have a configuration file. So things that take the scheduler's metadata and flatten that down into a configuration file on disk and then like helps a process like NGINX or HA proxy. And so another solution to this and a solution that's taken within Kubernetes and a few other things is the idea of a magic proxy. So within my container, I have my own network namespace. And let's say I want to be able to talk to a Redis server. So I say on my application, it requires a Redis server. What you can do and what Kubernetes does is it runs a proxy for you and it looks and you connect to it over TCP, that proxy. You connect to it just like a regular Redis server. But in the back end, it's doing a reverse proxy based on information it knows about the infrastructure to save the nearest Redis server or round robins to your connection to the set of Redis servers that exist in the infrastructure. And so this sort of magic proxy frees you from even thinking about DNS or configuration files at all. And it also gives you the ability to be resilient to host failures again because DNS has a latency from host has failed to update the IP address to where the new host is. With a magic proxy, since it's doing a reverse proxy it's able to detect, hey, this host failed. The TCP connection terminated. Where is the new IP for the thing? All right, so service discovery is pretty critical and all this stuff and there are solutions but a lot of it is evolving. But the basic idea is that you either use a proxy or you try to shoehorn existing services discovery mechanisms in this way of building infrastructure. So essentially what we've done is we've from the beginning of the stack all the way down we've kind of rebuilt how we think about infrastructure a little bit. And it is kind of a new, for a lot of us because a lot of us haven't used scheduling systems it's kind of a new way of thinking about these problems and this sort of stuff. And I think it does require that we, and enable us to do interesting things at every layer of the stack that we haven't really been able to do before. So I have a couple other things coming up at LCA. I'm giving this exact same talk later today at the Auckland Continuous Delivery Meetup. And then on Friday I'm giving a tutorial that kind of walks through all these technologies on the command line and talks through how you'd actually use them and how they work. So if you found this talk intriguing from a high level this is gonna be more of a I'm at the bash prompt showing you how it all works in reality sort of talk. And then also giving a talk at the GoLang AKL Meetup on FCD and then there's been some requests to talk about Rocket too, which are two projects written in Go. All right, so that is all I have. Happy to take questions, we have about four minutes I guess. Yes, you just wanna shout it out and I'll repeat. First question, are the host operating system that the container runs on? Is that, would you use terms like immutable and stateless for that sort of shell that the container runs in? So I don't think I've quite followed the question. Well, this whole concept of I suppose in your ability to move the containers from one place to another is that there wouldn't be too much data outside of the container that it needed to exist. Right, correct. So this sort of infrastructure right now works very well for horizontally scalable like web infrastructure where you have kind of a classic two tier app of a database and then a web tier. There are options that you have for if you need to write to the local file system. What large organizations have done like Twitter and Google is essentially say don't write to the local file system. Other options are you use NFS which quickly becomes a huge bottleneck. You use distributed file systems like CEPH or something like that. Or you use kind of lazily replicated file systems which is sort of how Plan 9 or Andrew FS sort of do that. There's also a tool called Flocker that replicates lazily file systems. But that's not so great if you're running my SQL or something because after that transaction completes that you're gonna ship somebody, their teddy bear, they're gonna be real upset if you lazily replicate that. Okay, and just quickly, from your experience which one of the container solutions or which one's a leader in terms of the metadata around performance that you can get out of the container? Yeah, so really you need to run something like CAdvisor. There's not really anything that's done a better job at doing the time series performance metadata. Yep. And that uses just generic, this generic C group file system. And so you can run it whether using N-Spawn, LXC, Docker or whatever. Yep. I've also got two questions. Hopefully they'll be very quick. Sure. You talked about the A and B partitions and switching back and forth. Does that double the storage requirements or does it use some sort of technology to compress the changes? Yeah, so it does double the storage requirements but CoreOS is currently at 100 megs in shrinking. So it really doesn't matter in practice. So it's the base OSM in Joanie. Sorry? It's the base OSM in Joanie? Yeah, so it's only the kernel and the container runtime and that's about it. And a PID-1 like system D and libc, I mean. Okay. And that OS update functionality, is that host initiated or container initiated? So it can be used by containers. The protocol used for that update system is called Omaha and it's identical actually to a protocol that Google uses for updating its software. It's just an XML protocol that tells you the payload and ships some metadata around. Yeah, but which side actually starts the OS upgrade process? Oh, sorry. So there's an agent running on the host and you point it at a HTTPS endpoint that it is able to get updates from. So the host knows about all the containers running on it and it does the updates of them? The update process only applies to the base operating system. You need to use a higher level thing like Kubernetes if you want to manage the rollout of updates across the applications. Because that's a scheduler level concern, if that makes sense. I also have CoreOS stickers too. If anybody wants to put them on the laptop so they're kids' foreheads or anything like that. Hi, for someone who's completely new to CoreOS, what's the difference between the standard open source CoreOS and the commercial offering that you guys have? So they're bit for bit identical. The commercial offering essentially gets support and then you get your own private dashboard where you can see how the rollout is going of your updates. And that's the primary difference is that you're just like any other OS vendor you're paying for support and then value added features. So is Pass Patch available for the open source version? Yes. And there's no proprietary software in the commercial version either. Yeah. All these container approach like back all dependencies and ship it everywhere reminds me Windows world when you have CD burner or gigabyte in size. Are we hitting the wrong way? So this is a really good question. This is essentially the question is essentially the question is we end up with kind of huge bloated containers. And part of what we, so we started this thing called the application spec, the app container spec or app C. And part of what we wanna do is we wanna enable people to experiment with how to put containers together a little bit by specifying how the container just the underlying format and how it's overlaid on top of each other. Essentially defining the D package to the app. And so essentially what you have to do is you have to have tooling that is aware of the underlying application and the runtime. So you'd imagine that we're gonna have to have tooling that's aware of Java. We're gonna have to have tooling aware of Python, et cetera so that we can compose and not duplicate software that's used across multiple containers. You can do tricks like using a content addressable store so that you automatically de-duplicate files. And that's actually how the initial spec looked is using crypto hashes. The problem is that you need to make sure that you're using the identical version. A crypto hash doesn't know that this is Python T7 but with slightly different debug timestamps. Yeah, so there's a lot to be experimented with in that space for sure. I think that's all the time we have. All right, thank you so much everybody. We have a small...