 Let me just lay it down there. OK, welcome, everyone. Thank you for coming to this talk. This talk is called to persist or not to persist container data. So it's about the idea of persistent storage for containers. So if that's not what you came for, you need to leave. No, I'm just kidding. So we're going to get started. There'll be a couple of us going to present. And then John's going to attempt a live demo showing how you could use Cinder as the service to type persistent storage to a container system. So, John, you're up. So we'll start with some quick intros. John, you want to start? So my name is John Griffith. I work at SolidFire. I've been working on OpenStack for about five years now and also doing a good bit of Docker stuff. And so that's what we're going to talk about today. And you'll get to see my live demo possibly fail exquisitely, so. Great. My name is Ken Hoy. I am the OpenStack evangelist at Rackspace. So that's what I do. Hi, my name is Shamal Tahir. And I'm an offering manager with IBM and predominantly working in product work group and other user groups within OpenStack. OK. So the talk is really broken into three parts. One is just as a level set, Shamal is going to talk a bit about what exactly are we talking about when it comes to container persistence or container or persistent storage containers. Then in my section, I'm going to piss off as many storage vendors as possible. And then John will try to save me by showing how it can be done. OK, so just a level set real fast. How many of you are already familiar with container technology as far as architecture and what it does? OK, good bit. It's now. So just to quickly recap then from what is a container to begin with. So what a container does is it uses functions within the operating system itself, namely namespaces to isolate the view of the container in terms of processes and various user and other namespaces. And then it uses C groups to limit the resources that are available to the container itself. The container contains an image. And the image is actually built on top of a file system. And it layers the different components, so the OS, the application, the dependencies, et cetera, are each a read-only layer within that image. And then, of course, at the very end, you get a read-write layer at the top. That is where you make your changes while the container is actively running. And then, of course, when you hear about terms like RunC, LXC, et cetera, those are the container formats. And they really define what you can do from an operational perspective with the container, as well as setting up the resource reservations, et cetera. But really, the interesting thing about containers is not the container format itself, but the fact that what it enables you to do from an application perspective. So really, containers kind of lead you into a realm of possibility where you can leverage microservices architecture. And what that really means is you can take an application that used to be running on a single instance, for example. It was what we call a monolith, where you have the database, or the persistence, the application logic, the UI, the API, everything, embedded into a single instance. And you could scale that up, et cetera, or maybe it's even scale out. But really, the value is, from a microservices perspective, you can take those different pieces and components that we had. Which one's the laser, by the way? Green? Laser is the way. Yeah. So you could take all those different components, and you can actually containerize them. And so now I can have my UI running independently from my API, my logic, each function within my logic could be an independent container. And this really allows me to scale each of them independently, as well as upgrade them or maintain them individually. So it's those benefits that really make containers attractive, more or less. And of course, then I have the data layer as well. And so there's multiple ways to do data. So if it was just as simple as saying there's one way to do data, and this is how you do it, we're done. But why is there a whole session? Well, as we discussed at the very beginning, when we had the container architecture image up, there's a container read write layer. The data in that layer is femoral. So as soon as the container is stopped, the data is gone. And so if you're running something like a stateless service in that container, you don't need to worry about persistence. The data goes away. I really don't care. It's gone. Or I could say, OK, I need to save this data, and now I'm going to create a new image. And my image continues to get larger and larger and larger and larger and heavier and heavier and heavier. So I could do it that way, but that's not optimal. And then, of course, there has to be a better way to do this. So there has to be something like either between lose everything and keep everything. There has to be different ways of doing this right. So from that perspective, there's numerous ways to do persistence in containers. One way is you could have storage locally that you expose as a volume into the container itself, and then that's where you store your data that needs persistence. The other way is, from a container perspective, it's connected the same way, except the storage is not on the host itself. It's running somewhere else, maybe a dedicated storage provider, and you're just consuming the volumes into your container. Likewise, you could use distributed storage, where you have pools of storage at each node that you pool together in a scale-out manner, and then you expose volumes from that layer. And then the last way is, of course, using something like a backing service. So instead of storing data in files and file systems and storage, you use either a database as a service, or you could use object store, et cetera. So you bring in the service to the container instead of the volume, and you interact with the service instead. So now that we've covered what is persistence in containers and why there's a whole session on it, because there are so many different ways of doing it, Ken is going to come on now and say, when, what are the dos and don'ts with what I just said? Before we do that, actually, because I feel like we covered a lot of information very quickly. So particularly on this slide, does anyone have questions? If you do come up to one of the two mics, I just want to make sure folks are clear on what these different storage options are for containers. Go ahead. Can you please clarify the third one, the third option? Third option. Third option. So the third option is basically leveraging storage within the host, the disk within the host itself, but then using something like a software and storage provider to pull that disk capacity together into a logical pool and then creating volumes from that resource pool of storage and assigning them to the actual containers. So effectively, it's similar to local storage, except instead of it being one disk on one host, we have a bunch of hosts providing the entire storage itself. And we're pulling it together and then assigning volumes from that instead. Anyone else? Any other questions? OK. So I want to talk a bit about some of the do's and don'ts for container storage. So I will start with the don'ts and then end positively with the do's. So first thing you got to ask yourself is if I'm actually going to think about doing persistent storage, am I doing it? I guess the cool question is, should I be doing that? And if I'm doing it, am I doing it in a way that's appropriate for what a container is meant to be? And so I'm going to talk about a few things that's kind of don'ts to keep in mind. And I'm being purposely binary about this just to get a point across. So the first thing to note about container persistent storage is it's actually kind of do you think one level up is how many have heard the analogy that a container is a lightweight VM? That's what I heard that. OK. Purge that out of your mind. That was a useful paradigm when people try to initially explain what a container is. But actually, at this point, hopefully people know enough that they realize a container is not anywhere close to being a lightweight VM. So the reason I kind of cite this first is because I've talked to a number of customers who, because they've kind of been taught this idea that a container is nothing more than a very lightweight VM that can run on a single operating system, is they've done things like, let me take an entire server that would used to be a VM in VMware, and I'm going to turn it into a one large image. So instead of having a small few mags or gigs, I've seen customers with 200 gig Docker images. And then they want to know why it takes so long to boot up. So that's why I bring the point. Large images are really nothing more than anchors. If you're going to treat your VM, a container like a VM, then you might as well stay with VMs because you're getting some benefit, but I don't think enough benefits to make it worthwhile to make that change. The other thing about treating containers like lightweight VMs because you're encapsulating everything into that image, and if that dies, you've lost everything, you're forced to treat that container like it's a pet. So I assume by this time all you have heard the cattle versus pets analogy. So VMs often are like pets. Containers should always be cattle. But if you think in your mind that a container is lightweight VM, you're going to end up treating it like it's a pet or doing the things that would make it seem like a pet to you. And one of the things that Shamile spent time talking about was this idea that where containers are most, you get the full benefits of a container system is when you run microservices on it. And I would argue it's impossible to have a real service-oriented microservices-based architecture. If every container you have is basically a very large Oracle VM. Believe me, I've had people try to do that. Okay. Number two, don't attach volumes to all your containers. Again, I've talked to these same customers. They're like, well, you know, I can't really follow the 12-factor app. It's too hard for me to figure out how to save configurations data or persistent data somewhere else. I'm just going to take a volume and connect it to every single one of my containers. Again, treating it like it's basically a VM. So why is that a bad idea? So again, Shamile talked about the idea of containers were designed to be all ephemeral. And whilst for some people, that seems like a terrible idea, right? Why would I want to have things that are running in my production that when I blow it away, everything disappears? But in fact, there's actually great benefit to having ephemeralness because when things are ephemeral, it actually means they're much more portable, right? And one of the things that, let me ask, a lot of you have seen familiar containers. Are all you familiar with concepts like Docker, Swarm, Kubernetes, Maysos? Raise your hand if you understand those concepts. Okay. So what is the great benefit of a container management system like Maysos or like Kubernetes? It's this idea that I can fully leverage my entire data system by having my microservices in containers move to any machine I want based on workload and scale requirements. What happens when I tie a persistent storage to those containers? You end up in a situation typically where I gotta say, okay, these 20 machines have my database data on it. And so I'm gonna tell Kubernetes or Maysos, these are the only 20 machines in my data center that can be used for this application. So you've basically created an anti-pattern for what you want to do with containers. So the more persistent storage you attach to these containers, the less mobile these containers become. And there's some ways around it. Obviously you can use kind of scale up NFS or scale up file systems, but then you're basically adding some complexity and you're also adding the overhead of possibly having to transfer data from one container from one host to another or having to attach and reattach volumes. So it's also the same thing we left and then the last city kind of goes along with that. If I wanna scale out and scale back, if I wanna scale out and I'm gonna use persistent storage everywhere, I have to figure out a way to make that persistent storage available and exposed on every single container host that I have, right? In order to be able to scale across the entire environment. And again, that actually becomes an anti-pattern for what you're being able to do because the reality is if you're running a very large Google-like environment someday, you really can't attach traditional storage to every single one of your container hosts or the same traditional storage. And the third though is, I really advise against people using traditional sand storage for their containers. Because there are a couple of reasons. Because one, you limit your scalability. The minute you decide that I'm gonna do a lot of persistent storage, I'm gonna tie it to a single array. Your scale limit is basically that array. Again, unless you can figure out a way, if you start running containers across tens of thousands of machines, you would need to be able to have an array that can talk, that can be visible to all those machines. The other thing is one of the benefits of having a container system is there is no single point of failure. Because your entire data center, at least within the single data center, there's no single point of failure because every machine, in theory, should be available for you to run their containers on. But in fact, that storage array that you've connected to those containers, that becomes your single point of failure. Okay, so I'm not telling you though that you can't use persistent storage because the reality is they are stored, they are data that you would like to keep around even when a container goes away. What I'm trying to say is it's better to be prejudiced towards not using it than using it, and then in the exceptions where you do need to persistent storage, figure out a way to do it without making it become a standard part of your environment. So a couple of things, some of these things come from the, if you guys are familiar, the 12-factor app. One of the concepts is try to use loosely coupled back end. So what I mean by that, it means you don't want, you want to try to avoid storing persistence data on things that are where you, you know, it's kind of connected to on a wire and you got to mount the value when possible. You want to use things like object storage where you're basically accessing the data over URL, over the network, because then you have much more flexibility and it's not so tied down to those container hosts that you have. If when possible, use database service. But you don't want to make it your problem to figure out how to make this database scale across your container infrastructure. Let that be the concern of your cloud provider. And some of the ideas, only use it when necessary, right? Because you want to keep containers as stateless as possible. So if you, what you don't want to do is make all your containers be stateful. And you want to use, and the kind of the ware cases that I think is a really, it would be considered a good use case for using persistent storage containers. As data stores, that may be difficult to replicate. You know, maybe you've got a database that you can't shard, so now you've got, you know, a 500 terabyte database. Trying to replicate that, right? It's not always practical. So in that case, the best, the only, maybe the only way to treat that is to run it in a single, a couple of, one or more large volumes and then use persistent, some kind of persistent storage in order to keep that state when the containers move. And the last, kind of the last advice is, use distributed storage. Shamael talked about that. John's gonna talk about it probably a little bit. Distributed storage, if you're gonna use persistent storage containers, work well because it scales along with your container's host. So again, you're not tied to a single, essentially a single head or dual heads and saying that's my scalability limit. You can, in theory, depending on the solution, you can scale, sorry, your storage can scale as quickly and as high or as wide as your container system itself. So these are just some of the leading scale systems that are on the market today. I'm sure there's more. These are just some samples. Okay, at this point I'm gonna turn it over to John so he can walk through. If you won't listen to my advice and you decide you are gonna tie persistent storage to everything, here's at least a right way to do it. And then we'll take some questions. All right, thanks Ken. So we're gonna swap out and I'm gonna talk while I do this, so this will be interesting. A couple of things to start. So a lot of the stuff that Ken pointed out, I don't necessarily adamantly disagree with. Actually, most of it I do agree with. The only thing that's kind of interesting about this whole world is we always use companies like Google and Facebook and the 12-factor app and all these things is our examples, right? The reality is 99% of us are not going to be running a 1500 node docker cluster, right? I mean, it's just not gonna be that way. We may run 50 nodes, 100 nodes, a couple of 100 nodes and stuff like that. And in those cases, I think there are good ways to do this. There are good options in terms of storage platforms. The ones that Ken had on that slide are great examples. And there are also ways to get cross data center replication and things like that and make it all work. So what I set up here is I've got a couple of OpenStack instances and I'm doing kind of what I think is a good model and I could be completely wrong, but I've got on this machine, if I can get to it. So just so you know, I tried to do a similar demo to this yesterday at a talk that I gave and it failed miserably. So it wasn't actually a storage. So I wrote a new driver. So this particular setup, I am using a Docker volume plugin that I wrote for a solid fire cluster. I'll talk about that a little bit in a second. Yesterday, I'm also writing one to actually use Cinder from OpenStack. I wrote it at four o'clock in the morning the other night and it wasn't as done as I thought it was. So when I tried to demo it, it didn't go well. So hopefully today we'll go better. So that being said, what I'll do, hopefully you guys can still hear me. All right, so here's what we have here. I am going to go into this demo app that I have called Moby. And in here, let me make the font bigger for you in a second. That's not too bad. Can you guys see that okay? Okay, so this is just a simple app. It's gonna be, there's gonna be one container that's running a web app, right? And it's just gonna capture clicks. And this app has been around for a while. You guys may have already seen this somewhere else. Gets used a lot. But then the other is a database container. Now that database container, what I did is I went ahead and I said, hey, I want a volume to be attached to this database container. So when I do a Docker compose up, and again, you can do this at higher levels, whether it's Meizos or whatever. But what I do is I say, when I do a compose up and you build this deployment for me, I need a volume that has this name. I need you to either go out and find it for me and attach it, or I need you to create it and attach it. So it's gonna do that for us, right? And that's where my database and all my persistence is gonna be. The web app itself, all those other components, they're not gonna have any persistence. They're not gonna have any state, right? All my state is relegated just to the database and just to that one volume. So to Ken's point, I'm not attaching volumes to everything and just going crazy willy-nilly with persistence. I wanna rebuild things. So I've got that. So what I'm gonna do is I have two open stack instances that I built, Xenial 1 or Xenial and Xenial 2. You can probably guess what OS that is. So those two are running and I've got them ready to go. So I'm gonna come on here and first one thing I need to do is I need to make sure I enter this command. I'm gonna come over here and I'm gonna open another session into that machine, that same machine. And I'm going to start my solid-fire Docker driver, okay? So I've got this driver and I am going to just start it. Give it a second and cross our fingers. No feedback is good feedback. So now I'll go into my Docker compose directory here, that compose file we looked at. And I am simply gonna say Docker compose up and I'm gonna run it in daemon mode. So you can see here, I already had this built in the past so I didn't actually have to go fetch anything so no big deal. Images are up to date, containers are up to date, everything's running, okay? So now in theory, what we can do is go over to the URL of this machine. Okay, so now I can come in here and I make my little Docker things, yay. All right, so yay, right, big deal. But this is actually kinda cool because what I'm gonna show you now is we're gonna go and I am going to kill that node that this is running on. I'm gonna shut that instance down, okay? So in theory what that means is hey, your container node, it just dumped whatever. So now it's up to Kubernetes or Meizos or you or whatever automation you have to bring that environment back up and actually still have all your data. So that's the key. So I'm gonna come over here and I'll just, I'll do it the hard way and I'll shut that down. Then when you come back here, cached for a minute, it'll go away. But let's get, let's go to our second instance which is, it's meal two. So our first one is gone now, you can see that. Why can't I get to, all right, well we can do this another way if we have to. We can rerun it on the same machine again. I'm not sure why I can't get to this. Maybe if I do this, it's like open stack networking has let me down, I'm unable to connect. So what you would do is that machine actually has the same Docker compose file and it has the same solid fire driver running on it. So what we would do is we would just go into that machine except now I can't get to anything. Live demos, when will I learn? Never again. This is why Shaman and I go off the stage and get the laptop to move up to the demo. So what we can do is I can show you, for some reason SSH isn't working but it looks like maybe everything else is. So let's try this. So this is the Xenio two machine that I had mentioned, right? So SSH does not work. What's that? It shouldn't be because I had, I actually tested this before demoing it today and I actually had it working because I was not gonna fall for that again. That's right, I'm not on the same network so that could be the problem. Interestingly enough, although it is interesting that, all right, well, that was an easy demo. So I guess the question that I would ask at this point, if that process and what I'm describing, if you guys kind of get the gist of it, I mean, if that makes sense. I have a demo of this actually recorded that you can go and watch. It's at solidfire.com. I have a blog on it. If you go here. Can you play the video? I can. How short is it? Yeah, it's kind of long, but what I can do is I can fast forward through the interesting pieces. If you guys are interested. Oh yeah, I will. Absolutely. So if you come down to here, this is getting the driver. Okay, so if we look here, so you can see this is the same environment that I was just kind of describing, right? So I've got two machines. I've got the driver set up and running on both. You can see the Damon start, blah, blah, blah. There's that Docker compose file. This is the same compose file. Since I've done this demo, I've actually built more complex things with like seven and eight containers and one database and one other log share and things like that, right? So you can scale this and expand it however you like. So we run the container up just like we did over there. It creates the database. It creates the web server, does all that. You can see all the activity on the driver to create and all that, the attach. And then this was kind of where we left off, right? So we're on the first machine on demo one. We create a bunch of our little icons and we do our thing. And what's gonna happen is we come back over here and we'll go ahead and we'll kill that. We'll go ahead and kill that. So we do a stop and then you come back and you see, again, we get in that state where the web app is no longer available, right? So then I'll come up here and then I'll go to the other machine, the second, the demo two machine. And I just wanted to show that it is, it's the same exact Docker compose file, right? Because that's the whole point. We don't wanna have anything, we don't wanna do anything different. So I do the up and I start it and you can look down here and you can see we went out to the solidifier cluster and we said, hey, where's the volume named Redis or whatever it is I use in the composed, yeah, Redis. It'll find that Redis volume, it'll attach it and connect it to the container for us. So that's all good. And then we'll go to the URL for demo two. Now, if you're actually doing this in production, you're gonna wanna use a VIP, right? And automatically transfer that IP back and forth to whichever node. But you'll look and you see we get back exactly where we were, right? That's a good question. So you detach the volume from the post? Yeah, so that's the key. So the key on how all of this works, the model that I use and the model that I recommend for people, of course, because I'm a block storage vendor that uses iSCSI, is you do an iSCSI attach, it does an iSCSI, the Docker volume driver, the plug-in will do an iSCSI attach to the node that is hosting the container and then it will pass it through using the file, using Docker's file system magic and everything else, right? And give you a mounted file system on the container itself. So you don't actually have to know anything, you don't have to know about file system types, any of that stuff unless you want to. If you wanna specify those things you can, otherwise it's as generic as possible, right? And that's the whole key. So, sure, go ahead, you wanna step up to the mic? So this particular demo right now is not Cinder, I am putting together a Cinder project that does this and as of an hour ago, that's working now. So I will push that out, that will be on GitHub, it may become a Cinder project. So any of you that are running Cinder, you can actually go ahead and load this up and actually just have your Docker systems talk directly to Cinder and do all of these things. Yes. Correct. The volumes, yeah. So the volumes are actually on the SolidFire cluster. So if you look, no, no, no, it creates them dynamically, it creates them on the fly. Now you can do it one of two ways. So back to one of the things that Ken mentioned earlier, right? So if you look, this is the SolidFire cluster I was using for our live demo that failed, right? But you'll notice the plugin did all of its stuff, right? Unfortunately my networking is screwed up and I can't SSH. But you can see it created the volume. So it created the volume for us and did all that. So the plugins that I have, they will go ahead and create, delete volumes, all that stuff for you. But the key is, if the volume's already there, it'll just use it, right? So to Ken's point, if you move things around and your container moves and stuff like that, the, your app and all that, it doesn't need to know anything. All it needs is a tag or an identifier to that volume so that it can go to the cluster and either find it or create it, whichever one. Absolutely, yes. Yep, absolutely. So one of the things that, one of the things that's really nice about the Docker volume plugin, it's really, really simple. There's basically seven calls that you can make. You can do create, remove, list, get, attach, and detach. It's really basic. But they've kept it basic, which is the way it should stay. I've been begging people, please don't let it get more complex, right? But the thing is, is they also allow you to pass in options. So inside of your YAML file or inside on the command line, however you decide to use it, you can pass in options to specify different things like size, QOS, replication, pairing, consistency groups, whatever you want. But the key is, is ignore all of that from Docker's perspective. How would that work with the sender? Would it just pick up the extra spec? So yes, exactly. So with sender, the beauty is, this is hopefully gonna help promote volume types inside sender a little bit better because volume types in sender are kind of misunderstood and not used effectively in my opinion. You can put any kind of special characteristics you want in volume types, whether it be QOS, replication, whatever. There's all kinds of things you can do with that. And that's the place that you would do that. So, sorry, thank you. So what would happen is, when you use that sender plugin for Docker, you would go and you would say, when you create these volumes, or when I ask to use these volumes, this is the type that I want and it will read all of those extra specs and just use that. So it works just like sender, except it abstracts all the complexities that you have an open stack away from it and uses it for containers. We need to end, are there any burning questions that you wanna, if not, I think the three of us around for a few minutes before we'll probably get chased up by the next session. So we're happy to answer any questions. Happy to debate anything. So happy to have a fight. Have a fight. Can all fight tomorrow. That's right. No? Okay. Well, thanks everyone for coming. Thank you guys.