 We got a couple of looks like there might be some seats up front if you guys are looking Let's start so next up we have folks from Messosphere Dell EMC and DMI team talking about CSI and storage support in Messos and let them introduce themselves And thanks for coming to the talk. All right, so my name is G work for Messosphere right now, and I'm approaching Messos Committer and PMC members since 2013 Yeah, and my Twitter handle is G and a score you James Hi, I'm James D. Felice. I'm an engineer at Messosphere. You can find me on github at JDF Hi, I'm Steve Long open source engineer with the code team at sponsored by Dell EMC. That's a group that Works on community open source projects. I've been working on Apache Mesos for about two years now also working on a Rex Ray project that is a storage provider enablement for CSI And I've also dabbled a bit on Kubernetes Okay, this is an overview of the agenda today. We're going to start out with a state of storage and container orchestration today Moving on from that will cover The benefits of standardization, which is what the container storage interface is We'll get into a more detailed overview of what the container storage interface looks like Then we'll move on from there since this is mesos con to cover The specific details of how mesos is going to adopt the container storage interface This is a work in progress. So finally we'll go on to what the roadmap is, you know We're our plans are to deliver a release one but follow it up with more features and then we're going to wrap up the day with Some directives on how you can participate yourself and how you can help move this forward and have a voice in how this gets done As I said, we were going to cover three perspectives of the state of storage and containers now and These perspectives are from The perspective of a user of a container orchestrator from the perspective of somebody Implementing a container orchestrator and from the perspective of a storage provider. We'll start with users Over the past couple of years, there's been a huge shift Where stateful applications running in containers have become pretty mainstream, you know there was a time where the 12 factors advised you that everything in a container should be stateless and What that really meant you can't deliver a practical application that has no state anywhere You put it off to the side and it wasn't even managed in a container or by your orchestrator but in the last couple years mesos added support for external persistent volumes and This made it feasible to run these stateful apps inside a container having With it under the management of mesos and you can see here just a few icons of Stateful apps that can be deployed on the mesos platform in containers From the perspective of the container orchestrator Developers and these would be mesos, but also DC OS cloud foundry kubernetes docker swarm These platforms have over the past years essentially all added support for External persistent volume mounts, but they did it independently and as a result these implementations are inconsistent, you know that they have variations in the couplings to vendor proprietary APIs in the storage front and some of the platforms even have instances where the storage interface is all the way to the storage provider are implemented in code that's within the source tree of the orchestrator and There there has been this will be covered later in the presentation But some of the orchestrator vendors have discovered that maybe that wasn't the best decision Finally from the perspective of storage providers of which I'm here representing one We've been in the position where We have to support all of these container orchestrators for our customers And it's tough to keep up because the interfaces we have to deal with on each orchestrator have been a little bit different and This hinders our ability to rapidly adopt to new features that come out in these orchestrators and It also means that our staff when we put them on these projects kind of gets watered down because they have to be trained on all of these platforms and you maybe end up having finite resources and You know diverting them to this horizontal spread just because of these variations As an example of this this is a summary of the state of the world today This is a spreadsheet showing the storage plugins across all orchestrators And this is just a cut in paste of a little spreadsheet that if the screen would big was bigger Would probably go down a couple of floors below us, but on Amazon EBS alone There's like five storage plugins that have are being implemented and being maintained in the world by various vendors What I use as an analogy and this is kind of my wrap-up of this whole session as to why we want the CSI That this is a picture of the electrical outlets and plugs in the world and you know, nobody wants the world of container orchestrators and and and Storage providers to evolve like what has happened with electricity in the world in the current situation Users have problems using your device portably if you go buy a hairdryer and you go travel to Europe What are you going to do buy a new hairdryer? Or you buy an adapter for 30 bucks move to a different country and you need to do it all over again This doesn't make sense Even from the first so it's bad from the perspective of the user But if say you're an appliance vendor It's bad for you as an appliance vendor because if everything was uniform you'd get a better economy of scale You could have one product that you could probably sell cheaper because you'd design it once sell exactly the same thing worldwide Ultimately when the appliance vendors have to come up with all these permutations it adds costs and Even for you the user then the stuff is just more expensive and they can't put engineering effort into adding unique novel features It also adds cycle time to how quickly these get delivered because In the electrical appliance market these things typically have to be retested for every every market and it just takes you longer So this is it at this point I'm going to turn it over to James to cover Some of the specific goals of the container storage initiative Thanks, Steve So primary goal of CSI is to provide a neutral standard protocol for CEOs to interact with proprietary storage systems you get a presented with a consistent set of behavior and expectations for both orchestrators and for vendor implementations of CSI For vendors, this means there's less work to support n number of container orchestrators For container orchestration systems, you get access to a broader storage ecosystem You get to leverage open APIs and you end up with a looser coupling with with storage back ends some other goals We're aiming to Present a small set of APIs but still enable many use cases in other words We want to drive towards kind of a lowest common denominator of APIs We also want a low barrier of entry for CSI plug-in writers So it should be very easy to get up and running You know building a new CSI plug-in Great. So from a high level CSI presents a control plane interface that's largely focused on volume lifecycle This interface is service oriented versus command line interface. This allows plug-in services to easily co-locate with other required long running services Fused demons or Gluster or NFS services Services are exposed via GRPC Some advantages to this. There's understood mechanisms for practicing and load balancing GRPC calls GRPC supports streaming responses, which is something that we're considering post v1 GRPC scales well. It's an open specification and there's great community support from a configuration and operation perspective CSI places an emphasis on protocol over operational specification CSI allows plug-in supervisors Whether that's a container orchestrator or something else to decide how to deploy and isolate plugins CSI does not specify security protocols an operator is responsible for Protecting a Unix socket just as they would any other file system object With respect to packaging CSI does not mandate a container image format. The spec suggests that you try to use something that's cross CO compatible There's pretty minimal expectations with respect to supervision. For example, a plug-in should terminate upon request getting a sigterm And last isolation of a plug-in is not guaranteed, but it's very likely So what makes up a plug-in? CSI spec defines three GRPC services identity controller and node The composition of a particular plug-in binary may depend on the deployment requirements for example a headless plug-in will probably bundle all three of these services together To illustrate this There's some diagrams on this slide To the right we have a headless Model so you can see in the middle. There's the orchestrator and Around it, you've got the nodes where you've got your containers running and on each node There is a plug-in instance running and the plug-in instance is exposing both the CSI controller and node services on the left side Again, you've got the orchestrator in the middle, but you've also got a plug-in Co-located with the orchestrator and that plug-in is exposing the controller service And then on the nodes where the containers run that are on the outside those nodes are running Plug-in instances that are exposing the CSI node service The storage vendor is left to decide which deployment strategy is appropriate for their plug-in implementation And it's up to the operator to configure the container orchestrator according to the storage vendor documentation So this slide shows the lifetime of a volume from creation to deletion Shows the different states that a volume goes through as different CSI RPCs are invoked and Shows that a container orchestrator is driving the provisioning process. So a Container orchestrator will invoke a create volume call It will invoke the container publish volume call, which is like saying attach to a node for example It will invoke The node publish volume call, which is kind of like saying hey, I want to mount this volume So plug-ins advertise support for these lifecycle operations through capability RPCs and a couple examples of Those RPCs are the create and delete volume RPCs and the controller publish and unpublish RPCs so that brings us to The API I'm just gonna be a brief walkthrough of the API The API is really intended for consumption by container orchestrators like mesos and not by end-users That said it's still useful to understand what's happening under the hood So we'll start with the identity service This is important for version negotiation A container orchestrator invokes the get supported versions RPC The plug-in responds with a list of supported versions and then the orchestrator selects which version to use for a future RPC calls to the plug-in All CSI endpoints are required to support this service regardless of the deployment node The next type of service is the controller service This runs either in a central location or it can run on all the nodes themselves. It really depends on the deployment node Sorry the deployment mode The first call here get controller capabilities is important because it reports which RPCs are actually implemented by the plug-in and All these RPCs below there are the optional RPCs that may or may not be implemented by a particular plug-in So create and delete volume create volume Accepts a name and capabilities size for the volume that you want to create Delete is the inverse it deletes the volume from the storage provider controller publish and unpublish Similar to they started out as attach and detach We decided those names really don't fit all the workflows. So we landed back on publish and unpublish and just prefixed it with controller Again this these calls may be useful for centralized deployments They could also be useful for a headless deployments. It really all depends on the plug-in implementation the list volumes call is useful for Discovering pre-created volumes. It does not imply that the create and delete RPCs are supported the validate volume capabilities RPC allows a container orchestrator to determine if some volume Maybe a volume returned by list volumes if that volume supports some set of capabilities and or parameters and Lastly get capacity allows a co to determine the available space On the back end typically for a create volume call last is the node service and This service runs on the actual nodes upon which volumes are mounted. So these are the nodes that are running all your containers Probe node is an important call that checks the configuration of the plug-in. It checks the required software Any required devices if that call fails the container orchestrator decides that the plug-ins not ready to service requests Node unpublished and unpublished you can think about these like mount and unmount Get node ID presents a consistent identifier for the node from the perspective of a plug-in instance and The get capabilities calls really just a placeholder for now. There is no meet there yet and That was it brief walk through the API I'd like to turn it over to Gina who's going to cover how CSI is going to be integrated into mesos Right. Thanks James All right, so James kind of explaining like how CSI itself works and what are all the APIs and I'm going to cover like What's our plan to in mesos to adopt CSI as part of the the storage work? So what we plan to do for CSI integration is to introduce this new concept called resource provider If you think about mesos right now, we do have a way to customize the use of the resource Like you can write your own framework to to to customize the use of the resource, but we don't really have a way to customize Providing resources like right now like this is like hard-coded basically agent will boot up and Discover a bunch of resources on the agent and report those resources to the master But we don't have a way to customize that and the whole concept of resource provider is trying to provide an abstraction in mesos Allowing you like allowing operators or on cluster administrators to Customize the resource providing part. So it can be local or external so local resource provider means that the resource it provides actually tied to a particular agent for example think about CPU memory disk some of the disc are local and External resource provider or providing those resources that are not tied to a particular agent think about some resource like remote disk storage or IP addresses that not like not tied to a particular agent So we do want to support these two kind of concepts local resource provider and external resource provider and if you think about that the agent itself right now can be Treated as a local resource provider providing those traditional CPU memory disk resources And and the plus the other part which is task task management part And as I said already so so the reason we introduce this is because we want to allow users to do customization extension and also like mesos don't have this concept called a global resource or external resources and the whole resource provider Extraction is trying to support that and then given that we have this resource provider interface that we plan to introduce and For storage particularly we want to introduce a first-class storage resource provider As a site it can be both local and External so we call local storage storage local resource provider and a storage remote storage external resource provider So the storage provider will talk to the CSI plug-in to get the available storage space from the plug-in itself For example as you as James mentioned there's a call in the controller service get capacity So that's something that we plan to call on to get the actual capacity from the storage provider and expose to mesos as Disk resources, so we want to abstract away all the details from the plug-in itself and only expose like resources to mesos masters And the resource another another job of the resource provider is trying to handle operations For example, we have some existing operations like reserve create persistent volume And we want to extend that on to support things like create block create volume, which is like just creating provision and deprovision Volumes from those storage providers and the goal in mesos support is pretty simple I'll phrase as this so so the storage vendors just need to provide mesos a single Docker image that contains all the CSI plug-in bits and the configurations and the mesos will handle the rest So this is kind of the high-level architecture of how we're going to integrate CSI into mesos as you can see here We have a master in agent We do have two type of resource provider here external storage external resource provider and storage local resource providers and For I mean we do do things differently for those two different types and for external ones We run a external like we run a controller plug-ins controller CSI plug-in in a centralized location probably scheduled by a frame like marathon or Aurora and And under the resource external storage resource provider will using gRPC to talk to the controller plug-in in this case For example, we have a EBS plug-in here That the storage resource provider will talk to you and to do on dynamic dynamic provisioning volumes and deprovisioning volumes and once the you once the framework use get those resources disk resources and the trying to launch a test using those resources and master will send a task to the agent and There's another plug-in for EBS Running on agent node as as James mentioned earlier that for CSI you do have to implement the no plug-in interface And also the controller plug-in interface. So the no plug-in interface is going to be responsible To like make sure the resource you want to use on that agent node actually shows up for example You want to do an attach to make sure in EBS You want to do an attach to make sure that the volume actually show up on the node and then you want to do a mount The the whole reason in CSI that we want to separate no plug-in from the controller plug-in is because there are some There are like certain operations that has to has to be performed on the node for example file system mount and Make a fast things like this So this is for external volume support for local storage on support for example, if you have an LVM plug-in What we end up doing is actually like we run both the controller plug-in and The no plug-in in the same container so that same container provide two services and the local storage resource provider will talk to that on CSI plug-in as Well as the agent will talk to that CSI plug-in to to properly like do provisioning deprovisioning and Make sure the resource actually show up on the node, which it's kind of a note up in the local resource provider case Alright, so this is kind of the roadmap of storage support in mesos We're gonna first do the local resource provider integration and And then integral with CSI and then we're gonna do the external resource provider because this is a little harder because we don't have we don't really have a Notion of global resources inside mesos And it's a little harder to do that so this we decide to do that later but it's kind of part of the roadmap and And also we want to integrate CSI and you can track the progress in this epic So the LRP local resource provider support is targeted for the next release and the ERP is targeted for the release after that Which is 1-6 so our release cycle is two months So expect that to be available in two months and the ERP support is gonna be the release after that Which is four months or five months Right now I'm gonna hand over To Steve Thanks So, I mean I'm back on stage again to represent CSI from a storage vendor's perspective and what you see here is a quick overview of our Rex ray Storage provider at the top you see container orchestrators They go through this the container storage interface spec and get to the Rex ray plug-in Rex ray actually is written as kind of an out-of-tree Generic plug-in that can have plugins implemented onto its own. So you see here that We can deliver NFS block VFS as well as a number of other Storage providers Rex ray has been delivering Persistent volumes to mazos since October of 2015. So it has had quite a history I'm sure that some of you in the audience have it in production today With the pre CSI storage interface It Sunday September September the 12th. We actually released Rex ray version 0.10 Which is the first implementation of the CSI? Proposed specification in the world somebody had to go first So at this point in time as she said it isn't supported in the container orchestrator yet to my knowledge It's not supported in any container orchestrator, but we felt that we wanted to get something out there in the world to enable Projects like mazos and DCOS to do work on the other side And a key here is I don't want anybody to get bent out of shape But I'm calling it a proposed specification because at this point in time And the container orchestrators get a vote on this and we actually discovered a couple of issues as we were building the Rex ray implementation of a storage provider so the Spec I think is pretty darn close to what the final first official declared release spec will be but technically this is a proposed implementation at this point and Should that change? I think we'll be pretty rapidly on board with adapting to whatever change takes place This was made just to enable some Early foundational work for the orchestrators like mazos kubernetes DCOS doctor swarm Just kind of an architectural overview of what we've what the plan is for Rex ray we start with the container storage interface We're planning on doing a conformance like a conformance to spec test suite We build Rex ray and the container orchestrators talk through The CSI spec calls for GRPC to these plugins These are the plugins that we just announced So we've got coverage in public clouds for all the flavors of storage that you're likely to use on Amazon EBS CFS S3 Google persistent disk Digital ocean block storage if you run on-prem we've got support for CFR DB We've got three more or less generic flavors for block NFS virtual file system for the Dell EMC on-prem product line we shipped with Initial support for scale IO and I salon and then finally we have support for open stack cinder in this first release Now if you're using it in the pre CSI interfaces if you use Rex ray now You're probably aware of this, but there's even broader support, but we some of the support for all of the Storage plugins that are available with Rex ray today isn't quite enabled on CSI yet, but it's in our roadmap So the roadmap calls for the next release Rex ray zero dot eleven roughly December of this year I'm gonna turn it back over to Chakri then I'm just one storage Vendor, but the goal of CSI is to have broad industry support here. So You know to emphasize that I'm gonna bring another on stage and by the way if there are any storage Vendors in the audience. I'd be happy to talk with you if you haven't gotten on board CSI yet to discuss this Chakri Thank you, Steve We are the amount II we are actually developing storage and network solutions for containers So initially when we started looking at we looked at Kubernetes and we realized that There is no way you can develop your plug-in out of the Kubernetes code base All your plugins has to be inside the Kubernetes code base and it's very tightly coupled with Kubernetes release cycle So we developed flex volume. It was a good start But it was limited to one orchestrator. It provided a simplified Thank you. It provided a simplified vendor neutral interface But it only supported one orchestrator So now here comes CSI and we are very excited to be part of the CSI community and I'm actually grateful to the ecosystem for bringing this together and making it happen So thanks to Google Mezos Docker and Cloud Foundry So they worked hard to make it happen in the last few months So what is CSI from a vendor's perspective? It is It is one storage plugin interface to support multiple orchestrators so as a vendor you'd all have one plug-in and it's going to work with multiple orchestrators and Without this we had to develop multiple releases of our plug-in with multiple orchestrators And we are consuming we actually we are spending a lot of time in development and testing these various environments so with a united ecosystem now there is a single interface and We don't have to keep on supporting all the test matrix which you used to And also with the united ecosystem what we realized is every orchestrator was innovating Separately and they're developing their own interfaces. So and there was a lot of Duplication in the effort so once we got together actually we can see that like there's only now one interface and There's a lot of new innovation happening in that area and there is no duplication and As a vendor for our customers what this gives is a proven interface so that people are easily and ready to adopt the container infrastructures Okay, and Thank you so Okay So We're getting there a wrap up here, but if you're wondering who's involved with the CSI this is a broad initiative So you see there the logos of essentially all the major container orchestrators as well as a lot of the major storage providers We had to do this deck a couple weeks ago for Because the Linux foundation wants the the decks submitted early so I may have even missed a few here It's hard to keep up because this thing is gaining momentum So if you're a storage provider out there that is committed to doing CSI Enablement let me know so that if I deliver this deck again. I can add you to the chart What we're delivering now is just the first step so I want to be open about what it isn't in CSI and The list is up there, but on the roadmap past the initial release We have plans to support snapshots volume resizing quotas Support for containerization of the Windows OS and then finally user ID and Credential pass through all the way from the container orchestrator to the storage Provider because some storage providers actually have features that rely on user identification as a means to Implement additional forms of security Finally in community this stuff is underway now we've got a number of storage providers working on this as well as all the major Container orchestrators, but there's room here if you're a user of a container orchestrator a consumer To get involved in this program, you know, and it's a great opportunity because I mentioned on the road map We have plans to support snapshot which would enable backup Backup strategies and a number of other features But if you've got features that you really want I'd invite you to get involved with the container storage interface group And and get your feature wish list on the table. It's currently operating in the form of an online One-hour meeting every two weeks, and there's a link to get there There is also a Google group that Supports a mailing list so you can look at the historical communications there to feel to get a feel for where this Has come from and where it's going Why would you want to get involved now? Well, you can shape this at this stage It's still in the formative stage But more important than that perhaps is that if you plan if you're somebody who runs stateful applications at scale And these are critical to your organization Getting involved at this architectural planning stage is a great way to train your staff You know, you're gonna get a thorough understanding from top to bottom as as to how this is put together Which should enable you to do some serious troubleshooting and maybe making plans for how you'd implement your own infrastructure of You know laying out your compute nodes your storage interfaces. What kinds of storage you'd like to procure and Often as you see these architectural decisions made you you you get to a point where you're kind of at an expert level understanding how these things work behind the curtain under the covers and That's a great opportunity because If you wait till it's shaped informed some of the Some of these opportunities for just watching why this these decisions were made or lost I mean theoretically you could go back to do Google searches to find a lot of these notes But actually coming to these meetings we occasionally might even have some face-to-faces is a good opportunity if you're really a Scale user where this is critical to your business That said I that that's our Presentation it looks like we've got some room for some questions here if anybody's got any Yeah, we have one there Yeah, so so the question is As I mentioned in one of the slides that Mesa's is gonna take just one darker image name from storage providers And just take care of the rest and how that workflow How is that workflow? How does that work? So yeah, so I think there's a configuration So as I mentioned there's a storage resource provider and there's a configuration for that and when you actually You need to specify the Docker image name in that configuration And you can hit some endpoint on the agent or if it's a remote resource like external resource provider Then it's you can use marathon to launch that external resource provider and then talk to Mesa's master and We need to run the for the local resource provider on there's a plug-in. There's a single plug plug-in container So can I go to the shirt? Go to the architecture slides All right, so if you see that graph So the local you have a configuration that you pass to the storage local resource provider which is a Docker image name for your plug-in and The storage local resource provider will make sure that this plug-in container is running on that agent using the same containerizer we have inside Mesa's just run that container make sure that's running and we'll talk to that container using gRPC and And also monitoring the house of that container. So when it fails, it will restart and make sure that it can talk to that again and And yeah for for local one it's like that So you just need to tell Mesa's about the configuration of that storage local resource provider Which contain an image name and then we can take care of the rest for external ones Most likely the storage external resource provider is gonna be launched by another scheduler like marathon or aurora And then it has it's like one way to do that is you package the container inside the same storage resource those external storage resource providers you as a pod for example that you can talk to each other using the UNIX domain socket or local address and then there's a container image name In the configuration of the storage external storage resource provider that is gonna be passed through from Mesa's master to the agent when you're actually launching a task and There's a component in agent where actually wants to make sure that resource appear on that No, it will try to see if there's a container no plug-in running or or not if if there's not it will try to spin up that no container and The reason you can get the I mean the name of the container is all the way passed for on the master to the agent in part Of the configuration of the external resource provider. Does that answer your question? cool The question is where the configuration of all these volume The question is where the configuration of the plug-in container resides and the answer for that is it depends So different plug-in vendors has different configurations for example when launching a plug-in container You I mean some you may want to specify something special environment variable That's part of the configuration that you supply to the storage local resource provider or storage external resource provider It's a JSON config That you can specify what's the image name you want to use what's the command What's the environment variable pretty much the same as when you're launching a task you have to specify those things We use the same mechanism asking you to specify those things the command info and container info of that container Yeah, so the the question is How does CSI expose access to additional? metadata about the volume so that a Container orchestrator Or someone can reason about that to place a workload so the CSI spec Defines to Two things that are part of a volume handle one is the ID and one is a metadata and so it's up to a plug-in to Expose whatever additional metadata it wants about that volume through the the volume metadata protobuf It's an opaque field there's no standard keys So if you end up writing specific logic in your CEO, then you're kind of coupling it to a particular Storage provider if you're an end user, then maybe you're coupling your app to a specific storage provider There's no conventional set of keys for that metadata field, but the information Can be there as long as a plug-in implements support for that Does that answer your question? Yeah, so so basically like I went into a create volume if you go to the repo right now CSI spec repo So in a create volume the return value actually give you a volume ID and the volume metadata We're gonna stick into the resource that we provide so the resource provider is gonna get that information and then create a disk Resource one of the metadata we stick into that disk resource can contain the ID as well as the metadata But whether the scheduler wants to leverage that information to do scheduling decision That's kind of like a question mark. Some people might prefer like I don't want to do like logic for like Given specific storage provider because that metadata is kind of very specific to that given storage provider You probably don't want to do that So so in like a different approach is you provide some kind of high-level abstraction I say like a storage class or a profile where you specify a name of that profile and then you have for each Different provider you specify different parameters and things like this and then the scheduler will meet based on the profile name to make Scheduling decision rather than just based on those raw parameters in the metadata from the volume So it depends on what the direction you want to go Mesos is not gonna be opinion on that and just one final a Bit of information because the question was was kind of related to placement There is an open action item right now in the CSI working group one of our partners sad from Kubernetes is working on the concept of domain and how How domain is going to fit into the CSI spec and how that would affect placement of volumes in the cluster So it's kind of an open ticket. It's an open item if you want to be involved. I recommend attending the community Sinks that we have Yeah, just one more thing I think domain is pretty important because they think about EBS right like you do have some restriction on where you can use the resource We use that volume if you create a like a volume EBS volume from zone a you cannot use that volume in zone b You do need to expose some like domain or topology information To the framework so that it can make a better placement I mean, you don't want to place the volume in a zone that you cannot use the volume any other questions Okay, we are out of time if you have any question feel free to come over and ask thanks. Thanks everyone