 All right now we get down to it. The dedicated are left So for those of you listening to the recording, it's about 520 and they're very dedicated core group in the room with us here today So how everybody my name is Alec shosh. I'm a product manager at VMware. I'm part of the software to find storage team and Dan asked me to come by to talk about Storage and specifically around vSAN and how it relates to open stack You know in the question section I'm more than happy to talk about general topics around vSphere and storage and such like that But the talk today is mostly focused on vSAN and the reason for that is because when we talked to vSphere customers about implementing open stack what they usually say is You know really love the stability of ESX But the problem is is the way I build ESX clusters today Designed for a workload that open stack usually is not supporting. So I have this use case where I have my open stack Environment that's supporting these newer apps that do their own replication. They have their own backup strategies They don't need the high-end sand services that are traditionally associated with a ESX cluster Well, funny enough We have a product that has all the performance and all the stability that you expect from ESX But doesn't have the back-end sand services that you expect from a from a VMAX or a VNX and it's called vSAN So magically here we are today And I would like to claim that we had this all planned out and we knew exactly the use cases We were building vSAN for but I have to be honest. That wasn't the case It just so happens that we have a some piece of technology That's come along at just the right time for a lot of reasons and it just so happens that open stack has come along And so we have a very nice confluence of events here Just let me set some context though before we dive into the internals of vSAN And by the way, I will get into the internals of vSAN and I'm happy to go super deep if that's what you want From a VMware perspective though this entire discussion is part of the greater software-defined storage conversation right we fundamentally believe that the storage industry is going through a transition towards software-defined storage and Like any one of these transitions There's a lot of conflict a lot of discussion and a lot of debate about these words I'm not really religious about this I'm just but I would like to explain to you when we use this word what we mean right that is not to say Other people's definition of this term is wrong. We're just saying we have to define it for ourselves So when we use this term, we know what we're talking about So when a when a VMware person uses this phrase, we meet something extremely specific right what we mean is that The storage subsystem is broken down into three fundamental component parts and They are Fundamentally the data plane Which is not a really huge surprise which has been around for a long time and the data plane contains things like vSAN and VVol and san and nas and all those wonderful things we've had for a long time But it also contains a virtualized data services tier and that's something that's a little bit newer And in fact can be controversial right so if you're not a storage geek don't worry about that What we're talking about is services that sands traditionally have deployed so things like backup things like replication things like compression things like caching right if you buy a Netapp faz series controller or an emc vmax. They have those features right in the box Nothing wrong with that awesome thing it works great scales up Can be expensive, but it certainly does the job very very well as does our friends at IBM and Dell and hodachi And you know very rich industry here lots of support on the other hand What happens if you want to bring these services up into the software layer and have them aggregated across all your different storage implementations, how do you do that? Well, the answer is by virtualized data services and again, there's a lot of people doing really great work here This could be a software based replication tool This could be a caching layer right so there's lots of interesting things going on here But the difference is that the services are abstracted from the persistence right that is to say they can be combined in random combinations And for those of you in the open-stack world This probably makes a lot of sense because in the open-stack world You don't really want to know what the storage back end is all you want to know is I want to send your volume And I want to send your volume of this class Does it sit on a fiber channel array? Does it sit on a direct attached disk? I don't know and you know what I don't care and I don't want to care So that's a different way of looking at the world than traditional IT operations, especially enterprise-class IT operations And then this is the part that that I manage actually so I'm the product manager for the control plane storage control plane at VMware. There's a Virtualized control surface here similar to the virtualized data service this control plane needs to be surprise abstracted Needs to be standard needs to have policy constructs Again, and it opens that context. This makes a ton of sense right when Cinder asks for a volume It gives a very abstracted command right it says I want a volume of this class Attached to this VM of this size That's all it says so from a Software-defined storage perspective at the VMware within the vSphere product group. This is a command we readily understand Right, we're very happy to honor the request, but traditionally That's not the way vSphere used to work right and so this is kind of a new thing to Traditional enterprise-class shops right which are much more used to that seems for very specific things like I wanted the MDK on that data store Might seem like a slight difference in statement, but it's actually a very fundamental operational change Sorry, what's a VMDK? Sorry? So a VMDK is a virtual disk. Okay. He was joking You got me with that one sir Open stack conference. So use the term VMDK. Sorry. All right. All right. So that's negative one point for Alex being VMware big It okay You're lucky. I didn't say I wasn't start using high throwing hyper V terms at you I spent 12 years in Microsoft. So it's very hard to unlearn the Microsoft isms So today we're really going to be focusing on on vSan, which is this little yellow box here But I just wanted to make sure you understood this kind of broader context So this is something that we're doing as an industry as a company the storage industry is moving this way VMware is moving this way our competitors are moving this way It just so happens that the open stack community We feel can take huge advantage because the operational model and the workloads associated with open stack are pretty well Suited to this type of storage and compute environment. So hopefully that makes sense so Let's let's talk about the basics right so at a very basic level. What are we talking about? Well, the interesting thing in the vSphere context is we've actually had storage abstraction for a long time in the product We call it a data store, right? So if you're not a vSphere person don't worry about it a data storage is what we use to abstract traditionally lones, right? Or discs or collections of discs We always had this abstraction. It's been around for a long long time It was convenient for us to think about random blobs of storage as these things called data stores We also use that exact same mechanism to abstract away implementation details like oh this one's sitting on fiber channel and this one's sitting on NFS and That's perfectly okay, and it's all a data store a day of stores a day to store. This is not a new thing in the vSphere world And then we also have this notion of a VMDK now you might think well a VMDK virtual disc object That's not very revolutionary Alex. Well, it's not today But you know when it originally was invented that's a pretty cool thing that the guest thinks he has a disc a block object But what he actually has is a file and Actually if you dig inside of ESXi the way it actually works is that thing that we kind of roughly refer to as a dot VMDK file Because that's the original implementation is actually not that anymore It's actually a virtual disc construct that could be stored on an object store or a file system Or a block device completely abstracted away from the guest the guest has no idea that we're doing this a disk is a disk is a disk It just works right so this abstraction is not new it's been around for a long long time But it's important to realize that there's this history of abstracting away implementation detail So what we're really doing is just taking the next step and continuing to abstract away detail as we have been doing for some time So within vSphere there's another construct that we call SPVM storage policy-based management This is not data plan abstraction like data stores and VMDKs. This is control plane abstraction What we're saying is when you ask for storage inside vSphere Tell me what class of storage you would like right so some people refer to this as like t-shirt sizing right or gold silver bronze What we're saying is tell me the kind of thing you want I Want a high performance disk that I'm going to use for OLTP transactions Right. I want an encrypted disk that's going to contain credit card data You know I live in Japan and this VM may not leave Japan Whatever the class of thing you want. That's what I care about so within vSphere Not that the implementation detail necessarily matters to an open stack consumer, but between us friends We'll talk about the implementation detail the way we do that is through storage policy SPVM And this is not a new feature it came out in vSphere 5 But what's nice is that that abstraction mates up very cleanly with things like Cinder and Nova Because Cinder and Nova don't want to know what a data store is they don't know what a LUN is They don't really want to care about the difference between a high performance fast enabled fiber channel LUN on a VMAX and a really really slow ZFS based NAS that I built myself out of component parts, and it's lucky if it can do 10 IOPS an hour, right? Those things shouldn't matter to open stack and the way we make it not matter in our implementation Is this thing called SPVM? I love SPVM because I'm the PM for SPVM. Anyway, everybody has a mommy and a daddy. Yeah, so anyway So we're the other interesting this thing is going on inside of the vSphere and VMware Is that we're moving away from from LUNs? One of the big trends you're seeing inside of our product line is that we're attempting to move towards VM granular management of all things And again, this might seem like a trivial change, but actually if you get into the guts of the way the thing works It's a pretty big deal traditionally what we would do if you look at most enterprise customers today Who are deploying vSphere? What they do is they take LUNs usually large ones You know two terabytes or so or larger And then they pre-allocate into the cluster a group of lines a group of data stores And then they consume against those LUNs until the LUNs are full and then they just start over again Right, that's a pretty normal implementation model in a vSphere customer Which is cool if you only want to do one thing, right? But what happens if I have some VMs that need encryption and some VMs that need replication and some these high-performance and some Don't some of expensive and some not See where I'm going here, right being able to carve up those LUNs into multiple classes of service and to provide additional data Services like replication and backup becomes very complicated. So now these really big buckets That you're trying to carve up into little teeny boxes That's actually pretty hard try to take a bucket of water, right and you can't do it So instead what we're doing is we're moving away from that model. We're moving towards a VM granular management model so in a in a vSAN or a vVol use case and vSAN and vVol are both features that are That are relatively new vSAN shipped this year vVol is gonna ship next year What you do is when you ask for storage from us, you don't get a LUN, right? What you get is a virtual disk object and it's actually just that it's an object-based file system Sorry objects based storage system both vVol and vSAN are both object-based. So you say okay I want this virtual disk and here are the properties. I want it to have This is starting to sound familiar. I hope because that's exactly the way Cinder works So now what's happening is is that our plumbing looks a lot more like the cloud operating model That people like OpenStack are asking for now. This is not unique to OpenStack by the way, right? This is exactly what people like cloud view wants and this is what our product called vCake vCloud Automation Center They want that so you know from a from a plumbing perspective is the hypervisor. We know we have to serve multiple masters But for the context of this room, we're talking about things like Nova and Cinder requesting virtual disks So when we wrote a Cinder driver last year We made sure that that Cinder driver was based on these virtual disk objects these VMDKs So when you get a VMDK object When you get an object from Cinder using our driver, you don't get what we call we refer to as an RDM Right or a raw device map You actually get a virtual disk and the reason why we do that is because that future proves you Against technologies like vZan and VVol which don't support raw disks So that's the reason why we did that So what's the workflow? What does it look like? Hopefully this is pretty simple and obvious to you guys But I'll just cover it real quick So the first thing you need to do is you need to set up your capacity pool That in the Havana release that meant you had to make the data stores available in the ice house release What that means is that you're going to use SPBM to discover your storage tiers basically Then your cloud admin your open stack admin creates their Cinder volumes and volume types. Excuse me Excuse me The reason why we do this is because it's actually the volume type That allows us to inject metadata into the ask into the request through the extra specs mechanism I have a little demo of this later so I can show you how this works And then when the consumer creates a volume makes collect the symbi volume type Because that's tied to the metadata Injection right in the extra spec we see the request coming down saying okay. I want an object of this class We use the storage policy-based management infrastructure to select a container to put it in We can set properties against it if we have to at the same time we provision the object and then present it to the VM The only kind of weird thing about implementation and Dan mentioned this earlier in his presentation But that was like two and a half hours ago So you may not remember but what we do is we actually lazy create the virtual disk We do not create it when you create the Cinder object and we do that up for a couple of reasons One is because you could provision a thousand Cinder volumes you never use them So why should I have space on my back end that you don't need the other reason is when we know where you're going to put The Cinder volume then we know what data stores that the the VM can see So why create it on data store X and then immediately the SV motion it storage be motion it to data store Y That doesn't make me sense So we know oh, I'm going to attach it to this VM and this VM can see these 10 data stores Maybe I should make it on one of those 10 data stores right instead of making you over here and moving it So that's the one of the other reasons why we lazy create performance is better and it helps us decide where to put it After it's created if I detach the volume and then present it to another VM That's running on another cluster that can't see the local storage Then we silently move it to a data store that the VM can see and The vSphere feature we're using is called SP motion Doesn't really matter what we call the feature we just silently move it in the background So it looks like you just attach it and reattach but actually what happens is we detach move and then reattach And then it all happens in the background The question is is that only relevant for vSAN? No, that's for any Any data store any class of data store NFS fiber channel ice-cozy doesn't matter Not all data stores are visible to all clusters, right? So there may be a case where I need to do a sv motion Because vm1 is on a different cluster in vm2 lots of reasons why I might have to do that So the code just does that generically in the background So the question is I thought vSAN was going to make him available tall the answer is that vSAN is available to all members of a single cluster So if you're within a cluster, you're good. If you're moving cross clusters, then you'll still have to ask the motion okay The other weird thing about implementation on cinder just to give you a kind of like the nitty-gritty is Because of the way vSphere works. We don't actually manage disks like cinder does like cinder knows what it disks is That's all it does it assigns a disk a grid and then it detaches the disk and then sometimes later comes back and says You remember that disk guys made like two years ago. Yeah, I want it back now These fear doesn't work that way these fear manages VMs disks or children of VMs So when you detach a disk from a VM we can kind of forget about it It may still be there, but we don't really know what it's there so what we do is we cheat and I'll fully admit that this is a hack but we make it work is we create a fake VM right a metadata only object and We make the cinder volume a child of that shadow VM and the only reason why we do that is so that we don't lose track of the Disk ever so if you detach the disk and then come back a year from now and ask it for it back We can find it and the reason is because the name of that fake VM is the grid of the cinder object So we can always find it. So just a little bit of a hack to get around the way vSphere works This will be fixed in a future version of vSphere, but today we have to hack it around It turns out that making a VM is a relatively cheap operation. So it's not a huge deal We hide them in a special folder. So they're not cluttering up your main stuff But just want to let you know if you see weird things in your very vSphere UI That's what it is and that's why it's there If you delete that VM by hand We lose our minds, right? So please don't do that So how does vSan fit into all this Alex? Well, I'm glad you asked me that it turns out that vSan because it is inherently local storage has a couple of interesting things in the open stack world one is It's directly connected to the hypervisor. So when you scale the hypervisor you scale the storage and One of the things about cloud as we all know is that cloud is all about the perception of Infiniteness right in a cloud world. We think the world is infinite. We pretend like it's infinite It's not but we pretend like it is and the way we achieve the appearance of it Infinity is we simply are able to scale very quickly and be very flexible Well, what's one thing we know for certain about traditional sand architectures? They don't magically appear right somebody has to install them Somebody has to set them up usually in most corporate environments. That's two separate teams, right? So you have to plan ahead. So usually what people do is they buy Santa capacity in advance they pre-provision that can get a little expensive So in this case by bringing the storage into the cluster What's happening is every time you add a node to a cluster or every time you add a cluster You're automatically adding storage capacity because compute and storage are now one thing So that to some extent solves that scaling and planning problem. I'm also adding storage in much much smaller increments Right most storage arrays now. I'm talking about traditional storage arrays not not some of the new guys that are doing these Scaleout scenario, but traditionally, you know storage would have a head unit right or probably a pair of head units And then you'd scale out with shelves, right? If you think about it every time you bring a new head unit on that's a pretty significant scale factor Right because you just brought a lot of IOPS capacity and then you start consuming against that as you had the shelves More modern storage architectures don't work that way, right? They operate in a pure mode and they scale out linearly Vsan is like that. So Vsan adds capacity with every single Member of the cluster added it doesn't necessarily have this big scale factor You don't add a hundred thousand IOPS in one chunk, right? You're adding them in much smaller chunks So we are supporting this today in Cinder as a vice house and we're adding support for Nova and In glance actually the code is already there We've already published it on the community and we're just working with the reviewers to get it upstreamed So the interesting thing about Vsan is that Vsan was designed as a hybrid storage system from the get-go And again for the non-storage people out back Hybrid is kind of storage speak for both flash and rotating media, you know, it's kind of like the Blues brothers joke, right? What kind of music do you have here? Well, we have both kinds country and Western So the question is what kind of discs do Vsan user support both kinds flash Rotating so the Vsan node is Always both a flash disc and a rotating media always and in fact the minimum configuration for Vsan is three physical hosts and Each one of those hosts must have two spindles one flash one rotating And once I get into the architectural slide you understand why that's the case So the absolute minimum number of discs that you can use to build your own personal system is six Right to each in three house. The reason why we need three hosts is because we have to have a witness Right, we scale up to 32 nodes, but we scale down only three. That's the minimum We don't use traditional rate, right? We use an array of notes. So when we do fail over and we do Availability metrics we always do it based on node complete node failure So we're not striping we're not using rate five or not using right six, right? We take the object we replicate the object n times depending on the settings of the object The interesting thing here is that replication saying that availability setting that's actually a property of the virtual disc Not the entire data store So that's the other interesting thing about traditional storage arrays is if I wanted to have a high availability Lung Right, I'd probably have to set that availability down in the raid group right shelf level And then I start putting things in there because it happens to have that rate low in B Sand that's not the way it works every time I provision an object every time I make that decision But I want n availability and plus one and plus two and plus three So you could have two VMs one is hugely important and one is completely unimportant Sitting on exactly the same data store at the same time running a completely different service levels The sand doesn't care right. That's just built into the way these sand works and how do I get that different level of execution? Through storage policy right as I already said you set the policy apply the policy to the object And that's how we decide do I replicate this thing? How much splash drive reserve? How many stripes do I make? So for those of you that are familiar with the VMware Kind of not language. We have these things called BSA virtual storage appliances very important to note Not a VSA V sand is in the kernel. This is an ESXI feature This is a kernel level storage feature extremely high performance high scale enterprise grade storage So don't be confused about that if you don't know what a VSA is don't worry about it But for those of you that are more VMware vSphere knowledgeable, we want to make sure we're really clear about that So there's only three seemingly conflicting goals, right? We wanted to make something that was hugely simple We wanted to make something that was very high performance and we wanted to make something that had very low TCO What's interesting is if you look out in the marketplace It's a kind of a right now. It's kind of a pick-two scenario, right? You can have any one of these two We wanted to have all three at once and to do that. We had to invent a completely new way of doing storage So that's why the architecture is so different so I mentioned this before so I'll go real quick through this slide, but What we're saying is is that the VMs themselves have individual storage policy and those policies control the way vSAN works Those policies can concern things like availability Striping performance Iops use of flash all those things are all controlled through policy that policy is assigned to the object When the object is created and when I say object I mean in this case a virtual disk when the object is created that information is handed to vSAN and then vSAN takes Appropriate action. No, there's no lones here No, no ones at all vSAN is an object store Extremely specialized object store that really only stores two things It stores VM metadata and virtual disks. That's it Now in theory we could have implemented a generic object store, but instead we chose to implement a very very very Focused object store and the reason why we did this for performance Right, we're highly optimized to a small number of extremely large objects Right because we wanted to make sure that we had the enterprise grade performance and we were pretty successful The scale limits of vSAN are quite high So you can have a 32 hosts in a single vSAN cluster Why 32 Alex because that's a limit for ESX We scale to ESX is limits. That's the point. It's an ESX feature. It's not a separate thing 3200 vms in one cluster 2 million IOPS 2 million IOPS 4.4 petabytes now that petabyte number is not Really crazy amazing until you consider that we're just running in the hypervisor, right? There's no storage system involved This is just hypervisors. You're running on local disks, and these are just regular old disks by the way, right? So I was not part of the team that built this thing, but I have to say I'm very impressed with that work There's two ways to build these things out. So some customers come to us and they say look Alex We really want something simple, right? I just want to skew. I want a part to order on the internet Fine, no problem. It's called vSAN ready. So you go in it's a preconfigured node. It's got everything in it Buy it from your favorite vendor plug it into the rack turn it on wire it up. You're good to go Some people are like no, no, no. I want that disk. I want that controller. I want that motherboard Fine, no problem. As long as it's on the vSphere compatibility list that it is vSAN supported full stop The only component of this system that's vSAN specific is the storage controller itself And the reason for that is we need to be able to see the disks So if you have a storage controller that's doing caching or is abstracting disks into lungs and things like that vSAN Is not going to work with that, right? So you want to have direct access to the disks So there is a list of storage controllers that we support in vSAN But every other component of the system is just standard old ESXi okay, and The way you you fine-tune this thing is by changing the number of SSDs in a unit By changing their capacity by changing the ratio of SSD to rotating media And so you can have an extraordinarily fine-tuned experience Even within a single head unit, right? So I can go with two SSDs for head unit or I can go with slightly larger SSDs per head unit by default We recommend about a 10% ratio So if I have a terabyte of rotating media, then it's a hundred gigabytes of SSD, but that's just a guideline. It depends on your actual workload. That's going to vary. Yes, sir The question is I thought you can only put one SSD in a data store. That's not actually correct It's one SSD per disk group and you can have as many disk groups in a data store as you'd like And more disk groups means more throughput by definition a disk group is an SSD We created disk group basically what we mean by a disk group is an SSD with its backing rotating media So if you just leave us in completely automatic mode, which is the default We'll take every SSD that you have make a new disk group for each one and then keep adding rotating media until we run out The question is that I don't have any SSDs then what the answer is vSAN requires SSD You must have at least one SSD in every participating member of the cluster Right notice. I said participating member of the cluster not all members of the cluster must participate that's not required in the vSAN infrastructure and Minimum of three physical hosts Minimum of three maximum 32 Okay So really we're just talking about any sx feature and this is a screenshot off the production product Right, you can see that just along with all the other features DRS. Sure ha sure vSAN. Yes Now notice that down here. It's grayed out, but the default is automatic mode Right if you leave it in automatic mode, we will self-select the disks and we'll do everything for you You can turn that off. You can manually configure it if you want to but by default You're done one check box. You're done There is one extra little step that I didn't mention and that is to say that that that the The hosts must be able to see each other over an IP network And we recommend that to be a tank to be a gigabit network Right, but assuming that you have a fully connected Cluster that has high-speed interconnects. All right. No, just work. Okay So when we talked about disc groups so disc groups whoops Disc groups are by definition and ssd and they're associating rotating media and The reason why we do this is because the way vSAN works is that when you write a block What we actually do is we write it to ssd? always Exclusively we never ever ever ever write to rotating media We always write to ssd Sometime later asynchronously We will de-stage that right from ssd to rotating media and this is the fun part based on policy designators So some virtual discs may never get de-staged That's perfectly fine. Some discs may be de-staged right away So now when I read a block if I haven't been de-staged I go right from flash again right because I'm already in flash if I have been de-staged Then I have to go hit the rotating media then when it comes back It's cached up on the ssd tier again, and if I hit it again, I'm back in cash Right, so we're inherently using the flash as a read write cache all the time The way we use it though varies depending on the class of the object that we're talking about okay We can take objects big objects like VMDK virtual disc We can split them in their computer component pieces We call those stripes and then we can spread those stripes amongst the cluster and why we do that well We do that for availability and performance Right when you set a rule say you say this virtual disc is n plus one what that means is that Media must be written to at least two physical nodes before the right is committed to the guest So we will write it in parallel to two physical nodes when those rights commit then and only then the guest receives a right commit When you read it'll read it'll try to read from the local node first If you're striped though, it'll grab the local stripe But if it's not it'll go across a network grab the stroke remotely and then go forward So the guest perceives this this common storage pool across the entire cluster What's actually happening though is we're taking the object we're striping it up and we're pushing it down across the cluster Based on the rule set What's interesting about this is that we can scale up In a single node or we can scale out by adding additional nodes So as we build up we can just keep adding hard drives or keep adding virtual doesn't continue to scale up or we can just scale out By adding additional nodes on demand Not all nodes need to be the same size Right, you're going to get the most consistent performance if your nodes are similar But there is no requirement that they're the same so you could have ten terabytes on node one and one terabyte on node two Perfectly fine. You could have three SSDs in node one and one SSD in node two That's fine Operationally you probably want them to be similar because that way all the VMs will be received similar performance as they get moved around the cluster But that's not a requirement and they don't need to be from the same manufacturer You can have a mix of you know HP and Dell or you can have racks and In blades doesn't matter Okay, and what that gives us is it gives us a very linear Scalability factor we are scaling linearly based on the number of nodes So whatever that nodes performance is you take that times the number of nodes you have So if you have eight nodes and then you add an additional now eight You're basically doubling your performance. It's a very linear curve as the cluster size increases From a storage perspective, that's exactly what you want, right? That's it turns out the dirty secret of storage is if you have twice as much gear You don't always get twice as much performance But in our case we do because the way we're architected Okay That's a lot of stuff Any questions? How about if we take a look at it actually working? How about that? Nobody wants to see it working? So I am not as brave as Dan So I brought a recording So what's going to happen is Let's say we have a vSAN cluster What what you actually see is you actually see a data. Well, I can't stand there what you actually see is a data store and When the cluster is enabled you just see it as one of the many the many data stores that are attached the Normally when you set this up You'll build out your physical cluster right you'll add or you know and then you'll go in and you'll create storage policies And storage policies are going to be whatever classes of storage internally you want to support, right? So if a lot of my customers, there's only one class of storage, you know gold For lack of a better term, but you may have a situation where some of your VMs are more equal than others And you want to may want to promise them a higher level of IOPS Or you may want to have more redundancy and the way you do that is through storage policies, right? As soon as policies can be whatever you want and they're configured by the administrator The This is just to showing you what we've got here, right? So we've got the very simple vCN Implementation right it's got three physical hosts So the next thing is we need to create our our cinder volume types And because we're we're real hairy developer types We're going to use the command line instead of the wimpy UI way But obviously this works either way Actually, you can tell that this was done by my engineer because it's all command line all the time and so what we're going to do is we're going to take a look at the At the side we're going to create a goal And then the next step is to add the extra specs that allows us to connect this to the to the SPBM policy that we saw on the previous screen, right? So remember what we said is extra specs is just a delivery vehicle And you can see that the VM where extra spec is called storage profile and then it passes on the string called go profile So if you recall from the previous screen remember it was called go profile. So that's what connects it to it's a very simple mechanism It's just a literal string that we're passing as long as those two match. Everything's golden, right? So now we've kind of gone forward in the video a little bit here and we've created a couple of different classes So now that that's set up though, you know, you're really probably only going to do that once, right? The actual consumer experience is much simpler, right? The consumer experience is you go to the website or you go to the command line You request a storage object and you just say what kind you want and then we give it to you, right? Very very simple user experience. Again, the implementation detail underneath is completely hidden from the user I'm not going to go all the way through this because I'm assuming you guys all know how Cinder works, right? So from this point forward, we're basically talking about normal Regular cinderisms, right? It appears as a volume type. You consume the volume type, right? Nothing really amazing or special. On the back end what happens is we translate that Cinder request into a Storage policy management request. We pass that down to vSAN. We create the object So I'm going to just go ahead and pause here. This video is up on on YouTube So you can take a look at it. Also, it's in the lab if you want to go out and build a vSAN lab You can do that. It's pretty straightforward Okay I need just a time. I'm just gonna All right. So in summary So what we've seen from Opus that customers what customers are telling us is that You know, we really need low-cost high-performance storage here. We don't need high-end replication solutions We don't need synchronous replication. We don't need offline snapshots. We don't we don't need all these fancy things We need something that's performant. It's stable and it's low-cost around the commodity hardware Surprise that's exactly what vSAN is. It's all of those things It's very simple to deploy and operate And from our perspective, the best part is it's integrated with vSphere. It's a vSphere feature. It's not a separate thing So from our perspective, this makes a lot of sense, right? We have a huge commitment to open stack within VMware We have the storage product that seems to fit these these use cases and when we talk to customers about this what they tell us is Yeah, this makes a lot of sense, right? So we're seeing a lot of people take this up. Does this mean that we expect all of our These open-stack customers to go directly directly to vSAN? Probably not, right? The vast majority of vSphere customers today are running on sands and most of them are really happy with those sands That's great. We love sands. Sands are fantastic for what they do So if you are implementing open stack in your production environment and you want to carve off a piece of your existing Sand and put that on open stack, it'll work just great. Everything that I just talked about Will work perfectly well against the sand infrastructure, fiber channel, ice-cozy, NFS, and it'll still work This is just another option to look at, okay? So with that, I think I am right up against my time and I thank you all very much for your attention I'm happy to take questions. Thank you Yeah Yeah, so the question is what happens if I'm running a VM on a cluster that's not a vSAN cluster? Can I consume vSAN storage? No If I'm on the same network, if I'm sitting, if I really have a great personality and I'm sitting really close and no So vSAN is Only managing storage within a single ESX cluster only exclusively We are despite the name. We're not actually a sand. We don't support NFS. We don't support ice-cozy We don't we don't support external sand protocols, right? So if you want a centralized storage entity serving multiple clusters There's some really great products out there to do that. That's not what vSAN does Question what if it's a host in that cluster if you're hosting the cluster then you can consume that storage Whether you have local storage or not. It's a cluster level asset that can be accessed evenly by all the members of the cluster But only the members of the cluster not across clusters But we can have a non-uniform form cluster Configuration that works fine. Now there are performance implications to non-uniform clusters So, you know take that with a grain of salt will it work? Can you consume the storage of a foreign machine? Absolutely Right, right question is that means I don't have to have SSDs in every single host, correct? Keeping in mind the store that you could have performance implications by having non-uniform access Right, some VMs may experience higher performance than others The other thing is if you have members of the clusters who are not participating Participating it will limit the total number of VMs that you can support on a single cluster And the reason is because we distribute the metadata across all members that are vSAN enabled and The metadata limit is a per ESX limit, right? So we can support 4,000 objects per ESX server, but that 4,000 objects is only distributed to participating members So if you have a 16 node cluster with eight vSAN nodes You're going to get half the scalability as a 16 node cluster that are all vSAN nodes in terms of just number of Objects that we can support so there's some subtlety there, right? If you read the the vSAN deployment guide we strongly suggest that all all members of the clusters participate Right because it's it's more predictable that way and it's the safest option Even if it's only just two discs in the host so you may have a case where you have 16 members of a cluster eight of which have two discs eight of which have ten discs That is totally fine, right? So in the classic thing is I have blades and I have rack mounts and I want the blades to participate And the answer we would say is that's fine But you probably want to go ahead and take the two spindles that are available in this in the blade Have them participate even though there's relatively small amount of storage and the reason is because that way they can participate in the process They can be a witness. They can store metadata They can form quorums, right? So our the design assumption is is that most of them what but the reason why it has to work When that it's not the case is what happens if I went ssd and a host and ssd fails Right, you don't have is just fail fall down and die at that point So we have to support this mode where not all members are participating so since that already has to work Right, you can do it by design as long as you are willing to accept the performance Window that you're limiting there Yeah, yes, sir Yeah, so you can do it so the question is who's doing the scheduling and the answer is you can do either you can specify a Datastore and then very basically the cinders doing this is scheduling But we would prefer that you just tell us what kind of object you want and let us do it Because we know much more about what's going on in the data stores Then cinder does but you know some people want to have more control. So we have to allow both ways Yeah There's a lot of we can have like a this is what we refer to as a three beers conversation about who should be doing scheduling It's more of a philosophical debate Mechanically, we have some advantages because we're closer to the discs, right? There's also policy handoff when we talk to the arrays. We give them policy hints Which cinder can't do so if you're not using our policy infrastructure You don't receive the advantage of the policy hints. So your performance will drop So that's the other reason to use our policy infrastructure other questions. Yes, sir No, it's a persistence tier the question is what goes into that tier So it would be more accurate to and if you're a storage guy think of it as dynamic auto-tearing at a block granular level Or sorry, that was inaccurate at a stripe granular level Right. Are you a storage guy? Oh, okay in the storage world those things mean things So you say auto-tearing to a storage guy as you guys light up. Oh, you're doing auto-tearing. So What happens is is that if a stripe lands on an SSD we consider that to be a right commit If it was only a cash layer, that's not technically a commit. That's a dirty buffer For us, that's a commit. So we consider that to be a valid commit and we report that to the guest Later we may move it Right. So the storage the big S storage world like the pointing-haired storage guys That's not cashing to them. That's auto-tearing. So I think to a normal human that's the same thing Right, but we have to use our words carefully because in the storage world that means something There's actually two factors one is how often you're assessing it But the other one is the policy that you've set for the object So some objects may have higher priority than others causing them to be We call it the elevator mechanism. So you take the elevator down So you may you may or de-staged so you may get de-staged So let's say you have vm1 and vm2 vm1 is set to a hundred percent flash vm2 is set to zero percent flash They both commit a right at exactly the same time both of those rights commit to SSD The guest receives exactly the same acknowledgement exactly the same time one millisecond later Vm2's right gets de-staged Vm1's right is not de-staged Then they read the same block He gets a really fast access. He gets a slow one Yeah, so is that cashing or is that auto-tearing fine close enough? Yeah So I think what I'm saying for all practical purposes the distinction between those things is not that big Mechanically, it's it's what's happening is different But the experience of the user the experience of the VM is is identical. We're using flash for Iops We're using rotating media for capacity, right? SPBM talking about my baby here, man, you know Storage policy-based management. Yep. Yeah, it's actually the policy is not how long its percentage of object size guaranteed So it's a reservation guarantee, but mechanically it's basically the same thing the bigger your guarantee the more likely You're right will remain longer. Yeah, so it's expressed as percentage of object size One SSD and one rotating medium at a minute and a minimum. Yes Yes Well, you can lie to us and tell us that the SSD is rotating media. We wouldn't know But yes, we we require you will not enable a disc group unless you have at least one of each It won't work So the question is why that crazy requirement Alex This doesn't make any sense to me the reason why is because Architecturally we wanted to make sure that we had a uniform d-stage layer Which gives you a more even performance experience the problem is is if you have SSD without rotating media you have no d-stage Right, so now architecturally we can't assume that you can take the d-stage down to the rotating So architecturally we're assuming that we have two classes of discs fast discs and slow discs, right? If you take the slow disc away Right now we're just We're just an all-flash array Right those things already exist right it's called pure or violin or so we're just not in that business If you want the world's fastest swords with ultra low latency and a million IOPS by a violin, they're really good at that Yeah, I Don't think I said that I think I said that it happens that vSAN is very well attuned to open source open-stack workloads vSAN is not an open-stack only product. It's a generic storage It's a generic storage product and the reason why we use both SSD and HDD is because in our research What we found out is is that the cost the cost of ownership the cost per IOPS on SSD is very low But the cost per gigabyte is extremely high HDD the opposite the cost for IOPS is high the cost for capacity is low So by combining the two you get low cost per IOPS low cost per gigabyte on the same platform So it's an architectural decision. We've made you guys will tell us whether it's right or wrong right because if it's wrong You won't buy it But we're pretty confident in this design and if you look at what's going on in general in the storage industry Right a lot of people are moving to this this hybrid SSD HDD model There are definitely use cases like high-frequency trading right NASDAQ where you want the absolute minimum possible latency with millions of IOPS We're not that we're a general purpose storage system for 80% of your workflows Those storage systems are designed for 5% of your workloads And they're really really really good at that and we didn't think that we could be a better High-performance low latency array than violin or pure or the others right on the other hand We thought that we could produce a system that was had a much better ROI for 80% of your workloads And that's the system that we designed You can definitely argue we made a mistake, but that's that's the rationale Yeah, I think we had a question back here. I think I'm gonna have to cut this out They're gonna kick us out. I love this conversation by the way The next step is you're gonna have to buy me beers to continue answering questions Which is totally legal bribing your presenter with beers Totally cool and open stack summon one more question and then I think they're gonna kick us out of here But I'm happy to continue the conversation. Yes, sir The question is does vSAN have distance replication? vSAN does not but vSphere does So vSphere has replication if you want to use it vSAN does not have its own replication engine Absolutely using a Using vSphere replication service keeping in mind that vSphere replication service has a minimum rpo of 15 minutes So if that's what you're looking for then then that'd be an appropriate way to do it Okay, I'm gonna have to stop the questions here I love the questions. Happy to talk to you outside, but they're gonna kick us out of the room because it's after 6 o'clock Thank you all very much. Thank you