 All right, folks, we're getting started here shortly. Let me first of all introduce myself. My name is Kamesh Pamaraju, one of the speakers in today's session. I'm a product manager at Dell focused on OpenStack solutions. And here with me today is Neil Levine, who is now a director of product management Red Hat. As some of you probably know, Red Hat bought Ink Tank. The maker is a Seth, what, like two weeks ago, Neil? Just recently, so it's a new development. Lots of exciting things going on. Just as a little quick background, Dell has been working with both Ink Tank and Red Hat prior to this acquisition. So we've had a lot of experience working with both companies. In fact, this morning, we just did a case study with the University of Alabama. I'll talk a little bit about that later, who have successfully implemented Seth in their university environment with HPC clusters. So quickly, the kind of things we're covering today, Neil is gonna talk a lot about what's coming in Seth. There's enterprise Seth that they're working on. Exciting new features coming in there, things like tiering and erasure coding. There's Seth FS that's gonna come down in one of the upcoming releases. And there's also the file system, which some of you may know about. So let's take a quick sort of check in the audience. How many of you actually use Seth today? Okay, about 20, 25% of the room. How many of you have heard about Seth? Okay, everybody has. That's why you're here, right? Excellent. So first part, Neil's gonna talk about what's coming in Seth. He'll cover a lot of the use cases. Seth can be used for a variety of workloads, whether it is database, databases, even Hadoop. He's gonna talk a little bit about that. Whether you have a capacity or a performance use case, we will talk about that. And then later, I'm gonna focus on reference architectures. What does it actually take to go build a Seth cluster, implement it? What are some of the considerations around hardware, around redundancy requirements, networking requirements? So we'll get through that each. We each have about 20 minutes. We'll reserve questions towards the end of the session. And of course, feel free to talk to either one of us offline. So with that, I'm gonna hand it over to Neil, who will cover the first part. And then I'll come back later for the second half. Thanks. Thank you, Kamesh. So as Kamesh said, I'm Neil Levine. I was the VP of Product for Ink Tank. That change is about two weeks ago. And I'm working for Red Hat. So I'm gonna give a little bit of background about what Seth is. Just a little bit on the technology. Explain about the product. Then I'll go through the roadmap. There's a lot still being worked out. There's a lot of questions people have about how Seth is gonna fit into the wider Red Hat portfolio. Feel free to ask questions. I may not have answers, but I can certainly give you as much information as I have so far. Okay, so Seth is an open source, massively scalable distributed storage system, which is a lot of words for it does storage. And it does storage in three different ways. So we talk about it being a unified storage platform, which means it does object, block, and file all in the same technology. Not necessarily the same single cluster, though you can run it in a single cluster if you want, but it's the same software for delivering all three capabilities. The object storage is very similar to those of you who know S3 or Swift. So we support both of those protocols as well as having our own native protocol. And from an open stack perspective, we can replace the standard Swift implementation with our own, but we still fully hook into Keystone for authentication. We have geo-replication capabilities, API for pulling out your billing details, and so on and so forth. So whether you're doing a public cloud or it's for private cloud, this is the object storage piece. The block storage, which is probably what most of the open stack community uses for, is a pretty sophisticated enterprise, in quotes, grade block storage system, which supports snapshots, cloning, which I'll go into in a second, and again is fully integrated into open stack through the Cinder API service. But you can also access it through the Linux kernel, though that's not typically done within open stack deployments. If you're running things on bare metal, that's how you'd use it. And there are iSCSI interfaces and so on and so forth as well. And finally, the file system, which is actually the oldest part of the project, or the original part of the project, is a distributed POSIX system. So some overlap with what Gluster does, though it has a very different architecture. And again, you can access that file system through the Linux kernel, or through traditional protocols like NFS and SIFS. And again, I'll cover a little bit about the roadmap for the file system in a bit. So from an architectural point of view, people sometimes get confused and certainly analysts pigeonhole us in odd places. At the lowest level, we are an object store. That is what everything is built on. And we call that RADOS, which has an acronym I won't repeat. So RADOS is where almost 10 years worth of development has now gone into to build this incredibly resilient, self-balancing, self-healing distributed object store. And this system here is the foundation for everything that we do within the SIF project. So there is a library called Libredos, and that gives us the flexibility to then build API translators like IGW, which provides the S3 and Swift connection, or to build the block device. And so software libraries have been hooked into KVM or other hypervisors to allow the block device to appear as a normal disk to the virtual machines. And then finally, of course, the file system itself, which stores both its data and its metadata all within the object store underneath. So yes, we are an object store architecturally, but from a use case and a product point of view, we do both object block and file. Just to go a little bit more into the components side of things, and Ross Turk, our VP of marketing community, often says we have more in common with BitTorrent than we do with a NetApp. Rados at the lowest level consists of two main components, which are the monitor nodes here on the left, and the OSD nodes or OSD processes, which are the software processes which look after an individual disk or a RAID set. All of these software processes communicate with each other through a gossip protocol, maintaining this peer-to-peer communication mechanism, which is how we're able to dynamically respond to failures where a single node or a set of disks goes out of service. The monitors keep an eye on all of these OSD processes and maintain what's called a crush map or a cluster map so we can formally tell the clients these are the current processes and disks which are running, which have data for you and these ones don't. And so all of these components interact continuously, which is why we have this incredible scale out because you just keep adding software processes, the cluster map gets updated and the clients start talking to the storage system on the back end. So we created a product around Ceph about six, seven months ago, where we have Ceph as the upstream project, which is definitely how Red Hat think about things, and Ink Tank Ceph Enterprise as the downstream or the commercial product that we have. The product consists of four core components, the open source software at the very base or the core of the product here. And importantly, and this is the thing often people get confused about, they say, what more do I get with a commercial product? But actually from a software point of view, there's less in the commercial product than there is in the full community. There are a lot of things which we do not consider to be production ready or we haven't done formal testing of. So we ensure that the only bits that we ship contain, sorry, the bits that we ship only contain stuff we have tested or queued. And right now it is the object in the block site. So when we talk about our product, we're positioning that as the object and block stuff. The file system exists for now in the community or the upstream project only. One of the things which is gonna change is Calamari, which is the management platform. For those of you familiar with the Hadoop space, it's kind of similar to Ambari handles a lot of the management and the graphing and the sort of higher level interactions of the cluster. That was a proprietary piece. That's now gonna be open sourced as well. And it's kind of like the natural sort of third component to your storage cluster in addition to the monitors and EOSD processes. You now got the management node as well, which you can interact with through REST and other means. Enterprise plugins, I won't go into those, but we're gonna try and get stuff available. So if you're running in a Windows or a VMware ecosystem and you wanna access a stuff cluster, then you can do it through those plugins. And finally, of course, the support services. Okay, so a little bit about the, I'll skip that screenshot, the use cases and the thing that I guess most of you are probably interested in and relates directly to what we're doing with Dell is around the OpenStack use case here. And here you can see all the places that we interact with, the components of an OpenStack infrastructure. So on the object storage side, we fully implement the Swift API. We lag a little bit behind, obviously, because there's no sort of formal standard and we try and keep up with as best we can, but we still fully hook into Keystone API for authentication. And then on the block side, we hook into Cinder. So the ability to create a volume, delete a volume, ask for a snapshot is all done through the Cinder API and RBD as we call the block device has been supported since the Folsom release. And typically people will also use RBD as a back end to glance. So you'll store your images, your sort of canonical low KC images, your gold masses, as well as snapshots and so on potentially going glance. But because it's all stored in the RBD, makes it very efficient to then boot from those volumes at a later date. As I mentioned before, the way that the actual VMs connect to the block device is through the hypervisor, the hypervisor through QMU, KVM, or you can use Zen as well, we'll natively talk to the block devices, doesn't go through the kernel, RBD driver just goes directly through the user space hypervisor process down to the block device. So the reason that Seth has become pretty popular is because specifically around the way we handle things like copy and write. So if you wanna boot up 100 VMs instantaneously, we don't create a hundred separate individual images or block devices, there's one authoritative one and it's simply copy and write for all of the other 99 as they make modifications as they're booting to logs and so on, then the bits start to change on the back end. So very, very quick booting through the copy and write mechanism. We also do snapshots and incremental snapshots, so very easy to take deltas and to back those up off site or even to put those backups into the object store if you want to do so, whether it's on the same site or remote site. And the other thing which has been worked on for the Ice House and still some work on going in Juno is around ephemeral volumes, so you can have your ephemeral boot disks also back through RBD and with one or two things left to fix and then we should be able to have almost diskless nodes on the Nova side so you can just run very, very cheap compute with all of the storage held over the network remotely on Seth. So the good news is we were already a Red Hat partner before the acquisition and all of this had been formally certified with Red Hat, so all of the dependencies and the APIs had all been tested and given a gold stamp of approval. So if you're using RHEL OSP4, the current version of the OpenStack product, this is all ready to go and fully supported with even the Seth components running on RHEL 6 currently and RHEL 7RC build due out any moment now. Very quickly, just to go through some of the other use cases. If you're not just, you know, some people are obviously doing object storage outside of the OpenStack framework and again we see fairly traditional storage as a service or cloud storage, depending on your buzzword, using just the radar skate ways and our high availability mode here but we're also seeing a large uptake in a number of customers who are actually getting rid of the restful interface entirely and just going straight to the native protocol which will give you a 20x almost performance increase in some instances and really allows you, if you've got a custom application on the top end to talk to the object store with some really fine-tuned parameters where you can control stripe sizes and other such things which will really have significant performance impacts if you're running latency sensitive applications. So for some of these sort of bigger customers, this is becoming an interesting trend that we've seen. Okay, so just a touch on the roadmap. In the next couple of weeks, don't ask me for the formal date just yet but hopefully end of this month, maybe be near June. We've got the next formal version of the product coming out which is ISO, except enterprise 1.2 and the headline features there as Kamesh mentioned that you raise your coding and a cache tiering. So just to go into a little bit of details around those. The cache tier for those of you running RBD or wanting to run RBD with a OpenStack deployment is this will allow you to do better performance on your block devices here. So the cache pool is a transparent layer which you put in front of the base or backing pool. So typically the cache pool will sit on SSDs or even with a higher capacity network segment and it's transparent to the end user but suddenly there should see a performance boost and you can configure the cache in two modes. There's a traditional sort of right back mode so all the reads and writes go to the cache initially. The cache will flush the data down to the backing pool based on tunable parameters, time, utilization and so on and so forth. So that's a great way of just ensuring that the hot data stays in the right place and the cold data goes to the backing store. There's a secondary mode as well where we just do that in read only. So if you're trying to do archiving or you're writing stuff which you're not expecting to be read, this is a great way of doing it is just get it written to the backing store first of all and only the hot data will get pulled up as and when it's read. So a bunch of tunables which you can do there to customize it and for, sorry for OpenStack we think this will allow, if you're trying to run, say MySQL or other such things in a VM this is a great way to sort of boost the performance parameters. The other feature we're very proud to sort of get out there early compared to some other projects which we're working on is on the erasure coding. This has actually been a huge project, spanned a couple of releases. We started it sometime early last year and the cash pool on the erasure coding are actually designed to work together to a certain extent. So erasure coding for those not familiar is an alternative way of doing data integrity. The traditional data integrity is replicas. You just keep a copy for multiple copies of your cat image or whatever it happens to be. Two X, three X worth of storage. Wherever erasure coding you chop up the image and you put some parity bits on it. So kind of similar to RAID. And so typically you'll only have to use 1.5, 1.6 worth of data to get the same SLA in the case of nodes going down or chassis's going down. And again with the erasure coding you can do some tunables here. You can set what are called K and M parameters. Basically how much parity you want to keep. So the more parity you keep, the better your resiliency is, but it comes at some performance hit. When we release the formal product we'll have some basic benchmarking which will allow you to sort of see what the gap is from replicas to erasure coding based on sequential random read, write and different image sizes. So hopefully that'll give some people a guide as to the kind of performance penalty that I pay for erasure coding. But obviously it comes at a huge cost advantage. If you've got large amounts of data you have to use significantly less space for the same data integrity. So just to round off the story about where we're going and this meant that September is wrong and that should actually say Q4 of course. And someone is still potentially subject to change but early indications are that nothing is going to change right now. As Kimesh mentioned we're still going to be working on the file system. There is some overlap with Gluster but they actually probably fulfill different use cases. And architecturally they sort of lend themselves to different use cases pretty well actually. And HDFS is certainly one area we've always wanted to get to first and we're going to continue to do some work around getting such a fast as a formal replacement. And then on the RBD side we're going to be doing some mirroring work which allows you to stream out a copy of that block device to a secondary site. We're basically real time streaming point in time consistency which we're really hoping to open up more of the database use case. Combine that with the cash tier and also the fact that the RBD kernel module is now going to be compatible with the kernel that ships in rail seven which will be GA'd hopefully next month I guess. Suddenly this is going to really lend SEF to handle latency sense of applications particularly databases and working with some of the vendors in a database world to sort of optimize and tune all of this and hopefully come out with some cool reference architectures which will show you what you can do. And as I mentioned Hadoop will probably one of the early use cases we'll try and focus on for the file system. This is subject to obviously ongoing conversations with Red Hat in terms of resources but we would really like to get SEFFS to a GA point of view for the end of the year if not the beginning of next year. And then I think at that point we'll probably call SEF 1.0 and say that the project is has reached all of its original design goals. And I think last slide here is just to let you know if you are running SEF with OpenStack or thinking about it, running a virtual trading class, pick your time zone, sign up and learn how to use SEF with OpenStack. Thank you. Thanks me. So we'll reserve questions for Neil at the end of this presentation. So that's all great stuff. SEF as you see has a lot of different use cases. Lots of great stuff coming in the roadmap. The question now I'm sure many of you are asking yourself how do I consume this stuff? How do I actually implement a SEF cluster in my environment? What kind of hardware, networking, things should I consider and worry about? So I'm gonna talk a little bit about that. I mentioned earlier we work very closely with Red Hat and Ink Tank. Now we're of course Red Hat, all Red Hat. We do a lot of work co-engineering, working with them on a daily basis. And we have this thing called a reference architecture. Now reference architecture means different things to different people. To us it is essentially what do you need in terms of servers, networks, storage, disks, racks. Now how do you set it all up and how do you get it up and running? So it's not deployment. It's really all about the architecture around it. Let me tell you a few things that I'm gonna cover. So it's all about implementing your SEF cluster. So what are the things you need to consider, right? So many of the things I'll talk about today are things you need to consider. Questions you need to ask yourself, which I'm sure many of you will and I just wanted to get the thinking process going here. We are in the middle of defining a reference architecture in conjunction with Red Hat as we speak. And we will have a reference architecture documented in the upcoming release, which I'll talk about in a second. So where should you target SEF? What are you trying to replace it with? What kind of use cases are you looking for? And a few examples of the kinds of things we're thinking about within Dell in terms of reference configs. I'll talk a little bit about that. And then end with a customer case study, University of Alabama, which I mentioned earlier. So what kinds of questions should you be asking yourself? SEF is great, it's wonderful. There are 200 people in the room, probably interested in going off and trying to implement on your own. These are standard questions, right? What are your business requirements? What are you trying to achieve with SEF that you are not able to with other storage products? And a price sense, you know, on NAS devices or object stores, why SEF? So I'm sure you have budget considerations in mind. And as the case study that I mentioned earlier talks about, a lot of this is organizational commitment. It's a new technology. It's a scale-out technology. You're probably looking at avoiding lock-in by open source technologies and industry standards. You probably have some enterprise IT use cases versus cloud application use case. And I'll talk a bit about that in a second. You are probably looking at spike data usage. Maybe you have end of the month processing and all of a sudden you need gigabytes of extra data. Is that a spiky data set of usage? So those are the kind of business considerations. Again, on the sizing side, what are you starting off with? Is it a 100 terabytes, half a petabyte? What's your initial size and where do you see this growing? Is it gonna grow into a five petabyte cluster? Is it gonna be in a single data center or multiple data centers? There's some of the things to think about. And the most important thing is workload. What workload do you wanna put on this? What kind of IOPS and throughput requirements do you have? Are you gonna have a streaming video application running on this? Is it gonna be a Hadoop analytics workload? Or is it just dev test? So think through those things because every one of those use cases has implications on the kind of architecture you'll need to put together on the backend. And finally, what kind of data will you be storing on this? Is this gonna be ephemeral data? You're just gonna be running a lot of web apps that don't need persistent data on the backend. You spin up a VM, it requires some data, you're done with it, the data goes away. That's ephemeral. Is that the kind of workload you have? Or you have some database that you wanna run on this. So that's persistent data. Is it an object store? Do you want to have a large amount of video files and photos and email conversations and unstructured data? You wanna store all that. Then that's an object store. Or is it traditional block store? So these are all the kind of things you have to think about as you plan your self implementation. I'll give you a couple of guidelines from our conversations. So this is kind of how it maps out and the way we are looking at it. So think of it as a traditional IT is one sort of approach where you have your traditional sans, EMC, NetApp, Compellent, Equalogic. So maybe you're looking at replacing those because of cost reasons. Or maybe you want them to work, you want self implementation parallel with that. So they're both possibilities. The good news with Cinder, I'm sure many of you probably know this, is you can have multiple backends. You can have Compellent, you can have NetApp, you can have EMC, and you can have Seth. And you can multi-tier between them. So there's that possibility too. And then that's the sort of sweet spot for Seth where you can use it as a backend for virtualization and private clouds in the traditional IT environment. Or if you're looking at a cloud app, massive scale, scale out, and you want to use object store. Like I said, video files, photos, massive scale. You're generating a lot of those kinds of data and you want an object store sort of like an equivalent to a Swift cluster. You can, Seth is another alternative for you. And then of course you have the use case that Neil was talking about, which is a block store for OpenStack. So those are the kind of things you have to think about when you're looking for targets for our Seth implementation. So you can see here there are two dimensions. One is capacity. Are you looking at large capacity, cold storage, archival type of data? Or are you looking at performance? Because these two decisions will lead to a very different backend infrastructure. Whether you want to use SSDs or you want to use cache tiering and things like that. So keep those considerations in mind. So some of the things you have to think about. So redundancy and resilience and replication is gonna be a key decision point for you, right? It's obviously a trade off between cost and reliability. You can have 3x, 2x, 4x redundancy where effectively Seth is keeping multiple copies of your data in your cluster. So in effect you're using more storage. Of course you'll get reliability, but at a cost. So those are the things you want 2x or 3x depends on your workload. What happens if your data goes away? Are your customers gonna come screaming at you? Are you gonna lose a million dollars? Those are the considerations for that. Now Neil mentioned about the crush algorithm. This is the cool part about Seth. It's the intelligence behind Seth cluster. It determines where to put the data automatically, right? It does it automatically for you. You don't have to worry about it. But then you have to design the failure zones. Lot of things can fail. OSDs are sort of the disk. The disk can fail, which happens all the time. If you're using SSDs for journaling, those can fail. Your node itself can go down. An entire node with a lot of disks hanging off of it can go down. Your entire rack can go away. Or if you're replicating across multiple data centers, you might want availability zones. So these are all the different considerations for failure domains. And that determines how you define your crush configurations, right? So those are the things to think of. And then from a storage pool standpoint, if you want higher performance, you want SSD pools with multi-tiering, for example. You can have performance pools and capacity pools. Again, going back to that decision around, do you want performance or capacity? So you design your pool accordingly. And then there are these things called monitor nodes. The monitor nodes are the ones that are actually keeping this crush map intact. They are the ones that sort of look at what's happening in the cluster. Updating the map as things go bad. One of the nodes goes down. What happens? Crush algorithm kicks in. It'll copy all the data to the new node. And you still have three X redundancy. So it all happens automatically, which is what makes stuff great. But then you still have to design for failure of the monitor nodes, right? What if the monitor node goes away? So you need them to be designed in such a way you, you keep them across those failure zones, right? You might want your monitor nodes in different racks in case a rack goes away. So those are the considerations around replication. Finally, what happens when a node goes down or a disk goes down and your replacement scenario is I'm gonna take out this disk. In the meantime, Seth is doing its thing. It's doing its three X application. Now you're gonna go put a new disk as a replacement for that failed disk or you're replacing the node when a whole node goes down. So during that time, there are lower redundancies or there could be performance impacts because what Seth is doing is copying a lot of stuff around. There's a lot of chatter on the network. So again, we don't have all the answers. We're working with Red Hat to come up with optimum configurations that'll help you with certain, some of these considerations. So servers, right? So at the end of the day, Seth is running on servers, on commodity servers, right? Servers with a lot of disks in them, right? Whether they are SSD disks or just spinning media. So there are some guidelines. I'm not gonna go into too much detail. These slides, you'll have them. We can certainly chat offline here, but there are certain guidelines around how many SSDs you need and how much RAM you need per OSD, how many gigahertz. There are guidelines around SSDs versus spinning disks. One is to five, I believe, right? For every spinning, for five disks, spinning disks, you need one SSD for journaling. So there are some good guidelines, and that's what we'll be using to define and design our reference architectures. Erasure coding, as Neil said, will increase your usable capacity, right, for certain use cases. But it comes at an expense of an additional compute load, which means now you have to have higher capacity servers. It's this more compute being used there. And then there are these things called J-Bard expanders, and Dell has some really cool products in this area. Whole bunch of disks in a chassis, right? We have actually a product, which I'll talk about in a second. It's called the 3060. It's got 60 disks in one single chassis. And nowadays we have four terabyte drives. So you're talking about 250, like a quarter of a petabyte in a single box, right? So you can hook that up to one of our 720 XT servers, but then you have to be careful. There are some trade-offs here, right? Because if you add too many of these J-Bards, then you're probably oversubscribing your SAS lanes, because ultimately everything is going over SAS. And of course, there could be extra latency, so it depends on what your workload is. The monitor nodes, you need an odd number, because there's a quorum algorithm, and it requires an odd number of nodes for it to have quorum. They can be hosted on your storage nodes, where you're actually having your data, or they can be on dedicated nodes. It all depends on the stability of your cluster and how many nodes you have in terms of storage, right? So again, those are considerations for your monitor nodes. And now if you're looking at distributed sites, multiple sites across different, University of Alabama was a perfect example. They started off with a cluster in Birmingham, and they wanted a backup in Huntsville, which is 100 miles away. And they had a dedicated backbone, right? Like a high-speed wind link, but they had issues with latency and stuff like that. So you have to design these Stratos gateway nodes for large object store deployments. So there's this new cool feature in, I believe, in the new version, Firefly, right? So they have this thing called federated gateways that'll ensure that these things are all replicated, and there's consistency across multiple sites, which is all great stuff. So again, the point I'm trying to make here is a lot of decisions to make. Business decisions, technology decisions, hardware decisions, and these are all the considerations you have to keep in mind. And the good news is, we are helping, we working with Red Hat and Ink Tank are making the decisions easier for you with a reference architecture. And that's what you will see in a few weeks from now. Oh, I forgot, networking. The big, big issue with all of this stuff, right? So all of our customers that we speak to, they bring up networking as the biggest issue. Guess why? Because there's a separate networking team, the separate security team, and then they go off and build this thing, and afterwards the security team comes along and says, what were you thinking, right? What's wrong with this? So there's a whole issue, and the idea is for you to get those teams involved upfront because there is an impact on your networking infrastructure. How you're hooking this up to your core network, redundancy requirements about network, whether you want to use multiple switches, multiple traffic lanes between your servers and your switches. You want to use dedicated client networks. Again, the point is lots of decisions, right? You want to use VLANs or dedicated switches. You want to have one gig, 10 gig, 40 gigs. If you're using streaming video, if you're a telco, like a Verizon or an AT&T, and you're generating this tons and tons of video traffic, and some big event breaks somewhere, and it generates a lot of traffic to your site, all of a sudden you need 40 gigabytes, right? So you have to think through those workload requirements I mentioned earlier. There are lots of different design requirements around networking, multi-racks, multi-core fabric connectivity, WAN connectivity, because if you have multiple data centers, you know, like I said, the big difference between Swift and Ceph is Swift is what's called eventually consistent. That means when you write your data to your cluster, it's not written immediately. It takes some time. It's eventually consistent. So if you read back your data immediately within the next second after you write, you may not get the same data. You might get the previous data, which is old. That's called eventually consistent. Ceph is not eventually consistent. It's a topic. When you write it, you know it's there, right? So there's a big difference in the design philosophy between Ceph and Swift. And that's the thing to keep in mind when you do multiple different data centers, because if you have a latency on your WAN, which is typically what happens, when you write it up, when you write to your data center, to your Ceph cluster, one node is sitting in Huntsville, another node is sitting in Birmingham, and you have a low-speed WAN between the two. It takes time for the objects to get written into your disks. Now you're waiting for this thing to come back, because you can't use Ceph until it's written, right? So things of that nature you're to keep in mind. So again, networking considerations, lots of things to think about. Well, like I said, the good news is we are making it very, very simple and consumable. We just announced back at Red Hat Summit last month, this OpenStack, what we call a bundle, right? It's an easily consumable piece of hardware, software, services and support that's a rapid on-ramp to OpenStack, right? So it's scale up, it's modular, it's single-pointed contact from Dell. We work with Ink Tank, we work with Red Hat, but you get a solution that works out of the box. We have a reference architecture that's taken all the complexity out of it. Of course, you're welcome if you have specific requirements to come and talk to us. We'd be happy to work through the requirements and reference architectures with you, but this is sort of for the bulk user, right? Somebody that has a dev test, easy to set up, they want to get going quickly, speed, time to value. We've heard about speed so much today and the last two keynotes, it's all about speed. This is all about time to value. We'll get an OpenStack solution up and running very quickly and in the upcoming release, which is coming in the summer, early summer timeframe, we will have Cep integrated into this as well. So you'll have Cep, OpenStack, all integrated in one single bundle. That comes with professional services and with Dell ProSupport. So I just wanted to throw that out there so you're aware of it that you can start using it. So here are some examples, Cep, Dell Server, I'm not gonna go into the details of this, but as I said earlier, two different sort of ways of thinking about it, performance and capacity. So if you're looking at performance, you do need SSDs, which for journaling and you can see the different sizes. This is one single server, by the way. It's called the PowerEdge R720 XD, which is a main sort of workhorse of the enterprise. It's highly used, it's used across the board in many, many different enterprises. It's the one that we have chosen for Cep. So you get, those are the different sizes you get, depending on whether it's performance or capacity. And I was mentioning earlier, look at this MD series things. This is still work in progress, I just wanna let you know. MD3060 is a J-Bard chassis. It's got 64 terabyte drives in it. So it hangs off of your 720 XD server and it gives you tremendous capacity. And we are working the reference architecture to take into account all those decisions that I just walked through, so that you don't have to worry about those things. So this will come through in our reference architecture document that we'll be publishing soon. And that'll give you an idea of what these specific details around how these things are, including networking. So this is just to give you an idea of how things are progressing. So what are we doing to enable? As I already mentioned, working very closely with Red Hat and Ink Tank, our goal is to bring enterprise-grade storage solutions that are consumable by the enterprise IT folks, for those use cases that I just mentioned. A lot of things happening there were co-engineering. We are working together. We're actually now an extension of the Red Hat development team, actually. And we're working every week with Ink Tank, working through what these configurations should look like, what the preconfigured bundles should look like, what's the best way to get this up and running. And it's a very storage-focused release. The 1.0 that we released at Red Hat Summit was OpenStack. It's a balanced compute and storage configuration. And upcoming in the summertime frame, it's all about storage. So we're gonna have compelling. We're gonna have equal logic. This is our Dell storage solutions along with Ceph. And we also have Red Hat storage. So there are a lot of different solutions available to you. As Neil mentioned, this is all certified against RHEL. So it comes with professional services, support and training, which is important. There's the Ink Tank University. There's also the Red Hat training that comes bundled with all of this stuff. So if you want to learn more about it, get certified on Red Hat or on Ink Tank, there are lots of options available to you. So a lot of good stuff happening. Deployment services is part of it. So you don't have to worry about how to have to bring this all up. I've got these servers. I've got the software. I've got RHEL OSP. I've got RHEL 7. All of these different pieces, we will come and deploy it for you, both hardware and software. Because even with hardware, it's not that easy. You have to go set up your racks, your rack and stack, cable, make sure all your servers are properly configured, J-Bot options, raid options, what have you, networking, switch configuration, not easy. So Dell will come in and get that up and running, and we'll work with Red Hat to get the rest of the stuff up and running for you. So effectively, in a very, very short time, you'll have the entire open stack and stuff cluster up and running in your environment. So that's what we've got going here. Very quickly, I think we're just about time here, right? So the UAB is a great case study. In fact, we had the customer up on stage in an earlier session. Great use case study. I just wanna give you very quickly their pain point, what they started off with. They had 900 researcher. So this is a research institution that does a lot of cancer and genomics research. They get millions of millions of dollars from the federal government to go do this stuff. They have 900 researchers. They have tons of data that they generate. They use HPC, high performance clusters. And guess what happened? They put all the data in all these different places, laptops, USB drives, local servers, HPC clusters. Data was all over the place. This was a huge problem for them. Because transferring the data back and forth was a problem. Security was a problem because they didn't know where the data was, and there was compliance issues. So they basically said, hey, we want a single, consolidated, centrally managed data solution that is secure, that's compliant, that we can work with. And that's when we got engaged with the UAB and actually built a SEF solution for them. So we had the centralized storage solution. At the end of the day, the big call out here, they did their economic study on this. Turns out it's 41 cents per gigabyte per month, which is actually pretty darn good. It's coming very close to your public cloud storage number. So some really good stuff there. There's actually a UAB case study white paper that's out there that you can search for and download. Lots of interesting information there. So this is all about the specific things, and I'm gonna skip through all this stuff, and I'll open it up for questions. Just about running out of time here. Questions for Neil or myself? Yeah, please. So the question is about the phasing of the erasure coding within a product. The real phases, I'm not sure what phases you were looking at here. The full feature set is available in ICE. The question was about the phasing of the erasure coding feature set within the commercial product. And I'm not sure what phasing you've seen on the website, but the, no, no, there's no phasing. The erasure coding is done. It's, there's no phasing. It's, I'm not sure why. You can show me which web page you looked at afterwards, but erasure coding is ready to go when the product's released in a couple of weeks. So the question is, when are we gonna see Solometer support within Ceph? So that's actually one of the questions which came up in the Ceph developers session, which was going on a couple of hours ago. I don't have a date or a time for that, but there's been a sufficient interest expressed. And now that we have access to Red Hat OpenStack developers, I'm hoping to see that put on the roadmap relatively soon. So no dates, but it's the interest has been registered. We'll take other questions. We can have an offline conversation. Other questions? I have a question about the crush algorithm. Does it take care of locality of data into account while placing the blocks? So if the VM is running on one host, will the data be close to that? So the way that the block device works is very different to say Gluster. We don't keep the block device on a single host. It's actually striked across the entire cluster. So the concept of locality is slightly orthogonal. So you can run your virtual machine on the same node as a storage node, but typically it's not gonna give you much advantage. There is some data locality tweaks you can make. We have, you can set the primary affinity to keep data closer within a rack or within a network segment to the VM, but it's not sort of conversion infrastructure like that is not really the recommended deployment model. You can do it, but it takes a lot of customization. So it's not something we recommend without discussion with us. Yeah, other questions? Yeah, good question on the active, active kind of a system. So say if you have two clusters in two different geographical locations. Yeah. And you have both of them in an active, active configuration. They're essentially utilizing the same level of data set. One is writing to it. One is immediately reading from it. What I've heard is you're suggesting is not to go ahead with Swift, but to go ahead with set. Is that the right thing that I caught? So do we have to be specific about the use case? If you talk about the object storage within... Yeah, so suppose I'm doing a financial transaction, I'm writing a financial transaction, and then the same as time someone's reading a ledger on the other side, but it's an active, active, right? It's the same... Yeah, so the way we do, we work slightly differently to Swift here. So we have within the Rados gateway which handles the S3 and Swift protocols. That's what we... You can federate RGWs effectively, and so effectively they work in a master slave configuration. So there's a concept of a primary site where the data gets written and that's slaved out or copied out to remote sites. The remote sites, you can't write to them. They're only there for read. So you can do read affinity, but you can't do write affinity. Okay. Yeah, we are out of time. We are out of time, but if you want to talk to either of us, we can have a chat offline. Come straight up here. Thank you.