 Hello, I'm Edward Genouyak. This is Jeff Scott. We're with Charter Communications. VMware was nice enough to give up their slot and let us present a case study that we've done running high performance workloads using a combination of OpenStack, Pernex Data, FVP, and VMware Bio. What we wanted to do is replicate the success that we've had with Pernex Data on our VMware platform. But we wanted to replicate that in our OpenStack platform. What you'll see from Jeff is the before and after the storage analytics of running a Cassandra cluster deployed using Heat and Chef on VMware integrated OpenStack with VMware NSX and Pernex Data. Some of the challenges that we face in our industry are not unique to us, flexible self-service environment to support DevOps, but with rapid and consistent application deployment into our production environments. We run a multitude of data centers across our geographic footprint, and we have to have a consistent method for deploying these applications regardless of where they're going to be. Some of the unique things that we struggle with is having a host of application vendors that insist that their application needs to have local SSD-like performance. This starts to break down our model of trying to have a consistent substrate with which to present all of our applications in our production environments. We also struggle with the unknown workload. In this era of cloud, the infrastructure team, as it says rarely, knows what that workload is going to be, we still wanted to design a consistent substrate that would host any application regardless of what that workload was going to be. So some of the goals and the outcomes of this endeavor were to create the one-click application deployment, in this case, use it with a Cassandra cluster. We wanted to satisfy the network security concerns using micro-segmentation. That's, as you'll see, as inherent in the design. We also wanted to characterize cloud workloads with analytics throughout the entire stack. So we wanted to have a consistent operational tool set that our operations group was already familiar with that would allow them to characterize these workloads throughout the entire stack. Along with that, we wanted to decrease storage latency, increase storage IOPS. But we also wanted to increase CPU efficiency. And since we wanted to reuse our existing substrates, we believe that this is going to lend itself into saving us lots of money. For the sample use case, we chose Cassandra. Why did we want to use Cassandra? For a lot of the video application workloads that we have, they require Cassandra backends. But the way they're presented to us from the vendor community is that we had to have an isolated Cassandra deployment for each and every application that was out there, which ended up driving up costs. We wanted to consolidate that and present a single solution that they could all start to take advantage of. It's also an excellent application that benefits from automated recurring deployments, as you'll see with our combination of heat and shelf. Ultimately, what we wanted to do was take what we thought was the holy grail of virtualization. If we could virtualize Cassandra in a production environment, give it the IOPS it needed and reduce latency, we feel that we can virtualize all virtually anything. We also liked using this for our first one out there. There was a lot of open source benchmarking tools already available that we can use to characterize the workloads regardless of each of the different test cases. So with that, I'm gonna turn it over to Jeff Scott and he's gonna go over the particulars of the solution that we came up with. Thank you. So as Edward said, we brought together a number of different products throughout the VMware ecosystem. It started obviously with the VMware integrated open stack. This is a kilo-based open stack. I mean, a vSphere VMware is easy for us because we're already a large shop of VMware where we can leverage our in-house experience. But more importantly, we bring in all those ecosystem of partners to complete this solution end-to-end. That includes the VMware NSX for the SDDN layer, Pernex Architect and Pernex FEP, which you'll see some of that here in a second, which is the storage analytics and optimization and acceleration engine we have running at the hypervisor. Heat and Chef, most people are obviously familiar with heat. It's the VM open stacks orchestration framework. We have a heat template that deploys a three VM Cassandra cluster, configures all of our DRS rules, the VMware DNS rules, the security groups, the logical networks and the virtual routers, and even signs the floating IP for the Cassandra cluster. Heat then calls off and Chef to execute a Cassandra recipe that does the install and configuration of Cassandra. So it's the bringing all these products together to create this complete solution. Kind of a high-level architectural overview. Our environment for VIO consists of three clusters. You've got your management cluster, your Edge cluster and your Nova compute cluster. Management cluster is the vCenter, the NSX, controllers, the open stack management, our Pernex management and a number of other things, all the management stuff. Edge clusters where all the Edge gateways get spun up in and then for NSX. And the Nova compute where we're running the actual guest workloads. This kind of example, we have three Cassandra clusters spun up in the Nova compute plus the benchmarking VM. And it's also actually in this environment wasn't dedicated just to this, there's a lot of other VMs running. Network architecture overview, just kind of gloss over this just a little bit, but there's a lot of different networks that make up the VIO network architecture. We have the external network where the actual external IP or VIP gets created for the Cassandra cluster, transport network, API access network. We have a dedicated network for the Pernex FEP cache synchronization. And obviously the management network as well as NFS. Three clusters, each cluster has its own dedicated set of data stores. So first thing was kind of just spawning the Cassandra cluster itself. So just walking through the GUI environment, horizon, standard horizon stuff, launch the stack. We already have a URL for all of our heat templates and just simply copy and paste the URL into the deployment. You get the launch stack screen. So again, standard stuff. Fill out a number of different variables for the Cassandra cluster. Heat will then go and create the actual VM nodes, create the networks, the virtual networks, create the virtual routers, create the security groups, sign the floating IP to the cluster, then call Chef, which does the actual Cassandra deployment. This in-to-in takes roughly three minutes, give or take. And you got a running Cassandra database. When you're done, this is a view of the network topology from horizon as well. This is an example we've deployed random heat template three times and we have three separate Cassandra clusters. And then down the bottom left, you kind of see where our test VM is in this tenant. And then getting back to kind of that ecosystem model, we can actually, also, this is kind of a screenshot from VRealize Operations. We can see our same topology in our operations tools. So it talks to what Edward was saying about making sure we're operationalizing this platform. Using the same tools that we use and manage our vSphere environment, VMware vSphere environment, we're using those same tools to operate and manage our open stack environment. And you can see where we're getting the, in this particular case, at the tenant level, it's generating alert. This one is that the tenant's approaching a quota and even drill down into individual VMs. We're getting alert here that we're getting a disk space or memory alarm from open stack presented to us in VRealize Operations. Same thing with the NSX, the networking layer. We're able to see in NSX or in the VRealize Operations, we can see NSX in-to-end health between two VMs and see the physical network between those two. So our test framework, our test setup, pretty standard setup. Obviously, Kilo-based open stack running on ESX55. We're primarily at a Cisco UCS shop, NetApp FAS with, this is the interesting thing. It actually has a flex pool with both SSD and SAS disk. We'll talk about that in a minute. NFS data stores with the Pernik FEP and Architect and Cassandra and this Yahoo Cloud serving benchmark tool, which is what we use to do all the benchmarks with. We ran three different test scenarios. We ran with the baseline with no IO acceleration, SSD based acceleration and RAM based acceleration. Each, in the SSD case, each of the, we had five nodes in the cluster, in the compute cluster. Each of those has a pair of 200 gig RAID zero SSDs. And then in the RAM, we allocated 32 gigs from each host to the RAM cluster. We focused on two primary workloads within the benchmark tool. The benchmark tool has a bunch of pre-canned, pre-built test workload sets. We focused on what's called Workload B and Workload D. And these are two workloads that are particularly good at stressing the cache layer. There's obviously lots of different other use cases. We wanted to focus obviously on what would stress the cache the most. Workload B is a read mostly workload. It's one of the 95, 98% reads. Simulates photo tagging and read tags. Workload D is read latest workload. This is always hard for me to say, because it's a little weird, but it's reading the most recent writes. So it does a write and then it reads those writes until another write comes along and reads those writes. So it's the read the most, the latest writes. This actually is very similar to some of our authentication systems where people come in and do some authenticate and it goes and reads that stuff frequently. So the characteristics, Workload B IO characteristics, mixed block sizes. So you can see in the bottom graph in the bottom right, you know, it's mostly AK blocks, but you can see we got roughly 10, 13% of 4K, up to 64K. So various block sizes, heavy on the reads, 98% about average leads, reads. Workload D characteristics are very similar, mixed block sizes, very heavy on the reads with a few writes. Again, it does a write and it starts reading those writes. So baseline with no IO acceleration. So we ran the test and we got back 1.4 milliseconds average latency in 1500 IOPS. On the surface that sounds fantastic. You're going 1.5 milliseconds latency to my storage is actually really good, really fast storage. This is a lab environment that we're running these tests in and it's a very lightly loaded storage array. So there's not a whole lot going on. So that's part of what's going on here. And back what I was talking about before, these data stores, these NFS data stores are actually a combination of SSD and SAS. So you are, we have SSD performance at the array. And so these are the metrics we're getting without any of the host base caching. So it looks good on the surface, but then we start digging in. We look at the Cassandra VMs themselves and this is actually a performance graph out of vCenter. And we see that the VMs, even though the workload is trying to drive the VMs to their maximum CPU capacity, we're only getting about 10% CPU usage on each of the VMs. So something's kind of holding up the CPUs from writing. We drill down into some, and this is a graph out of Pernec's data architect. Again, talking to that, we can use all the same tools that we use today to do our analysis here. We see that the latency to the storage varies quite a bit based on the block size. So we can see on the 32K blocks and up, latency quadrupled or more. I don't know what's going on, drawing my graph. Wonderful. These are like the best graphs here. See if I can get it out of slideshow mode and see if it'll, I apologize. Technical difficulties, you gotta love it. This displayed fine earlier. All right, well, we'll get to some of these other graphs, but what you're seeing, at least in, and sorry, these aren't displaying here. These two graphs on the left are our update response time and our read response time. Lower is better. You can see that the baseline we had 27, or what is that, about 270 milliseconds run time. And obviously with using the local cache that dropped dramatically. Same thing on the update time. Dramatic decrease for the SSD and RAM acceleration. On the graph you can't see on the right, it would show you that for our overall throughput numbers, this is where it really starts to come together. We had over a 66X improvement with SSD cache and over an 81X improvement with RAM caching. Workload D performance, really informative graph here, had about the exact same, the graphs look very similar. We had about a two and a half increase in SSD and a four X increase in RAM. And it's a much bigger graph, but it shows for some reason. And this is kind of before and after. So we went from 1.4 milliseconds read time, or latency, down to 0.05 milliseconds latency. So even though we had this very high performance storage right in the back end, we still can take advantage of the storage acceleration layer at the hypervisor. And this is due to be able to run the ecosystem of VMware partners to bring in the same tools that we have in our normal environment, but layer leverage open stack. IOPS went from 1500 to 1800. And this is where the rig is really interesting. It's because we moved that IO bottleneck, running that same workload again. CPU utilization of the Cassandra VMs, the three node Cassandra cluster, went from 10% CPU utilization up to 97% CPU utilization. So more work was getting done and able to leverage those CPUs much more efficiently. Now you can go in and drill down and go figure out, hey, do I need to add more RAM or need to add more CPU? What happens it needs to be? So in summary, we completed what our objectives were. We created a fully automated deployment model for Cassandra using the heat and chef combination on top of open stack. Made it a one click operation. We greatly reduced cost by not deploying three bare metal hosts for every instance of Cassandra out there in this example, which that compounds us off when we talk about multiple applications and multiple data centers. And then we gained more insightful analytics. We increased our IOPS, reduced our latency and improved our utilization. It meanwhile, using our same operations tools to monitor and report on our environment using the leveraging the Pernix data and vSphere on open stack. That's it. Anything to contribute? Any questions? No questions? Awesome. No questions. All right, thank you very much. Thank you.