 Hi there, good afternoon. Thanks very much for coming and watching our complement to the announcement this week of our set of storage products. I'm going to today do a presentation, which is going to just introduce those products. It's the first time we've been talking about them in public. We'll invite up some people to talk about additional technologies and use cases for the products as well. But in the very beginning of this talk, I wanted to start by just discussing why storage is relevant. And it's a story of, to put it mildly, unreasonable expectations. This is a picture from the Annunciation in the Vatican in 2013 of the Argentinian Pope, who is a revolutionary and somebody who I admire very much. The reason I brought it up, it's because it is a great example of why we have a data issue today. Why storage is such a hot problem. You had, I think, 18,000 people at the Vatican watching this, each of them with a hand held, probably doing a gigabyte a second. Sorry, a gigabyte per minute of data recorded there. So you've got 18 petabytes per minute being generated by that crowd there. This is just one example. I was looking at this bioinformatics talk where they're generating basically 18 petabytes per month of data and need to get that stash and try to understand what's the best way to get from where they are today to something that will scale over years of data that they're generating. So we live today this data challenge or a modern data conundrum. Basically people are just generating data too quickly. You've got billions of people engaged in global scale creation, either explicit or implicit because they are either just generating data because they want to, they're taking a picture of their newborns or taking videos of family meetings, or because they are increasingly being logged and captured in transactions that they're doing there. Some of this data is ephemeral, but most of it is not, which means over time this is going to accrue and it will be a challenge there. Combined with this, there's this really interesting transition. When you had a hard drive attached to your computer, even though you paid for it, you knew that hard drive could fail. So your expectation there is that you're going to put some data on it, but you had to keep backups yourself. And because none of us kept backups, we always knew that we were being a little bit naughty and that when the drive failed, we were going to lose data. Well, it's interesting. You put photos on Facebook and on Dropbox on a free tier, and your expectation is that that has the highest durability possible, that you'll never lose your data. So it's very interesting. I used to buy equipment and you would know that would fail, but you're throwing this into the cloud and you expect the person who's handling that on the other side, in this case for free, to keep it as precious as you should be keeping it in your backups. Another reason why we have outrageous storage requirements is because analytics is now a primary element in business, so you really want to be logging everything that your business is doing in order to be able to look past for historical patterns and to be able to compute on that to ask questions that you don't know how to ask today. Finally, virginity and change control, which means that you don't only just have one snapshot of the data being generated, but you're actually tracking multiple copies of those as they evolve. So these things together basically make storage very complicated. Look at the typical storage requirements. So as an end user, your expectations are unlimited capacity. I want to be able to store, and I never want to get back a disk full error message. I want instantaneous read and writes and because we've been taught to expect that, we want infinite durability and availability. It should never go down, and you want interfaces for every use case. We were recently talking to a technology company that has a very large data footprints and they want to move to a system, but they want basically to offer objects, they want to consume objects, block, SMB, SIFs and NFS on the same cluster, which is a very challenging requirement for anybody that's tried doing something like that before. It's a very challenging requirement there. So how can we deliver against this utopia? And basically the answer is not the way that we used to do it, not with the monolithic appliance. You can't do this with a NAS or a SAN and you won't be able to do it in the future. If you have a NAS and SAN that addresses you today, it won't address you in the next five years because you're gonna have more data that you can put in there. What's the answer to that? We've heard a lot about this, it's not a new concept, software-defined storage, but I think this is the year where we'll see software-defined storage really break out and become something which is production grade deployable. Software-defined storage basically breaks up this traditional SAN and NAS concept, taking advantage of something which only exists now, cheap hardware that you can run a storage server on. It used to be that for you to be able to do any sort of sizable deployment, you have to buy special hardware. The HPC guys have been doing it for a long time with Infiniband fiber channel. Nowadays, you can run a real storage cluster on Ethernet using commodity parts that you can buy from any vendor, not just a high-end vendor, but anybody Dell HP Quantable Cell UKit that you can put together, a petabyte-size storage array. The other thing which is a requirement for this to be able to possible is a different type of software architecture. So basically horizontally scaling software design, which lets you have multiple nodes in a cluster participating and delivering a unified service. And finally, software-defined storage is interesting because people are now looking at the potential and saying, how can we consume that? So each of the different software-defined storage providers have focused on a subset of the possible use cases. And so each of them is optimized for a different type of storage interface and for different use patterns. In general, if you look at architecturely at software-defined storage, the main change from standard scale-up storage to software-defined is that it assumes that the failure domain is the entire node. Whereas before you'd said a drive can fail, now I say an entire node can go down, but I still have full service available. And you can add more nodes to increase capacity, improve durability, and provide additional functionality in the case of a scale-outs platform like SEF. You can add an object gateway, which lets you write objects to SEF along with consuming the block interface. Most of these systems have some shared concepts between them. They're the storage servers, which are basically the pieces of code that run closest to the drives themselves, and they manage the idea of writing to the disk and also talking between each other to make sure that the durability guarantees are maintained. API servers in general, which provide higher level interfaces out there. These are called sometimes proxies or gateways. These are the things that provide object interfaces up or ISCSI or even ATA over Ethernet capabilities. And then you've got monitors and controllers that live alongside those that do sometimes scrubbing and data validation. Sometimes it's streaming replication to another site. So you've got some peripheral services there. This is a diagram of how SEF works, and the SEF client here is basically what exports a high level interface there. Fundamentally, the piece that I'm talking about, the object store is called COSD in the SEF case, and it's what talks to drives themselves. You've got monitors alongside and a metadata server, which tracks metadata across your entire cluster. What are the typical interfaces that people consume software defined storage or any type of storage service? Block is the fundamental one. The typical interface which you use to provide block or the standard interface is ISCSI. AOE is another option. Block is interesting. You tend not to assemble it from multiple nodes at once. You tend to consume block from a single machine. You may consume it from multiple machines for a failover use case, but you're not really expected to be riding to the same block device from two different independent machines. Shared file systems is obviously the most traditional and most used example here. NFS and Samba or SMB and SIFs are the way that those are normally consumed. Shared file systems are very difficult to get right in scale out because it's not so much that offering up NFS and SMB SIFs is difficult. It's maintaining POSIX guarantees that's hard. The idea that you have locking and atomic transactions happening on the file system level is very hard to do in a distributed system. So that just makes shared file systems very difficult, and that's the reason why if you go and look at what specs people that are doing software defined storage or scale out storage today provide, normally shared file systems is the last piece that's added or it's the piece which is still somewhat experimental. And finally, object stores. Object stores are a great trick that we invented in storage to make the end user or the developer think about how they're storing data differently. By default, the end user assumes or the application developer assumes that they have disk to write to, right? Assuming you have disk to write to is very difficult, especially if you're assuming shared file systems because you get all these interesting assumptions. I can write to disk and other nodes that are part of the cluster can see the same changes as I can. Object stores are different. Object stores, you basically say, I'm generating a blob of data and I'm gonna give it to the object store and get back a key in return. The reason this is an interesting trick is because you're making the application developer think differently about how they're storing, you then can really optimize for that use case and storing blobs because of generation of media, because of generation of lots of documents and so on has become something which is increasingly the way that people program applications. When I started at Canonical 11 years ago, one of the first things that we worked on was basically a homegrown object store to back Launchpad, which is a development environment that we use and that OpenStack is adopted. Now, we wrote this, it's called the library and Chris is looking at what objects are to be right. It's called the library and inside Canonical. It's basically a piece of software which you put all the builds that we're doing on the Ubuntu side and you're getting back IDs. We have a separate job that then takes those builds and publishes those to the archive. Archive.ubuntu.com is published from an in-house object store. Nowadays, we'd be running that on something like Swift but we didn't have that at the time and we had to invent one ourselves. Anyway, the important piece about any way you consume software defined storage is that there are many trade-offs that are inherent in the design. So some people are gonna say, I'm the best for this specific use case. I'm the best object store, I'm the best block store out there. Some of them will say, actually I can handle multiple use cases but that is a trade-off. There's no perfect answer. You can't just say, I'm gonna deploy one technology and it will handle low latency. I can run Oracle database backed by this and at the same time, I can dump objects that are being stored by cashier point of sale systems that are doing atomic transactions on the backend. There is no system that provides you perfect latency guarantees, high availability across multiple sites. You have to choose what you want. Proxied versus direct access is another trade-off there and the way they are implemented matters. So if you're considering, if you're trying to decide what objects or block or file system you're using in software defined storage, be aware of how it's designed because that actually makes a difference. Whether the placement maps are basically unified across all the nodes or whether the node can decide what data can basically decide only on the data which is stored on its own disks like a local placement map. Those things all make a difference in how the hashing is done. We announced this week. We went out through press and we've been talking about it to an analyst earlier on. Our product's family which is called Ubuntu Advantage Storage. The idea here is that you can build on Ubuntu any of the supported technologies that we're using and we will 24 by seven support that together with canonical technical services. So you can choose what technology you want and we will help you both deploy and manage that and provide you with support when you need it. I'm gonna go over each of these technologies with just with a slide talking about what the key aspects are of each of them and then I'm happy to take questions and for the technologies where we have partners here I'll invite someone up to talk about them. They'll know much more about it than I do. So Ceph is the first offering that we're producing and I've listed these alphabetically. So Ceph is the first one that we have announced but it's a converged system. It offers block, object and experimental file storage. It's known to scale to publicly to dozens of petabytes at sites like CERN. The thing which is special about Ceph it has a strong consistency model which means that the data is consistent no matter where in the cluster you're asking for it. And that of course comes with the tone trade-offs. We're providing integrated dashboards so you can basically look at what you have in your Ceph cluster, whether the nodes are healthy, how much data is there and what the growth pattern is. We're providing Ubuntu Ceph with exclusive pricing on our end. It's based on the content stored. So if you're familiar with how you pay for AWS it's similar to AWS S3 in that you only pay for the data you're putting in. You're not paying for how it's being stored in terms of durability or high availability. But it's better than AWS because you're not paying for the network traffic like Amazon charges you. It's priced at 2.2 cents per gigabyte per month so it's very aggressive starting pricing. We'd love to see people deploy this at scale because we believe this is a vehicle for getting scale out storage out there. It needs to have aggressive economics to go with it. Just saying I'm gonna do scale out storage and keeping the same business model in place which is very high per node cost doesn't work. So you don't have to buy any support for the node itself. If all you're running is storage on the node you start paying at 2.2 cents per gigabyte per month and we go from there. We have been working with SAF for a very long time. It's one of the primary software defined storage solutions that have been deployed at our customers. I wanted to invite up one of our customers and partners Steve Eastman from Best Buy just to talk a little bit about their experience. Thanks very much. Yeah, no problem. Do a cameo for Kiko here. So if you guys go back a couple of years I spoke about SAF at a keynote here in a breakout session two years ago. We actually went live. I was looking back 30 mile couldn't remember how far back but we went live on Bob tail for our first cloud in August of 2012. So long time back we've learned a lot. We've been supporting it ourselves, engineering it. We moved to really a much larger scale out cloud recently on quanta gear. And so as we move to that and the scale is much larger our workloads become production. That's where we leverage the partnership with Ubuntu or Canonical I should say for support for that. And so yeah, it's good. And we even expect our failure zones. That's one thing about SAF is you can do the crush map and we spread our data across four racks minimum. So we write one object and well, I say four racks. There's a couple of different pools. So you can have a pool where you write one object and then you move it across two other racks. You write one object, you move it across three other racks. So we have different levels of service in the pools. A lot of the same stuff that CERN spoke about. I learned a lot from CERN actually. Our engineering team seems like every time we have a question, how should we tune this? How should we tune that? We go and we find a slide share from CERN. They're a little bit ahead of us there. Exactly, yeah. So CERN tends to solve it ahead of us, which is good. But anyway, so yeah, good deal. Yeah, and I'll be around for questions. That's great. Thanks very much. So it's fantastic to see a tier one retailer, like Best Buy adopting skill out technology and Ubuntu. I'm really proud to be part of that story there. The next one I want to talk about is Next Santa Edge, which was just announced this week. And I want to invite up Michael to talk about it. Thanks, Kiko. So I'm Mike Letchin. I'm the field CTO for Next Santa. So like you said, this is definitely going to be the newest one you've seen because we went GA on Monday. So if you're curious about something that's new, but it's actually been in development for about three years. This is a brand new scale out object system that we came up with. It is Swift S3 as well as iSCSI compatible. We've been working, and it just sits as a service on top of any of your standard Linux distributions. So really building it right into the OpenStack system was just ideal for it. A few kind of the key benefits to it is it sets up here. It's got inline deduplication across the entire cluster. So you're going to get a lot less use of that storage. So you're not needing nearly as much storage as you did in the past. To go along with that, we've got compression with it and a lot of the enterprise functionality we had. So for those of you new and exempt in the past, we were very ZFS based. And this was a new system that was using kind of that DNA, that heritage we knew for data integrity and driving forward with that. So by having that, the enterprise snapshots that built into it, and then of course a dynamic data placement. With that, it allows us to really reduce the option of things like hotspots. So you don't have to worry about that one node causing a lot more problems if you have that node failure when we talked about it. So like I said, it's brand new on it. We're actually demoing it in the booth. So swing on by, we're going to be there for a little while. I'm sure between me and Kiko, we'd love to show it to you. Thanks, Kiko. Thanks very much. So as Michael's saying, I think the one thing which is very cool about NextSense Edge is that it builds on knowledge that was basically developed at Sun through Open Solaris. Although it doesn't use ZFS or ZFS underneath it, the same concepts are there. The same ideas behind ZFS are there and it's actually an amazing system there. It's Russian engineering and Russian architecture being put on a really interesting problem, the functionality and available. Basically, the way the architecture works, I think, is actually quite unique. And we're very proud to be offering that on top of Ubuntu as well. The next one I want to talk about is Ubuntu Swift. Now Ubuntu Swift is our offering based on the native OpenStack object store. It's known to scale to hundreds of petabytes at places like Rackspace. The difference with Ubuntu Swift and Swift itself as a technology is that it was built as an object store primarily. And so it fundamentally is an object store. The model is eventual consistency, which affords very different performance guarantees because you don't have to ensure that all the nodes are updated or synchronized at the same time. And of course, because of that it affords native multi-site replication because the model is you are eventually consistent and all the nodes will give you back the same data, but you can expect to always get a response back. And we're also offering Ubuntu Swift at metered pricing like we do for Ubuntu stuff. The last technology that we included here is Swift Stack. And Swift Stack is a very interesting product. I wasn't familiar with it when we started on this project about building it out. So I wanted to call up. Chris, talk about it. Thanks very much. They're not important anyway. Yeah, interesting bit of history. So I have worked with Mike, so I know Mike very well. And prior to that, I was a ZFS engineer. So I've got a little bit of background. So I've been at Swift Stack for a bit. Real quick to add on it, as Kiko said, OpenStack Swift is, if you're familiar with the OpenStack community, it's been a part of that from the time that Rackspace sort of initially developed it. It's been a founding project in OpenStack. And it's been around as long as OpenStack has. Swift Stack was founded by a couple of folks that have been around for a while. Joe Arnold, John Dickinson, who's PTL for Swift, works at Swift Stack as well. Really, the value that Swift Stack was all about and really what we still provide is taking Swift, which is a well-proven, more widely deployed, probably arguably than anything else, object storage, and making it incredibly simple to deploy, to manage, and to monitor. So a quote from Kiko was mentioned in one of the genomics folks. We've had Fred Hutch, Cancer Research Center, and Hudson Alpha here doing a couple of different talks. Brandon Cruz, who was here from Hudson Alpha, said, yesterday, we can roll in a four petabyte rack and deploy that, have that thing up and ready to run in minutes, literally, with the Swift Stack UI. So we have a, as Kiko has on the slide there, we have a gateway. Native Swift API, RESTful API is going to be the interface into Swift, clearly. And that's true for Swift Stack as well. We deploy just so it's clear, entirely exactly the same bits as you would get from OpenStack Swift. But we have an out-of-band management and monitoring controller that sits kind of to the side, if you will. Integration with LDAP, Active Directory, your application integration work, and all of that monitoring, Nagio, Savics, whatever that looks like, that you would expect in kind of an enterprise class deployment of Swift, which otherwise takes a bit of some significant skill to kind of roll your own at that scale. So I think that was most of what I was gonna say. Anything I missed? No, and you should say around, because there will be questions for you. Yeah, I will stick around for some questions. Thanks, Kiko. All right, thanks very much, Chris. All right, so this is the content that I had. I mostly want to take time for such the answer questions there. Scaleout Storage is the next revolution. We've been busy disrupting computes across the data center. People have been looking at OpenStack now as something which they understand and they're familiar with. You use Nova as a replacement for proprietary VM launches or VM managers. Storage will be the next revolution. People are gonna look at how you bring scaleout economics, commodity economics, to replace traditional legacy storage systems. Storage is not a straightforward problem. It's never been a straightforward problem. I worked on Luster about 18 years ago, I think, 15 years ago, and at that time, I was just aware how much tuning and work had to go into a very large cluster, running Luster, and we've come a long way in abstracting that complexity away and making for simpler deployment and simpler management, but the truth is, there will always be important trade-offs involved depending on how you consume the storage, and different technologies will have different sweet spots. You can't use the same tool for everything and expect to get the same performance. The underlying point is that Ubuntu brings it all together, fully supported by us, so you can count on us to deliver any technology that we're offering. That's basically it. I'd like to open up for questions if people have questions around how the offering works or how the technology works. Sure, go ahead. So SSDs or key value, sorry, I didn't ask. So basically all of these were built in an era where solid-state drives are already a reality, low-cost SSDs, and so many of them will come with guidelines for deploying some of the services. For instance, you tend to store metadata on the SSDs. In general, you could do SSD-only clusters and expect to get much, well, the performance that you get on SSDs on a single machine, but across the cluster. I think as SSDs move to becoming cheaper and cheaper, you'll see HDDs as being basically backing for them or even cold storage for them, but for the hot parts of the workload that will stay on the SSDs, and I think that's what Best Buy is aiming to do on their deployment there. Do you wanna talk a little bit about that, Steve? Yeah. So much like CERN, we've got SSDs for the journals and then HDDs for the data. That's our current scale-out. We're working also with some pools and testing that are all SSDs, but they're not performing quite as well as the... So we're losing some percentage, so we're still working on that and we're meeting with other people here during the sessions yesterday talking about that. So... What would you use them for? Yeah, again, we would have different pools where we would, for the higher IOPS. So we would be giving tenants and giving VMs thousands of IOPS versus a few hundred IOPS. I think that was in line with what I saw from CERN yesterday. Yeah, I think it was yesterday, where they have kind of like a general-purpose IO that they're giving all their tenants and that's kind of what we're doing right now. We have general-purpose IO's, but we'd like to provide services like you see on Amazon EBS, right, where you've got thousands of IOPS a second per VM and be able to scale that out, be able to do it at scale and at economics, that makes sense, like the kind of economics you showed earlier, so for support, so. Thanks very much, Steve. Yeah? No, and I think actually, credit to the enterprise storage vendors who have actually stepped up and changed the way they're offering works. We're seeing lots of actually companies like EMC, for instance, seeing scale-out storage as a path forward in growth. The truth is this, the scale-up is limited by hardware, fundamentally by the hardware there. You can't have a single node that will be 100% available and will grow indefinitely. You reach very hard limits and so everyone will have to re-architect and I think it's actually great to see that traditional vendors have been very early about picking up on this and they haven't been waiting for the SAN or NAS revenue to die out. They're saying we're gonna be ahead. This means getting much more into software than they've been and I think that's actually really welcome. I think we need to have options for the end user. It's great. The question is what management package are we using for SEF? So are you asking about the dashboard or how it's deployed? Dashboard, we're using SEF-dash which is independently developed. We're looking at other options as well. We actually looked at Calamar, something that we could offer but the reality is that it's basically composed of many moving parts and it was difficult for us in a short amount of time providing a supported experience. The way that we do our delivery in general means that we select a set of components that we are sure we can deliver. This is the reason why, for instance, we're not saying that we support shared file on SEF. It's because we're saying, actually, we want to make sure that if customers deploy this, that we'll be happy to go out in the field and ensure that it runs. So we're providing a standalone dashboard that's been developed independently. We've put our own engineering on it as well. All the deployment is done through our tooling and so you can basically, through the same facilities that we provide for OpenSec itself, which are Juju and Maz, deploy at scale, scale it up afterwards, increase additional nodes, add gateways without having to worry about configuration or keeping it all consistent across the nodes. Yep, that's right. So you can, all of this runs on Ubuntu as the underlying OS. So you can install, you will basically, as we're working with you to deploy this, you'll have Ubuntu deployed and then you'll select what technology you want on top of that. In this situation, Ubuntu actually takes second stage. The truth is that what you're consuming there is a storage service and that's actually what the higher level product that you're deploying is. But it is Ubuntu, the underlying OS and you get the same Ubuntu guarantees. We provide five year supports for each of our LTS releases. You can count on us for technical 24 by seven support through Ubuntu Advantage. Who here has actually live software defined storage deployments where they're working or at where they're doing the research? What do you guys have? So Scale.io, but that's actually quite interesting, right, because of the announcements here this week and on OpenStack and what? Right, and what's your experience been with EMC? Scale.io is block only, right? Yeah, yeah, yeah, yeah, that's great. And in fact, we are in conversations with lots of people in the ecosystem. Our interest is to be able to provide different technologies exactly for that reason. So you'll see additional announcements coming out from us of other partnerships where we're basically delivering the software and supporting it like we're doing for these four. Any other questions? All right, that's it then. Thanks very much, thanks very much Steve.