 All right, hello, everyone. Can you hear me OK? All right, welcome. Thanks for coming. My name is Sage Weil. I work for Red Hat. And today, I'm going to be talking about Ceph, now and later, our vision for an open unified cloud storage system. Just to give you a bit of an outline about where we're going now, I'm going to start by talking a bit about why we developed Ceph, why it's open, what unified storage means to us, talk a little bit about what kind of hardware you can deploy Ceph on, its relationship with the OpenStack community software. I'm also going to take a minute to talk about why people don't use Ceph, because I think that's a useful retrospective exercise. And then we'll get to the fun stuff, talk about what's coming down the pike, roadmap, highlights, and community updates. So the current release of Ceph is Juul. It came out in the spring. We actually started working on Ceph over 10 years ago. It was originally designed as a distributed scale out file system. But one of the very frustrating things over the past five years, whenever we talk about Ceph, is that we had a stable object and block interface, but we didn't quite have that stable file system yet. That's where the last thing to mature. And so we'd always have to show slides like this that described Rado Skateway and Rado's block device is awesome, but CephFS is only nearly awesome. So the big milestone with Juul is that we can now declare that Ceph is fully awesome. File is fully stable and ready for production. So we're very happy about that. But Ceph gives you this unified storage platform. So you have the Rado Skateway, which gives you S3 and Swift-compatible object storage with object versioning, multi-site federation, and replication. The block interface, RBD, gives you a virtual block device with snapshots, copy on right, clones, and multi-site replication across clusters for disaster recovery. And for the file interface, we have CephFS, a distributed POSIX file system that gives you scale-out metadata, coherent client caches, and snapshots in every directory. So all of this in one. All of this is built on top of the Rados underlying Rados platform, which is a software-only distributed storage system that's self-healing, self-managing based on intelligent storage nodes that figures out how to distribute your data across racks and racks of storage devices. So in a nutshell, that's what Ceph is. I think most of you probably already know that since you're here at OpenStack. But I want to step back for a minute and talk a bit about what Ceph is about, what motivates the design of Ceph, and the development activities that happen. So first and foremost, Ceph is a distributed storage, of course. But it's designed from the get-go such that all components will scale horizontally. So Ceph is really about scale, about cloud scale. It's designed to have no single points of failure. It's a software-only solution. We don't rely on specialized hardware. And in that sense, it's hard to diagnostics. So you can deploy it on commodity components that you're choosing. We provide object block and file interfaces in a single cluster, so it's unified. And whenever possible, we make the system self-managing because when you're operating a system at scale, things are going to go wrong, and you can't have operators going in and having to intervene whenever there's a small issue. But last but not least, Ceph is open source. And that actually is one of the most important features of Ceph, I believe. So I want to take a moment to talk about why Ceph is open source and why that's important. Open source, of course, is important because you can avoid the vendor lock-in. You can get support from Ceph, from Red Hat, from Sousa, from a half dozen other people. You also avoid the hardware lock-in because you can choose your software solution, and then buy hardware from whatever vendor you choose, whoever gives you the best price or the best reliability performance. It has the effect of lowering the total cost of ownership before your purchase. These are sort of the beer benefits of open source, as in the free as in beer. But they're also the freedom benefits. Because the system is open source, you have transparency. You can actually go look at the source code and see what it's doing. If it's actually doing what we say it's doing, instead of just taking the vendor's word for it, you have the option of self-supporting the system. If you want to develop the expertise in-house, you don't have anybody support it. You can just go fix the bugs yourself and go off to the races. But most importantly, because it's open, you have the ability to add, extend, fix, and improve the system, which is really what open source communities are all about. Just a moment to talk about what unified storage means to us. So there are a couple sort of key advantages of having unified source system. The first is that you get the simplicity of deployment. So you can deploy a single self-cluster and you can run object, walk, and file all against that same storage platform in your infrastructure. And as a result, you get efficient utilization of storage, so you don't have to do capacity planning for your blocks separate from your file from your object. And finally, you also get simplicity of management. So you have a single set of skills that you have to train your operators on and develop all your tooling around to manage your entire storage infrastructure. At least that's the unified storage story. I think the more experienced operators in this room might question just how true and valuable this really is. And so I want to sort of take a moment. You'll forgive me the political imagery and call a bit of BS on some of the rhetoric here. So the first two points, simplicity of deployment and efficient utilization of space, I would call this half true. For small deployments, when you're setting up a small cloud, it's absolutely valuable that you can set up a single self-cluster and you can put everything at it. You can have your object and your block and your file all mixed together and all just work and you don't have to worry about it. It's very, very easy. But once you start to scale to cloud scale, I guess, as everybody in this room obviously is planning on doing, then it becomes important, usually, to optimize for the type of hardware that you're deploying workloads on. So for block workloads, you might be using flash and for object workloads, you might be using spinning disk and so forth. And so when you're operating a scale, you're not necessarily running all of this off of all your object and your block mixed together. But that last benefit is still, I would say, completely true, where if you're deploying the SaaS software across your entire infrastructure, even if it's in separate clusters or separate hardware pools, you still have that same set of management skills they have to develop and a single set of software they can deploy across your infrastructure. And this is actually valuable on the development side as well, because when we develop new functionality and rados that all of these interfaces are sitting up on, like erasure coding or compression and so on, we can take advantage of that for all those different use cases without having to reimplement the same sorts of capabilities three times. So you have a choice then. Many operators will build specialized clusters that are sort of tailored for a specific use case, especially when they start to scale very large. But you don't have to do that. Ceph is very flexible in that you can also define rados pools within the same Ceph cluster that are allocating storage to specific storage devices. So it's designed to be a very flexible architecture. And in particular, it's designed to be hardware agnostic. It's a software-defined solution in the true sense of the word. So if you have high performance workloads, then you can run Ceph on flash appliances and it's going to go really fast. We did a number of reference architectures recently, a red hat, that I should say, with Sam Dasung, Sandisk, and Intel, although I didn't have a pretty picture there. Samsung has this great 2U box that's just packed with newbies. They got 700-some, 1,000 IOPS on 150 terabytes. Sandisk has a similar solution that's much more dense, not quite as high performance, but it's designed more for capacity, optimized storage. It's much more cost compelling. So if you want to run Ceph on flash, then go forward and it's going to go really fast. And these guys have done a lot of work to optimize Ceph to make it perform well in those environments. If you want to run Ceph on SSDs and hard drives, you can do that as well. This is what most people do. You can go out and buy hardware from pretty much any vendor on the planet, and you can put Ceph on top of it. You can also find reference architectures from all the major players where they've pre-tuned Ceph, and you can find out how it's going to perform when you buy it. You can even go to somebody like Fujitsu and get a turnkey appliance that's a rack scale and ready to go and just plug it in and it's got Ceph pre-installed. Or you can run something like Open Compute hardware from somebody like Penguin and be off to the races as well. So you have a lot of flexibility here. And finally, you can even run Ceph on an actual hard disk. This is sort of bleeding edge here, but Western Digital Labs has a prototype hard drive based on their 8 terabyte helium platform where they essentially added an extra arm processor onto the PCB on the hard disk that runs Deviant Linux. And we run the Ceph OSD actually on the hard disk. They swap out the SATA interface for dual ethernet. And in the chassis, you swap out the SATA backplane for an ethernet backplane. And so you can imagine just racks and racks of hard drives plugged directly into the network running Ceph on the hard disk. So this is a prototype. Their next generation is going to move to ARM64, and they're sort of working through all the issues around building a completely new product. But the other hard drive manufacturers are looking at similar designs. So it's very exciting to see this full breadth from all the way from hard disks to really high end source systems. You can run Ceph on all of them. Which really underscores one of the key advantages of having green open source software, and that it leads to fast and open innovation. So obviously, open source software means you can enable innovation in software just by cloning the code and going and hacking on it. But it also enables the hardware innovation because those ODMs and OEMs can just get the code and start hacking on it to support their platforms without having to have some cumbersome business relationship with the proprietary software firm to get access to those code and deal with all the licensing and NDAs and all that stuff. None of that friction is in place for open platforms, which makes it sort of ideal for pushing, computing into the future. A good example of this is persistent memory. So persistent memory is coming. There's 3DX point from Intel Micron, and we actually have NVDNs today, although they're still pretty expensive. But these things are going to be really fast, somewhat more dense, and very high endurance. They will be expensive, but most of all they promise to be very disruptive as they sort of turn all the assumptions that we've built, all their storage designs over the last three decades, on their heads by having this completely new, persistent storage medium to design around. So Intel recognized Ceph as sort of a key platform to experiment with here, and they went and developed something called PMStore, which is a prototype back in for the OSD that's designed to store data directly on 3D cross-point memory using their NVML library. It has moved past the prototype stage, but I think it's a good example of how innovation for these new hardware technologies can happen very efficiently in the open source space. And that's really what we're here to do. So in the Ceph community, our goal is to create an ecosystem around Ceph that allows Ceph to become analogous to the Linux of distributed storage. And we mean that in a couple of different senses. First, open source and open development is critical. We also want to build a collaborative environment where we have lots of different organizations contributing development effort to improve Ceph and then build upon it and make it better. And finally, we want to build a general purpose platform. So in the same way that you can run Linux on embedded devices, on your phones, and also on big iron in the data center, we see Ceph as being a general purpose storage platform that can run on hard disks, commodity servers, and on high end, you know, flash optimized platforms as well, which sort of brings this back around to OpenStack. So Ceph and OpenStack get along very well. There are lots and lots of integrations. On the Redis Gateway side, with our object interface, Redis Gateway talks to Keystone to do authentication. It provides an S3 and Swift API. It doesn't necessarily talk to Swift so much as you can use it in place of Swift because it speaks that interface. On the block side, we have drivers for Cinder, of course, and Glantz for managing all your images. And Nova knows how to start up your KVM instances so that they're backed by virtual disks stored in Ceph. It all works very seamlessly, and that's why lots of people in OpenStack stack like us. But I think the most exciting piece to call out here is that there's a new Manila driver that lets you orchestrate file volumes stored in CephFS that can be plumbed through to your virtual machines. And that's new in whatever, I think it was not the Nuthaka, but yeah, new. But very exciting. But Cinder is really where Ceph and OpenStack have really shined. If you look at the user surveys over the last, I guess, three and a half, two and a half years, Ceph has consistently been shown to be adopted by roughly half or a little bit more than half of the OpenStack deployments out there. Rivaled only really by LVM, which is ephemeral non-reliable storage on the local disk, which makes sense. It's really addressing a different use case anyway. Then Ceph RBD is where it's reliable in the cluster. I bring this up not to pat ourselves on the back, although I might do that too. But mostly to point out that there's still a whole lot of other storage systems that people use with Ceph, and I think a useful exercise for us in the community is to look at why people are still choosing a lot of these other storage platforms for their clouds when Ceph is supposed to be the be all end all and be so great. I think an important exercise is to ask the question, why are people not choosing Ceph? I think there are a couple important reasons that we need to pay close attention to why that's the case. The easiest one, and the easiest one to write off is simply inertia. People have been buying proprietary appliances for decades, and it takes a while to change with habit. Even if you had a perfect storage system, it would take time to change those buying habits and actually have people adopt it. That's not particularly illuminating, but we'll just get it out of the way. Another reason is performance. I think Ceph has developed a reputation for not being as fast as other storage systems. I think partly this is due to history. Ceph has gotten a lot faster over the past several years and people have been using Ceph with OpenStack for a long time, and so I think some of these are opinions are outdated. But there's also a kernel truth here. The back end that the Ceph OSDs use to write data to local storage is definitely aging, and there's a lot we can do to approve that. I'll talk about more about that later. Another reason is functionality. Sometimes there are just things that Ceph doesn't do that other storage systems do do. I think that the big one in that category for me is quality of service. We don't have QOS and Ceph yet. Other systems do, and sometimes you need that. An important one, obviously, is stability. I think one of the reasons why Ceph and OpenStack works so well together was because OpenStack was kind of growing up at the same time that Ceph was, and so as we were sort of working out all the kinks and making the system stable, so was OpenStack, and the people deploying OpenStack had a high tolerance for that sort of failure in Ceph at the same time, so it ended up working out pretty well. I think we've done pretty well for ourselves, but really you shouldn't take my word for stability. You need to talk to other operators and listen to all the other talks where people are talking about their Ceph experiences. Which sort of brings me to my last point. Listening to other people talk about Ceph at events like this, it strikes me that a lot of the issues that people have with Ceph aren't necessarily that it's doing something wrong or that it's broken, but just that it's really hard to use. Distributed storage is complicated and we do as much as we can, or we do a lot to try to hide that complexity, but we by no means hide all of it. And Ceph, compared to a lot of other technologies, is just very difficult. And I think that's sort of more than anything, that's one of the key areas where we as a community need to do better. And where you as a user or an operator community can help us identify what we can do to do better. Which sort of brings me to the more interesting part of the talk. What are we doing about all this? Where are we going from here? So the great thing about Ceph is it's software and software can be upgraded. So we have a regular release cadence. Ceph does named releases every six months in the spring and the fall. The spring releases are LTS releases, which means we do regular back ports of bug fixes so you can run them for multiple years without having to upgrade. Our current release is Juul. We're just coming up to the cracking release. It's going to be out in the next couple of months and Luminous is going to come out in the spring of next year. So let's talk a little bit about what's coming, what's new in cracking. It's going to be terrible and wonderful and scary and all that good stuff. It's actually going to be the most fun to announce the release of cracking of any other Ceph release. But besides that, the big headline feature in cracking is going to be Bluestore. So currently the OSDs use something called Filestore to write all their data to the local disk. They write MS files and XFS that sit on the local disk. Bluestore cuts out that entire layer and writes the data directly to a block device. So we use an embedded key value database, currently RocksDB, although we could swap something else in later for the metadata, but then all that data goes straight to the disk. Bluestore can combine hard disks, SSDs, we can even use NVRAM or Persistent Memory for some of the journaling functionality. So it's sort of targeted at all the current generation technology that's out there. But it has sort of a few key headline features. The first is that we're going to have full data checksums across everything that's written to disk, which means that whenever we read something on disk, we're always going to verify the checksum before we return it to the rest of the system. This is going to be huge. It also features inline compression. If you enable it driven by policy on client hints or pools or whatever you define. So we can use the Elib or Snappy, which is going to sort of reduce the amount of storage you have to buy, hopefully for certain workloads. But the biggest thing is that Bluestore is roughly twice as fast as Filestore. And that's true for SSDs and hard disks and large IO and small IO and give or take. It's much, much faster. We get better parallelism and efficiency on fast devices. We eliminate the double writes where we used to do data journaling on a journal device. We don't do that anymore. It also performs very well even when you're using a very small journal. So you might use SSDs to accelerate those metadata updates. And those journals can be small on hundreds of megabytes instead of gigabytes like they are right now. Lots of people are working on this. It's been a real pleasure to work with people from SanDisk and Mirantis and ZTE on developing this new feature. And it's doing quite well. The biggest question probably in everyone's mind though is when can I have it? The current master is doing quite well. We've nearly finalized the disk format and in Kraken we're going to have a stable code base that hopefully won't crash in a stable disk format. It's almost certainly going to be flagged as experimental because it's a sort of brand new code and you don't want to go putting your production data on it just yet. But we do want as many people as possible to try it out on maybe your dev desk environments or your performance testing environments. The goal though is that for Luminous this next stable release that will have a fully stable and version that's ready for broad adoption. We hope to make it the default instead of file store but that really depends on how the next six months go. We need to make sure that it's stable and that we trust it because your data integrity is definitely more important than getting a feature out the door as quickly as possible. A big question also that comes up is how do you migrate from file store if you're already using it? And it's really pretty simple. You just take the existing file source OSDs and either evacuate them or not and just kill them. Reprovision the same storage devices as Bluestore and let the regular set recovery take over. Excuse me. The other new item coming in Kraken is Async Messenger. It's actually been in the code tree for a while as experimental but it's now the default. This is a re-implementation of the network layer in Seth. It features a fixed size thread pool so you don't want a lot of the thread thrashing that you have with the legacy implementation. It behaves much better with TCMALIC and it's now the default, so that's good. One of the nice things about Async Messenger though is that it's a new software architecture and it abstracts out the transport layer, the actual part that sends data over the wire. And so we have the normal implementation that uses sockets and TCP but we also have two experimental backends. Motivated by the fact that when you do profiling on Seth on high end storage devices, you see that a lot of time is spent doing TCP reading and writing data over the wire. Excuse me. That's these two big peaks that you see in this flame graph. So one back end is based on DPTK. This is an Intel library that lets you move the network driver and the TCP stack out of the kernel into user space with the different threading and memory model. Gives for very, very low latency network IO. That's a pretty cool prototype that's there. The other one is based on RDMA. So you'd send data over the IB verbs interface for RDMA hardware. We keep the TCP connection there for the control path. Mostly just because they're able to put this protocol prototype together in less than a month and make it work. So they just wanted to do a proof of concept here. This is all contributed by the X guy folks. They're doing good work. So some interesting stuff on the network side coming down the wire. Honestly, the network stuff really isn't the tallest pole in the tent as far as performance goes. So it's fine as all that weighs out. We're mostly focusing on getting the storage stack to perform it better. So moving forward to luminous. This is coming out again in the spring. So the first sort of big thing coming in luminous is that the multi metadata server in SFFS is finally gonna be completely stable in luminous. So you can have scale out metadata for SFFS and sort of fills in the last missing piece for completing the full scale out story for SFFS object block file. So that's exciting. The other big thing in luminous is gonna be erasure code overwrite support. So RADOS pools have supported erasure coding for a long time, but the current implementation only lets you append to erasure coded objects. It turns out that's simple to implement and it's completely sufficient for RADOS gateway workloads where you're using the S3 protocol to dump objects into and out of the cluster. But it doesn't work directly for RBD and for SFFS where you have to modify existing objects. So EC overwrites will enable RBD and SFFS to directly consume those erasure coded pools. It turns out this is really hard. The implementation requires a two phase commit to avoid the sort of EC equivalent of the RAID hole. It's also complicated to avoid an inefficient full stripe update if you do a small 4K write server in the middle of a stripe. And also the implementation relies on an efficient implementation sort of an internal primitive that lets you move data between our internal objects in order to do the roll forward, roll back efficiently. I mean that's only done efficiently in BlueStore. When we were trying to do this on POSIX it was very difficult to do this with a file system. But it's hard, but it's gonna be huge. It's gonna have a tremendous impact on our TCO. I'm going from three X-Rub because it's something more like 1.3, 1.4. And we believe this is really gonna make RBD great again. Sorry. Other new stuff in Luminous. We have a new demon in Seth called Seth Manager. Motivated by the fact that the Seth monitors currently sort of crowd everybody in the cluster. They currently do a lot and it turns out that a lot of what they're doing really isn't necessary. They're spending most of their time storing and aggregating stats about all the placement groups so you can do things like DF that really isn't critical for the functioning of the cluster and this has the effect of limiting the scalability of the monitor cluster and in turn the overall Seth cluster. And so Seth Manager moves these sort of non-critical metrics out into a new demon that's much more efficient. It's gonna enable us to stream things off to things like InflexDB or Graphite or whatever else for all your pre-graphs. And it also has a good point to do efficient integrations with external modules. Even modules that are written in Python. So they're using the barrier of entry for adding intelligence to Seth. So it's a good host for integrations like API endpoints so we've taken the existing Calimari API and we've just sort of plopped it right into Seth Manager with almost no modifications. It's gonna enable some coming features in the future like things like Seth Top and RB Top that you identify which objects are getting the most IO and which clients are doing the most IO in the system so you have better introspection and instrumentation. I mean it's also gonna be a good place to implement high level management features and policy. For example, slowly weighting up OSDs or identifying OSDs that are flapping. All those sorts of policy based decisions where you want to sort of automate management of the system would be able to be put in Seth Manager if you want. The other big thing that we're working on right now is quality of service. I mentioned this is sort of one of the big functionality feature gaps that we currently have. And the goal here is to be able to set policy both for reserved and minimum IOPS for a particular client or workload and also specify proportional sharing for the excess capacity. Initially we're gonna do this based on just the type of IO so you can have client IO, background scrub IO and recovery to do better performance isolation there. We can extend that to do QoS policies associated with different pools in a RATOS cluster. But eventually we'll actually extend that all the way out to the client so you can say that this particular VM is gonna have this many guaranteed IOPS across the system and everyone else fights over the scraps. The implementation is gonna be based on the M clock paper in OSDI-10 which features an IO scheduler and this distributed enforcement mechanism that maps very, very cleanly on deliberators. But the basic idea here is you can see the sort of ugly graph I took from the paper. If you have one client it's gonna sort of consume all the IOPS in the system. If you have a second client come in that has a higher proportional priority it's gonna take the lion's share of the IOPS but that first client is still given a sort of a guaranteed minimum number of IOPS in the system so that it can have that sort of minimum level of performance. One of the other things we see particularly with RVD is concerned around latency. So obviously when you move to a share store system you're introducing latency into your system because your rights are going over ethernet and they're getting replicated multiple times you have to wait for the acknowledge to come back. That's sort of a fact of life. In contrast if you're writing to a local SSD in your hypervisor you have very good performance very low latency. That much is expected. The problem is that if you use local storage devices if that SSD fails or your client host fails you lose all the data. Obviously that's not what we want. So what naively you would like to do is have some sort of write back cache so you are writing to your local device and you sort of asynchronously flush things out at the cluster. The problem is that typical write back systems are unordered so if you do that and then you lose your client cache the copy that's stored in the cluster is gonna have out of order rights and it's gonna leave data in an inconsistent state that's gonna look corrupt from the file system or applications perspective. So write back caching doesn't quite solve the problem. So what we're looking to do in Rado's block device is to create an ordered persistent client write back cache where we're careful about the order that we write things back from the client cache to the cluster so that even if you lose the client host or the client cache SSD the version of the image that's stored in the cluster is in a fully crash consistent state. It might be stale but it's crash consistent. So you get low latency writes to the local SSD. You get that persistent cache. You get fast reads from the cache. You have this ordered write back that gives you a point in time consistent RBD image and this sort of fills out the spectrum. So currently you have to choose between a local SSD with low latency, a single point of failure and no data if you lose the SSD and sort of the SAP RBD full replication where you have high latency, no single point of failure and perfect durability. This gives you sort of a middle point where you have a single point of failure only on the most recent writes for the last few seconds and if you crash you get a stale but fully crash consistent copy and we think this is actually gonna be useful for a broad class of applications that don't need that sort of perfect durability in the system. But one of the biggest things that we try to keep in mind on the team is that in the future, despite this being a cloud conference where we all talk about VMs, most data is gonna be stored in object stores. All those cat pictures and videos and whatever else it's gonna be stored in object stores not block devices or file systems. So Rados Gateway is sort of a key strategic component of the SAS system and there's lots of stuff that we're working on to enable all that. Things like erasure coding. Multisite Federation is big, tiering features are gonna be really big. But one of the sort of the cool new things that's coming in it's actually been prototyped safe and cracking but it'll be stable and luminous is Rados Gateway indexing. And this is actually grafted on to the new Multisite Federation feature. So with Rados Gateway Multisite you have multiple SAP clusters and multiple Rados Gateways for each cluster. Each of those sort of is defined as a zone and you have the Rados Gateways talking to each other to do a synchronous replication between sites. We've extended this so we built a plugin that essentially looks to the system roughly like a replication system except instead of replicating all the data it's looking at the same logs and so forth but it's only replicating the metadata and it's pushing it into elastic search where you can go in query based on file type or object attributes or whatever else you put in there. And in fact, although I show it here in a separate cluster it turns out that these zones aren't actually don't have to map to a cluster it's just a set of pools in an existing cluster. So you can have sort of an extra Rados Gateway instance in an existing cluster that's just hashed with feeding all this data into elastic search where you can do your queries. So this is something that people have been asking for a long time, it's pretty exciting. And I think I'll sort of wind up here by talking a little bit about the development activity process. So we have a growing development community, number of contributors has been increasing sort of linearly for several years now and the amount of code that we're able to produce and features we're able to implement has been increasing as well. Thanks to the effort of a lot of different organizations. This is the sum. These are the top contributors for the current crack and release, lots of organizations inside Red Hat. But I think that the cool thing here is not just the number of organizations but the breadth of types of organizations. So you see sort of the usual suspects with all the cloud service providers and OpenSat companies here. You also see a lot of cloud operators, people like DreamPost, Tencent, DigitalOcean. You also see Teleco, Telecoms. You see OEMs who are building hardware and enabling SAP to make use of that hardware. You see storage companies and in fact you see some very old traditional companies, storage companies, seeing SAP is sort of something that they need to pay attention to. So I have just a few takeaways for this talk. I think the first thing I really want to drive home is that nobody should have to use a proprietary storage system out of necessity. There's no reason why there should be an open solution that provides everything that you need. I mean, we're here to fix that. And furthermore, the best storage solution should really be an open source solution and that's part of our goal. SAP has been growing up but we're certainly not done yet. There's still a lot of work that we can do on the performance front, scalability, features and easy to use. But we are highly motivated and we're very interested in working with everyone here to sort of get here. So how can you help? As an operator, first and foremost, you can file bugs. I've heard of all kinds of issues here in the ops session and elsewhere in the hallways about problems that people have hit with SAP that I haven't actually heard about before. So I admit I don't religiously follow the bug tracker but I think a lot of these issues simply haven't been reported. So communicating with the upstream to develop the computing can be very helpful. There are also features that are very hard to use and contributing documentation is a very low barrier way to sort of help the ecosystem. You can blog about your experiences, what works well and what doesn't. And if you do all these things, you can build a relationship with a core team that helps you become involved, us understand what pain points you have and influence the future direction of the storage system. As developers, there's even more you can do. You can go and fix those bugs that you reported. We love that. You can help us design the same functionality and actually go and implement it and that's very important. It's also equally important to help integrate SAP with other platforms. So we've done a lot of work obviously with OpenStack and SAP and OpenStack work very well together but there are a lot of other emerging infrastructure systems that we can work with like Kubernetes and whatever else. You can help make SAP easier to use. This is really one of the biggest challenges that we have as a community. And you can help, you can participate in the monthly developer meetings that we have. They're all on video chat. We have Amia and APAC friendly time scheduling and so we highly encourage you all to get involved. And that's all I have. Thank you very much. And I think I have time for questions. Yes. Yeah. So the question is about third party software using Libraeus directly instead of RGWC. So the question is about third party software using Libraeus directly instead of RGWRBDSFFS. We're seeing that more and more often where we have people building integrations directly with Libraeus instead of sort of the other higher level interfaces. And then we love it. I think it's great. The Libraeus is a much lower level interface but it gives you a lot of power that the sort of S3 type object storage interfaces don't do. So we've seen time series databases. We've seen archival systems. We've seen databases. Recently we did an integration with RocksDB so you can run RocksDB directly on Libraeus which means you can run MySQL on RocksDB on Libraeus as a database of the service thing if you wanted to. Love to see somebody go implement that. Yeah. Other questions? Yes. Yes. So the question is about limits of scalability, five petabytes, 15 petabytes. The system is designed to scale infinitely. Of course that's never really true. In practice the limitations on set scalability are really around the number of OSDs in the system, not the number of bytes. So if you have big devices you can store a lot more bytes. The biggest stuff clusters that I've been directly involved with building are test clusters that we did at CERN and I think we were in the neighborhood of 4,000 or 5,000 OSDs. They had earlier done one that was more like 8,000. We've just done a whole bunch of work in Kraken that's reducing sort of the map sizes which were sort of limiting our scalability. And so we should be able to go much past that but we haven't had a chance to retest. It's hard to test big clusters when you don't have big clusters. I wish we could afford one, but we don't. But eventually I see us going 100 petabytes and we'll be on that honestly but it's hard to actually build those. The biggest one I've personally used was around 5,000 OSDs if I remember correctly. Yes. But I've heard of ones that are bigger. I don't know how they went. Yeah, other questions? All right, thank you very much.