 My name is Marcel Hergade. I'm a marketing manager or evangelist for the Data Services BU. And from that perspective, sorry, from that perspective, today we're going to brief you about data resilience for OpenShift and that is with Red Hat OpenShift container storage. First of all, my best wishes for the new year for everybody. I hope it will be a fantastic year for all of us. And for now we will have a look at what's new with OpenShift container storage 4.6. So we'll have a short introduction into Red Hat Data Services, what that is about and then how that relates to OpenShift container storage. We will dive into the release highlights of version 4.6, what's specific with this release, what the theme is. We'll dive into some other new features with this release and we'll have a glimpse at the roadmap. So this is the agenda for today. First of all, let's have a look at data services. What's data services about? We offer data resilience for OpenShift with container storage 4.6 and the whole goal is to make data accessible to applications across hybrid cloud, unlocking the power in new and impactful ways. And this enables for innovation without limitation. So the messaging here is actually simplified access. So easy to access. A consistent experience, no matter where you run, if you run on a public cloud or on prem or virtualized or on bare metal, it's all the same consistent experience. It doesn't matter where you run. And it's a dynamic skill. So that means that the system is very dynamic and can adjust to your needs. With data services, we have a different mindset. First of all, let's have a look at the traditional static approach as it always was in the earlier days. There was also always a focus on improving efficiency to get things better. It was more or less an infrastructure up few. So we always looked at these things from the infrastructure side. Poor performance at scale. So when the system needed to grow and scale out, then this had performance penalties because there were limitations with scaling and you had walls that were hit at some point and you needed to be flexible with these things and performance issues applied. It's disconnected, so it's not always online. It's a specific setup that sits somewhere in the data center and it's not per se connected to the internet. So therefore, it's manual, monolithic and rigid actually. And if we look at the new approach, the data services approach, there's a focus on innovation and we don't look at it from an infrastructure view, but we look from the application side. So application oriented. It needs to have benefits to the application and it needs to be highly scalable because if you need to grow, then your data services and storage also needs to be able to grow. Always on, so always available and automated on demand and flexible. So you don't want to wait on an administrator to fix things for you if you need some adjustment or so. It's entirely automated. So what does that mean for developers and data scientists? First, let's have a look at the traditional approach and then this is a comparison with a library where you can actually go for books. You must go there. You must visit that library and there's a limited range of content on offer. So not everything is per se there, but it's what they have that's available. You can only check out a few items each time. So when you come there, you can only take a few books with you. You cannot take the entire collection and there's a line to check content out. So you need to get in the line, check out and there's a limited usage. So you can keep those books for a limited period of time and then you need to return them or you must revisit the library to renew your books. Let's have the same comparison but now from a data services approach. With data services, you have access to your data from anywhere, no matter where you are. There's a wide range of rich content offer. It's not limited to just a certain collection of content and you can have simultaneous access to all the content. So with multiple people, multiple users, you can access the same content. So it's not limited to one user that has access at a certain moment in time. And self-service. So there is no need for a manual supervision by some administrator who needs to fix things for you. There's an unlimited usage and access can be granted indefinitely. But as you see, there is a lot of benefit here but that also means a different approach. And this is typically the difference between traditional static and the data services approach. Well, let's have a look at reddit data services. Where do they fit? So we have the reddit OpenShift container platform that sits on an infrastructure which can be public, private or an edge environment where OpenShift container platform runs on. It consumes storage, whatever storage is available in that infrastructure platform. Then on top of OpenShift, that's where reddit data services sits and that serves to the applications. We'll go into some more detail there if we compare this to the old school OC model that was being used in the network world. Then you can see that the underlying storage which is available in the infrastructure is the physical storage, the so-called layer one, so the physical layer. Reddit data services is higher up in the stack and that's actually the presentation layer because it doesn't source storage but it actually provides storage services. And that is what reddit data services does for you. So you're not selling a storage product as such but you're selling storage services, data services based on storage. So if you are in a situation that you already have a storage system or have some storage environment, that doesn't mean that you cannot use data services because that's higher up in the stack so you can still leverage reddit data services and this presentation will explain why it makes sense. Reddit OpenShift Container Storage consumes storage to provide higher level data services and this is very important to realize because it's not a storage product as such, although it has storage functionality built within but it always consumes existing block storage. Well, and then the existing block storage that can be Amazon storage, AWS block storage, it can be Azure, Google Cloud Platform but it could also be a storage source from VMware by a son or a fees son or maybe local drives, direct drives and it can even be a local drive in a bare metal system. So that's the storage layer. On top of that, that's where Reddit OpenShift sits and on top of that, that is where Reddit Data Services resides which uses Reddit OpenShift Container Storage version 4.6 and with that, you will get a uniform, Kubernetes, RWO and RWX storage class. Multiple storage classes are being sourced from that single storage below the infrastructure line. I hope this makes sense. We add a lot of functionality just on top of storage. So your existing storage is most likely block storage. On top of that, there is Reddit Data Services with OCS 4.6 OpenShift Container Storage 4.6 and that one serves you with file storage, block storage, object storage, S3 compatible. So there is a lot more on top of that and that's what we call Reddit Data Services. And this is a kind of new fission because with Data Services, we divide this in three main pillars, actually. So we have data at rest and data at rest is more like data in databases, data warehouses, data lakes and so on. So that's data that's being stored in a database for use. It could also be my SQL, SQL CrunchyDB, DB2 Warehouse, for instance. That's data at rest. The second one is data in motion. And with data in motion, we mean streaming data and like messaging data, Kafka, AMQ streams, serverless, K-native, so functions, server functions. That's data in motion. And then finally, we have data in action and data in action means that's where data is actually being processed like data analytics, data intelligence, AIML, Starburst, Jupiter, Jenkins, TensorFlow, those kind of things. That's what we call data in action. So data at rest databases, data in motion is streaming and messaging and data in action is where data is actually being processed. So these are the three main data services that we actually recognize with OpenShift Container Stories in Reddit Data Services. So speaking about Reddit OpenShift Container Stories, what does it actually provide? What does it offer? So it offers you persistent storage for containers in all flavors, like block storage, file storage, object storage. And with file and object, we can provide persistent storage for containers. It has a full integrated management from within Reddit OpenShift, so there is no separate console or no separate way of managing Reddit OpenShift Container Stories. It's all built in in Reddit OpenShift. There's an entire integration. And it offers storage provisioning, like I said, for all types of data, block, file, object, and it's all OpenShift operator based, so it's a very easy setup. It's no difficult process to set it up and everyone can do this, actually. So to summarize this with Reddit OpenShift Container Storage, it runs on OpenShift, it integrates with OpenShift, it serves any type of data service, it runs on any cloud, on a public cloud, private cloud, on-prem, virtualized, whatever, bare metal. It's a consistent experience, regardless of the underlying infrastructure, and it's all used and managed through OpenShift, so no separate consoles, no difficult things, easy to install and easy to work with. The key use cases that we see with our customers are these like cloud-native apps, CI, CD, repositories, that's where this can really be helpful, but also structured data, like databases and data warehouses, and we have a very nice example of IBM DB2 Warehouse, where actually OCS has been used and had really great results. Another use case that we see often is big data, data analytics, AIML, where Reddit OpenShift Container Storage can also serve as a data lake storage where you can gather all your data and multiple applications can access the same data without having to move data around or you can have multiple data sets available within the same system. So now let's go to the details to the release highlights with Reddit OpenShift version, container storage version 4.6, and the theme of this release is actually data resilience for OpenShift. Well, what does it mean? We offer resilience, data resilience for applications that run in OpenShift, and how does that work? I created this little word cloud with a lot of words and terms that you might hear or have seen somewhere, and let's go a little bit into more detail here. So the new data resilience features with OpenShift container storage 4.6 are all built on Open Standards. So it uses the container storage interface, also named as CSI, that comes with OpenShift. With that, we can actually create CSI-based snapshots on the storage, and we can make use of OADP, and OADP is OpenShift APIs for data protection. We will go into more detail on this later on in the presentation, but just for you to know that we use Open Standards for no proprietary stuff, it's all Open Standards being used here. And what we deliver is a container-aware data protection. So that means application persistent volume level backup, so protection of the persistent volume level, but that's not really special. What is special here is that we do also cluster protection on a namespace level. So just creating data protection or a backup from your persistent volume is nice, but that doesn't mean that you have an entire capability to restore everything, because you need to know the correlation between the volume, between the container part, but also the metadata around it, where it makes sense to the container, the settings, the specifics of the cluster. So what I'm trying to say is only the persistent volume level backup may not be enough. So if you need to do a disaster recovery, you also need the other information to make the container run again, to have everything at hand, and even you can restore across multiple OpenShift versions. It will also work alongside existing data protection solutions. So it can use your existing backup solution where OCS orchestrates the container backup by working through the container storage interface. There is no need to replace your existing backup software, so you can reuse what you already have in the house. So let's have a little look on what this looks like. So in the traditional world, the traditional model, we had a server that ran applications, that had state, was connected to some external storage, and that's in the live situation. Then if you want to have data protection on that same, we then created a backup. For the server itself, for the operating environment, that's where we mostly use the server image. So you could restore an image to get the operating environment back. But then you needed to have the state, and the state is the installed applications, the settings of those applications, those kind of things that need to be restored from my backup. Third, you wanted to backup the storage, because the storage got actually the application data. So where the server got the operating system and the applications and the state, then the data of the application would be in the storage. So those things were considered in the backup altogether, and that would enable you to restore the entire system. Now, there was a shift to the container platform. In this situation, there was no server state anymore, no operating system backup, because a container platform could run anywhere. It was simple to rebuild it, to just build a new container platform, and the containers run from an immutable image that came from a registry. So as long as you had access to the registry, you could rebuild the platform, run the container once again, so there is no need for an operating system backup for a state backup in that regard of the applications. The containers, of course, had state, and this state was captured in a persistent volume. So that persistent volume that lived on external storage, and that storage also needed to be backup to make it back available in the case of disaster recovery. And then you need to be aware about the correlation again between the container and the persistent volume, because if you just restore persistent volumes, you never know which container belonged to which volume. So that's also something to take into consideration. So that was a kind of difference. And now, with OCS 4.6 and data services, this changes around, now we can do a safe state of the entire platform. So that means we can save state the container platform, the specific settings that are relevant for the container platform. We also save state the storage platform meter data. So that's all in the backup. And this means that you now can do an entire restore to another cluster, for instance, or even to another OpenShift version, because you have everything that's needed to recover, to bring the application back online again, including all the namespace details. And to make it even more interesting, we can do this with OADP, which provides APIs to backup and restore OpenShift cluster resources, so YAML files, specific files for a namespace, internal images, so that the images that are needed to actually to start the application, but also the persistent volume data. So this altogether is being realized within this data protection mechanism. So that's done by OADP, which is currently working with IBM Spectrum Protect Plus as a partner application that has this enabled. So what does this mean, actually? We work with several partners, of course. We have others where we work with. For instance, we are also working with Kasten and Trilio, and there may be other applications in the future that will work with this and support this. So this is a flow chart. On the left side, we have the container platform and the storage. Your data protection application with OADP enabled, we go into more detail later on this, so don't worry if it's a little bit too much. But your data protection application that supports OADP, that will reach out to your container platform cluster by leveraging the OADP API, and it will take a backup of the containers, of the storage, the states, the system information, namespace information, and that information will all end up in a complete data protection on the other side. So with that backup, we can restore to the same cluster to another cluster. We can reproduce the same information very easily without difficult interaction. And if you want to see this live happening, I have my colleague Annette Clouet. She created a nice video that actually shows how this process works in real life. So on this slide, there is a link, and I will share it, of course, afterwards with you. And if you go to that link, then you will end up in a video where very detailed is shown how to do this and how it works and what it delivers. I hope it makes sense. So just to give you a glance of what's possible here. So with OADP, you can use what you already have. It works with your existing data protection application. It orchestrates container backup with OpenShift container storage. You don't have to worry about snapshots and so on. It's all being abstracted away by OADP, by OADP which integrates with your backup application. It provides application consistency versus crash consistency. So with just a snapshot, you would have crash consistency. And with OADP combined with your backup application, that's where you can have application consistency. There's also results in a reduced need for training because you can use what you already have. You don't need to use some other backup application or some other data protection application to realize the same. Now you can just use whatever is already there in the house. And it therefore doesn't require additional monitoring because it's already integrated. It uses OpenShift and it uses whatever you have already. So what is OADP? It's an OpenShift API for data protection. It's a set of APIs and it's a Red Hat supported community operator that provides APIs for backup restore scheduling, backup storage location and snapshot location. But this is all done by OADP. And I said it's a community operator. So this one is supported by Red Hat to ISVs. So backup vendors and backup partners can actually use this and get Red Hat support on it. So you're fully supported here with this mechanism. So let's have a closer look into this, how this works. So with Red Hat OpenShift, you get CSI, so the container storage interface, which is a standard mechanism of accessing container storage CSI. Once we have Red Hat OpenShift running, we can do an operator-based installation of Red Hat OpenShift container storage, which has support for CSI, snapshots and clone. So this is a really simple installation, right-click, install, and then you do a little bit of configuration. That's not that complex, so it's easy to get it all up and running. And once it runs, you get a very nice marriage of the both. And that means that you have a single interface to work with through CSI. So you're working with the CSI interface within OpenShift, and that one attaches to Red Hat OpenShift container storage and makes it possible to create snapshots and clones without having to go into all details, without having to go under the hood and do set commands and so on. It's all abstracted away and it's all running through CSI. So this is to show you how you have a fully integrated management within the Red Hat OpenShift UI. And if you use your backup application with OADP, it makes it very easy to work with this combination. So with an OADP-enabled backup application, and at this moment we have IBM Spectrum Protect Plus who can actually do this, then it really works very simple. It just addresses the CSI interface and it will work with its APIs to make the whole thing running without having to go through difficult details. And then your question may be, hey, that's nice, but I don't have IBM Spectrum Protect Plus. What else can I do with my other application? Well, with any other backup application you can use it. At this moment you can use it pre and post scripts, and that's a script that you can run in the backup application. Pre means before the backup start you can do some interaction with CSI so you could request the stories to create a snapshot and then mount the snapshot to the backup, create a backup and afterwards post script afterwards, remove the snapshot, for instance, after it was backuped. Sorry. So this is OADP abstracts all this complexity away. If you have a backup application that can use OADP, if you have another backup application that is not yet OADP enabled, then you can use with pre and post scripts and address this against the CSI interface on OpenShift. So either way you can have a nice solution to do your container backup but also other relevant information for your cluster to be able to restore. So let's have a look at Data Resilience Service Level Objectics. At this current release with 4.6 we have support for the backup solution enablement. So that means actually protection against logical failures, something like user errors, accidental deletion, bad actors, application software, bugs, things like that, hijacking software, malicious software, fear, virus. So you can restore data that was captured in your backup. It's a restore to a previous point in time copy and it's based on a snapshot and a snapshot can be local or remote. So this is supported right now. In future releases, and that's on the way later on this year, we will also have a one disaster recovery solution which is actually protection against data center disaster. That means that it's a replication, asynchronous replication to some other sites which actually protects you against data center disaster. Think of failures to power grids, geographic skill, natural disasters, those kind of things. Then you can have another copy and another data center available. And then later on we will have a Metro HADR solution protection against hardware failures, hardware components, system wreck level, etc. So this is all to come in future releases of OpenShift. It's planned for the next release and onwards. So for now we have full support for the backup solution and in the future you will always also have a replication, asynchronous and synchronous replication planned for the next releases of OCS. This all makes sense in regards to RTO and RPO. RTO is recovery time objective. That means the time it takes to get back to that certain point in time, application downtime. And then there's RPO, recovery point objective. And that means the more we can go to the right, the less time is needed to get to the point of disaster actually. Recovery point objective. So for now, logical failures. So support for backup applications, future release coming soon, asynchronous and synchronous replication. So this is on the roadmap for next release. Then we have some other new features to walk through. So this was the main theme, data protection, data resilience for OpenShift applications by offering the capability to create CSI snapshots, CSI based snapshots. So without having to go through all the difficult details you can just address the CSI interface and that one will take care of all the complexity and will make sure that the snapshot or clone is being realized for you. But it's not the only thing that comes with OCS 4.6. There are some other features, OCS snapshot and clone. That's what we already spoke about. What I didn't tell you is that it's also possible to do the same from the OpenShift UI. So within the OpenShift UI, you can also create snapshots from there. So even if you don't have a backup application externally that does this, even through OADP or through scripting, you can also do it yourself in the OpenShift UI GUI actually. Then another thing is encryption at rest for the entire cluster. And this one uses Ceph encryption. Ceph is under the hood for this encryption and it goes for the entire cluster. So this is good for if you have field drives or so whatever that no one can actually access that data because it's encrypted. Multi-cloud object gateway namespaces. That's a whole mouthful and I will go into a little bit more detail in the next few slides. Same for multiple storage classes. So we now have a capability for multiple storage classes to enable new functionalities, local storage operator announcements and improve bare metal deployment. So a little more detail. Encryption at rest for the entire cluster. It means that all your storage devices use a self-generated encryption key and are encrypted on OSD level. And this again protects against disk theft and allow safe RMA or filled local storage devices. Your data is protected. No one can do anything with it without having access to the key. Then multi-cloud object gateway namespaces. This is also something that needs a little bit explanation because this is actually a proxy bucket that does not contain, this is about object storage of course. And it does not contain any objects. It's just plain text. It connects to existing buckets which may reside on Amazon AWS or Azure Blob. So it will provide you with an aggregated view of all these object resources. So you can have multiple locations where you maintain object buckets. And this will give you an aggregated view across all these buckets that live on different locations. And you can configure a default writer. So that means that anything you write will go to a specific location, but everything you read can come from multiple locations. And this is very handy if you would want to do a migration from one platform to another, where you still have access to existing data that lives on your platform while you're already working on the new platform or on the other situation. In this example, it's public cloud, but it could also be of course another data center, a private data center or whatsoever. So this is what multi-cloud object gateway namespaces does. Federated view across multiple bucket resources writes go to a specific location that you configure upfront. And the purpose of this is data migration or things like that. Multiple storage classes are here to introduce new capabilities. And an example of such a new feature could be replication count two. And maybe you ask yourself, what's this all about? By default, we have replication count three. So everything is written three times into OpenShift Container Storage. And with replication two, we can reduce that replication to two times. And this is specifically handy if you're working with applications that already have their own internal replication mechanism. So certain applications already have internal replication. And if you then would store that on OCS, then it will still get replicated three times. So you will get replication on top of replication. And to reduce that burden, we can lower this down to replication count two. There will always be a need for some replication because replication one means none. So if anything goes wrong, then you will be doomed and lost. So therefore replication two is the minimum that we absolutely need. But it's better than replication three if you have an application that already has its own internal replication. So I hope this makes sense. And then compression is another feature which you would want to enable on a certain storage class. So the compression gets enabled on a storage class. So hence the need for multiple storage classes. So you can have your default storage class that is replication three. You can have a separate storage class that is for replication two where you can assign your applications to or have a separate one for data compression. Which leads to data reduction in the end. So that's about the multiple storage classes. The container snapshot and clone, we have already been going through this more or less. So I'll leave it to here. You can use the API directly. You don't need specifically a backup application, but if you have one, you can easily interact with it because it's all open. Another announcement is the local storage operator. And with this, it makes it very easy to create an automated listing of candidate disk devices. So suppose you're on a virtualization platform or on a cloud platform where you want to use SSDs attached directly to your instance or whatever. Or if you use bare metal with multiple disks in the chassis, then you will get an automated listing of whatever is available in regards to disk devices. And you can make them easily usable for deployment with OCS OpenShift container storage. Full automated discovery without having to go through disk discovery steps outside OpenShift. So, you know, go on the command line and see whatever devices you have. This is all abstracted now in the local storage operator. So it makes the life of an admin somewhat nicer and easier again. Then there is an improved attached devices deployment. This is a result of that improvement of the local storage operator. And this brings us to tech preview features. Tech preview features are features where you have early access to things that are being developed with Red Hat. It's not specifically meant for use in production. If you would have a specific need for production use, then you can always reach out to us. And so we can see if we can work with you on an exception basis or if possible. But this is tech preview. That means you can test it. It's not per definition suitable for production uses. So one of these features is autoscale of object gateway endpoints. And this is all being addressed by OpenShift functionality. So you can scale your endpoints, S3 endpoints to improve the performance. There is also a support for OpenShift container platform compact mode for edge situations. A reduced platform footprint for specific situations. Let's just go through them quickly one by one. So the multi-cloud object gateway autoscale of three S3 endpoints and performance. This is done by OpenShift autoscaling. And that actually creates multiple endpoints where needed so that you get a better performance. Compact mode is something that's where you run everything. So where you run OpenShift and OpenShift container storage all together on a three-node cluster. Just three nodes. So this is meant for proof of concept demonstrations or for edge use cases or specific use cases where you don't need that much on hardware resources that you can use the compact mode where it runs the entire system on three nodes. So that goes for the OpenShift platform, but also for the OpenShift container platform with everything within. Only three nodes are required here for a smaller footprint. Then another thing that's in tech preview is support for expanding your OpenShift storage capacity. Multiple back-end storage classes. Well, what does this mean? We have customers that use the VMware infrastructure and these customers often deal with data stores of four terabytes within VMware. As a kind of common practice in VMware, they use four terabytes of data stores. Now we can use multiple of those as a back-end for OpenShift container storage. So this is also something that's in tech preview which makes the life of a VMware OpenShift user somewhat more easy. The reduced footprint, once again, can run on nodes that have 8 CPUs and 24 gigs of RAM. Again, for the edge and smaller use cases. Proof of concepts. This is where you can work with limited hardware resources. Finally, support for IBM PNZ platforms. So we can now run OpenShift and read that OpenShift container storage on these platforms. And this is tech preview with this release 4.6 for the people who are interested in this. It's now possible to run this application together on these platforms. Then we have reached the point of the roadmap. So 4.6, this is where we are right now. This has been released on December 17th as a kind of pre-Christmas present. And the full official marketing launch will happen soon. It's planned for January 19th, so expect more to come. You had just had a sneak preview on what's coming along and what the highlights are. So with 4.6, the main theme is the snapshots, CSI snapshots, data resilience for OpenShift, backup partner-based solutions and the three-note cluster. Those are the main highlights here. 4.7 will have geo-replication, asynchronous replication, synchronous replication, stretch cluster, and that's to be expected in future releases. For today, this has been my story, my presentation. I hope you enjoyed it, that you got a little bit inside on what's new in 4.6. If you want to have more information, you can visit our websites, www.reddit.com. My name is Marcel Herginen, my email address is here in below. Feel free to reach out if you want to know more, if you have questions or specific you want me to address offline, and if there's anything that I can answer right now, I'm happy to. Thank you.