 From our studios, in the heart of Silicon Valley, Palo Alto, California, this is a CUBE Conversation. Hi, I'm Peter Burris and welcome to another CUBE Conversation from our wonderful Palo Alto studios in beautiful Palo Alto, California. As we do with every CUBE Conversation, we want to talk about an important topic with smart people that can provide some good clues and guidance as to how the industry's going to move forward. And we're going to do that today too. Specifically what we're going to talk about is that there has been an enormous amount of interest in Kubernetes as a technology for making possible the whole microservices approach to application development. But one of the challenges is that Kubernetes has been specifically built to be stateless, which means that it's not necessarily aware of its underlying data. Now, that is okay for certain classes of application, but the typical enterprise does want to ensure that its data can remain stateful, that does have a level of protection required, et cetera. Which creates a new need within the industry for how do we marry stateful capabilities, stateful storage capabilities with Kubernetes. And to have that conversation, we've got some great guests here. Partha Satala is the co-founder and CTO of robin.io and Radesh Manon is the CMO of robinio. Partha Radesh, welcome to the CUBE. Great to be here. Good to be here. Thank you. All right, so Radesh, why don't we start with you? Why don't you give us a quick update on robin.io? Sure, so robin.io, as you were alluding to, is addressing a super important problem that is in front of us, which is that you've got cloud-native technologies, especially containers and Kubernetes, becoming the default way in which enterprises are choosing to innovate. But at the same time, there's a whole swath of applications which were architected just five years ago, which all need to get the same benefits of agility, portability, and efficiency of cloud-native technologies. Robin helps bridge that, and I hope to talk more about that. Excellent, so Partha, let's start with you and talk about this problem, this impedance mismatch between applications that require some stateful assurance about the data and Kubernetes, which tends to be stateless. How does that impact the way applications get built and deployed? Sure, so if you look at, I mean, you mentioned that Kubernetes is a platform that was started or originated for stateless workloads, and people have adopted it, it's the fastest-growing open-source project, we know about that. But when you look at a stateless workload, it actually depends on state from somewhere. It's basically computing something, right? It's computing state that's coming either forward to the network, or it is computing on state that's stored, whether inside a big data lake or inside a database. Now, if you look at the problem itself, developers have gotten used to the agility benefits that Kubernetes has to offer, the mostly infrastructure as a code kind of constructs that it offers. However, the agility is not complete if you do not bring the stateful workloads also into the Kubernetes fold. So as an example, think about somebody who's trying to build an entire pipeline, across the ingest process, serve, visualize pipeline. Now, if you're saying that, you know what, in order to put this entire stack together or entire pipeline together, I have to still do something that is non-agile by going outside Kubernetes and then marry that with something inside Kubernetes. That's not true agility, right? So more and more we are seeing developers and the DevOps teams basically saying that, I want to have the entire stack developed and deployed on an agile platform like Kubernetes. And of course that comes with a bunch of challenges that need to be addressed and I'm hoping you talk about that today. Well, if we have, as you said, the state has to be maintained somewhere, the state may be maintained somewhere up in the cloud, but there are going to be circumstances where because of data locality issues and you want local control, you have latency considerations, any number of other issues that you want to be able to locate state in the cluster close to the Kubernetes stuff. Is that really what we're talking about here? See, that's one aspect of it, which is essentially around the performance and maybe even governance reasons why you want to co-locate state and stateless, right? But the other reason I was saying is, if you want to deploy a stack, a stack is comprised of many components, stateless as well as stateful. And you're talking about a birth of an entire application that a developer is going to push onto this platform, right? So there, it's not about just the data locality and all that, it's also that just enabling the entire stack to be deployed in one shot. So you just want a simpler, more manageable stack. Exactly. All right, so what's the solution? What do people have to do to get access to both those performance, more performance, stateful application Kubernetes clusters that have some degree of data locality concerns or to sustain that dream of increasingly simple stacks? What has to happen differently? Sure, and there are two aspects to this. The first one I would say is that, A, the platform that is going to offer this on top of Kubernetes has to guarantee the persistence he needs, whether it is in terms of reliability, in terms of performance SLAs, right? It has to guarantee those. So you have to get those onto the platform first. But beyond that, if you look at Radesh was talking about, many, there are many, many data platforms or data applications or workloads that predate both Docker and Kubernetes. Now, if you don't really bring them into the fold, you really are not solving the real business challenges that people have today, right? So beyond just providing persistency layer to Kubernetes pods, you need to have a way in which you can take complex platforms such as what's a Mongo, Cassandra, right? Elasticsearch, Oracle Rack, Cloudera, these kind of workloads and bring them on to a platform that is architected for microservices, which is Kubernetes, right? Because these platforms are not, these workloads are not designed for microservices workloads. So how do you marry them onto a platform such as Kubernetes that is designed as a microservices platform? So you got to solve that and that is exactly what Robin has done. So we have taken this approach where you can take complex workloads, data platforms and then make them run on a microservices platform like Kubernetes, starting with the storage subsystem, which is where one of our core strengths is. So I could conceivably imagine an Oracle database being rendered as a container with inside a Kubernetes cluster and positioned as a service and been orchestrated by that Kubernetes instance. But if I could jump in, you don't have to imagine, we have customers in production where they have Oracle Rack as a service offered on Robin, right? Now, one thing I want to contextualize is that our roots are solving this hard problem of applications that haven't been designed for containers, containerizing them and being able to manage that gracefully in Kubernetes, right? I just gave the example of Oracle Rack as a service. Or we also have customers with, let's say, multiple petabytes of data with Hadoop as a service powering big, large enterprises as well. Now, from that lineage now, what we're also offering is that there is a set of customers who already picked Kubernetes already, right? Might be OpenShift, it might be PKS, it might be GKE. To those customers, we also have an offering called Robin Storage, which brings powerful data management capabilities, right? So to offering the platform offering, which is Kubernetes plus storage plus networking plus application bundles for some of the demanding workloads that we just talked about. And then Robin Storage is a new offering which can add the magic of data management and advanced data management capabilities to any Kubernetes that you have. Well, let's talk about that just for one second. When I think of data management capabilities, I'm thinking not just of, you know, IO being written back and forth between some media and some application. I'm thinking in terms of data protection and security. So give us a sense of the scope of the services that are part of this solution that you're talking about. Yeah, I'll start in part that you can chime in as well. So the first context you need to have is that all these data management capabilities are in the context of a hybrid being the normed implementation, right? Nine out of 10 customers are looking at implementing on-prem with public cloud, right? So in that context, any of the capabilities that we are talking about being able to take snapshots or being able to take, you know, move that snapshot to be offered as a backup in the cloud or ability to clone and rehydrate applications. These are all capabilities that need to operate in a hybrid cloud context. That's number one. The second thing is, rather than just solve the storage level problem of taking snapshots, being able to bring application and data together is a big game changer. And Pata, can you add a little bit more on the app plus data construct? Absolutely, because if you look at the data services that Radesh talked about, snapshots and clones and things like that and backups, those constructs have existed in the storage industries for almost three decades now. There's nothing new about that, right? But if you look at applying them for workloads that are running in Kubernetes, you go to up-level that. Because when you look at a storage-level snapshot, it is still a volume or a run-level snapshot. But what a developer or a DevOps team needs is the ability to take an entire workload, let's say a MongoDB cluster, and say, I want to snapshot the entire cluster. I want to keep different states, even if the topology of the application is changing, correct? And that is something that Robin has innovated on. Because we recognized, and I come from a storage background, I was a distinguished engineer at Veritas, I've been fortunate to be building many data platforms there, and we recognize that just leaving it at storage does not deliver the promise of agility that Kubernetes offers. So you go to up-level it into applications. And for the very first time, in fact, we are introducing concepts such as, you go to a Mongo cluster and you say, I want to go snapshot this cluster. We understand the topology that this cluster has, how many shards and how many pods that are offering these things, the services, underlines the volumes, and that forms a snapshot. That's an application-level snapshot. The benefit of application-level snapshot is that if another developer wants to go clone and run queries on that, they don't have to go talk to a storage admin and say, just give me clones of these volumes. They'll say, just clone this MongoDB cluster. And then, within minutes, you have an up-and-running MongoDB cluster, fully functional, you can start running queries, live. Live data. Exactly. The other thing would be, Radesh talked about portability. So you have these snapshots, you get periodic snapshots. So let's say that you run out of capacity in your data center and you would like to go burst into a different cloud. Let's say on-premises, and you want to go and run a clone in GK because that's where the capacity is. Our snapshots and the way we have implemented and architected this allow you to port an entire application along with the topology, metadata, and data so that you can go and stand up fully functional, ready to use, let's say a MongoDB cluster in GKE in the cloud. No, you talked about GKE, Google Kubernetes Engine on GCP, Google Cloud Platform. Obviously, when you think about Kubernetes, that's kind of the mother ship when you come right down to it. How does your platform and GKE, GCP work together? So the first thing is that we have a partnership which is led by engineering to engineering engagement that Partha is front-ending around a standard set of APIs whereby the advanced data management capabilities that we are talking about can be brought into Kubernetes world itself and of course GKE has the implementation footprint. So that's one area that we've been collaborating on. The second is from a Google perspective, the preferred storage for running enterprise workloads or stateful workloads or the data intensive workloads that we're talking about is Robin storage. And that's, we definitely are pretty excited by the fact that through rigorous technical evaluation after rigorous technical evaluation, Google has chosen Robin storage as the preferred storage for these demanding workloads. So from both these standpoints of moving the state of the art of what does it mean to provide data management capabilities to Kubernetes to providing a solution that works today for customers who are embracing GKE both on-prem and in the cloud to be able to bring stateful workloads we are working with Google and pretty excited about that. Partha, do you want to add further color on the engineering partnership? Yeah, absolutely. I think as Radesh mentioned, so Google preferred, we are the preferred storage solution for that. Now we kind of just rewind back a little bit. There are about 25, 30 different storage vendors providing storage for Kubernetes, right? So what is so special? I mean, I think that there's something special that led us to this thing, this point, right? We took a very fundamentally different approach when we solved this problem for GKE or for Kubernetes. We could have started with several open source storage solutions out there and build on top of that. When there are companies that take ButterFS, for example, PTRFS and build on that. There are companies that take Ceph and build on that, right? We fundamentally said that, listen, if you want to elevate the experience from storage onto applications, the example that I took earlier of taking a snapshot of a Mongo, migrating it and all that, and if your storage stack is unaware of the application, which means that if the storage stack is unaware of the topology of the application and all that, can you really do application consistent snapshots? You can, all you can do is, you can just snapshot individual volumes, correct? Now, if the storage stack is not aware of the application topology, can you actually do application level quality of service? If you can do that, can you really guarantee noisy neighbor elimination? You have to do all those things, right? If you really want to run data platforms, those are the core things that you need to do, right? And so we took an approach that, it does not, it will not cut it if you build your storage stack on top of ButterFS, for example, or on Ceph. So we took a grounds up approach and he said that if you were to build a storage stack just cloud native, Kubernetes native, how would that look like? And how would the primitives be exposed so that it can elevate the entire experience to applications? So architecturally, we are very superior compared to the other players out there. It's a proof is that we got picked. Now, that's one aspect. The other aspect is the approach that we have taken to expose these primitives around snapshots and backup and app portability and all that was very clean and very pragmatic, right? How it works with both the born in the cloud as well as the, the, the prior workloads, right? And because of that, we are also collaborating with the Google engineers to come up with a set of APIs that we are planning to standardize, right? Around Kubernetes so that you could have a very standard set of APIs through which you can trigger these data management calls, right? So that's, that's other, like Radish talked about in engineering to engineering collaboration. So that's the other thing that we are collaborating on to create the standardized set of APIs based on the knowledge that we have had because we have, we have field deployments of like Radish talked about, right? Oracle RAC, we have field deployments that people are deploying multiple petabytes of storage in the single Kubernetes Robin cluster, right? So all that learning and all that experience that we have had is contributed towards this joint engineering to engineering effort that you're putting to create the standardized data management APIs. So we've got Robin IO has delivered a piece of technology for handling stateful Kubernetes clusters that has been validated by Google IO today or, you know, so that can be used now and is the basis for further engineering work to move this more into the mainstream for the future. That's good. Very exciting stuff. Partha, Radish, thanks very much for being here in theCUBE. Thank you. And once again, I want to thank Partha Satala who is the co-founder and CTO of Robin IO and Radish Manon who's a CMO of Robin.io. Once again, I'm Peter Burris. Thanks very much for watching this CUBE conversation. Until next time.