 Welcome back to theCUBE's coverage of CouchPace Connect Online where the theme of the event is modernize now. Yes, let's talk about that with me is Ravi Mayaram, who's the Senior Vice President of Engineering and the CTO at CouchPace. Ravi, welcome, great to see you. Thank you so much, I'm so glad to be here with you. I want to ask you what the new requirements are around modern applications. I've seen some of your comments, you got to be flexible, distributed, multimodal, mobile, edge, those are all the very cool sort of buzzwords, smart applications, what does that all mean and how do you put that into a product and make it real? Yeah, I think what has basically happened is that so far it's been a transition of sorts and now we have come to a point where the tipping point and the tipping point has been more because of COVID and where COVID has pushed us to a world where we are living in a sort of occasionally connected manner where our digital interactions precede our physical interactions in one sense. So it's a world where we do a lot more stuff, touch less in a digital manner as opposed to sort of making more specific human contact. That has really been the sort of accelerant to this modernize now as a theme. In this process, what has happened is that so far all the databases and all the data infrastructure that we have built historically are all very centralized. They're all sitting behind, they used to be in mainframes from where they came to like your own data centers where we used to run hundreds of servers to where they are going now which is the computing model has changed to consumption based computing which is all cloud oriented now. And so, but they are all centralized still but where our engagement happens with the data is at the edge, at your point of convenience, at your point of consumption not where the data is actually sitting. So this has led to all those buzzwords as you said which is like, oh, we need a distributed data infrastructure, where is the edge? But it just basically comes down to the fact that the data needs to be where you are engaging with it. And that means if you are doing it on your mobile phone or if you're sitting doing something in your, while you're traveling or whether you're in a subway whether you're in a plane or a ship wherever the data needs to come to you and be available as opposed to every time you going to the data which is centrally sitting in some place. And that is the fundamental shift in terms of how the modern architecture needs to think when it comes to digital transformation and transitioning their old applications to the modern infrastructure because that's what's going to define your customer experiences and your personalized experiences. Otherwise people are basically waiting for that circle of death that we all know and blaming the networks and other pieces. The problem is actually that data is not where you are engaging with it has got to be fetched. You know seven Cs away and that is the problem that we're basically solving this modernization of that data infrastructure. I love this conversation and I love the fact that there's a technical person that can kind of educate us on this because data by its very nature is distributed it's always been distributed but distributed database has always been incredibly challenging whether it was a global cysplex or an eventual consistency and getting recovery for distributed architecture has been extremely difficult. You know I hate that there's a terrible term lots of ways that's going to cat but you've been the visionary behind this notion of optionality, how to solve technical problems in different ways. So how do you solve that problem of a super rock solid database that can handle distributed data? Yes, so there are two issues that you alluded to over there. First is the optionality piece of it which is that same data that you have requires different types of processing on it. It's almost like fractional distillation. It is like your crude flowing through the system you start all the way from petrol and you can end up with Vaseline and rayon on the other end. But the raw material that's our data in one sense. So far we've never treated the data that way that's part of the problem. It has always been very purpose built and cast first problem and so you just basically have to recast it every time you want to look at the data. The first thing that we have done is make data that fluid. So when you're actually, when you have the data you can first look at it to perform let's say a simple operation that we call as a key value store kind of operation. Given my ID, give me my password kind of scenarios which is like, you know there are customers of ours who have billions of user IDs in their management so things get slower. How do you make it fast and easily available? Login should not take more than five milliseconds. This is a class of problems that we solve. That same data now eventually without you ever having to sort of do casting it to a different database you can now do solid queries our classic SQL queries which is our next magic. We are a no SQL database but we have a full functional SQL. The SQL has been the language that has talked to data for 40 odd years successfully. Every other database has come and tried to implement their own QL query language but they've all failed only SQL which stood the test of time of 40 odd years. Why? Because there is solid mathematics behind it. It's called relational calculus. And what that helps you is basically look at the data and any combinatorial any which way you look at the data out will come the data in a format that you can consume. That's the guarantee it sort of gives you in one sense. And because of that you can now do some really complex in the database science what we call as predicate logic on top of that. And that gives you the ability to do the classic relational type queries select star from where can the stuff because it's at an English level becomes easy to. So the same data you didn't have to go move it to another database do your sort of transformation of the data and all the stuff. Same data you do this now. That's where the optionality comes in. Now you can do another piece of logic on top of this which we call search. This is built on this concept of inverted index and TFIDF the classic or Google in a very simple terms but Google tokenized search. You can do that in the same data without you ever having to move the data to a different format. And then on top of it you can do what is known as eventing or your own custom logic which we do on programming language called JavaScript. And finally analytics and analytics is the your ability to query the operational data in a different way ad hoc querying. What was my sales of this widget year over year on December 1st week. That's a very complex question to ask and it takes a lot of different types of processing. So these are different types of that's optionality different types of processing on the same data without you having to go to five different systems without you having to recast the data in five different ways and five different application logic. So you put them in one place. Now is your second question. Now this has got to be distributed and made available in multi-cloud in your data center all the way to the edge which is the operational side of the database management system. And that's where the distributed platform that we have built enables us to get it to where you need the data to be. You know, in a classic way we call it CDN-ing the data as in like content delivery network so far do static sort of moving of static content to the edges. Now we can actually dynamically move the data. Now imagine the richness of applications you can develop. And on the first part of the answer to my question are you saying you could do this without scheme it's with a no scheme on right and then you can apply those techniques. Fantastic question, yes. That's the brilliance of this database is that so far classically databases have always demanded that you first define a schema before you can write a single byte of data. Couchbase is one of the rare databases. I for one don't know any other one but that could be let's give the benefit of doubt. It's a database which writes data first and then late binds to schema as we call it. It's a schema on read thing. So because there is no schema it is just a JSON document that is sitting inside and JSON is the lingua franca of the web as you very well know by now. So it's just JSON that we manage. You can do key value lookups of the JSON. You can do full query capability like a classic relational database. We even have cost-based optimizers and other sophisticated pieces of technology behind it. You can do searching on it using the full textual analysis pipeline. You can do ad hoc querying on the analytics side and you can write your own custom logic on it using our eventing capabilities. So that's what it allows because we keep the data in a native form of JSON. It's not a data structure or a data schema imposed by a database. It is how the data is produced. And on top of it we bring different types of logic, five different types of, it's like the philosophy is bringing logic to data as opposed to moving data to logic. This is what we have been doing in the last 40 years because we developed various database systems and data processing systems of various points in time in our history. We had key value stores, we had relational systems, we had search systems, we had analytical systems, we had queuing systems, all these systems. If you want to use any one of them, our answer has always been just move the data to that system. Versus we are saying that do not move the data as we get bigger and bigger in data, just moving this data is going to be humongous problem. If you want to be moving better bytes of data for this, it's not going to fly. Instead, bring the logic to the data. So you can now apply different types of logic to the data. I think that's what in one sense the optionality piece of it is. Thank you for that. But as you know, there's plenty of schemalist data stores, they're just, they're called data swamps. I mean, that's what they became, right? I mean, so this is some interesting magic that you're applying here. Yes. I mean, the one problem with the data swamps as you call them is that that was little too open ended. Because the data format itself could change. And then you do your, then everything became like again, data recasting, because it required you to have it in certain schema in one sense at the end of the day for certain types of processing. So in that word, a lot of gaps, it's proliferated but it not really, how do you say, keep to the promise that it actually meant to be. So that's why it was a swamp. I mean, because it was fundamentally not managing the data, the data was sitting in some file system and then you were doing something else. This is a classic database where the data is managed and you create indexes to manage it and you create different types of indexes to manage it. You distribute the index, you distribute the data, you have, like we were discussing, you have asset semantics on top of it. And when you put all these things together, it's a tough proposition, but we have solved some really tough problems which are good computer science, tough computer science problems that we have to solve to bring this to bear, to bring this to the market. So you predicted the trend around multimodal and converged databases. You kind of led couch base through that. I want to, I always ask this question because it's clearly a trend in the industry and it definitely makes sense from a simplification standpoint and so that I don't have to keep switching databases. So the flip side of that though, Ravi, and I wonder if you could give me your opinion on this is kind of the right tool for the right job. So I often say, isn't that the Swiss Army knife approach where you have a little teeny scissors and a knife that's not that sharp? How do you respond to that? A great one. My answer is always, I use another analogy to tackle that one is that, have you ever accused the smartphone of being a Swiss Army knife? No, no. Nobody does that because it actually, 40 functions in one is what a smartphone becomes. You never recall your iPhone or your Android phone Swiss Army knife because here's the reason is that you can use that same device in the full capacity. That's what optionality is. It's not like your good old one where there's a keyboard hiding half the screen and you can do everything only through that keyboard without touching and stuff like that. That's not, the whole device is available to you to do one type of processing when you want it. When you're done with that, it can do another completely different types of processing. Right, as in a moment, it could be your tom-tom telling you all the directions. The next one, it's your PDA. The third one, it's a fantastic phone. Four, it's a beautiful camera which can do your F-stop management and give you a nice SLR quality picture. Right, so, and next moment is the video camera. People are shooting movies with this thing in Hollywood these days for God's sake. So it gives you the full power of what you wanna do when you want it. And now if you just thought that iPhone is a great device or any smartphone is a great device because you can do five things in one or 50 things in one, at a certain level you miss the point because what that device really enabled is not just these five things in one place it becomes easy to consume and easy to operate. It actually started the app-based economy. That's the brilliance of bringing so many things in one place because in the morning, you know, I get an alert saying that today you got to leave home at 815 for your nine o'clock meeting and the next date might actually say 845 is good enough because it knows where the phone is sitting the geo position of it. It knows from my calendar where the meeting is actually happening. It can do a traffic calculation because it's got my map and all the routes and then it's got this notification system which eventually pops up on my phone to say, hey, you got to leave at this time. Now, five different systems have to come together and they can because the data is in one place. Without that, you couldn't even do the simple function in a sort of predictable manner, in a manner that's useful to you. So I believe a database which gives you this optionality of doing multiple data processing on the same set of data will allow you to build a class of products which you are so far been able to struggling to build because half the time you're running sideline to sideline just integrating data from one system to the other. So I love the analogy with the smartphone. I want to continue it, double click on it. So I use this camera. I used to, when my kid had a game I would bring the big camera, the 35 millimeter. So I don't use that anymore. No way, but my wife does. She still uses the DSLR. So is there a similar analogy here and by the way, the camera shop in my town went out of business, you know? So, but is there, is that a fair, in other words, those specialized databases they're stills a place for them but they're getting squeezed. Absolutely, absolutely. Great analogy and the great extension to the question. That's the contrary inside of it. In one sense is that, hey, if everything can just be done in one do you have a need for the other things? I mean, you gave a camera example where it is sort of, it's a slippery slope. Let me give you another one which is actually illustrates the point better. I mean, just because I listen to HACMA music on the iPhone doesn't stop me from having my full digital receiver and my Harman Kardon speakers at home because they produce a kind of sound and immersive experience. This tinny little speaker is never in its lifetime intended to produce, right? It's the convenience, yes. It's the convenience of convergence that I can put my earphones on and listen to all the great music. Yes, it's 90% there or 80% there. It depends on your audio file mess of your, I mean, your experience. The super specialized ones do not go away. There are places where the specialized use cases will demand a separate system to exist. But even there, there has got to be very close, how do you say, close binding or late binding? I should be able to stream that song from my phone to that receiver so I can hear it from those speakers. You can say that, oh, there's a digital divide between these two things. No, no, I can only play CDs on that one. That's not how it's gonna work. Going forward, it's gonna be, this is the connected world, right? As in, if I'm listening to the song in my car and then I step off the car, walk into my living room, that same song should continue and play in my living room speakers, then it's a connected world because it knows my preference and what I'm doing. That all will happen only because of this data flowing between all these systems. I love that example too. When I was a kid, we used to go to Twitter, et cetera, and we used to play around. We'd take all the big four-foot speakers. Those stores are out of business too. You need your subwoofer too. Yeah, absolutely. And now we just plug in the Sonos. So, is the debate between relational and non-relational databases over, Ravi? I believe so because I think what had happened was relational systems, I mean, were the norm. They ruled the rules, if you will, for the last 40 odd years. And then came this no-sequel movement which was almost as though a rebellion from the relational world we all inhibited because it was very restrictive. It had the schema definition and the schema evolution as we call all those things. They required a committee. They required your DBA and your data architect and you have to call them just to add one column and stuff like that. And the world had moved on. This was a world of blogs and tweets and mash-ups and a different generation of digital behavior. There are digital native people now who are operating in these. And the applications, the consumer-facing applications were living in this world and yet they enterprise ones who are still living in the other side of the divide. So, out came this solution to say that we don't need SQL. Actually, the problem was never SQL. No SQL was, you know, best approximation, good marketing name, but from a technologist's perspective, the problem was never the query language. No SQL was not the problem. The schema limitations and the inability for this system to scale. The relational systems were built like airplanes which is that if San Francisco, Boston, there's a flight route, it's so popular that if you want to add 50 more seats to it, the only way you can do that is to go back to Boeing and ask them to get you in from 737 to 777 or whatever it is and they'll stick you with a billion dollar bill and they'll have to somehow pay that by, you know, either flying more people or raising the rates or whatever you have to do. These are called vertically scaling systems. So relational systems are vertically scaling. They are expensive. Versus what we have done in this modern world is make the system horizontally scaling, which is more like the same thing if it's a train that is going from San Francisco to Boston, you need 50 more people, be my guest, I'll add one more coach to it, one more car to it. And the better part, the way we have done this here is that, and we have super specialized on that, this route actually requires three dining cars and only 10 sort of sleeper cars or whatever, then just pick those and attach. In the next route, you can choose to have, I need only one dining car, that's good enough. So the way you scale the train is also can be customized based on the route, longer route, more dining capability, shorter route, not enough dining capability. You can attach the kind of coaches, we call this multi-dimensional scaling. Not only do we scale horizontally, we can scale to different types of workloads by adding different types of coaches to it, right? So that's the beauty of this architecture. Now, why is that architecture important? Is that where we land eventually is the ability to do operational and analytical in the same place. This is another thing which hasn't happened in the past because you would say that I cannot run this analytical query because then my operational workload will suffer, then my front end will slow down, millions of customers are impacted. That problem we have solved, the same data in which you can do analytical query and operational query because they're separated by these cars, right? As in like we fenced the resources so that one doesn't impede the other. So you can at the same time have a microsecond, 10 million ops per second happening of a key value or a query. And then yet you can run this analytical query which will take couple of minutes to run, one not impeding the other. So that's in one sense, sort of the part of the problem that we have solved here is that relational versus the no SQL portion of it, these are the kinds of problems we had to solve. We solve those and then we yet put back the same query language on top. Why? It's like Tesla in one sense, right? Underneath the surfaces where all the stuff that had to be changed had to change, which is like the gasoline, the internal combustion engine, the gas usage, these were the issues we really wanted to solve. So solve that, change the engine out. You don't need to change the steering wheel or the gas pedal or the paddle shifters or whatever else you need over there for your gear shifters. Those need to remain in the same place, otherwise people won't drive, otherwise it will not even look like a car to people. So even when you feed people the most advanced technology, it's got to be accessible to them in the manner that people can consume. Only in software, we forget this first design principle and we go and say that, well, I got a car here, you got the blue harder to go fast and you lean back for it to apply a break. That's how we simply define design software. Instead, we should be designing them in a manner that it is easiest for our audience, which is developers to consume. And they've been using SQL for 40 years or 30 years. And so we give them the steering wheel and the gas pedal and the gear shifter is back by putting SQL back on. Underneath the surface, we have completely solved the relational limitations of schema as well as scalability. So in that way, and by bringing back the classic asset capabilities, which is what relational systems we accounted on and being able to do that with the SQL programming language, we call it like multi-statement SQL transactions, so to say, which is what a classic way all the enterprise software was built. By putting that back, now I can say that the debate between relational and non-relational is over because this has truly extended the database to solve the problems that the relational systems had to grow up to solve in the modern times rather than get sort of pedantic about whether we have no SQL or SQL or new SQL or any of that sort of jargon-oriented debate. These are the debates of computer science that they've actually endeavored to solve and they have solved them with the latest release of 7.0, which we released a few months ago. Right, right, last July. Ravi, we got to leave it there. I love the examples and the analogies. I can't wait to be face-to-face with you. I want to hang with you at the cocktail party because I've learned so much and really appreciate your time. Thanks for coming to theCUBE. Fantastic. Thanks for the time and the opportunity and I mean, very insightful questions. Really appreciate it. Thank you. Okay, this is Dave Vellante. We're covering CouchBase Connect Online. Keep it right there for more great content on theCUBE.