 Hello, and welcome back to the DockerCon 2021 virtual coverage, I'm John Furrier, host of theCUBE here in Palo Alto with a remote interview with a great guest, Cube alumni, Jim Walker, VP of product marketing at Cockroach Labs. Jim, great to see you remotely coming into theCUBE. Normally we're in person soon. We'll be back in real life. Great to see you. Great to see you as well, John. I miss you. I miss seeing you live and in person. So this has got to do, I guess, right? We had the first multi-cloud event in New York City you guys had was I think one of the last events that was going on towards the end of the year before the pandemic hit. So a lot's happened with Cockroach Labs over the past few years, accelerated growth, funding, amazing stuff. Here at DockerCon, containerization of the world, containers everywhere and all places, you know, hybrid, pure cloud, edge everywhere. Give us the update. What's going on with Cockroach Labs and then we'll get into what's going on at DockerCon. Yeah, you know, Cockroach Labs, this has been a pretty fun ride. I mean, I think about two and a half years now and John, it's been phenomenal as the world kind of wakes up to distributed systems and the containerization of everything. I'm happy we're at DockerCon talking about containerization because I think it's, you know, I think it has radically changed the way we think about software, but more importantly, it's starting to take hold. I think a lot of people would say, oh, it's already taken hold. But if you start to think about like just, you know, these kind of modern applications that are depending on data and what does containerization mean for the database? Well, Cockroach has got a pretty good story, you know? I mean, gosh, before Escape, I think the last time I talked to you, I was at CoreOS and, you know, we were playing the whole Kubernetes game and I remember Alex Povey talking about Giffy, Google infrastructure for everyone or for everyone else I should say. And I think that's what, we've seen that kind of happen with the infrastructure layer, but I think that last layer of infrastructure is the database. Like I really feel like the database is that dividing line between the business logic and infrastructure. And it's really exciting to see, you know, just massive huge customers come to Cockroach to rethink what the database means in cloud, right? What does the database mean when we move to distributed systems and that sort of thing? And so, you know, momentum has been building here. We are, you know, upwards of, oh gosh, over 300 paying customers now, you know, thousands of, you know, Cockroach customers in the wild out there, but you know, we're seeing this huge, massive attraction to Cockroach cloud which is a great name. Come on, John, you got to say, right? Like, you know, so, and you know, our database as a service, so getting that out there and seeing the uptake there has just been, it's been phenomenal over the past couple of years. Yeah, and you got to love the Cockroach name. I love it, you know, survives nuclear war in winter, all that good stuff, as they say. But really the reality is is that it's kind of an interesting play on words because one of the trends that we've been talking about, I mean, you and I have been talking about this for years and with our CUBE coverage around Amazon web services early on was very clear about a decade ago that there wasn't going to be one database to rule the world. There going to be many, many databases. And as you started getting into these cloud native deployments at scale, you know, use your database of choice was the developer ethos. Just whatever it takes to get the job done. Now you start integrating this in a horizontally scalable way with the cloud. You have now new kinds of scale, cloud scale. And it kind of changes the game on the always on availability question which is how do I get high availability? How do I keep things running? And that is the number one developer challenge whether it's infrastructure as code, whether it's security, shifting left. It all comes down to making sure it stops running at scale and secure. Tell me about that. Absolutely and it's interesting. It's been like I said, this journey and this arc towards distributed systems and truly like delivery of what people want in the cloud. It's been a long arc and it's been a long journey. And I think we're getting to the point where people, you know, they are starting to kind of bake resilience and scale into their applications. And I think that's kind of this modern approach. You know, look, we're taking legacy databases today. People are kind of lift and shift, move them into the cloud, try to run them there, but they aren't just built for that infrastructure. Like there's a fundamentally different approach and infrastructure when you talk about cloud. It's one of the reasons why, you know, John early on your conversations with the AWS team and what they did. It's like, yeah, how do we give resilient and ubiquitous and always on, you know, scalable kind of infrastructure for people? Well, that's great from that those layers, but when you start to get into the software that's running on these things, it isn't lift and shift and it's not even moving improved. You can't like just take a legacy system and change one piece of it to make it kind of take advantage of, you know, the scale and the resilience and the ubiquity of the cloud because there's very, very explicit challenges. You know, for us, you know, it's about re-architect and rebuild. Let's tear the database down and let's rethink it and build from the ground up to be cloud native. And I think the technologies that have done that that have kind of built from scratch to be cloud native are the ones that are, I believe, you know, three years from now, that's what we're going to be talking about. I mean, this comes back to again, you know, like the genesis of what we did is Google Cloud Spanner. It's the Spanner white paper. And, you know, what Google did, they didn't build, they didn't use an existing database because they needed something for transactional relational database. They hire a bunch of really incredible engineers, right? And they've got like Jeff Dean and Sanjay Gemawad over there, like designing and doing all these cool things and they build. And I think that's what we're seeing. And I think that's, to me, the exciting part about data in the cloud as we move forward. Yeah, and I think the Google Cloud Infrastructure everyone, I think that's the same mindset for Amazon is that I want all the scale, but I don't want to do it like over 10 years, I want to do it now, which I love. I want to get back to this thing, but I want to ask you specifically this definition of containerization of the database. I've heard that kicked around. Love the concept. I kind of understand what it means, but I want you to define it for us. What does it mean when someone says containerizing the database? Yeah, I mean, simply put the database in container and run it. And that's all, I think that's like maybe step one. I think that's what, that's kind of lift and shift. Let's put it in a container and run it somewhere. And that's kind of, that's not that hard to do. I think I could do that. I mean, I haven't coded in a long time, but I think I could figure that out. It's when you start to actually have, you know, multiple instances of a container, right? And that's where things get really, really tricky. Now we're talking about true distributed systems. We're talking about how do you coordinate data? How do you, how do you balance data across multiple instances of a database, right? How do you actually have failover so that, you know, if one node goes down, a bunch of them are still available. How do you, you know, guarantee transactional consistency? You can't just have like, you know, four instances of a database, all with the same information in a John, without any sort of coordination, right? Like you hit one node and you hit another one, the same account, you know, which, which transaction wins? And so the concepts in distributed systems around, there's this thing called the cap theorem, this consistency availability and partition tolerant. And actually understanding how these things work, especially for data in distributed systems to make sure that it's going to be, you know, consistent and available. And you're going to scale. Those things are not simple to solve. And again, it comes back to this, I don't think you can do it with legacy database. You kind of have to re-architect. And it comes down to where data is stored. It comes down to how it's replicated. It comes down to really ultimately, where it's physically located. I think when you, when you deploy a database, you think about the logical model, right? You think about tables and normalization and referential integrity. The physical location is, is extremely important as we kind of move to kind of containerized and distributed systems, especially around data. Well, you guys are here at DockerCon 2021, Cockroach Labs, good success. Love the architectural flexibility that you guys offer. And again, bringing that scale. And like you mentioned, it's awesome value proposition, especially if people want to just, just program the infrastructure. What are you, what's going on with, with DockerCon specifically, a lot of talk about developer productivity, a lot of talk about collaboration and trust with containers, big story around security. What's your angle here at DockerCon this year? What's the big reveal? What's the discussion? What's the top conversation? Yeah, I mean, look at where we are a containerized database and we are an incredibly great choice for developers. You know, for us, it's, look at, there's certain developer communities that are important on this planet, John. And this is one of them, right? This is, you know, I don't know a developer doesn't have that little whale up in their, you know, in their, in their status bar, right? And, you know, for us, it's, look at, look at, you know me, man, I believe in this tech and I believe that this is something that's driven greatly simplify our lives over the next two to three to 10 to 15 years. And for us, it's about awareness. And, and I think once people see cockroaches, they're like, oh my God, how did I ever even think differently? And so for us, it's kind of moving in that direction. But, you know, ultimately our, our vision where we want to be is, is we want to, we want to abstract the database to a SQL API in the cloud. We want to make it so simple that I just have this REST interface. There's endpoints all over the planet. And as a developer, I never have to worry about scale. I never have to worry about DR, right? It's always going to be on. And most importantly, I don't have to worry about low latency access to data no matter where I'm at on the planet, right? I can give every user this kind of, you know, sub 50 millisecond access to data or sub 20 millisecond access to data. And that, that is the true delivery of the cloud, right? Like, I think that's what the developer wants out of the cloud. They want to code against the service, like, man, it's got to be consumption based and you secure and, you know, I don't want to have to pay for stuff. I'm not using and that all those things. And so, you know, for us, that's what we're building to and, you know, interacting in this environment is, is critical for us because I think that's where our audience is. I want to get your thoughts on, you guys do have success with a couple of different personas and developers out there, groups, you know, classic developers, software developers, which is this show is at DockerCon full of developers, KubeCon, a lot of operators, cool and some devs, but mostly cloud native operations. Here's the developer shop. So you guys got to hit the developers, which really care about building fast and building the scale and last with security. Architects you had success with, which is the classic, you know, cloud architecture, which now distributed computing, we get that. But the third area, I would call the kind of the role that both the architects and the developers had to take on, which is being the DevOps person or then becomes the SRE in the group, right? So most startups have the DevOps team, developers, they do DevOps natively and within every role. So they're the same people provisioning. But as you get larger in an enterprise, the DevOps role, whether it's in a team or group, takes on this SRE, Site Reliability Engineer. You know, this is a new dynamic that brings engineering and coding together. It's like, not so much an ops person, it's much more of like an engineering developer. Why is that role so important? And we're seeing more of it in dev teams, right? Seeing an SRE person or a DevOps person inside teams, not a department. Yeah. Do you agree? Yeah, I mean, we employ an army of SREs that manage and maintain our, you know, Cockroach cloud, which is, you know, Cockroach DB as a service, right? How do you deliver kind of a world-class experience for somebody to adopt a, you know, managed service database such as ours, right? And so, you know, for us, yeah, I mean, SREs are extremely important. We have personal kind of, you know, an opinion on this. But more importantly, I think, you know, look at, if you look at Cockroach and the architecture of what we built, you know, I think at Kelsey Hightower at one point said, I'm gonna probably mess this up, but there was a tweet that he wrote something like, you know, Cockroach DB is the spanner as Kubernetes is the board. And if you think about that, I mean, that's exactly what this is. And we built a database that was actually amenable to the SRE, right? This is exactly what they want. They want it to scale up and down. They want it to just survive things. They want to be able to script this thing and basically script the world. They want to actually, that's how they want to manage and maintain. And so for us, you know, I think our initial audience was definitely architects and operators. And it's the KubeCon crowd. And they're like, wow, this is cool. This is architected just like Kubernetes. In fact, like at CD, which is a key piece of Kubernetes. Well, we contribute back up to at CD, our raft implementation. So there's a lot of the same tech here. What we've realized though, John, what database is interesting, you know, like the architect is choosing a database sometimes, but more often than not, a developer is choosing that database. And it's like, they go out, they find a database, they just start building and that's what happens. You know, for us, it was, you know, we made a very critical decision early on. You know, this database is wire compatible with Postgres, and it speaks to SQL syntax, which if you look at some of the other solutions that are trying to do these things, those things are really difficult to do again. So like a critical decision to make sure that it's amenable that now we can build the ORMs and all the tools that people would use and expect out of Postgres from a developer point of view, but let's simplify and automate and give the right kind of like the platform that the SREs need as well. And so for us, the last year and a half is really about, how do we actually build the right tooling for the developer crowd too? And we've really pushed really far in that world as well. Yeah, talk about the aspect of the scale of a safe startup, for instance, because you made this great example, Borg to Kubernetes, because Borg was Google's internal Kubernetes-like thing. So you guys have Spanner, which everyone knows is a great product that Google had. You guys have almost the commercial version of that for the world. Is there, I mean, some people will say, and I was just going to challenge you on this and want to get your thoughts. You know, I'm not Google, I'll never be Google. I don't need that scale. Or so how do you address that point? Because some people say, well, this might dismiss the notion of using it. How do you respond to that? Yeah, John, we get this all the time. Like, I'm not global. My application is not global. I don't need this. I don't need a tank, right? I just need to walk down the road, you know what I mean? And so, the funny thing is, even if you're in a single region and you're building a simple application, does it need to be always on? Does it need to be available? Can it survive the failure of a server or a rack or an AZ? It doesn't have to survive the failure of a region, but I tell you what, if you're successful, you're going to want to start actually deploying this thing across multiple regions. So you can survive a backhoe hitting a cable and the entire East Coast going out, right? Like, and so with Cockroach, it's really easy to do that. So it's four little SQL commands and I have a database that's going to span all those regions, right? And I think that's important. But more importantly, think about scale. When a developer wants to scale, typically it's like, okay, I'm going to spin up Postgres and I'm going to keep increasing my instance size. So I'm going to scale vertically until I run out of room and then I'm going to have to start sharding this database. And when you start doing that, it adds this kind of application complexity that nobody really wants to deal with. And so forget it, just let the database deal with all that. So, we find this thing extremely useful for the single developer in a very small application. But the beauty thing is, if you wanted to go global, great, just keep out of notes. Like when that application does take off and it's the next breakthrough thing, this database is going to grow with you. So it's good enough to kind of start small, but it's a scale fast. It'll go global if you want to. You have that option, I guess, right? I mean, why wouldn't you want optionality on this at all? So clearly a good point. Let me ask you a question. Take me through a use case where with Cockroach, some scenario develops nicely. You can point to the visibility of the use case for the developer and then kind of how it played out. And then compare that and contrast that to a scenario that doesn't go well. Where it plays out well for an example. And then if they didn't deploy it, they got hung up and went sideways. Yeah, you know, like Cockroach was built for transactional workloads. That's what we are. Like we are optimized for the speed of light and consistent transactions. That's what we do and we do it very well. At least I think so, right? But I think, you know, like the, my favorite customer of all of ours is DoorDash. And about a year ago, DoorDash came to us and said, look at, we have a transactional database that can't handle the right volume that we're getting. It falls over, you know, and they'd significant challenges. And if you think about DoorDash and DoorDash's business, they're looking at IPO in the summer and going through these, you can't have any issues, you know? So the system's gotta be up and running, right? And so for them, it was like, you know, we need something that's reliable. We need something that's not gonna come down. We need something that's gonna scale and handle burst and these sort of things. And their business is big. Their business is not just, let me, you know, deliver food all the time. It's deliver anything. Like be that intermediary between, you know, a good and somebody's front door. That's what DoorDash wants to be. And, you know, for us, yeah, their transactions and that back in transactional system is built on Cockroach. And if you, that's one year ago, they needed to get experienced. And once they did, they started to see that this was like very, very valuable and lots of different workloads they had. So anywhere there's any sort of transactional workload be it metadata, be it any sort of like inventory or transaction stuff that we see in companies, that's where people are coming to us. And it's these traditional relational workloads that have been wrapped up in these, you know, transactional relational databases, you know, what built for the cloud. So I think what you're seeing is, that's the other shoe to drop. We've seen this happen, you know, you're watching Databricks, you're watching Snowflake, kind of do this whole data cloud and then the analytical side, John, that's been around for a long time and there's that move to the cloud. That same thing that happened for OLAP has got to happen for OLTP. Where we don't do well is when somebody thinks that we're an analytic database. It's not what we were built for, right? We're optimized for transactions. And I think you're going to continue to see these two sides of the world, especially in cloud, especially because I think the way that our global systems are going to work, you don't want to do analytics across multiple regions, it doesn't make sense, right? And so that's why you're going to see this, the continued kind of two markets, OLAP and OLTP going on. And we're just, we're square in the OLTP side of the world. Yeah, talk about the transaction processing side of it when you start to change the distributed architecture that goes from core edge, core on-premises to edge. Edge being intelligent edge, industrial edge, whatever, you're going to have more action happening. And you're seeing Kubernetes already kind of talking about this and with the containers you got. So you got kind of two dynamics. How does that change the nature of the level of, the nature of and the level of volume of transactions? Well, it's interesting, John. I mean, if you look at something like Kubernetes, it's still really difficult to do multi-region or multi-cloud Kubernetes, right? This is one of those things that like, you know, you start to move Kubernetes to the edge, you're still kind of managing all these different things. And I think it's not the volumes, it's the operational nightmare of that. For us, let's federate at the data layer. Like I could deploy cockroaches across multiple Kubernetes clusters today and you're going to have one single logical database running across those. In fact, you can deploy cockroaches today on top of three public cloud providers. I can have nodes in AWS. I could have nodes in GCP. I could have nodes running on VMs in my data center. Any one of those nodes can service requests and it's going to look like a single logical database. Now that to me, you know, when we talked about multi-cloud a year and a half ago or whatever that was, John, you know, that's an actual multi-cloud application and delivering data so that you don't have to actually deal with that in your application layer, right? You can do that down in the guts of the database itself. And so I think it's going to be interesting in the way that these things get consumed and the way that we think about where data lives and where our compute lives, I think that's part of what you're thinking about too. Yeah, so let me, while I got you here, one of the things on my mind, I think people want to maybe get clarification on this real quick while you're here, take a minute to explain the interesting Cockroach DB and Cockroach Cloud. Okay, they're different products. You know, you mentioned you brought them both up. What's the difference for the developers watching? What's the difference between the two and when do I need to know the difference between the two? So, you know, to me, they're really one because Cockroach Cloud is Cockroach DB as a service. You know, it's our offering that makes it, you know, a world-class easy to consume experience of working with Cockroach DB where, you know, we take on all the hardware, we take on the SRE role, we make sure it's up and running, right? You're gonna connection stringing your code against it. And I think, you know, that side of our world is really all about this kind of highly evolved database and delivering that as a service and you can actually use it as Cockroach DB. I think it's really interesting, John, is the next generation of what we're building, the serverless version of our database where, you know, this is just an API in the cloud. You know, we're gonna have one instance of Cockroach with multi-tenant, you know, database in there and any developer can actually spin up on that. And to me, you know, that gets to be a really interesting world. When the world turns serverless and we have, you know, we're running, you know, our compute in Lambda and we're doing all these great things, right? Or we're using Cloud Run and Google, right? What is the, what's the corresponding database to actually deal with that? And that to me is a fundamentally different database because what is scale in the serverless world? It's autonomous, right? What's scale in the current like Cockroach world? Well, you kind of keep adding nodes to it, you manage, you deal with that, right? You know, what does resilience mean in a serverless world? It's just, yeah, it's just there all the time. What's important is latency when you get to kind of serverless, like where are these things deployed? And I think to me, the interesting part of like the two sides of our world is what we're doing with serverless and kind of this, and how we actually expose the core value of Cockroach DB in that way. Yeah, and I think that's one of the things that is the nirvana or the holy grail of infrastructure as code is making it, I won't say irrelevant, but invisible. If you're really dealing with a database thing, hey, I'm just scaling and coding and the database stuff is just working with compute, just whatever, how that's serverless. And you know, you mentioned Lambda. That's the, that's the action because you don't want, by deciding what the database is just having it happen is more productivity for the developers. So that kind of circles back to the whole productivity message for the developers. So I totally get that. I think that's a great vision. The question I have for you, Jim, is the big story here is developer simplicity. How are you guys making it easier to just deploy? John, it's just an extension of the last part of the conversation. I don't want a developer to ever have to worry about a database. Like that's what Spencer and Peter and Ben have in their vision. It's, how do I make the database so simple? It's a SQL API in a thought. Like it's a REST interface. I code against it. I run queries against it. I never have to worry about scaling the thing. I never have to worry about creating active passive and primary and secondaries and all these like the DevOps side of it, all this operation stuff. It's just kind of done in the background, dude. And if we can build it, and it's actually there now where we have it in beta, you know, what's the role of the cost base optimizer in this new world that we've had in databases, how are you actually ensuring data is located close to users? And we're, we're automating that so that when John's in Australia doing a show, his data is going to follow him there. So he has fast access to that, right? And that's the kind of stuff that we're talking about the next generation of infrastructure, John, not like, we're not building for today. Like in Congress labs is not building for like 2021. Sure, do we have something that's great? We're building something that's 22 and 23 and 24, right? Like what do we need to be as a, you know, as a extremely productive set of engineers? And that's what we think about all day. How do we make data easy for the developer? Well, Jim, great to have you on, VP of product marketing at Cockroach Labs. We've known each other for a long time. I got to ask you while I got you here, final question is, you know, you and I have chatted about the many waves of open source and the computer industry. What's your take on where we are now? And obviously you're looking at it from the Cockroach Labs perspective, which is large scale, distributed computing, kind of you're on the new side of history, the right side of history, cloud native. Where are we right now? Compare and contrast, for the folks watching who are trying to understand the importance of where we are in the industry. Where are we and what's your take? Yeah, John, I feel fortunate to be in a company such as this one and, you know, the past couple that I've like been around. I feel like we are in the middle of a transformation and it's just like the early days of this next generation. And I think we're seeing it in a lot of ways in the infrastructure for sure, but we're starting to see it creep up into the application layer. And for me, it is so incredibly exciting to see the cloud was, remember when cloud was like this thing that people were like, oh boy, maybe I'll do it. Now it's like it's, anything that new is going to be on cloud, right? Like we don't even think twice about it. And the coming nature of cloud native and actually these technologies that are coming are going to be really interesting. I think the other piece that's really interesting, John, is the changing role of open source in this whole game. Because I think of open source as code, consumption and community, right? I think about those. And then there's license, of course. I think people were always, a lot of people wrapped around licensing. Consumption has changed. John, Matt, back when we were talking Hadoop, consumption was like, oh, it's free. I get this thing. I could just download and use it. Well, consumption over the past three years, everybody wants everything as a service. And so we're ready to pay. For us, how do we bring free back to the service? And that's what we're doing. That's what I find. Like I am so incredibly excited to go through this kind of, bringing back free beer to open source. I think that's going to be great. Because if I can give you a database free up to five gig or 10 gig, man, and it's available all over the planet, it has fully featured. That's bringing our community and our code, which is all open source. And this consumption model back. And I'm super excited about that. Yeah, free beer. Who doesn't like free beer? Of course, developers love free beer. And a great t-shirt, too. That's soft. Make sure you get the soft t-shirt. You just don't want free puppy. You know what I mean? Remember, it was just like, that sounds painful. Well, Jim, great to see you remotely. Can't wait to see you in person at the next event. We've got the fall window coming up. We'll see some events. I think KubeCon in LA is going to be in person. Reinvent a database for sure will be in person. We know that for a fact, we'll be there. So we'll see you in person. And congratulations on the work at Cockroach Labs. Thanks, John. Great to see you again. All right, this Kube coverage of DockerCon 2021. I'm John Furrier, your host of theCUBE. Thanks for watching.