 Everyone, welcome to this special CUBE conversation. I'm John Furrier, your host of theCUBE. We're here in Palo Alto, California. And I'm here with a very special guest coming down from Seattle remotely into the CUBE studios is the leader at AWS, Amazon web service, the vice president of database analytics and machine learning. Swami, great to see you CUBE alumni. Recently taking over the database business at AWS as a leader. Congratulations and thanks for coming on theCUBE. My pleasure to be here, John. Very excited to talk to you. Yeah, we've had many conversations on theCUBE and also in person and also online around all the major trends. You've had your hand in all the action, going back to your days when you were in school learning and writing papers. And 10 years ago, Amazon web services launched AWS DynamoDB. Fast, flexible, no sequel database that everyone loves today, which has inspired a generation of what I would call database distributed in cloud scale, single digit millisecond performance at scale. And again, the keyword scale. And again, this is 10 years ago. So it seems like yesterday, but you guys are celebrating it. And your name was on the original paper with CTO of Rener Vogels, your celebrity. Congratulations. Thank you. Not sure about the celebrity part, but I'm very excited at least I've played hand in building such an amazing technology that has enabled so many amazing customers along the way as well. So... A little trivia on the paper as you were an intern at AWS, obviously getting your PhD. And then since rising through the ranks and involved in a lot of products over the years, and then leading the machine learning at AI, which is now changing the game at the industry level. But I got to ask you, getting back to the story here, a lot of customers have built amazing things on top of DynamoDB, not to mention lots of other AWS and Amazon tech riding on it. Can you share some of the highlights that came out of the original paper? And with some examples, because I think this is a point in time, 10 years ago, where you start to, so the kick up of cloud scale, not just for developers and building startups, you really start to see the scale rise. Yeah, actually, I mean, as you probably know, based on what you read, to explain the genesis of DynamoDB itself, I had to explain the genesis of how Amazon got into building the original Dynamo, right? And this was during the time when Warner, I joined Warner Steam as an intern and Amazon was one of the pioneers in now pushing the boundary of scale. And year over year, our Q4 holiday season tends to be really, really big for all the right reasons. We all want our holiday shopping done during that time and you want to be able to scale your website, orders, fulfillment centers, all of them at that time. And those are the times around 2005. And the answer is, when people think of database, they think of a single database server that actually runs on a box and has a certain characteristics in terms of scale and availability and whatnot. And it's usually relational. And then when we had a major disruption during Q4, that's when, yeah, ask ourselves a question saying, why are we actually using a relational database for some of these things when they really didn't need the data model complexity of relational database? And normally I would say most companies would actually ask an intern or a few engineers who are early in the career saying like, what the hell are you suggesting? Just go away. But Amazon being enabling builders to build what they want. And they actually let us start rematching what a database or our scale could look like. And that led to Dynamo. And since we launched Dynamo and then we migrated from a traditional relational database to this one for some of the Amazon.com services. And then I moved on to actually start building some parts of our storage service and then our managed relational database service. I explicitly remember in one of our customer advisory board which is the set of some of our leading customers who actually give us feedback on roadmap and others, Don McCaskill, who's the CEO and chief geek of Spongebob and Flickr. And him actually looking at literally me, I was standing in the corner and saying like, you all build Dynamo and why do I need to keep sharding my MySQL database and resharding as I'm scaling? And this is the time when the state of the art in most databases were around like you start sharding your relational database and constantly resharding. And this is when most websites were starting to experience the kind of scale which we consider normal now. During those times, it was mostly, most companies used to have a single relational database back end and start scaling that way. And that conversation led internally and it resonated with a lot of AWS leaders and myself saying like, hey, what is a cloud database re-imagined without the hampering so SQL look like? And that led us to start building DynamoDB which is a key value database at the time. Now we support document model too, but that delivers single digit millisecond latency at any scale. Imagine that, so. Yeah, well I think about that time at that time, 10 years ago when you were having this conversation and I know it's bug, bug and I've, he's a totally geek and he's good to point that out. He also had Netflix as customers too, I like to hear how that's evolved. But I think back at the time, if you look back then, I got to ask you, most people, we've talked about this before, no one database rules the world. That's now standard. People now don't see one database. Back then it was a one database kind of mindset back then. And then you had a big data movement happening with Hadoop, you had the object store developing. So you're circling around that area. What was it like then? I mean, take us through that because there was obvious visibility that, hey, let's just store this. Now you see data lakes and that's all happening. But back then, object store was kind of new. Yeah, that's a great question. Now, one of the things I realized early on, especially when I was working with Bernard, even using Amazon.com itself as an example, that the access patterns for various applications in Amazon, but let alone AWS customers tend to be very varied. Some of them really just needed an object store. Some of them needed a relational database. Some of them really wanted a key value store with a fast latency. Some of them really needed a durable cache. And, but it so happens when you have a giant hammer, you use that for everything looks like a net, which is essentially the story at the time. And so everyone kept using the same database, irrespective of what the problem is, because nobody else, I mean, thought about like, what else can we build that is better? So this led us to, literally, I remember writing a paper with Bernard internally that was widely used in Amazon explaining, what are all the menu of workloads that exist? And then how do we go about actually solving for each of these things so that they can actually grow and innovate faster? And this was led to actually the genesis of not only building RDS and Aurora and so forth, but also Dynamo and various other non-relational databases to let alone also storage access patterns and whatnot. So, and this was one of the big revelations we had, which is that there is not a single database that is going to meet the customer needs as the diversity of workloads in the internet is growing. And this was a key pivotal moment because with cloud, now applications can scale way more instantly than before. Now, building an application for Superbowl is way easier than before. That means that on, I mean, everybody is pushing the boundaries of what scale means and they are expecting more from their applications. That's when you need technologies like DynamoDB. And that's exactly what DynamoDB set out to do. And since then, we have continued to innovate on behalf of our customers in the purpose-built database story as well. And this concept has resonated well across the board, if you see that the database industry has also embraced this methodology. It's natural that you also evolve into the machine learning side of it because that's data is a big part of that. And you see back then, you're bringing up kind of like flashes for me where it's like the data conversations back then and the data movement was just beginning. So, the idea that you can have diversity in access methods of the kind of databases was a use case driven by the application, not so much database saying, this is how you have to work. The script was flipped. It's changed from infrastructure dictating to the applications what to do. Now the applications are going to the infrastructure saying, give me what I want. I want to access something here in an object store, something here in NoSQL. That became the genesis of infrastructure as code at a global level. And so your paper kind of set the wave, the influence for this NoSQL, the big data movement. It's created tons of value. Maybe a third Mongo might have been influenced by this. Other people have been influenced. Can you share some stories of how people adopted the concept of DynamoDB and how that's changed in the industry and how does that help the industry evolve? I mean, first of all, Dynamo is, we are fortunate to actually share our experience of building a Dynamo style data store where it is a non-relational API and showing what are some of the experiences that we went through in building such a paper. And we set out early on itself that it should not be just a design paper, but it should be something where we share our experiences. So even now when I talk to my friends and colleagues in various other companies, one thing they always tell me is they appreciated the openness with which we were sharing some of the examples in terms of learnings that we learned and optimizing for percentile latencies and what are some of the scalability challenges, how we solved, and some of the techniques around things like sloppy quorum or various other stuff. We invented a lot of terms along the way too, but people really appreciated several of some of our findings and us talking about it. And since then, I've made so many other innovations that happened in the industry and within AWS, but also across the entire academia and industry in this space. The databases I've been going through what I call as a period of renaissance, where one of the things, if you see our own argument, Raju and I started on the database front is, we started with the premise saying like, if you were to build a database where cloud is the new normal, this is again in 2008, we asked ourselves that question and what would we build? That led us to start building things like DynamoDB, RDS, Aurora, let alone we re-imagined data viruses with Redshift and several other databases like Timestream for time series workloads or Neptune for graph and whatnot. But the moment we start actually asking that question and working backwards from customers, then you will start being able to innovate accordingly. And this is what really well, then more than 100,000 AWS customers have chosen DynamoDB for mobile, web, gaming, tech, IoT. And many of these are fast growing businesses such as left, Airbnb, Redfen, as well as enterprises like Samsung, Toyota, Capital One and so forth. So these are like really some meaningful workloads let alone Amazon.com that runs on it too. We have an internal customer, it's always good to have that inside customer. You know, I really find this a really profound use case because you're just talking, you know, in Amazonian terms, I'll just translate for the audience, working backwards from the customer, which is the customer obsession you guys have. So here's what's going on, the way I see it. You got DynamoDB paper, you and Werner and the team, Paul Vaselle has a great video on your blog post that goes into the talk he gave around that time, which is fun to watch if you look back. But you have a radical enabler here that's disrupting and changing S3, RDS, Aurora. These are game-changing concepts inside the landscape of AWS. At the same time, you're working backwards from the customers. The question I have for you as a leader and as a builder, how did you balance the working backwards from the customer while bringing something brand new and radical at that time to the market? This is one of the hardest things we as leaders need to balance on. If you see many times when we actually work backwards from customers, the literal way to translate it is literally do what customers are asking for, which is true nine out of 10 times. But there is one out of 10 times you got to read between the lines on what they are asking because many times customers may not articulate that they need to go fast in the right way. They might say, hey, I wish my horse carriage goes faster but they're not going to tell you they need a car. But you need to know and be able to translate and read between the lines. We call it under the bucket of innovate on behalf of customers. And that is exactly the kind of mantra we had when we were thinking about concepts like DynamoDB. Because essentially at that time, almost everybody would, if I asked, they would just say, I wish a relational database could actually be able to scale from not just like 100 gigabyte to one terabyte or it can take up to like two million transactions a second and so forth and still be cheap. But in reality is relational databases the way they were engineered at that time those are not going to meet those scale needs. So this is where we had to read between the lines on what were some of the key must have needs from customers and then work backwards and then innovate on behalf. Will these workloads be enabled by this and so forth, which are some of the reasons that led to us launching some of the initial sets on DynamoDB on a single digit millisecond latency and seamless scale. At that time, databases didn't have the elasticity to go from like 10 requests a second to like 100,000 or one million requests a second and then scale right back in an hour. So that was not possible. And we kind of enabled that. And that was a pretty big game changer that showed the elasticity of the cloud to a database flow. Yeah, I think also just to not to nerd out on this but it enables a lot of other kind of cool scale concepts like queuing, storage, it's all kind of together this database piece that you guys are solving. And again, props to you guys and the team, congratulations. I have to ask more generally, how has your thinking changed since the paper? I'll see you've got more experience under your belt. You don't yet have the gray hairs yet but we'll see those soon come in. But you've got a lot more experience, you're running teams, you're launching a lot of products. How has your thinking changed in the industry since the paper? What's happening now? What's the big evolution? What are those new things now that are in the innovate on behalf of the customer? What's between the lines now? How do you see this happening? I mean, since launching Dynamo, we have worked on, I had the opportunity to work on various problems in the big data space where we've worked on some of our things that you might be aware of in the analytics all the way from redshift to quick side to then I moved on to start some of our efforts having built systems that enable customer to store process and query data and then analyze them. One of the realizations I had, this was in around 2015 or 2016, I can't remember what it was, that machine learning was hitting a critical point where now it is ready for being a scaled adoption where cloud has basically enabled limitless compute and limitless storage, which were the factors that are holding back machine learning technology. Then I realized that now we have a unique opportunity to bring machine learning to everybody, not just folks with PhD in machine learning. And that's when I kind of moved on from database and analytics areas to start machine learning, which is a decent area because machine learning is powered by data and then started building capabilities like SageMaker, which is oriented to an ML platform to build, train and deploy ML models. And this is what it does, the leading enterprise ML platform by several gaggle users and then also a bunch of our AI services. Since then I view, the reason I'm giving all this historical context is one of the biggest realization I had early on itself in 2016 as first machine learning is one of the most disruptive technologies we will encounter in our generation. This is right after cloud. I think these two are the most amazing combination that is kind of revolutionized how we build applications and how we actually reason about that. Now, the second thing is that at the end of the day, when you look at the end to end journey, it is not just about one database or one data virus or one data lake product or even one ML platform. It is about the end to end journey where a customer is storing their order database and then they are actually building a data lake that has customer history and order history and they want to be able to personalize and for their viewer experience or actually forecast what products to stuff in their fulfillment center. For that all these things need to work end to end and that view is one of the big things that struck me for the past five years and I've been on this journey in addition to building this ML building blocks to connect the dots so that customers can go on this modern end to end data strategy as I call it. But it goes beyond a single database technology or data virus or an ML technology but putting all of these end to end together so that customers don't end up spending six months connecting the dots, which has been the state of the art for the past couple of years and we are bringing it down to a matter of weeks and days now. Yeah, the speed is incredible. Swami, thank you so much for spending the time with us here on theCUBE. My pleasure. Thanks again, John. Thanks for having me.