 Okay, we're back, this is Dave Vellante with Jeff Kelley. We're here at the MongoDB Days Conference at the Marriott Marquis in New York City. We have been going all day. Miles Ward is here. He is a tech athlete, we like to say in theCUBE. We love the sports analogy. He's Miles of Solutions Architect with AWS and great conversation off camera and welcome to theCUBE, excited to have one on. Thanks, I appreciate you being here. Yeah, so, well, let's start with why you're here at the MongoDB Day. Sure, Amazon Web Services listens to customers. And that's not just a bunch of BS marketing, is it? You guys really believe that, you know? You know, Bezos says it, Andy Jassy says it, and when we talk to you guys, you always try to really emphasize that piece of it, but so anyway. I mean, I can tell you, I sat in NASA facilities at JPL Pasadena while we powered the live streaming of the landing of the Mars rover. And if helping that kind of company take data from Mars and store it in S3, that's the line. It goes from there to there. That's not changing the world, I don't know why. If you didn't watch the re-invent keynotes, go back and check it out. Go just Google re-invent keynotes and you'll see that Mars rover keynote was fantastic. These guys were pumped up. You know, the big celebration was awesome. That's the thing about Amazon. You guys got the deepest customers. You got the proved points. But anyway, so you started to tell us why you're here at the Mongo event. So we, in this sort of ecosystem of change, we're watching a whole new set of technology tools that break the standards, that have altered the baseline of what you can squeeze from a performance standpoint out of hardware. And MongoDB, as a really, we think the kind of leader in the NoSQL open source space is doing things that other technologies are not able to do and making it so that businesses can entertain use cases that were just impossible. So what kinds of things can Mongo do that maybe some of the others don't excel at or are struggling to do? Sure, as a document store, that's different than say a key value store or a graph database or more of a MapReduce system like Hadoop. Document stores speak the same language as applications. They interact using exactly the same data structures that your software and frankly that your websites use. So as a developer that has to learn, I think the new soup de jure is the full stack. You've got to be great all the way across from the front end of the back end, being able to interact with a data storage system that speaks to application developers that's actually storing data the way applications use data is a game changer. So we've heard the term polymorphic a few times today and shout out to David Scott who's the former CEO of 3PAR, a company that got acquired by HP. He coined the term polymorphic storage and everybody laughed at him. He said it's going to stick and interestingly enough we've been hearing it more and more today. Okay, so let's talk a little about your role at Amazon. I mean, Amazon has just been exploding. We've been tracking it. You guys started the cloud. You're bringing the cloud now to more and more places, really driving hard after the enterprise customers we notice obviously, but talk about your role at AWS. Sure, Amazon, I think it's important to recognize we built this series of products that add value from my mom's blog of pictures of her goats that costs one penny every nine months because that's when the S3 charges round up to a penny to shell oil biggest, you know, giant, giant business deploying millions of dollars infrastructure, right? We add value and differentiated benefit for businesses all across that continuum. So my role as a solutions architect is trying to figure out how our tools work at that broad continuum. I go work with startups. I work with our partners. I work with very, very technical businesses to businesses that don't want to be very technical all trying to get them to learn the best practices for extracting value from the cloud. So how, you know, as I say, we've seen Amazon attack I use the term attack the enterprise, you know with the vengeance, you know and really going after customers and trying to understand what they need. I actually wrote a piece early this year called the Amazon gorilla attacks the enterprise and somebody pointed out to me, you know, they're the gorilla, they're the large incumbent but for the first time ever in the history of the industry, they're also the disruptor, the cheetah. So that's interesting. We've never, I could, I try to think of another example and I couldn't in this business. How are enterprise customer requirements different and how is that, you know causing you to change your architecture and the way that you guys respond? Sure, it's a great question. We have, you know we hear from enterprises on the one hand that they want to earn all of the same value that those little startups have earned. They want business agility. They want lower time to market. They want lower operational cost. They want no capital cost. They all operate in the same kind of constrained business environment that, you know, mom and pop shops or garage startups or what have you are operating it. So part of it is about learning from these nimbler younger, smaller businesses and making sure that they can get the same values that they did. Another part of it we're learning from enterprises is that, you know I think Amazon.com is a massive business. We've learned a lot from it in trying to make sure that we met its needs. We've had these other marquee customers that have done sort of incredible things on top of us but the big enterprises, they do, they have truly gigantic workloads, you know petabytes and petabytes and petabytes of storage, you know, giant computational processing loads. And so we're continuing to invest, you know in frankly that part of the market that has the biggest needs we think for disruption, right? I think Redshift as one system for, you know, scale out MPP data analysis, you know, that fits a different market niche than say your MongoDB does, but, you know a kind of product that's really, you know a love letter to your average enterprise and incredible cost savings for that workload. So from the enterprise standpoint I've always said we are not always but recently I've been saying we're sort of entering a new phase. You guys started it all in 2006 and it was like, hmm that's interesting. And then when the economy went south a lot of people said we got to get the variable expenses as fast as possible. It's like, here take my problem. And then when the economy starts up ticking again you start to see lines of business spend. So you had this so-called shadow IT phenomenon emerge. And a lot of CIOs were saying now Amazon not on my watch. And then yet, we don't use any AWS, we don't use any public cloud and you go, yeah you do, I was just down talking to you know the marketing department and they're all over this thing. Really? And so that sort of woke I think IT up and now what we're seeing is I think in many instances the CIO saying okay here's another arrow in the quiver. We're going to try to embrace this thing as opposed to fight it like we did distributed computing because we all know how that ended. So I wonder, do you sort of agree with that? Sort of those phases, those scenarios and what are you seeing in the marketplace in terms of adoption? Two different things. So one, if you look at those sort of different you know kind of phases beneath that is this undercurrent of change in the status quo of what it means to be successful in IT, right? The inside of those years is the kind of birth of this idea of DevOps, of businesses that have decided that the infrastructure needs to be so nimble that it is now its own kind of software product that you'll be building automation that builds your business. So that infrastructure is code. Yeah, yeah, yeah, my data center lives in my source control, right? And so the bring your own data center movement where now marketing and these other departments can build on their own because they have developers on their own which developers is all it takes to get to a place where you have a massive amazing data center when you're using the cloud. That transition I think has changed the status quo enough that enterprises are in our experience moving from cloud. Interesting, that'll be a six month proof of concept. Maybe we'll examine it for a couple of our workloads. We've closed data center 17, 19, and 22. We have 90 days to evacuate. Let us know how you move everything. That's the kind of stuff that as a solutions architect I go in and say, all right, let's try to figure out how we make that work. So that talks to just much deeper business integration as opposed to my problem for less. Yes, yeah, once you're approaching, once you realize that the problem that the business has is not its ability to spend the costs of the infrastructure that it delivers, but its ability to move work through the pipeline more quickly. It's ability to execute against business goals more nimbly. When that's the inhibitor to growth, then you start to invest in any way that you can in things that give you more flexibility. And AWS gives great flexibility. You talk about DevOps, you guys early on in terms of DevOps adoption, the whole hyperscale movement, you think of Amazon as, again, an earlier adopter. And over time those concepts are seeping into the enterprise. Open source is another one. I wonder if you could talk about Amazon and open source a little bit. I mean, MongoDB, open source, you guys host a lot of open source products. What's your role in the open source community? So it's interesting. Mongo is a successful open source product, I think is another great example of places where businesses that earn value from the open source ecosystem, that's a part of the way that that code of culture or culture of code work. One of the great examples of that is the S3N extensions for Hadoop. That's kind of a weird technical thing to say, but we've made it so that all MapReduce users everywhere can interact with S3 storage systems and other storage systems that expose the S3 API to interact with storage remotely at an efficiency that is totally impossible given the normal bring the code to the data sort of MapReduce model. So that process where we're taking tools that everybody finds useful and figuring out how to extend, maybe to make them scale better, maybe to make them more cloud ready, maybe to make them easier for folks to use. Another good example is Bodo is an open source Python SDK for AWS. The Bodo system does an amazing job of making it so that a smaller number of developers can manage a bigger amount of AWS because it's such a straightforward CLI. So that's another open source project where we're working together with folks from Eucalyptus, folks from other sort of teams that are working in the cloud environment to collaboratively build great software. So talk about elastic MapReduce, Redshift, I know Glacier is another one. You again, you listen to guys like Andy Jassy talk. These are some of the fastest growing services in the history of Amazon. And you guys are, he talks about the flywheel, the flywheel effect, the idea being that you've got this sort of self fulfilling motivation or prophecy where the more function you add, the faster, the more ecosystem partners you get, the more customers you get, you can spin that flywheel faster and innovation comes out faster. Mike, I want you to talk about from an architect's perspective, how do you architect a platform that doesn't break under the weight of all that function and all that speed? How much further into the future can you take this? And how do you do that? Sure, well, it's one of the most basic sort of amazing things, you know, when the data center is something that you manage with screwdrivers and ethernet cables, the maximum scaling point gets to a place where so many people have to be trained to do the right thing at the right time, that it becomes kind of a personal management issue. You have to really build a functioning mechanical business as much as an intelligence business. With AWS, when you've built a system on top of a programmatic infrastructure, that code, it's durable. And that sort of durability of innovation is this incredible boon where, you know, if Bill leaves the company but he leaves his code behind, Bill's innovation stays. And so we're also watching a lot of a creative development where people are taking, you know, for example, I've built a great Amazon machine image that runs a more efficient version of Hadoop. And then I build a cluster of Hadoop using cloud formation and then I script sets of clusters using Bodo to ensure that I can deploy what I want. And I package all of that as a Bodo library and use Bodo to call sets of libraries, right? Or that you do that with Chef or you do it with Huppert or these other kinds of systems. As people build these packages that work, they open source those. We're starting to see open source data centers because the only definition of the data center is that control software that manages it. So we're starting to see businesses that are saying, it's not just enough to open source the software package that runs on one computer. I want to open source the design of my whole data center. Let's talk a little bit about Obama for America. You're very much involved in that project. As an architect, Obama for America raised a lot of dough for not a huge investment, relatively speaking. Take us through sort of what your role was there and tell us a little bit about that. Sure, we started about a year before the campaign, met with the CTO and the technical leaders and decided to volunteer for the remainder of the campaign and serve the same kind of role as a solutions architect, help them design storage infrastructure, high performance web infrastructure, their application deployment process, their dev and test and staging methodologies, a lot of their high-scale data analysis tools, those systems got used to build 200 distinct applications, served millions and millions of concurrent users on election day, had 100% uptime on election day. It did what it was supposed to do. It was pretty amazing. And from the business value standpoint, the ROI equation, it was put in a million on one end and a billion comes up. Yeah, and the fundraising component is important, but if you look at how much those software tools were used to coordinate the volunteers, to coordinate legal representation for disenfranchised voters, to do all of the work around canvassing and call voting, the real measure is very binary. Like you either win or you don't win, right? They put it in a million, I'll come get the president. Yeah, that's a different thing. So can we dig into how Mongo, so Mongo on AWS, what are the real, help us understand the, compare that to, deploying Mongo in your own data center. What do you offer that makes that a more compelling offering in your opinion? Sure, so I'm the author of the MongoDB on EC2 white paper. So you're the man to ask. I wrote the best practices for this. But it's really convenient, even though that's a 25 page paper that you and I could read and we can sit in front of your computer and you can deploy a great cluster, it'll take you about a half an hour. And we've reduced that to software, right? So there's no reason that once we've described what the best practice is, that we can't capture that as code. So we use a system from AWS called CloudFormation. CloudFormation allows you to deploy as a single atomic command the entire infrastructure that you want. So rather than, let's get the CD out and put it in the drive, and then we got to turn the computer on and then we got to configure it a bunch. You hit the play button and then a replicated, sharded MongoDB fleet pops up. So that's a pretty compelling case. That's pretty easy. You press the button and then there you have it. And if you need it to be 10 times as big, you put a zero at the end and you hit enter. And then you've built a 10 times as big cluster. So what are you seeing customers really doing with Mongo on AWS? What are some of the killer use cases you're seeing? Sure, again, if you look at a document store as different than these other kind of no SQL strategies as different from sort of relational database strategy, one of the big things, just sort of right off the top is data use cases that are just too big or too fast for your average relational database system, right? You need inserts under a certain number of milliseconds. You need to be able to do millions of those inserts. That becomes multimillion dollar problems in a relational database environment. It becomes multi-thousand dollar problems in a no SQL kind of environment. So part of this is just about picking tools that are cost efficient. Another dynamic of that is we're seeing a lot of businesses as they build more and more sophisticated geospatial applications. They're very rich geospatial systems inside of Mongo, rich data types for that that differentiates them from other no SQL systems. And also a lot of very interesting work in these sort of end to end deployments across the software stack where you're writing what the applications need to read directly to the database. So let's take it kind of a step back for a little bit broader picture. So we were at, I don't know if Dave mentioned earlier, we were at the GE event earlier this week. I announced their kind of industrial internet platform that they're building with Pivotal. I was on a panel with AWS among others, but AWS, a partner of GE in that effort. So the idea of all these industrial grid equipment are just creating more and more data and it's being outfitted with sensor technology, et cetera. So clearly data volumes are going to continue to grow both from more of the commercial internet, if you will, social media and other things, but also now the industrial internet we're seeing a lot of data growth. How is that impacting or how does AWS approach that on a high level? I mean, is that kind of what's your strategy to deal with that growing data volumes and not just the growing data volumes but especially in industrial internet situations where the goal really is to orchestrate processes through analytics across a large ecosystem. How are you, how are you going to support that? So a couple different parts to the answer to that. First off, every time that I've ever seen someone describe the internet of things or machine to machine technologies or industrial internet, there are a bunch of labels for this stuff, sensor driven analytics. Invariably they draw some flow chart that has a whole bunch of little dots down at the bottom and then they all have this little arrow that goes up to this thing that's shaped like a cloud. We built a cloud, it's pretty powerful. We have the big one. So that seems like a natural spot for us to play in that role. We built a system that's distributed globally today. So no matter where in the globe you're deploying these kinds of industrial systems or sensor based systems, we have systems that are close enough that they can deliver the kind of analytical value that we want. We've also invested quite a lot in, we think some of the most efficient to use, some of the highest scale, big data analysis and storage systems, right? So some of the big work goes into kind of extracting value as the data comes in. Some of the work goes into archival for later retrieval and offline analysis. So we've built things like DynamoDB that can catch the data in two milliseconds and process away at it all the way over to Glacier where we can store it for a penny a gigabyte a month and at the same durability of S3. So you have all of this rich ecosystem of choices depending on the skills of your developers, the specific needs of the data set that you bring in, whether it's tiny little bits or video or what have you and a whole range of tools to address that. And that's the video and that kind of video related content and image content, that's going to really, I mean, that's a huge source of massive data that's really just going to continue to grow. When Netflix peaks in the late afternoon, they're a third of the internet in the United States. Wow. So that's not small. And we don't think Netflix is anywhere near the last or the biggest or the most voluminous users of computer storage or database or network infrastructure. They're a great example because they've been a partner with us for a long time. But I think you're going to see businesses like GE and like Shell and these others do things, many orders of magnitude bigger than we do. When you project out where we're going to be even five years from now, it's just mind-boggling. We were at Riley Velocity earlier this week and one of the things that developers, I mean, everybody's talking about mobile obviously, it's like mobile, mobile, mobile, mobile. One of the comments was two years from now, I've always been talking about mobile. Everything's going to be mobile. But the challenge that people brought up was it's like two steps forward, one step back. We go into developing mobile apps and all of a sudden things get slower, they get more complicated. So I wonder if you could talk about that a little bit, what you're seeing in your customer base, how you're dealing with some of those performance issues generally and maybe even specifically with Mongo. I know you're doing some stuff with SSD and performance tuning, maybe talk about that a little bit. I mean, this device has roughly six times the compute performance as was required to land the space shuttle, right? I mean, we've gone a long way, they're a lot better. And the downside of that is that expectations have gone crazy, right? I want to have a map of the entire planet at incredible fidelity. I want to have all the music I've ever heard. I want to store every picture and show them to my buddies, right? So those super high-scale data uses in some ways more data than you even want to get to with your laptop because it is so useful in situ. It has to come from somewhere centralized. And many of the application developers that we talked to, you know, they're good at building captivating interfaces, they're good at building user experiences, but they need a lot of support around making sure that that little cloud that goes in the middle of the drawing with all the zillions of phones, that it scales, that it grows, that it does all the things that it's supposed to do, when it's supposed to do them at a price that makes sense for their business. So, one of the biggest things we've seen that's interesting in the sort of mobile spaces, mobile performance, right? It's not just enough to have the big cloud in the middle. We built this system called CloudFront that shoves to the edge of our networks closer to the cell carriers, closer to the individual mobile devices. The video and picture and audio content that's so popular on them so that when you do things like streaming audio or streaming video or this kind of streaming webcast, that's the sort of thing that CloudFront could accelerate for mobile users. Frankly, the push of another button, right? You just sort of turn that feature on. It's not that tough. All right, Miles, well, listen, thanks very much for coming by theCUBE. As we say in Boston, you're wicked smart. We'd love to have you on here, and we hope they have you back sometime. Maybe see you re-invent and... Perfect, looking forward to it. We're really excited about it. Thank you for taking the time. You're welcome. Keep it right there, buddy. We'll be right back. Jeff, Kelly, and I, this is theCUBE. We're at the MongoDB event. We're live at the Marriott Marquis. We'll be right back.