 Even Las Vegas for day three of our exclusive wall-to-wall coverage. This is theCUBE, our flagship program. We go out to the events, and extract the students from the noise. I'm John Furrier, the founder of Silicon Angle. I'm George Michael Hoos. Dave Vellante, co-founder of wikibond.org. And we have on theCUBE Dr. Matt Wood, general manager of data science for AWS. Welcome to theCUBE. Thank you so much. It's great to be here. So, doctor, I got to ask you, can you summarize the announcements for us here? I mean, let's put the package here. Let's try to put the portfolio of innovation on the table. It's pretty large, but let's lay it out. All right, let's play it out. So yesterday, Andy made a couple of announcements in his keynote. He talked a little bit about how customers could improve their compliance and governance standpoint using a news service we have called Amazon CloudTrail, AWS CloudTrail. That basically allows customers to collect logs of changes to their infrastructure configuration. So when you create a security group, remove a security group, any of those sort of changes get captured in an audit log. They get stored into Amazon S3, and then you can plug those into your standard reporting around other audit trails inside your organization. So that's CloudTrail. We also announced, you might have heard of Amazon Workspaces, a new virtual desktop done the Amazon way. We also launched Amazon AppStream as well, a method of running resource-intensive applications up on AWS and delivering them back down to any type of device, be it a laptop or a mobile device, without having to worry about whether the physical hardware could support the graphically-intensive nature. So that was day one. Do you want to start there? Let's go day two and then we'll go unpack them all. All right, day two. So this morning, Verna Vogels, ctrofamazon.com, came on stage. He made a couple of announcements. He announced that we were going to bring the PostgreSQL database to the Amazon Relational Database Service, got a big cheer. A lot of customers have been asking about that for a long time, so it's great to be able to make that available. And within about two minutes of making the announcement, it was actually available to launch up on the console and customers were using it. We also announced some new multi-region features. So customers have asked us to be able to move some of their data around more easily for business continuity and disaster recovery scenarios. So we added Redshift snapshot copy, and we also announced the availability in the next couple of months of cross-region read replicas for MySQL on RDS. This allows you to basically have a relational database that is keeping copies of the data available asynchronously replicated across multiple regions. So it allows you to set up hot and warm standbys of your application running across multiple regions. A great boost for reducing recovery time and recovery point. Or even a pilot light, I heard this morning. Yeah, there's a pilot light set up as well, that's right. So you got to keep that pilot light warm. So we did that. We also announced some new instance types on EC2. So we announced the new I2 instance. This is specifically designed for very high performance IO intensive applications. So no SQL applications and data warehousing applications. The eight extra large instance type can drive 350,000 IOPS for reads, 4K random reads, and about 320,000 IOPS for random 4K random writes. So this is a very, very high performance beast in terms of IO. And we announced the C3 instance as well. Three C3 instance type is specifically designed for computationally intensive workloads. So it has the latest Intel Xeon E5 IB bridge processor. It also has some smart networking technology which allows you to reduce the latency on the network, which is very important for tightly coupled, high performance computing applications. And then of course, we wrapped out the show with Amazon Kinesis, a new managed service for real time streaming data analytics. I think I did that almost in one breath. That was awesome. All right, good. So let's break it down. So you got day one enterprise, day two under the hood, day three connecting all the dots at the top of the stack, pulling all together. All the goodness of Amazon, certainly everyone's been familiar with, but really what really is key is you guys are really moving the ball down the field, yard by yard, first down, first and 10, move the chains, use the football analogy. But Kinesis is the big pass play. That allows data to stream in, which opens up a lot of new possibilities. So I want to get your take on Kinesis. Sure. We're saying earlier, the analysis is this closes the loop because now I can put any stuff into Redshift very quickly and start doing more iteration through data and then reroute that back into my development. So that's cool. I mean, that makes the integrated stack really, really hum. Now that's disruptive. So what's your take on that? How do you see that? What are customers doing and what use cases have you seen that in action? Sure, so we kind of look at data in terms of a timeline or a lifecycle. So you think about it that, you know, customers are generating data all the time, whether it's on social networks or sensor networks, and then they need a place to collect and store that information. Once you've stored it, you need to be able to ask questions of it, so you need to be able to compute against it. And once you've computed and asked questions of that data, you typically need a way to be able to collaborate and share the results of that information. So if you rewind three to five years ago, the cost of data generation was sufficiently high, that it was the rate limiting step in that lifecycle. Today, the cost of data generation is plummeting, whether you're dealing with social networks or with hundreds of millions of customers, whether you're dealing with genomic sequencing, or you're dealing with building out sensor networks, putting that pressure sensor on the end of the drill that's drilling for oil. These all drastically reduce the cost of generating data, which makes it the economics more favorable that more data is going to be generated. That puts tremendous pressure on the infrastructure required to collect, compute, and collaborate around the data that's stored. So what Amazon Kinesis does is it allows an entirely new class of application to be built and developed without the complexity of having to manage either batch processing, which typically can't keep up at scale, or very, very high throughput data streams. And what it means is you can capture that. Kinesis will store the data for 24 hours in a very reliable fashion in an entirely managed environment. So customers don't have to worry about provisioning the storage and the servers and all the rest of it. They just set the amount of throughput that their application needs, and they can just set it and then start streaming data in straight away. So you're immediately persisting the data, right? Absolutely, so as soon as it comes in, it's reliably persisted across data centers, across availability zones under the hood. And we store it there for 24 hours. And what that means is you can ask Kinesis to do a window analysis. So you can say, give me all the data for this particular sensor in the last five minutes. And I want to compare that with five minutes an hour ago. So you can move this window around and do some time series analysis based on the data that's flowing through. Where we expect to see these sort of use cases really shine with operational logs. So these are customers that have large scale operations that have lots of instances, lots of operational metrics of their application level, of their database level. And they want to be able to collect all of that and immediately respond to any changes in their operational status. So whether that query is running slow or a particular application server is misbehaving, you want to be able to quickly identify that and respond to it in kind. Okay, 24 hours. What happens after 24 hours? 24 hours, it's a rolling process. If you want to persist that data, you have to be able to take it and put it, well, you can put it into S3. Or S3 or Glacier. And then from S3 you can persist it in Glacier. So the option for the customer is to move that window if they want to. Yeah, I mean the process of Kinesis, you can think of it as a big water hose. And on that big water hose, you can make it elastic so you can increase the diameter to put more water through it. But you can just attach sensors to the water. And the hose doesn't care how much water it's handling. And you can add as many of those sensors to the hose pipe as you like. So you can measure the pH and the temperature and the pressure. That's what Kinesis is. You can create these Kinesis applications which are continually sensing the stream of data that's thrown through the pipe. And you can create for developers to look at what's going on in their applications and their environment. That's one of the main ah-has right now. Phase one is, hey, I can see stuff better. And you got redshift to do some querying against it. What have you guys done with the product? Share or just anecdotally, prior to the announcement, the goodness of it, what's the vibe like? What's the sentiment? You guys falling out of your chair. What are some of the things that's going on internally just to give a taste for the folks out there? Yeah, sure. So we're extremely excited about Amazon Kinesis. We think that it enables organizations of any size. So whether you're a young startup with a bold new idea that doesn't want to be constrained by their infrastructure or you're a large engineering organization that's starting to deploy large amounts of sensors like GE, you want a hassle-free, reliable way of collecting and analyzing that data and responding to it in kind. So this has been extremely challenging from an infrastructure perspective but also from a software and a managed service perspective. And so I think the real value for Kinesis is making those type of advanced next generation analytics available to everyone. So I want to ask you to just change gears a little bit. Talk about you personally. Dave and I love big data. We've been at the first original Hadoop world. Now all the Hadoop summits, we totally love the data science. We built our own crowd spot Apple platform. It's got 75 million, 76 million people adding a million people a day into it. All real-time, all on Amazon. We're super excited. But I got to ask you, what is your role? I mean, your GM of data science, what does that mean and what do you do every day? Sure, that's a good question. So I've been at Amazon about five years and at Amazon you don't so much change roles. It's just a mass responsibility. So I've ended up being at the intersection of many things. You take on more responsibility, yeah. So I'm at the intersection of many things. So I spend a lot of time with customers helping them get as much value as possible from the data that they have available and in helping them kind of get the value from that data inside their organization so that they can get real actionable information from it. So I spend a bunch of time on that. I also spend a lot of time, so my team runs all launches for the platform. So all the launches that we worked on today, my team were involved in kind of bringing those out and helping customers understand them. So I do that. I've got a background in life sciences. So I also spend a lot of time with our life sciences customers, whether that's in the public sector. So working with the NIH or customers like Illumina building out these very large genetic sequencing instruments that are just sensors basically. Everyone always talks about data scientists. It's the rage. But we were talking at our event, Big Data NYC a couple weeks ago, that there's like about 200,000 data scientists in the world today. Give or take a few, how you define it. But over 2 million data analysts. So the data world is really going to, you're going to see all the action come from knowledge workers. So how are you making it easier? What's the vision? How do you see data? I'm sure there'll be more data science. We grow on data science for sure. Well, you know, math guys right in Python, but you're going to start to see general purpose tooling around data science for analysts. What's your vision on that? Yeah, so I think the most important role for a data scientist is that of interpretation. So being able to take the business requirements from one side of the organization and interpret it and know enough about the business that you can make informed interpretation around those requirements, and then pass those on to the technical development and the analyst teams to be able to go off and implement those requirements. So it takes a strong understanding and a strong clear vision of where the organization is going and what the challenges are, but also a strong kind of deep technical focus in understanding the statistical models and where you might use machine learning and all those sort of things. And the flip side of that is being able to take the results of those experiments. So the statistical results, the statistical models, they piping those back into the rest of the organization and being able to make informed judgment calls about what is actually a valuable result from the data analytics that you're running. What's the biggest thing that surprised you here at this event that you didn't expect to have happened? Overwhelming, someone grabbed it in the hallway instead of buying a beer. Customers said it's great implementation. Success is one of the launches. What surprised you this week? I think it's just the scale of the event. I mean, this is only the second year we've run it. We've got nearly 9,000 people in attendance. And because we've been at AWS a reasonably long time, five of those seven and a half years, just seeing the growth of the organization and these sort of leaps and bounds that the organization is going through is fantastic. But more so it's the strength of the customer stories and the strength of the diversity of the customers. I mean, we work daily with folks like Pinterest and you can walk out of a meeting with Pinterest and walk into a meeting with NASA. And so that diversity can only be delivered with this kind of utility platform. That's what I find most exciting. We're just commenting. We're getting the hook here, but so I got to ask you to get the final word on this segment. The car's leaving Las Vegas from reinventing. What's the bumper sticker on the car? What does it say? We've been asking everybody all week. What does it say? What's the bumper sticker say? I think it would say, what would Kinesis do? And it would just, I just want customers to start thinking about if you don't have to worry about the infrastructure and you don't have to worry about the complexity, what new applications can you build with the data that you're collecting or you want to collect that Kinesis enables? Intelligence, creativity, new things, new use cases, this is our exclusive coverage. We'll be right back with our show wrap up right after this short break.