 theCUBE is a live mobile studio. We're back here live in Las Vegas. This is Silicon Angles and Wikibon's theCUBE. Our program, we go out to the advanced and try to seal it from the noise. We are going to break down the analysis here of Amazon Web Services, slew of announcements. I'm John Furrier, the founder of Silicon Angles. I'm Joe Mykos, Dave Vellante, and special appearance by Wikibon analyst Jeff Kelly, who is the number one big data analyst in the planet. Of course, we're biased. Jeff, welcome back. Yesterday you were out scouring the landscape. We had too many men earlier on. So we really want to get your perspective. We know you were out there talking to a lot of the folks, talking to the developers. A lot of stuff going on that smells and walks and quacks like big data here. I mean, a lot of enablement. Obviously cloud platform is enabling huge tsunami of new applications, DevOps, obviously the big focus. What's your take? Yeah, a lot of big data related announcements this morning. You saw things both on some of the analytics side, the more sexy side of big data, but also around the whole concept of making big data and data management in a big data context more enterprise grade. So you saw things around reliability, supporting multiple zones, backup disaster recovery around RDS and some other of the data platforms that AWS offers. And of course you saw Kinesis, which was I think probably the biggest announcement. We've got my attention the most. Kinesis, a new service from AWS, managed service all around stream processing of data. So you can basically power real time intelligent applications. So those are the kind of the two things that I heard this morning. Again, Kinesis, the analytics, that's kind of the sexy part of big data. The stuff around reliability and disaster recovery, keeping data consistent. Not quite as sexy, but very important in terms of making big data enterprise grade. Now I was talking to David Floyd earlier this morning. He said that based on his information, Amazon developed Kinesis, mainly using its own technology. So they're not buying in technology. This is really their first entry into the space. And Jeff, you and I have talked about this a little bit. The whole notion of real time and in this new unstructured world, in the Hadoop world. We've done a number of case studies. The one that obviously we talk about a lot is TAPAD, which is an ad serving application. And essentially what they're doing is they're merging their analytical data with some of their transaction data and serving up ads in near real time. As the customer is trolling around the website. So the company behind that tech is Aerospike. They use an SSD. Others are doing that as well. We've had a number of folks on theCUBE. This Kinesis feels like it's related to that. Maybe not directly head to head, but talk about that a little bit. Right, well, so as you said, Kinesis was developed internally by AWS. Their problem was they had all this metric and data, basically how people were using the AWS platform and they wanted to monitor that and analyze that data in real time. So they started to develop what is now Kinesis. And they realized, well, this is something that could be applicable to a lot of use cases in our customer base. So they went ahead and made this a full-fledged managed service running on AWS. So how it relates to some of the trends we're seeing in the larger big data world. What has struck me initially hearing about Kinesis was it's very much related to the industrial internet use cases around automating operations using real-time analytics. Essentially creating intelligent operational applications so you can do things like a wind turbine can self-correct based on real-time data that's coming from both the machine itself, maybe outside sources like weather data, other things. Essentially automating these decisions in real-time because you can't have a person doing that in real-time making those kind of operational-level decisions. So that's kind of where I see this starting to fit in in terms of how it actually compares to some of the other technologies out there. So you mentioned AeroSpike and Tapad. I think ad tech is probably the easiest example for people to understand. You log on to a site within a millisecond so you get an ad that's tailored to you. There's a lot of analysts going on in the background analyzing who you are, your profile. There's advertisers that then bid on whether they want to show you an ad. The winner has to decide from their inventory which ad to show you based on your profile and your potential to buy. All that analytics happens in under a second behind the scenes and it's obviously automated. There's no person back there doing that kind of analysis that's just not feasible. So that's kind of the easiest example to understand but apply that type of workflow to any number of industries in the industrial internet space like healthcare, like transportation, energy and you can see the potential value of really automating these systems creating almost like intelligent ecosystems around operational ecosystems around all these different industries that impact everybody. Now in terms of where Kinesis fits in it's unclear to me if it's truly a streaming engine. The example and the way Werner described it in the keynote today talked about data streaming in and then persisting it into different databases such as DynamoDB or others to then do an analysis on it. So to me it didn't quite reach the level of a true streaming engine like something you would see from IBM, streams. InfoSphere streams, H-streaming, which was recently acquired. Exactly, so we still need some more details to determine exactly what level of does it really fit the streaming category, streaming definition. And then the other thing that I was a little disappointed when in the keynote was the example was a social media application. Yeah, they took the Twitter firehose. They took the Twitter firehose which is my favorite planet is Mars and ended up being but it was very much a sort of trite. It was a little trite but it was a person. It was an end user doing some analytics and it was real time in that sense the data was some milliseconds coming in but it was an analytic use case where it was a user trying to gain some insight. It was not automating an action. Which to me is where the real value of the streaming capabilities comes in. I agree, I was more impressed two years ago with Larry Ellison's demo of Twitter but that wasn't that impressive. Right, well I think people look at, I think it wasn't the greatest use case to pick that to show one because it's, you know Twitter, social media, people say great, it's kind of neat but is it really that, it's going to change the world that I can now analyze Twitter in real times as opposed to 10 minutes old. And again for me the real value of big data and the industrial internet is when you start to build applications that are infused with analytics that are automating operational decisions that a person just can't do. And when you start doing that you start getting, creating as I said intelligent ecosystems where you can make much more efficient operations in healthcare, energy, the energy grid for example. It leads to lower prices for consumers, at least the better service for consumers. So to me those are really the potential valuable use cases and it's not clear to me that Kinesis is able to tap into those operational type applications. But isn't ad tech an example of that operational application? You're essentially making a decision in near real time without really human involvement, right? That's the best example and one of the only real world examples that's out there right now it needs to be applied more to, as I said these other use cases and that's what we're seeing why GE's built their cloud and some of the technology they're working with with PIVL and AWS as a matter of fact to do that. Well what about high velocity trading for example? I mean is this not an application for, they said financial services I presume they're talking about trading apps. Yeah absolutely, yeah that would be a use case for sure. I think one of the challenges you know there's been complex event processing engines but that was generally one source of data it was not as a distributed system so that's where your streams comes in for example that's a distributed streaming environment. So yeah that's an example I don't know if I'd put that into the industrial internet use case and some of these societally beneficial use cases. You could argue I mean certainly it's wealth creation and that benefits people. But you know where I think it really starts to make an impact on people's lives is when you're talking about energy that you consume at your home and when it's healthcare you go to the doctor and you've got actually the people are there, the doctors are there that you need the specialists, the beds ready, the machines are ready, it's optimized, it's shorter visits, it's more efficient, et cetera. Jeff what are you hearing about the uptake on EMR, elastic map reduce? It's coming around, it's hard to judge specifically how much of the AWS business is around EMR but from what I'm hearing from customers that they like it a lot it actually abstracts away a lot of the complexity of running your own Hadoop cluster which I mean if you take a step back that's pretty much one of the main benefits of the cloud generally. So people like it for that reason. I think the challenge is one is data movement, getting all that data into Amazon can be a challenge. I think the other thing is building applications on top of it to actually do more with EMR and Hadoop but it's one of those areas where Amazon's focused on it, I spoke with some of the product managers of EMR, they're focused on being completely Apache compatible, they try to trail just behind the Apache releases and you can also run other Hadoop distributions on AWS if you want. You don't get the- So you can run Hortonworks on there, you can run Hortonworks, you can run Cladero, you can run MapR. I think the difference is when you do that you are as a customer you're still dealing with some of the complexity of running your own cluster, you've still got to manage that process where with EMR they have the ability, if you use the full EMR capabilities to abstract away that kind of thing, so you're not, for instance, notify when a node goes down, Amazon takes care of that, you don't even have to know when that happens. Whereas if you're using other Hadoop distributions on top of AWS, that's still something you have to be aware of and take into consideration. All right, Jeff, well thanks for coming on theCUBE, helping us unpack Kinesis and a little bit on Elastic MapReduce, so hopefully we'll get you on again later in the day. Absolutely, thanks. We'll be right back with our next guest after this short break, exclusive coverage. Let's look at it in the Wikibon. Go to crowdchat.net slash reinvent if you want to join the conversation. We're documenting the conversation here in theCUBE. We'll be right back after this short break.