 Okay, we're back live in Silicon Valley in Santa Clara, California. This is Stratoconference, O'Reilly Media. This is siliconangle.com's coverage of Stratoconference. This is theCUBE, our flagship program. We go out to the events, extract a signal from the noise, talk to the smartest people we can find, entrepreneurs, developers, CEOs, startups. Whoever can bring that signal, we will try to bring that to you with great questions. I'm John Furrier, the founder of Silicon Angle, and I'm joined by my co-host. I'm Dave Vellante of Wikibon.org, and we have an interesting guest right now from Intel. Boyd Davis is the VP of Marketing, and Intel made big news today, announcing a Hadoop distro based on a 2.x platform, made the announcements in San Francisco. Boyd made the trip down here to be on theCUBE at Strata and joined the show. So, Boyd, thanks very much for coming on. Hey, it's great to be here. One exit up from Intel, so we're close to corporate headquarters. Close, close to the mothership, yeah. Obviously, Intel is no stranger to innovation. Obviously, going back and Dave and I can show the gray hairs. Intel made the computer industry and extracting away complexity, and that was they made their bones on Moore's Law and making things faster and simpler and lower cost. And now, with the big data world exploding, where storage and compute is kind of blurring, but at the end of the day, software's driving everything. So you guys, although chip company, have a lot of software. So, one, tell us some background to Intel coming into the distribution of Hadoop. Obviously, it's a big data market. It ranges to analytics today, applications in the future, and certainly infrastructure will be affected. So tell us why and how you got here and then what you guys announced. You know, it's actually pretty close to our heritage. Intel has had a long history of participating in software projects. And historically, what we do is we engage with software providers and enable the software, contribute to open source directly, and really try to make sure that the software out there took advantage of the innovations in the underlying platform. And that's exactly what we're doing with the Hadoop framework. We believe that there's a huge opportunity to advance the Hadoop framework in performance and security and reliability. And the only difference this time is instead of just enabling, we think it's important for us to bring a commercial offering to the marketplace because we can provide some stability to the Hadoop ecosystem. You know, you have two different characteristics of players, some great venture startups that are innovating and then some traditional big companies that are integrating Hadoop into their products. And we're offering that middle ground of establishing Hadoop as a horizontal layer and creating a foundation for innovation and then making sure that customers know that there will be an open, open source Hadoop that's available for a long time so they can really remove the uncertainty and get those big data projects rolling on top of Intel's distribution. So we were talking off camera, David Fleur and I, who we talk about all the time, mobile, cloud and social. And we always talk about market share, consolidation. So you hint stability, you kind of are telegraphing a little bit about, you know, customers want a stable partner platform. Are we hinting at some consolidation? I mean, if Cloudera gets bought out by Oracle, that changes the game, look at MySQL, you know, things like that could happen. You just never know, but you guys are out there saying we're going to be a stable force. Well yeah, I certainly let you guys do the speculation on how this industry plays out. Our partners tell us. Or Cloudera gets bought by Intel, no. Our partners, we had partners like SAP and SAS today and Cisco and Savis. And one of the reasons they tell us that they want to work with us is one, the technology. You know, one is we really do know the underlying compute capabilities, memory and non-volta memory in particular, where leaders in 10 gig ethernet, that's one. But they also want to work with us because they know we're committed to this horizontal platform for innovation. We're really not seeking to go where everybody knows all the value is, is up in the stack where the vertical applications are. And that's where we want to enable our partners to succeed. We're really creating a platform, much like the Xeon platform in the data center is a platform for our customers to bring great infrastructure to market. This is an ingredient platform that allows people to bring big data analytics to market as a solution. So that stability and that notion that we're going to be committed to open source, keeping this as a horizontal layer, and that we're going to have the best underlying technology, that's why people seem to be migrating toward our distribution. Boy, tell us some of the things about where you see the market, specifically around abstracting away the complexities, because that's, Intel made their bones on that by extracting away some of the complexity and giving people just a simple look into a chip. And you guys have done that before with it. Dude, what do you guys look at as the key complexity that you're going to provide that reliable, hard and top or whatever you want to call it platform? Well, our focus on driving the Hadoop framework forward is really in three areas. One is interactivity and responsiveness. Today, rightfully so, most of the Hadoop implementations, you assume that you run the job and forget it for a while until it gives you an answer. And we'd like to bring that responsiveness down so that there's a much more rapid response time. The other thing we're hearing from customers, particularly in the enterprise is security. We're bringing encryption into the file system and cell level production into HBase. And we actually launched, just yesterday, Project Rhino to have a holistic open source view of hardening the Hadoop framework and all the projects for a consistent security, authentication and encryption approach. So that security is really significant. And then finally, reliability and enterprise quality of the stability of the platform. We've got a long experience in the enterprise. So ease of use is important. And I didn't answer your question directly because we are doing some work around ease of deployment with our manager, which is not part of the open source framework. The manager is a separate product. But within the open source framework, it's that security, performance and reliability that we're bringing to the table. And bringing Hadoop to the silicon level, obviously is a key for performance and presumably security as well. Absolutely, absolutely. If you take a look at one of the specific things we've done is take advantage of new instructions that we built into Xeon, actually, just a couple of years ago, that accelerate encryption up to 20 times. Now the reality is when you're talking about a 20 times improvement in performance, you really move away from having it be a responsiveness thing to having it be a possible or not. Because if you had that kind of performance overhead of encryption without that acceleration, people just couldn't afford to use it. So we're absolutely looking at the ground up. We built solid state memory technology into the Hadoop framework. We're taking advantage of advances in 10 gig ethernet and out in time, the future direction of fabrics. So there's a lot of this, not so much building Hadoop into the silicon, but building the silicon into Hadoop. So now you mentioned cell level security into HBase and I can't help but think of the Accumulo project when I think of that. So essentially it's for cell level security, you're talking about a more granular level of security for an open source database like HBase, which is widely distributed. Are customers clamoring for that? Are they at that point where lack of security is an inhibitor right now? Or are you- Absolutely, I think it's one of the major places we see in kind of two areas. One are those obvious organizations that have sensitive data, whether it be financial services institutions or healthcare institutions that really want to take advantage of big data. But the other big drive is for multi-tenancy. The industry really wants to drive Hadoop as a service. And as soon as you get into a multi-tenant environment, the first question enterprise customers want to ask is how secure is my data when I know that there are other tenants on it? So, you know, we're looking forward to working with the community around Project Rhino because security takes a comprehensive approach. So we've done some work around cell level protections in HBase, but Rhino really looks at security, encryption and authentication across the full range of Hadoop projects. And we're going to plan, we've submitted that to the community now and are going to look forward to getting feedback from people like that are working on Accumulo and other projects. Yeah, more of a holistic approach you're advocating there. Now, your point about multi-tenancy is interested and I noted you said Savas was at the announcement. And obviously the services component is huge in this marketplace. We've sized it in services is probably almost half of the opportunity. You mentioned the applications as the value there as well. So what do you see as sort of Hadoop as a service you'd be seen as Amazon attacking the enterprise? You've got partners in your ecosystem. What do you see in that whole landscape? Well, I think it's a way to accelerate the use cases for big data. So you have kind of one of a couple of different models. One is customers that aren't sure exactly what they want to go do. So it's much easier to do evaluations, prototyping in the cloud and to consume Hadoop as a service. Another one in Savas mentioned this morning that when you're already working with a service provider for your infrastructure, that's where your data is. So you really want to start to get value on that data. That service provider is the right place to deploy your big data infrastructure. And then the third class of customers are those ones who really know what they're doing and have their data accumulated in their own environments. And that's where we're seeing people deploy their own solutions. But in both of the first two scenarios, enabling a broad basis service providers is really, really critical. And we're excited to have Savas announce their collaboration with us today. We're in conversations with a lot of other service providers to make sure that all of the service providers are looking for that security, enterprise-class reliability and kind of performance. So we have a good fit for the service provider role. I just want to clarify the role. So you run a software group inside of Intel. Absolutely. And so I'm still a little unclear on the objective here. Is this an accelerant to the marketplace or are you actually trying to drive a large business out of this initiative? Well, you know, it's interesting. You know, I believe that our initiative in the Hadoop framework and the work we're doing around the Luster Parallel File System and some of the other software products we have will ultimately have a bigger impact to Intel in terms of driving accelerated growth for the data center in our Xeon business. But I also believe quite strongly that the only way for us to have that impact is to participate in the market with a commercial offering where customers have to vote on the technology with their pocketbooks. You know, we will continue to do enabling and we have a broad base of ecosystem partners that we're enabling, but in certain cases, particularly where that technology is very close to the underlying platforms, it's important for us to drive advances with an actual product. So we are absolutely planning to build a software and services business around both package applications and open source, typically lower level system software that's more close to our heritage, but we're going to build a business around it. But ultimately by building a business around it, we think we can influence growth in the market by accelerating these key architectural transitions. If we can make big data more accessible, more useful, more safe and secure, people are going to buy more solutions more quickly. So we plan to benefit both ways. Yeah, and obviously the security is a major thing, but also performance you mentioned with cell level security. That's one of the dings on cell level is the performance. But I want to ask you about some topics that have come up with respect to the kind of the data warehouse market that we heard from some of the vendors. With moving data around the network and transforming data sets, it's very expensive in terms of performance costs. So throughput and latency are really big deals in kind of the database world right now. So that's one question I want to ask you your thoughts on that. And then two, what use cases beyond normal, obvious data analytics are you guys seeing? Because Intel is a diverse industry set of solutions from data center to internet of things, to mobile, to everything. You know, I think one of the key reasons why we want to drive performance into the Hadoop framework is because ideally organizations can deploy their data into a single environment and then get value out of it a lot of different ways. You know, we have examples of telecommunications customers in China who are storing their call data records, billions and billions of records. And the first step is to make the data available to their customers online in a simple lookup. But then all of a sudden once that data is there, they can start to do analysis and figure out what's the most popular smartphone or what's the most effective data plan. So it's that versatility of the data that we find to be very important. And one of the reasons why we want to make this horizontal layer kind of a multi-use layer for many organizations. So that's our approach. Now by the way, you know, we're still engaged with some of the large data management vendors and I can guarantee you there are places for solutions like SAP's HANA and Oracle's Exadata and you know, Vertica and Green Plum and Matizah and the long list. We're by no means to a level where the Hadoop framework is capable of taking on all those tasks. But we can certainly drive it forward to take care of more of them. Now, what was your second question that was relevant? Relative to some of the use cases around data, for example, data centers that flooded with big data that splunked on one end and kind of a niche there. And then you got all kinds of other data exhausts as Dave and I talk about out there for Internet of Things, probes, everything that's got thrown off data. Well, you know, I think relative to the Hadoop ecosystem and those usage models, I think most of the audience here at a conference like Strata are very familiar with the telecommunications, healthcare, public service and public safety, you know, marketing and use cases, fraud detection. You know, honestly, the list goes on but all of us in a community like this at Strata have a pretty good feel for those. Maybe one distinction for Intel is we are really focused also in our big data strategy beyond Hadoop on the Internet of Things and really the distributed intelligent data that exists at the edge of the network. We've created a framework called the Intelligent Systems Framework that really starts to focus on the connectivity, manageability and security of data coming from those sensor networks. And one of the things you'd like to do is make sure that you can go to the edge and filter out the noise and you have to be really clear on what's noise and what is data. And then also make sure that the data coming from the edge of the network has integrity. So the analysis that comes out of it is worthwhile. So we think we're just at a starting stage. Today, those solutions are very bespoke, kind of customized solutions for every kind of data and every kind of sensor, every kind of edge device. We'd like to move to a mode with the Intelligent Systems Framework where there's some commonality and extensibility so that as you particularly mix data of different types that you have some commonality in the data structures in the security profiles and so forth. So that focus on the internet of things and the Intelligent Systems Framework I think is a little bit unique to us because we really like to drive consistency and scalability into those edge solutions. Yeah, and you know, we're hearing that too because we also, in this market in Strata, we cover incumbent industries like data warehouse, business, intelligence, very low hanging fruit. Obvious, you know, industries that's been around, sunk investments are in there, you know, teradata and whatnot. They're all doing that green plum. But beyond that, you mentioned the telecom example. I mean, this is really kind of how we see the new world. So the old way and new way, always transitioning. The old way never dies. It kind of hangs around like mainframes. We've seen that, right? So the new way, I want to talk to you, ask you a question about this new way. So the telecom company or other examples, they don't fit into a classic data warehousing model because you're now mixing data sets as you mentioned. How do you see that world evolving? I mean, it's simply transparent in the future. You guys are looking to make that seem, you know, transparent to the user because now you don't have to transform data. I got a structured data set over here. I don't want to have to retrofit a data warehouse to a use case, say the telecom records, say, hey, I got fraud detection issues. I could do do lookups, okay, maybe a data warehouse app. But now I want to use that same data to do fraud detection or anything real time. Well, I think the, you know, the origin of a lot of the newer big data tools, things like Hadoop and even like Luster came out of either the high performance computing world who's dealing with computational density and capacity at scale or the big cloud services world who are dealing with, you know, computational scale of a level that most organizations haven't faced yet. And they've come at these with these new breakthrough, many cases, open source tools. And yet those tools are not, you know, going to fit all use cases without support and nurturing. So our point of view is for one, make sure that we drive for standardization and integration wherever possible. You know, we're a company whose business is built on standards and interoperability. We wouldn't exist without it. And then also to take something like Hadoop as a foundational technology and extend it. And think about how do you capture more and more use cases with the same framework to have that framework be more and more versatile. And we're going to be on a multi-year path, a decade long path at least, to get to the point where you can really start to standardize on a set of tools. But I think today we're a long away from that. And certainly, you know, companies really have to think about their data strategies to mix the old with the new. So I got to ask you the Intel Moore's Law question because obviously the history is the culture, you know, Moore's Law is inherent. So when you look at this new business case opportunity that you're forging, obviously it's a little bit, you know, mash up between software and analytics and big data and storage and the, you know, the hardware side acceleration with Silicon, et cetera, whether Silicon goes to Hadoop or vice versa. So what is the internal Intel Moore's Law conference room conversations like when you have to try to peg a number to performance increases? Do you say, you know, we want analytics to double every six months or I mean, how do you guys talk about Moore's Law in context to your market opportunity? Well, across the company and you asked this as a representative of Intel, it depends entirely on the usage case and the user experience. You know, we've made enormous strides in our technology for smartphones in the last several years where the focus is really on getting that rich user experience with a long battery life for a phone and getting the very best connectivity and the very best developer ecosystem. So performance and user experience for a cell phone user is radically different than it is for a data center user where we have seen in the data center, a shift from in the old days, we'd optimize around database transactions, TPCC. That was, you know, that's what you built your performance gains around. And then over time, we recognize that things like Java performance, floating point performance, you know, throughput performance for HPC that mattered. And now of course we're building energy, Hadoop benchmarks, of course, power consumption, you know, things like the spec power benchmark which we have actually led to. But wall clock time. I mean, you said today in your announcement that one terabyte of data could be analyzed within seven minutes versus say four hours previously. Well, wall clock matters, right? Yeah, performance is our heritage. I think that's one of the things that will remain true for Intel is performance is a heritage. But when you talk about how do we use that Moore's law, which is doubling the capacity of the transistors every 18 to 24 months. And we see the next several generations just like we always have. The way we use that transistor budget is all about user experience. And the user experience varies depending on the nature of the user. And increasingly we've gone beyond just thinking about that budget for only performance, which is still a part of our heritage and start to think about how do you use it for graphics? How do you use it for better battery life for a given application? How do you use it to secure identity or privacy or security? So we're really focused as a company now on from the ground up, building our silicon technology with the user experience in mind. And that's what's put us in a position. Where data is at the center of the value proposition? We're computing is at the center. Okay. And how many devices compute? Pretty much everything computes and that becomes the landscape that we're targeting. And as we think about user experiences, it gives us a chance to maybe deliver more of the solution. Well, I want to come back to security because I think it's so vital here. One of your former colleagues or I guess current colleagues, Pat Gelsinger has been in the Cube a number of times. And we've asked them before, is security a do-over? And he flat out, remember John, he said, yeah, it is. And many of our community have said, the way to solve this problem is the way Intel is solving the problem is really from the ground up, a bottoms up approach. And when you talk about internet of things, I guess I just make an observation. I mean, I think of like the Stuxnet virus and a virus going after a programmable controller. And essentially you've got a big task ahead of you to protect fundamental infrastructure of nations. Yeah, and we take that burden of responsibility quite seriously. Security is one of the pillars for Intel. And you saw our desire to take this to another level with the acquisition of Macapie of you years ago. We do believe that the only way to stay one step ahead of the bad guys, and let's face it, we're never going to be secure. We're only going to be safer. And the only way to stay one step out of the bad guys in our mind is to have an integrated solution of hardware and software working together so that you can really focus on the threats. And I think we've seen some good fruits of the Macapie and Intel collaboration on the device side. We're actually working much more closely with Macapie now on data center security topics. And we're looking at how do we make sure we extend that security out to that edge of those intelligent devices that may in fact represent the greatest threat surface of all. Certainly the most consequential when you take a look at where those devices are deployed. So I think customers can trust that we are going to stay on the forefront of integrating software and into silicon. It's not impossible to hack silicon, but it's a lot harder. And in that combination of hardware and software we think will ultimately create safer computing. Well, and it's a virtuous circle too because security gets performance requirements and that presents for the liability challenges as well. So you're hitting on all the marks here. Absolutely. Well, we love the edge of the network, we are really covering, we love it, we think it's going to be dynamic and diverse. I mean, the edge is not like it was before. It's multiple edge points, different environments. Big data is going to be key to analyze that and be real time, whether it's security prevention or user experience. Well, I think distributing intelligence out to the edge is an area that Intel is passionate about driving. We really want to not just view all of those sensors and the edge devices as dumb devices that kind of transmit data foolishly. You want that data secured, you want the connectivity standards to be consistent and you want to have those devices be managed and ideally you'd have the intelligence out there at the edge so that the data that you do have to gather and put in your precious Hadoop cluster or your other big data environment in a data center in the cloud, that that data is as valuable as possible. Well, boy, my final question is more of an ending question set up for our next interview when you come on theCUBE because we'd love to have Intel on theCUBE so you guys have a great track record of innovation and really make the market in terms of growing and helping everyone else grow in the industry. Pauline is in your group, right? Yeah, if you knew that. Yeah, we love Pauline, she's great content on theCUBE. My question is, what's your to-do items? Now you get the launch out of the way under your belt. What's your to-do list? What are you going to do in the next six to 12 months? What's your key task that you want to knock down? What do you want to nail down to the wall? What's your objectives for the next couple months, next six months to a year? Well, you know, and I'll take you asking me that personally because my scope is data center software at Intel and we're talking a lot about Hadoop and the Hadoop framework has an enormous amount of challenges to drive our next releases, support our partner innovation engagements that we talked to today and drive that business forward and get to a point where we're really helping the community go forward. But in addition to that, we're looking to make the Luster Parallel File System a lot more accessible to a lot more HPC users. That's a technology that Intel acquired a year ago and we have an excellent opportunity there. You know, we have a product called the Expressway Service Gateway that allows people to expose APIs from their legacy environments. And that concept of exposing APIs for organizations for developers is a hot topic all on its own. So I don't know if you guys have a conversation coming up around API management. We're participating there and we're participating in data center optimizations around better power efficiency, you know, more policy-based power management, better workload placement for the underlying performance and more efficient use of non-volatile memory. So our agenda from a software perspective spans all of those. And in the next year, you know, we hope to deliver solutions to the marketplace that customers can really help modernize their infrastructure and get value out of their data. Awesome, well we do love developers. We love data as a development asset for people and also with cloud and convergence with Flash and all these new modern architectures. The game's changing, so we'd love to keep in touch. This is theCUBE, Boyd Davis, Vice President at Intel. Going into the Hadoop market, all in. Intel is always all in and they're going to expand out to do what Intel does best, attracting the simplicity away from the users and developers for Internet of Things, Data Center and Hadoop. This is theCUBE, we'll be right back with our next guest after this break.