 The Cube at Hadoop Summit 2014 is brought to you by Anchor Sponsor, Hortonworks. We do Hadoop. And headline sponsor, WAN Disco. We make Hadoop invincible. Okay, welcome back everyone live here at Silicon Valley. This is The Cube, our flagship program to go out to the events and extract the signal from the noise. I'm John Furrier, founder of Silicon A, joining my co-host here, Jeff Kelly. Big data analysts at Wikibon, industry leading analysts and our next guest is end user Demetra with Cetrica. Welcome to The Cube. Thank you. So we'd love to have people who actually deploy the technology. You're one of those. So tell us, what do you think about the ecosystem here? What's your take on Hadoop Summit, the vibe, and what things are you seeing that attract you from a technology perspective? I think it's growing. That's the first thing I would like to say. I've not been here in San Jose before or in the Hadoop Summit in the US before, but I certainly was at Amsterdam. I was actually speaking as one of the panelists there. I was just getting bigger and bigger in a span of three months. So vibe is really positive. We're using Hadoop and we have great expectations from it. And obviously it's great to see that the team here at Hortonworks are really putting a lot into the core itself which directly helps us as end users, customers. Yeah, well tell us a little bit about Centrica for those members of our audience who aren't familiar with you. Sure, Centrica is actually a very billion dollar company. It's a global company. I actually work for British Gas, which is one of the largest business units of Centrica. What we do in the UK is we talk about, we actually are in the energy utilities market, as well as home services market, as well as a number of other different, so we actually cover a broad range of things, also in the insurance market. So quite a vibrant and quite a dynamic company. So tell us a little bit about, how you're using Hadoop at British Gas? Yeah, so we started really the journey about I think now just over 12 months. We started with trying to solve our data problems within the organization, the advent of smart metering, as well as the fact that we wanted to reduce a lot of ID costs. So that's when we started with the journey, started looking at the strategy around how do we completely transform our data landscape with Hadoop. But we've come a long way. We've moved on from just the strategy and the architecture definition to a proof of concept and then moving on now to almost going nearer to production. So you mentioned smart metering, and of course that has huge implications for a company like yours, I would imagine. It does, yeah. If it's similar to here in the US, where with a traditional metering, somebody goes out maybe once a month, and reads the meter, and now with a smart meter, it's every 30 minutes or whatever the interval might be, where it's sending data home. Is it mainly a data volume issue that you could either look at this or? It's not so much just the smart metering concept here at British Gas. It's the connection that that has with the rest of the estate. That's what drives the complexity. As in, everybody is now trying to understand what benefits can we get out of this data if we try and join it with the rest of the estate. What insights can we drive? Using that insights. It's a direct implication to the way we run our business, to the way we... Can you care of it momentarily? Sorry. So yeah, it's mainly about the connections, and it's mainly about the complexity, and it's also about our existing systems. How can we actually run them better, run them efficiently? It's a host of things, really. So it's about using that data to run your internal operations, to understand how the... Internal operations, yeah. How we can better manage our resources, how we can efficiently manage our processes, as well as how can we think about the future, new applications that we can generate with this data? Well, it's interesting because we've recently conducted a survey. One of the big use cases was just that, kind of optimizing internal operations, which is a huge area where companies can find a lot of efficiency, rocks a lot of savings, and potentially you can route those savings to new analytic projects and to find new insights. So talk a little bit, walk us through a little bit, kind of, you mentioned modernizing your infrastructure. So take us from kind of where you were, to kind of where you are, to where you'd like to be, if you could kind of walk us through the journey a little bit. Yeah, I mean, where we are, really, we're pretty mature in terms of what we're trying to achieve from a Hadoop perspective, from modernizing the data fabric or the data operating system that we keep talking about. That's where we are. Where we'd like to be is, we would like to be that data-driven organization where we can proactively identify opportunities to make the best of our investments. And that's where we would like to be. And I think we're in a pretty good shape right now. Where we were, not even like to think about at this point in time, but pretty much a lot of the same kind of problems that large enterprises face with respect to legacy applications and complexity of the architecture. And if that's where we started from, we are on a journey to clean that up, kind of modernize our foundations before we venture out into running new business applications on it. And what's the culture like at British Gas relative to making kind of data-driven decisions? Is that, has data always been, whether it's small data or big data, part of the kind of culture of British Gas? Or is that something you've got to work on instilling in employees? I think we've got to do more work on that. But data has always been an integral part of the way we make our decisions. It's obviously improving every single day with the advent of big data, not just big data, with the advent of new opportunities that energy markets are facing, not only in terms of cost reduction, but also in terms of revenue generation, looking for new opportunities, making life simpler and better for our customers. So it covers a wide range of things. And all of that, if you look at it, can only be made possible with data and the right utilization of data. So I'd say it's never enough what you do with your data. It just has to be a continuous iterative process. And that's where I would like to see British Gas. How has security been a big force in your organization? Obviously, everyone does not even care about security. No one says that, but when you look at Hadoop and what's going on there, obviously security is the big power move first Hortonworks, and then now Cloudera following suit to Hortonworks. So, you know, that's a huge issue. And it's a tell-tale sign that, guys, that's table stakes. Is that true? Yeah, I think it's a great move by both companies, and especially for enterprises like us, securities in the core, it's the foundation really. So we're quite happy that those acquisitions have taken place, and we're looking forward to how that integrates with the product, with the core, and then how can we benefit from the opportunities? So I was talking earlier, we had the CTO, Wendis, go on. I said, lay out the differences between Cloudera and Hortonworks. In a way, if a friend asked you, what's the difference between Cloudera and Hortonworks? How would you answer that question? I wouldn't comment on that because I haven't used Cloudera really. So, I don't know how to comment on that, but if you ask me without the... Well, Cloudera, a lot of sales reps, so that's, you know, obvious, everyone talks about that there. Penn and Dave, a lot of guys on the street, this field organization, Hortonworks is a more open source focus. Yes, yeah, I mean, that's almost right in the face, that differentiation between the two companies. But if you ask me without, you know, as just an observer of the community, I'd agree with that statement really. It's the open source, and it's the number of times you actually can innovate, and the number of people you can engage, which is really the differentiating factor. So, as John has kind of alluded to, Hortonworks is very much focused on kind of a lot of their reseller arrangements with SAP, Teradata, et cetera. And I imagine at British Gas, as you mentioned, you've got a lot of legacy applications. I imagine the environment there is a heterogeneous one. Very heterogeneous, indeed. Does Hortonworks' approach make it easier to integrate with kind of some of your other technologies? What are some of the other technologies you're using? I'm just curious to kind of get an understanding of the relationship between Hadoop and some of the other data management tools and technologies you use. It's very important to us, the integration factor. As you rightly pointed out, we are a very heterogeneous and, you know, the environment is really heterogeneous. I've got almost all the names that you mentioned, plus some more. And it is very important, and it was when we started off with this journey, it was very important for us to seek a product and operating system, which is what we're talking about here, which can connect and integrate as an open service into all of these ecosystem partners. And I'd like to say we're pretty pleased with the way it has worked out so far. There's obviously a long way to go with some of our vendors in terms of how they can push down queries natively down into the platform so that we can utilize the potential of the platform better. But I'm glad to say that the way it has panned out, it's a very collaborative community. And they stick to the word, the way they work with each other, whether it's Microsoft, error data, or SAP. And, you know, another finding in our survey recently was a vast majority. I think over 70% of the end users applied some type of outside consultants. Was that something you did? Or do you guys have the internal expertise to kind of architect the system? Or did you bring in some outside professional services to kind of help you architect and deploy some of these big data technologies? I think it was mixed up all of that. The approach that we took was we were very certain from the very beginning that we would like to develop in-house capabilities because we see the potential, not only in the market, but in the product. And it is important for us as a company to gain the skills that we need to in order to exploit the platform in the best possible way. So the way the approach that we took was the architecture, the strategy was pretty much our baby. The vision was ours. And then in terms of implementing it, we brought in expertise from companies like Wharton Works and some other infrastructure companies and then we integrated them into the team, the in-house team. We also gave an opportunity for our team members to learn. Let's say, we gave them raspberry pies to start with, just to play with it. And then laptops. And then it was a gradual upgrade for them as well. But they enjoyed learning with it as much as we did. So question on our crowd chef from the audience here is just basically asking, how do you guys deploy Hadoop? The question that followed up in the audience is, how do you guys hook Hadoop into other big data tools? So it's pretty much the integration point that I covered a minute ago. In terms of deploying Hadoop, we do have a disaster recovery approach as well as a kind of high availability approach to Hadoop. So we, across data centers, we have deployed that. And the way we integrate with most of our existing ecosystem partners is pretty much using the connectors that they provide us. And also, where we can't get to that, we allow them to natively try and push the data out of Hadoop into those ecosystem partners. Okay, well, final question for you is, what is the future for Hadoop in your mind? How do you see it evolving over the next five years? Whenever somebody asks me that question, the first thing that comes to me is there's no straight answer to that. The opportunities are huge. I think it's what we do with it. That's where the future lies. I think Jeff Kelly said 51 billion and there are people who pood that number. Was it 51? Yeah, just over 50 billion dollars. But that's from a revenue perspective. I mean, we think as big as that number is it's going to be practitioners like British Gas that are going to drive the most value. And the value that you drive and companies like yours are really using Hadoop, I mean, the size of that mark is going to be much better. Internet of Things is a big part of Europe there's energy, you think about energy, you think about exploration, you think about all kinds of things involving energy, transport of energy, all this, that's a big data challenge from the Internet of Things standpoint. What do you think about that market? Is it just early adopters now? Is there a lot of thinking? Is there a delivery around it? What's there, peg that evolution of that industry? Early days? I think it's early days, yes. It's fair to say it's early days. Things are going to pan out in a very, very wide distribution of use cases that you just mentioned. But yeah, I would categorize that as early days but I think we're strong thinkers as well, we're learning from other industries as fast as we can. And I think it's a great time with this technology kicking off and we being in the early stages as the early adopters, we've done the right choice in picking up the latest state of the art technology to solve our big data problems. Well, thanks for coming on theCUBE, D. We really appreciate hearing from a practitioner, someone out in the trenches, buying and deploying and using the technologies. Obviously we're bullish on big data, we just think this is just the tip of the iceberg. Cost of ownership, value chain integration, new devices. It's just amazing, it's a fun time. This is theCUBE and of course we're documenting it, breaking it down, we'll be right back with our next guest after this short break.