 Live from New York City, it's The Cube at Big Data NYC 2014. Brought to you by headline sponsor, WAN Disco, with support from EMC, MarkLogic and TerraData. With hosts, Dave Vellante and Jeff Kelly. Welcome back to New York City everybody. Jeff Kelly and I are here all day today, wall to wall coverage, as well as tomorrow. We also at four o'clock today at the Hilton Times Square. We have our capital markets event. Jeff Kelly is going to be introducing some new data to the marketplace. And then of course we have a great panel. Abby Metta, Amy O'Connor, Peter Goldmacher, former Cowan analyst, Jeff Kelly. I'll be moderating. We'll be broadcasting that live on SiliconAngle.tv, so definitely check that out. And then of course we're celebrating five years of The Cube at Hadoop World. Really pleased and honored that we've been able to cover Hadoop for that long. And The Cube has just been an amazing way for us to extract signal from the big data noise. Gustavo Tavo de Leon is here. He is with Cognizant and works in their big data practice. Tavo, welcome to The Cube. It's great to have you. Well, thank you for having me here. We're big followers of The Cube. And it's really exciting to be here in front of you guys. Yeah, we were saying, I think you're the first Cognizant guest, certainly that I've ever interviewed. We've interviewed a lot of guests in The Cube, but I think you're the first Cognizant. So I mean, I'm familiar with Cognizant, but for those in the audience that aren't, I mean, obviously, world-class services organization, maybe talk a little bit about the company. Sure. We're a Fortune 500 IT services and consulting company and customers around the world. And our big data practice is primarily focused out of our enterprise information management group. And it's been a really exciting ride these last three years or so that we've really focused on that and made it what we call one of our H3 priorities. It's our future. And we see it as our future for our customers as well, where big data technologies are going to take the enterprise. Well, as you may know, we were the first firm – Jeff Kelly really led this research to size the big data market and pegged the market at 50 billion by 2017. We're kind of sticking to that forecast. You know, you could say it's a large area. Guys like Chris Lynch say, oh, that's way undercounted. That's fine. But the most interesting part of that is we tried to peel back, okay, who is really driving this business? Where is the revenue? Where's the value, both from the supply side and the demand side? And the one thing that we realized that a huge proportion of the activity is around services. And the reason is this stuff is hard for customers. It's complicated. They don't have the skill sets. It's buzzwords and terms that they've – you know, who scoop and hive and jive and all this stuff that they've never heard of before. And now it's yarn and spark and it's just the sea of acronyms. So services plays a critical role. I often say it's where the rubber meets the road in terms of getting value. So I presume you agree with that. You've seen that. But how are you helping customers squint through and navigate through that complexity? Right. Well, there's really – we're at an era right now where the enterprise is ready for big data. I mean, there's – you know, 2013 we had a lot of science experiments going on and you can hear that from all the vendors that you've talked with. But now we're seeing that the enterprise is much more ready to accept it and we're going seeing two paths typically that they're following. The first one is big data technologies as an infrastructure replacement, supplement or some type of cost savings mechanism. And this is a really critical item for many IT shops because they are working on, you know, razor edge budgets where they really want to make an impact. And things like Hadoop and being able to use that as an infrastructure component has been critical for many of our clients and many of our verticals and how they're able to accelerate it. And that's where they start to get that learning and understanding and as we work with it and as we propagate that knowledge with them about how is it that these components can work together. Even something as simple as taking, you know, data archiving and putting on Hadoop. You know, you just don't land the data over there. There's a lot more that you need to do with it and you don't need all the components from the vast Hadoop ecosystem to do it. So that's really the first path we're seeing is IT coming in on an infrastructure size for cost savings. So I wonder if we could talk a little bit more about that. Sure. So you mentioned an example, archiving, long-term retention. Yeah. I'd love to sort of unpack that a little bit. So where are they applying this? Is it storage costs, compute costs, software costs, all of the above? And what are they replacing, changing? Where is the cost savings coming from? Sure, sure. So there's a couple things going on. In many industries, there's more regulatory controls that are being in place and need to be implemented quickly and especially in banking financial services, health care, et cetera. And other industries are catching on to that whole idea as well and what they're finding out when they go back and they do the surveys and doing a risk study is that their systems are simply aren't being archived. So there's that whole component that, you know, there's a need for archiving. Then they're looking at where they've had things on tape traditionally or on slow disc or on sands, et cetera, and how can they get that onto a high-performance platform that's very cost-effective and make it so that they can have an accessible archive that can later evolve into possibly something else. So it's bringing that data together as we're really seeing that they're doing it. What they're looking at is, you know, their current computing power is sometimes very focused, especially in the EDW space, and they want that focused power to be working on the analytic side. But oftentimes there's ETL jobs going on inside the EDW and moving those off of with the data archiving is also part of the cost-saving process as well. There's huge efficiencies that come with this. Interesting. So that was one area. What's the other area? I'm guessing it's going to be not just saving money but making money. Sure, sure. You know, big data technology for many of our customers is the competitive advantage. They see that there's a business case, there's a use case, there's something that they need to drive out to have a competitive advantage. And it's either because they're forward-looking in that way or they're now reacting to their industry or to other vertical industries and what they're able to do and they're seeing the nimbleness of it. But this, in our world, is very different in that we have found that if it's not business use case driven on the, this is what we call the big data analytics side, that the return on investment is going to be hard to find or hard to prove at the end. And so that, getting that business case right is a lot of what we do up front with our clients and refining that and then quickly getting into an agile mode to start working on the discovery of the data and working towards the analytic solution and then seeing what the real value equation is and then proving that out versus either the early assumptions for it or the hopes and aspirations for it as well. So I'm guessing that you're looking at, in terms of the companies that are most able to do that are the ones where the lead is from the business side, where they come to you with a business problem, business challenge rather than an IT-led directive where it certainly makes sense, I guess, on the first part. You want to save costs on your infrastructure. When you've got the amount of data going skyward and the budget's staying flat, something's got to give. But on the kind of making money side, on the big data analytics side, who are you seeing driving the projects? So that's a really interesting question and sometimes it's different by vertical industry. Banking and financial services has always been ahead in that type of thinking where the business is coming to IT saying can we do this, can we do that, or we have to do this or we have to do that based on the change on the opportunity and the risk that's coming up. But we're also seeing this coming out of the chief marketing officers, the new chief data officer positions that are coming out. This is starting to drive this use case mentality. We're starting to see more of it coming out of a combined IT and business approach to it where the IT group is very business astute and they really understand the data and they're starting to look at it and say, you know, we've had questions around for years that we haven't been able to answer that we can now answer, why don't we look and ask them more? And in those environments where we see the largest maturity curve where people are really ramped up, big data is being operated basically as a shared service within the corporation across different lines of business. And that can mean everything from just basic hardware and software and infrastructure type things to full blown use case development, business case development, getting those approved and then propagating that out and the project manager behind it and all of that. Well, that requires, of course, you mentioned, kind of the business and IT being aligned. What we're finding in our research is that one of the biggest reasons that organizations aren't getting ROI is precisely because they are not aligned. IT has one idea about what constitutes success and that's usually, okay, I've got my Hadoop cluster stood up, you know, the lights on the disk drives are blinking green, everything's great, and the business side says, well, no, we need to know, you know, who's our most profitable customer, they're next to increase their profitability and other, you know, business cases, they want the insights from the data. So they're two different, potentially two different ways to measure or judge success. How do you go about actually making that alignment happen? Again, it's easier to do when it's business use case driven because you can take that throughout the corporation and in a sense explain it, shop it, move it forward and even give it the energy to grow and come to fruition. When we have it on the IT infrastructure side, again, it's still use case driven, but it's driven in the sense of we're headed towards the cost savings or let's say an important improvement in SLAs that might come from using faster, better, smarter technology in that way. But it's really for us, the groups that are leading this, the industry leaders in different verticals, IT and the business are coming together. And the way I like to talk about it is that, you know, we've had this old dictum in big data for a long time about schema on write versus schema on read. And that's been about a very physical thing that happens, you know, in an application. But think about it up as a thinking level, as an innovation level. So with schema on write, it's very much defined and everything around it is defined from security to data governance to realizing the value. On schema on read, what do we do now? Because now it's the queries that are going into data. How do you govern that? Where's the security that comes with that? How does the business case get developed in that way? And how are traditional enterprise organizations dealing with this, okay, I'm going to do it right, and don't worry about the schema. Just put it in the data lake and we'll figure it out later. How are they dealing with that? So it tends to be different groups that have that attitude within the corporation. So the more analytical groups or groups that have the broader analytical view, the corporation are saying, yeah, just get it out there and I'll figure it out. And oftentimes they can when you come down to it. It's not quite as dangerous as it sounds. And oftentimes this is not some fountain you're opening up for the public to drink. It's more of a very particular group that's going to get high value out of it. And they can look at the data in Roth form oftentimes and tell you what it is. They don't need a lot of metadata oftentimes to move forward with their analytics because they know what their needs are. They can simply interrogate the data. There's a number of tools that help them do that. And they move forward and they can say, all right, this is important to my modeling. This is important to my machine learning. This is important to my advanced analytics. And they can go through the data and pull that out. And they can also make some of the judgments about data quality and also data suitability, whether it's suitable for their model or not. So that writing it out there and it's just kind of schemeless and you can go after it, it's there, it's possible, and people are utilizing it well. But it's a very particular group with a very particular attitude that's really seeing the benefit of that. So a lot of times organizations, the IT side anyway, will define themselves, I'm an Oracle shop or I'm an IBM shop or I'm a Microsoft shop. The database sort of defines them. And database was sort of the stable world for a while. And all of a sudden, I don't know how many vendors there are out there, database vendors, there's got to be dozens, right? And it's exploded onto the scene. How does that fit with sort of the existing model of IT? And what are you seeing there? So the fit is, it's been cumbersome. It's been awkward with our clients in some cases, especially the ones that aren't on the leading cutting edge. But we try to help them through it and they're very anxious to get through it as well. Awkward in what sense? In the sense that there's an evangelist inside the company who's a champion for sort of the new way of doing things or awkward that they don't know what to do with it and don't want to say so? No, it's just awkward in that they're moving, they know they're moving from one paradigm to another. It's well accepted, it's going to happen. It's just, what are all the steps to getting there because it is such a different one. It's not like adapting a new programming language. We all know how to program in one way, shape, or form. We're providing very different paradigms here. And it's those paradigms that build a sense of awkwardness or a curiosity that people say, oh, how can I now innovate off of this? And those people really are the leaders. So you've got the internal transformation and of course, let's take it a step further. How is that impacting the industry heavyweights out there that have dominated the data management landscape for so long from a technology perspective? You're seeing companies like Teradata try to adapt their making acquisitions and they're partnering with companies like Hortonworks. But you have a good perspective, I think, because you're in the EIM business, you've seen kind of the more traditional data warehouse world, MDM world, and now we're evolving into this next paradigm, as you say. How do you think the vendors that supply those technologies, those heavyweights, Teradata, Oracle, some say SAP, IBM, how are they adapting? Can they adapt? Right, right. So the first thing is, most of those people you mentioned are strategic partners. I imagine they are. And so, you know, I want to be... Technology agnostic. Technology agnostic, as we say, in the systems integration as well. But our partners and their people that are in their areas and fields, there are some early adapters that are saying we need to adapt to this type of paradigm and you can see it across different technology suites that come out. First it starts with just a connector to Hadoop and then there's an integration with the processing moving over to Hadoop and being able to do with MapReduce and off the HTF file system. So we see people in the tool vendors, you know, the people that we work with are adding those capabilities into their already very important and promising stacks. And it's not that these things are going to, one's going to replace the other. They're side by side now and people are realizing that that can be a vision that's productive, cost effective and it brings, you know, innovation and real value to the corporation. All right, Tavo, we've got to leave it there. Thanks very much for coming to theCUBE. Great discussion. I really appreciate your time. I'm thrilled to be here and thank you for having me. You know, we're all, like I said, big fans of theCUBE and Wikibon and we appreciate all your efforts and the information you bring us. Thank you. All right, keep it right there. We'll be back with our next guest right after this. This is theCUBE. We're live in New York City. Big data, NYC. We'll be right back.