 Live from Midtown Manhattan, the Cube's live coverage of Big Data NYC, a Silicon Angle Wikibon production made possible by Hortonworks, we do Hadoop and WAN Disco, Hadoop made invincible. And now your co-hosts, John Furrier and Dave Vellante. Okay, we're back here live in New York City. We're here for Big Data NYC. I'm John Furrier, the founder of Silicon Angles. This is the Cube, our flagship program. We go out to the events, check the signal from the noise. We're here putting on Big Data NYC covering Hadoop World Stratocomps, all the action happening in New York City. A lot of announcements here. Big Data NYC. I'm the co-host Dave Vellante at wikibon.org and we're having a live crowd chat at crowdchat.net slash stratocomps. Go there for a spam-free chat, putting all the conversations online as well as the videos. I'm joined here at Anchor Gupta with Metascale. Welcome to the Cube. Thank you so much. So, tell us what you guys do, because it's an interesting story. So just give a quick overview of Metascale, how it came about and its relations to Sears and the whole deal. Sure. Metascale is a big data wholly subsidiary of Sears Holdings Corporation, the large retailer essentially that you know about. The holding company that's owned Sears, Kmart, Lentzen, Kenmore Craftsman, and those iconic brands in U.S., we were born out of Sears after Sears almost perfected the art of using Big Data technologies in a large enterprise. We started on this big data journey about four years ago when as our data size on our side was growing, we needed technologies to be able to better process and analyze that data. And as a legacy company, we are as large an IT shop as you can imagine, but every technology that you can think about. So our existing EDW, it could process the data, but it wasn't cross-effective for us. And that's when we started looking at Hadoop as an alternative. Four years ago, there were not many enterprises other than the large e-commerce companies of Facebook and Google and Yahoo of the world that were using Hadoop. So it wasn't easy for us. We had to really take a step back, learn the technology. We had to build resources, we had build capabilities and expertise. It took us some time actually. We had to answer a lot of questions that were not available to us, the solutions were not available to us immediately. So we took some time, but then we built those resources, build capabilities, fast forward four years down the line now. We have a number of use cases and we are a heavy user of Hadoop and open-source big data technologies. So now, as other enterprises are thinking about Hadoop and adopting technology, they're looking for a partner who could provide them that neutral platform agnostic advice. And they started coming to see us and asking those questions, how did you guys go about it? How did you find your use cases? How did you decide on your Hadoop roadmap? So we saw commercial opportunity and we formed Metascale. So Metascale is a polyurethane subsidiary of Sears Holdings again. We provide end-to-end solution in getting companies started on their big data journey and accelerate their journey. So they don't have to go through the pain points that we did four years ago. So you came over from the Sears side, is that right? So when Eddie Lampert picked up Sears, it was a depressed asset. Eddie Lampert obviously one of the best, arguably not the best, investor on the planet and picked up a depressed asset that had really been beaten down by the likes of Walmart, who was driving data back in whenever it was the 80s and 90s, you know, the old beer versus diaper story, the original EDW, which now we look back at that as like a dinosaur. But nonetheless, is it true that Sears really decided that it would begin to compete and resurrect itself largely with the data-driven agenda? Is that what happened? I mean, in addition to all the other blocking and tackling that you had to do. So how critical was data viewed as part of the Sears turnaround? So if you look at how our strategy is and how we are moving forward in today's market, we are becoming more and more digital company. But most importantly, we want to become a member-focused company. So our Shop Your Way Reward program, that's our loyalty program. And we want to make sure that customers who are coming to us, we want to reward them for shopping with Sears. We want to have personal engagement with them. We want to understand how they shop with us, what makes sense for them. So have that relationship with our customer. And part of the digital engagement with our customer, data is I think the underlying key factor. So on the business side, it makes a lot of sense. But you have to have strong IT underlying capability to be able to manage the data, to understand your customer. And I think that's the kind of genesis for us to be able to connect. That's the underlying technology aspect for us to be able to connect. Become that integrated retailer. So no matter how our customers shop, whether it's online, in-store, mobile, phone, whatever method they choose. We want to be able to connect with them. We want to be able to understand how and what they're shopping and what they would shop based off of their past shopping behavior. So having that digital engagement, becoming that integrated retailer, it requires a lot of innovation internally. And that's where I think data is a key aspect of this overall framework, overall technology, road map and journey on our side. The business model is innovative. So essentially if I understand that you had this capability and said, okay, let's spin it out and help others, not just help ourselves. It's a wholly unsubsidery. It's still 100% by Sears. Yeah, or I'm just gonna spin it out, but okay, yeah. So create a wholly owned subsidiary that's quasi-autonomous. I mean, presumably you're autonomous. I mean, you're probably not micromanaged by a bunch of Sears guys, right? You've got your own agenda and P&L and so forth, right? So talk about how you're helping customers. What are they asking you? What are they coming to you for? And how are you applying the techniques that you learned at Sears to the broader market? The interesting part is, I mean, how does Hadoop is becoming more and more mainstream? If you see the number of vendors that are out there, it's just growing. The market is getting full off, companies that can provide you that technology, that solution. And I think how we differentiate us, how we differentiate ourselves, compared to all others, is we bring in that enterprise experience, first of all. So taking a step back, Hadoop is a new technology. It's not been there for 20 or 25 years. So there's a huge education piece involved in implementing Hadoop. And it's a fairly complex technology. It's nothing that what companies are used to for last 20, 25 years of using the relational databases. It essentially challenged that thinking of being able to manage and process that large amount of data quickly. So when other enterprises, especially large enterprises, are thinking about their Hadoop journey, they want a partner who can give them the view of this overall landscape, what may or may not work for them. And that's where we come in. We bring in this enterprise implementation experience of Hadoop. And no matter what your existing EDW, Enterprise Data Warehouse, maybe, you could be a heavy teradata machine or IBM shop, heavy teradata shop, IBM shop, or for that matter, any other technology that you may have implemented on your side, we can bring in our experience of how Hadoop will fit into your existing landscape. And we can help you identify use cases. And most importantly, we are a vendor-neutral platform agnostic company. So we provide a neutral view. Working with us, you get that enterprise experience, but you do not get logged down to our vendor solution. And you are essentially talking to experts who have done it on their side. So while as much as we are a vendor, we are also a heavy user of our own technology and solutions. So is that what you're finding that customers or the implementations that you're helping customers with are largely adapting Hadoop to their existing infrastructure? Or are you finding increasingly that people want to start with sort of a green field and create maybe a sandbox and then evolve it into a capability and a platform that maybe isn't constrained by the existing EDW? Can you sort of compare and contrast those different approaches? Sure. There are companies that are in different stages of using Hadoop. There are a lot of large enterprises that we talk to that they have heard about Hadoop, but they do not know much about it. In fact, there are companies that when they do RFP, they come to us and say, can you guys help us build RFP questions? It's a great position to be in, but it's very exciting when you think about it. There are various enterprises that are in different level or different stages of using Hadoop. So those that are in initial stages, that are thinking about how big data will help them, they don't even have, and Wikibon, you guys publish a lot on this, and the three reasons that you guys published that companies fail in implementation of Hadoop are they do not have proper talent, they do not implement it correctly, but most importantly, they do not have use cases. And that's where- They're guessing. They're guessing. They're hoping that let's do something quickly because there's so much hype about it and hopefully it'll work out. And what we suggest is take your time. We suggest finding out a couple of use cases that may make sense for you immediately, and we help companies build those use cases. We have built numerous on our side and we have seen those working. So depending on a company, what stage they are in, we help them accordingly. So companies that are in initial stage, we help them build use cases, we help them do a proof of concept. So we build a small Hadoop cluster for them and help them see whether how Hadoop will really help them, right? So those are the companies, so essentially doing due diligence based off of what they may have in their existing ecosystem and building that roadmap of how Hadoop or how Hadoop or Big Data Technologies will essentially be able to help them. So it includes managing Big Data talent, it includes setting up the infrastructure, it includes data governance as well. So there are multiple aspects to it essentially. And then there are companies who may be Big Hadoop shop today because they jumped on the wagon, their IT department said, oh sure we can do it, it's open source, we can put a couple of node cluster and do it for you. And now a couple of months down the line, they come to, as we talk to them and they say, can you guys come in and see if we are efficiently using Hadoop? Because we are not seeing a lot of benefits out of it. So what I'm seeing a lot of momentum in companies that are, they really are in the early, early phases and are looking for us to give them that neutral view, overall landscape of how Hadoop will help them short-term and long-term. So okay, so I get the neutral piece, you guys are independent and sort of technology agnostic even though you know a lot about technology. I want to get back to the use case. So it sounds like with a lot of engagements you start with helping the customer understand where they should actually apply this, where they're going to get the most bang for the buck where the risks are maybe low or appropriate. But when I think about, so there's an upfront consulting piece, right? You guys do that. When I think about that, I think of like the big global integrators like an Accenture, or Deloitte, or a PWC, or an IBM, they have deep, deep domain expertise in virtually every country and every region around the world. Now, in retail, you guys must have a lot of domain expertise. But so how do you compete outside of retail? Are you predominantly in retail? That's got to be kind of weird because your competitors of a lot of the customers you're helping or do you more sort of consult in a sort of a horizontal cross-industry approach? I wonder if you could talk about that a little bit. Sure, our solutions are big data solutions. They are industry agnostic, they are location agnostic essentially. We can implement our solution in or the work we have done to any industry for that matter. So we actually today is where we are. We have clients in pharmaceutical, we have clients in healthcare, we have clients in financial domain. But are the use cases industry specific or? The use cases may be industry specific, but underlying the use cases, the technology implementation is similar. I mean, you have to customize it to those customer and especially if you're talking about healthcare customer then there are various legal guidelines. You want to make sure those are met and whatnot, but the underlying technology is not, I think. I mean, we have done it for several different domains and we have not had difficulty that way. So given that use case is a big gap because people we talk about for experimenting, hoping, hope is for children, I always say. How do you de-risk that piece of it? Do you maybe work with other consultants or are you just good at probing the customers? We take our time. We always say to our customer, do not just do it because you've heard so much about it because you think it may work out. We say take your time, actually help, and we help them build use cases. So we have a pretty detailed methodology that we take our customer through and ask them questions, send them questionnaire and whatnot to be able to find out what are some of the use cases they can start with immediately. And then as we learn more about their ecosystem and how the governance take place within an organization, we help build more and more on top of it. So obviously it's not the very first day we'll come in and have a list of 20 use cases that'll work for you, but we'll find some low hanging fruit immediately. So you may have some of your bad jobs, maybe taking 20, 30 hours to run. You may storage cost may be going high on your side. Some of these use cases that are applicable across the industry, companies may start with that and then they could become more and more industry specific. So healthcare companies will think about how I'm gonna use, now that I know Hadoop could keep all my data and I've said different layers of data access and whatnot, now let me put some analysis on top of it. And that analysis could be applicable to our industry. So healthcare guys may be looking at DNA analysis and creation data and whatnot and financial or insurance company may be looking at hazards and catastrophe data and whatnot. So there are some obvious ones that you have, I'm sure are starting to repeat over and over and over. So can you maybe give us some examples of results that you've seen? Absolutely. So some of the results I'll share with you. We've seen results on both saving dollars in the bottom line and also be able to make business decision quickly. So we, some of the jobs for our client of ours, one of their pricing job was taking over 20, 25 hours to run initially on their mainframe systems because the data size was so huge. What we did was we were able to carve out the data in an HDFS table on Hadoop and be able to move the job to run from mainframe to Hadoop and be able to complete it over 10 times faster. So one, it resulted in a huge saving for our client because now you're not paying those mainframe processing costs or MIPS usage essentially as it is known. But you'll also be able to take business decision quickly because now you're not waiting for a day to get that job completed and if that job errors out that means you pretty much lose a day. So you are able to take business decisions quickly. So we have numerous use cases like that where it's a combination of both immediate dollar saving but also be able to take better business decision that helps you build better customer relationship and generate more revenue ahead. So you always hear a lot of the vendors talk about we're gonna make Hadoop Enterprise ready. That's gonna be our focus. You in a way help make Hadoop ready for the enterprise. It is ready for Enterprise actually. So yeah, so that's my question. Is Hadoop ready for the Enterprise? It is, I mean as we have used it for the last several years we think it is ready and Hadoop as it comes from FHC. So Hadoop is an ecosystem of technology. It is not a software or our technology that you could just download and hope it'll work. It's an ecosystem of technology where there are various like big hot, big let in or hive or scoop and zookeeper. So you need to package it to what may make sense. So when you say ready for the enterprise what do you mean by that? Because some are saying it's not ready. There's a lot of availability issues, failover, compliance, you mentioned data governance. What are some of the things that are ready? All that in your mind's ready right now. Yeah, I think and there are companies that have done it for you. So not sure how much in deep I can speak for other companies but Glauda's distribution of Hadoop is enterprise ready and we use and promote open source version because we think there's so much development happening in this area that more and more will be seen in moving forward. So we suggest companies to use, go for the open source but the companies like Glauda and Hadoop has already done the packaging for you. So that version is enterprise ready. Now you could customize it depending on how you want to give access to it and data layers and what kind of security you want to implement on top of it but it is there today. Of course as yarn and Hadoop to come out how does that extend the use cases that you'll be able to attack in your view and what kind of timeframe can we expect to actually see those things begin to generate business value? We think that companies have already started adopting or embracing those changes slowly. However, with continuous development in this area we're seeing Hadoop being more and more powerful allowing you to be more real time now which was a drawback for it becoming more secure becoming more enterprise ready as we just discussed. You should see use cases sometime in near future in the next couple of months but at Metascale we have the philosophy of you know let's prove the value of it before just jumping on upgrade for the sake of doing it and we take time to test the system internally test the technology and see what new use cases is going to accomplish. What is it going to do that the previous version was unable to do? So do not have a specific timeframe for you I would say but you should see some interesting thing in near future. Excellent. Well thanks for coming I think we really appreciate it. Great story I mean it's one of those things where you guys built the practice because it was in your own use case. Final question advice for folks out there on the use case because that's a big thing people want they see the use cases. This is the year where validation is kind of happening here at Hadoop World and big data NYC so what's your advice for the folks out there who are touching some basic use cases what should they do? What's your advice? I'll make it short and simple hire Metascale. Okay, absolutely, that's a good solution. Of course you're going to hire you. No absolutely so we tell companies just take your time you should look through you have to find a trusted partner who can really provide you an overall view of what technologies are out there then who can help you find how Hadoop will fit into your lens your ecosystem your landscape. So we suggest that companies take time they should have a combination of both business user and technology folks from both business side and technology side come together and essentially do brainstorming and as I said before we have these innovative set of questionnaire and back and forth discussion with our clients where we probe deeper with them and try and find these use cases for them. So a neutral advisor would be take your time take a step back really think how Hadoop will fit into your overall ecosystem what part of the ecosystem what are you really trying to accomplish? So see the end results first before you start their journey and try and accomplish that end results. A lot of time we get requests for proof of concepts where success criteria are not very defined. So it's very hard for us to say even if you save your 50% of the process ignite is that the success you were looking for. And some clients may have maybe have unrealistic expectation that Hadoop will cut the time by 90% or 95% but that may not happen. So it's important that you take your time look at the end results to find your success criteria and start with a few use cases. You may not have everything. In our world it took us time to find use cases and we are building more and more of it. We have more and more businesses within Sears that are coming to us and asking for help in this area. Our pricing team is a heavy user of these technologies but now there are other businesses that are looking for help on this side. So that's the same thing we suggest to our client. Find those businesses who can help you. Final comment I want to ask you quickly. What's your impression of the show? What's your big takeaway this year? Share with the folks out there what's going on here? What's the big takeaway this year at Big Data NYC and Hadoop World? There's a lot of buzz out there. There are a lot of new announcements that are coming out and we see more and more companies that are interested in knowing more about Hadoop and Big Data. So we see that it seems like the Hadoop adoption curve that's been on you guys predicted seems to be becoming steeper and steeper and that's exciting. It seems that Hadoop has been here and staying and it's going to grow as we have seen on our side. Time to up the numbers. Data engines being native in Hadoop on top of Hadoop, inside Hadoop, platform, data platforms, that's all the rage. It's great commentary. Thank you very much. I encourage we appreciate it. Metascale, if you have any needs go talk to these guys. This is the queue. We're bringing you all the top conversations here inside theCUBE, live in New York City right outside the Hilton where Duke World and Stratocon has gone. This is Big Data NYC. Go to our crowd chat, crowd chat.net slash Stratocon and you'll see the conversations we're keeping track of. We'll be right back with our next guest after this short break.