 Live from the San Jose Convention Center, extracting the signal from the noise, it's theCUBE, covering Hadoop Summit 2015, brought to you by headline sponsor Hortonworks, and by EMC, Pivotal, IBM, Pentaho, Teradata, Syncsort, and by Atunity. Now your hosts, John Furrier and George Gilbert. Okay, welcome back everyone. We are live here at Silicon Valley by Hadoop Summit 2015. This is theCUBE's Silicon Angles flagship program. We go out to the events and extract the signal from the noise. I'm John Furrier, my co-host George Gilbert, our new asset, big data asset at wikibon.com, and our next guest is Ashley Sturbs, EMO of Talon. Welcome to theCUBE. Hi, thanks for having me. So, it's just getting started day two, day one was yesterday. A lot of enterprise, it's cross-decasem, it's going mainstream. Pretty much everyone's like, okay, Hadoop's going mainstream. So, what's your take on that, and what are you guys doing, and what's the big news with you guys here? Yeah, well, we're seeing our Hadoop business accelerate dramatically. We finished last year at 122%, and then our Q1, we grew the business 178%. So, I'm not sure if I would call it having crossed the chasm yet, it's already a huge business, but I think it's got a ton of potential yet ahead of it. We just talked about that on the intro. It kind of landed in a little less staging area, kind of like a golf ball drop. Still in play, but still kicking butt, but a lot of debate on crossing the chasm. Why don't you think it's there fully, or is it because of the apps? You know, I mean, if you look at Hadoop, it provides tremendous value in terms of performance and cost savings, and yet it's still pretty hard to work with. There's a lot of growing up it's still got to do, and that's where tools like Talon come in in terms of providing a visual interface, allowing you to do drag and drop data integration instead of having to hand code MapReduce, for example. You know, we were speculating yesterday and this morning, again, on the intro, it's like the industry wants to self congratulate ourselves because it's not a fail. I mean, Hadoop is all steam ahead. So, I think the chasm is crossed from an industry perspective, but the customers are a completely different story, and we heard from Gartner, we were talking about that yesterday, is how they want to buy, is it ease of use message we're hearing that? What is the state of the consumption of the customer? I mean, will there be just Hadoop, there's going to be other platforms. What's going on in the customer's mind? What, and how do they talk about this? What's the linguistics that they use? They don't say, do they say, give me some MapReduce? Right, no, they're definitely not saying that. Or give me some Hadoop. And what I hear a customer saying is, how do I take advantage of Hadoop and how do I fit it into the rest of my IT ecosystem? So, how can I get data from all these different places? Use Hadoop is a great place to analyze it and act on it in real time, but then I need to push all that insight back into the cloud applications I'm using at my on-prem, an SAP or something like that. And how do I make it actionable for the business? And so, what do you think that the industry needs to do to get to the chasm? Is it just ease of use? Is it integration? Is it more software, filling the white spaces? You know, for sure it's across the board. I mean, for us, what we're focused on is being able to act in real time. So, not only can you do the high volume, heavy analytics, but then how do you make it actionable? I mean, a great example. Oh, it's going to ask, how do you close that circle? Right, well, a great example of that is a company called Hado. They're one of the largest online e-commerce companies in the world based out of Germany. And what they're doing is, you know, 50 to 70% of their shopping carts get abandoned. And they've realized that if they can predict who is going to abandon a cart and then pop an offer in real time, you know, free shipping or a coupon, something like that, they can have a dramatic impact on their business. Several hundred million more in sales by influencing that one event. And what I love is their CTO, his comment was, if you can't act on it in real time, all you can do is measure how much money you're losing. Yeah, I love that. And I love your tagline on your business, give an instant value from all your data. And I got to ask you to take us through kind of the mindset of a scenario. So I was talking to a Hadoop entrepreneur and he's like, we are domain experts in this vertical. They have been kicking bot, they're doing great self-funded, making money. It's a business. It's growing and they're happy. They've got 80 employees, they're going to be like 300. They're on a great trajectory. They do seven-figure deals, now they're going to do like 10 million kind of size deals, big deals. And I said, what's your challenge? He goes, here's my biggest challenge. We're app developers, we're writing software. And so when we bring it to a customer base, we're so excited when we get to the customer, they say, great, but you got to support this systems management tool from the 90s that we have or this and that. So what happens is he has to write more software, which is not in his mindset or as a bunch. That's kind of a real scenario that we see. I've seen that in multiple cases. So take us through that use case because that becomes a challenge because you have an invested party, the software developer, building value, using data sets that are specific to that domain expertise. Well, make sure you tell them about talent next time you talk to them. Is that a use case for you guys? Absolutely, yeah, we have a big OEM business where we've got companies just like that that are pulling data together, using our tools to help pull data together or let their customers do it for them. Depending on whether the customer wants to pay for that or they want to take that on themselves. But how do you do the sort of square the circle of, I mean, pulling data together, sort of, you know, they're well-known alternatives. Yeah. And there's a variety of sort of machinery underneath to accomplish that. Right. But to feed that in in real time, how does that part work? Yeah, so I mean, just to give you a little history of the company, we started off in data integration doing traditional ETL and then we moved to data quality and then we added in master data management, which is actually super important. I mean, a great example is one of our financial services customers, they want to take Twitter data and then combine it with their traditional customer data so that if they see somebody tweeting in Florida and supposedly buying a flat screen in Chicago, they know that maybe they've got a fraud incident there. And that's a perfect example of where you need to be. And that's a hard problem to solve too because not everyone uses their real name on Twitter. Sure. And then you do all kinds of cleansing. Yeah, but I mean the key thing there is you've got all these new types of data sources that are exploding. But where you get the real values, if you combine that with your other customer information. And so getting that 360 degree view of a customer, which master data management does, that's a key piece in it. It's just a step in the journey. I think that's a huge deal because here, talk about fraud detection. You also have mobile first-party data, whether it's in non-clickstream data, you get first-party data. I'm actually texting from Vegas, making withdrawal from Vegas, which happened to me a couple weeks ago at a CUBE event. And then I get a fraud detection, my credit card gets turned off. Suspicious activity. The data's out there. I'm on my phone. I'm making transactions. Just because I flew out of San Jose in an hour, that's so their lag is not real time. So they were working off my data in San Jose, didn't factor in the phone data. That's kind of an interesting scenario that you're kind of teasing on, right? And there are hundreds of those. And so being able to get that complete view of the customer, that's kind of a step two, call it, right? And then the next step is, okay, do I have an ESB type technology, real-time integration, be able to push that to whatever system is interacting with the customer right there in the moment. So I was kind of like, not dissing 360 view of the customer. I love the concept, but it's been overused, right? Or omnichannel marketing is another word. It's like, we all know that's legitimately new era time we have to live in, but 360 view of the customer has to be real-time. Every step you take, your context changes. So how does that work in your mind? Are you guys seeing that in your solutions and what do customers do to get that? I mean, true 360. Right, well no question. I mean, the first thing is just, how do you bring together all this silo data? What did they buy in the store? What did they buy online? What customer service calls have you had? And how can you just pull all that together to give insight to the person on the phone? Oh, I see that you bought these things here and those things there. Maybe the sweater would go well with that outfit, for example. But then the next step is getting predictive around it. So not just acting in real-time, but being able to predict the next thing the customer's going to do. So I got to ask you, what do you think the biggest driver for this growth is? I mean, we're talking about scenarios that are hitting mainstream, real-time analytics in the moment, Twitter data, master data, that's a great trend. That's happening, new sources are coming in. The internet of things probably fits beautifully into that. What's the big drivers? Is it mobile? Is it the cloud? What do you guys see? It's fueling the business. The number one thing is just the explosion of data kind of across all those. It's clearly a disruptive time and the people that can take advantage of that data to be faster, to be more intelligent, to offer new products and services, there are the ones that are going to win. How much of this do you go in with a sort of repeatable productized solution and how much has to be customized to the individual feeds of data that roll up? Well, for sure there are things like master data management where we've got a methodology and a process for pulling all that information together. But we really are providers of a Swiss Army knife and frankly at this stage of the industry's growth, every company we're working on is trying to do something innovative and different and so there isn't that kind of standard repeatable solution that we're putting out there in the market just go do it like this. Takes through a customer use case because this is a fascinating, this is really providing a hammer and nails right away for customers. You can start basic and then as they grow, your toolbox grows. Connectors are huge. I mean, connectors are top of the funnel if you're in marketing and other systems. So take us through a use case of a customer who are categorically. Well, a really exciting one is GE. So what they're doing, again, an example of combining data. So they've got all their past purchase data for let's take windmill terabytes. So they know the products the customers purchased, they're typically the ones providing the service and support on those. And now they're starting to say, okay, how can I pull all the sensor data from those windmills together and be able to predict when is the most cost effective time to send somebody out and how do I make sure that I can guarantee as little downtime as possible? Yeah, so they save money on maintenance. They have the driving truck out to say, is the windmill working? That's right. Yeah, it looks like it's working. And they're estimating that they can save their customers $2 billion a year in terms of lost energy. And how do they buy from you guys? I mean, how do you guys interface with customers? SAS, is it subscription? Is it a subscription model? OEM? Yeah. So take us through that. Yeah, and so we typically charge per developer. That's one thing that really differentiates us from a lot of the other players out there that are more, you know, either nodes based or something that's tied to the data that's being processed, which is extremely hard to predict particularly in this world. And we make it pretty simple. You typically know how many developers you've got on a project. It's a subscription model. If we don't deliver, you don't need to keep paying. So your developers are their developers? Their developers. Yeah. No, sometimes integration, they might want to do that. Okay. Yeah, right. What about like for the unwashed masses who still, you know, like cobalt programmers a couple decades ago, sink in traditional sort of ETL and master data tools. You know, not naming any names. How's the methodology different? Because, you know, the words are similar. Yeah. But tell us how the machinery works in the world of big data that's different. Well, I mean, that's actually one of the things we think is our real strength is that you don't need a Hadoop developer to start working with our technology. Basically, any data integration developer that's been working with Oracle or MySQL or what have you can use our tools and start immediately working with Hadoop. But I was thinking of, I was thinking of the pre Hadoop class of data integration people. Oh, so you're saying that is the point is that you can serve them. That's right. And bring them along. That's right. Exactly. Okay, and then. There's no question that one of the big barriers is just having skilled people. Yeah. So what, in the traditional ETL world is the workflow the same with your tools? Or are there different steps, but you can still leverage the same skills? I would say that they're largely the same. I mean, obviously when you're dealing with the kinds of data volumes that we're talking about, there's certain things that you want to make sure, do it with a sample first, make sure you've got your model validated before you start to scale it. So there's definitely some differences. For example, we've got a customer that's adding four terabytes of data a day to their Hadoop implementation. And your strategy around data quality, you need to be pretty smart about it when you're adding that much data. Oh, because the exposure of sources. Yes. Okay. Talk about Spark. You can mention that four week went live. Spark's a big driver for your business. We had info objects on earlier. They like, Spark's changed the game for them. Certainly it's the eye candy. It's a shiny new toy for the customers. The value proposition is significant. Right. Take us through that and where is that sat in from a reliability standpoint, delivery standpoint, where customers are using it. Right, right. Well, we're really excited about it. That example I talked about with auto, that was a Spark based example. And we think it's going to increase performance three to five times through the doing things in memory. As opposed to which process? As opposed to traditional MapReduce. MapReduce, okay, over MapReduce. That's right. And then on top of that, you've got a whole bunch of new capabilities you can take advantage of, whether it's the machine learning or streaming for ingestion, which is going to allow you to build a whole series of different applications. I mean, you go back to that whole analyzing of the turbine data. If you're looking at streaming data in real time, that's a whole new level of responsiveness. I mean, our crowd chat stuff is all real time. We store that in memory on Amazon, but many of you put it to disk, kills the whole app. I mean, there's new apps out there that need this functionality as table stakes. Yeah, I mean, you look at GE, before they started working with us, they were analyzing their data on a monthly basis. The next step was to get it to weekly and daily and changes the whole game when you can start doing things in real time. So share your thoughts on this because I'm trying to piece the puzzle together on Spark, and I've been a big fan of Spark from the beginning, how that emerged out of Berkeley and the whole cloud-air involvement, et cetera, et cetera. But I'm getting mixed messages. I've seen people using Spark in their apps and development. People are deploying it. But yeah, some people are saying, oh, it's not ready for prime time. I mean, what's going on with Spark? Is it shipping? Is it usable? What are people doing with it? And what's the status of it? Is it half-baked? Is it really ready? What's going on with Spark? Yeah, that's a great question. So we're today, we're in preview mode with Spark, letting people play with it, but not recommending they go into production. But with our 6.0 version that's coming out in September, it'll be fully supported. And we've held on- Fully supported from a talent's perspective. From a talent's perspective, right. And because remember, we're helping people take advantage of this technology. So from our perspective, that'll be the stamp that, yes, Spark is ready for prime time. Because you're supporting it. That's right, because we're supporting it for them, for our customers to go make use of it. We felt that before now, the APIs were changing a lot. It's developing fast. That's right, exactly. There's something that's, besides going in memory versus MapReduce, there's one big, big difference, which is all the different processing types, whether it's machine learning, or streaming, or you know, graph processing, you name it. They're all working on that same engine, where in the Hadoop world, each one has to go to a different engine, spit out its results, read in. Will there be a fundamental change in the kind of data integration and, you know, well, solutions that you can provide? Definitely. And to me, there's two levels of that. One is what we're going to do, leveraging machine learning to make our products better. So, you know, think about the number of steps that somebody might have to go through to pull together data from three different sources. And they would have to do all those steps themselves. And now you start to bake in machine learning into our product, suddenly we're auto recommending, oh, because you did step two, here's the next three, four, five, six steps. Do you want to just go ahead and do those? When you say next steps, do you mean to pull in resources? In the integration job, in the definition of that integration job. So, providing. For data sources. Data source, for data integration. Data, okay, okay. So it's, how do I make the process of building an integration job so much more seamless? Because the app is actually, you know, talent, in this case, is actually recommending. Here's how you can go do that. Oh, I see what you're doing. Let me show you the next five things you should probably do. Do you want to use these? Yes, great. And, you know, click a box and you're done. So as we end this segment, I want to get your thoughts on what's under the hood at, share with the folks out there, what's under the hood of talent, software, the tech, what you guys got going on. Yeah, okay. Well, so we're an open source provider of data integration applications. So it's data integration, enterprise service bus for real time, master data management. But big data has been the area that we have made a huge bet on in the last few years. And really what makes us stand out there is the fact that everything you do with talent is being done directly inside of Hadoop. So you're getting all the performance and scalability benefits of Hadoop. We're generating native code, whether it's Spark or MapReduce or PIG, to get that full advantage of what Hadoop brings to the table. And finally, for the folks that are watching out there, describe your ideal customer that would have the value proposition of talent so that they may recognize themselves watching this. Right. You know, you have multiple sources of unification of the data. I mean, describe what the problems that you were solving and some challenges. Well, the first thing I'd say is we're seeing a whole range of different types of customers because we make it really easy to start small and grow over time. So we see people doing small, simple things like their first detail offload project with Hadoop to people that are making much more strategic bets where they want to become more data-driven and how do they pull a whole wide variety of information together in real time to be able to act on it to solve new business problems. All right, Ashley Starrows, CMO of talent here on theCUBE, day two kicking off, wall-to-wall caches theCUBE. We'll be right back with more after this short break.