 Live from the San Jose Convention Center, extracting the signal from the noise, it's theCUBE, covering Hadoop Summit 2015. Brought to you by headline sponsor Hortonworks, and by EMC, Pivotal, IBM, Pentaho, Teradata, Syncsort, and by Atunity. Now your hosts, John Furrier and George Gilbert. Okay, welcome back everyone. We are live in Silicon Valley in San Jose Convention Center for Hadoop World, Hadoop Summit 2015. I'm John Furrier, the founder of SiliconANG. This is theCUBE, our flagship program. We go out to the events and then extract the signal from the noise. Our next guest is Lauren Schwartz, CMO of Atunity. Welcome back to theCUBE. You've been on every CUBE we've had. I kind of slipped as a Hadoop World 2015, but it's Hadoop Summit. It's the same ecosystem. Yeah, you've been seeing it evolve over the years. It's been interesting to be on here, kind of talk about the growth from each one as over time, but yeah, it's a pleasure to be back. So I want to get your perspective. We've had many chats over, many CUBEs over the many five, six, this is our sixth year at theCUBE. Right, right. What's your take? I mean, you're having a historic perspective. We've been in the front lines for the entire present and creation of the Dubico system. So what's your take? Are we high-fiving each other too early? We cross the chasm, as people said. I mean, George and I are like, hey, the industry, we're high-fiving each other. Self-congratulatory, but customers aren't like high-fiving us. Or are they? Yeah, well, I think you've seen a lot of articles recently. You've had Gardner talking about being at the peak, right? And just kind of crossing over that. So I think there's been a lot of hype. The nice thing is over the past few years, we've kind of seen the promise, right? Then you've kind of seen some adoption of people trying projects. And what's been exciting for us, and we work a lot of customers who have traditional data stores and on-premise, and then they're trying to figure out how to do it, is we're actually starting to see the return on value. Now it's still a definitely large gap, right? The number of deployed systems for Hadoop, the numbers in the thousands, versus they've got still billions of dollars going into data warehouses. So it's an active and noisy area, but we're actually starting to see some real values with customers. Yeah, George and I were talking, and Dave and I were talking at the last event, and also we had Informatica, where we went to. It's a whole master data thing. We're just talking about talent about this. There's no doubt that there's going to be some shake up, but this never going to go away. The data warehousing business intelligence stuff is classic. It's just such a massive market, but there's success costs involved of having this awesome data warehouse. It's just, if you can do it at a lower cost. So everyone wants to move portions of it off of the data warehouse into cloud or whatever environment. So that's where I see the check writing going on. Do you see the same thing? And what are you guys doing there? Yeah, and that's the place that we've been playing in. So as a company over the past few years, even before you saw a lot of growth in the Hadoop market and the cloud market, was how do you move between all the different existing data stores in an environment, whether it was Oracle or DB2 or a SQL server. And that has kind of transcended first into the cloud. So that's actually what we started to grow as a company, supporting Amazon, being a close partner on AWS, so people could figure out what portion, what type of analytics they might want to go from Oracle to Redshift with. And then as Hadoop started to grow, it was the same question of how do I go in an operational environment and take advantage of Hadoop? Because there's plenty of open source tools or things like Scoop. People can go started and get trials going. But then when you want to keep it up, consistency, real time, all this other sort of information, you have to take another step that has to go that way. So we've been helping focus on the movement of data. And then what's been exciting for us is, in just recently in the last quarter, we expanded our portfolio, announced the acquisition of a company called App Fluent. And they help with that first part of the question is, not just how do I move the data, but what data do I actually want to consider moving? I can look into the data warehouse now, there's plenty of legacy systems, they're not going to go away. And how do I figure out what's the value of the data in there? Do I have data that really needs that high performance and other lineage information or whatnot that I really want to keep in there? And what information can I move to Hadoop and do kind of the longer tail analytics? So that's been an exciting area because now not only is it a question of how do you move, but how do you figure out what to move? And that enables them to kind of cross that gap and take advantage of the power. You guys have had great success. You were recently named top big data company and database trends and applications. So congratulations. Thank you. But I got to ask you, there's a lot of tools out there. Yeah, yeah, sure. I mean tools as in like, you know, tooling. Right, right. So there's a lot of noise. How does customers figure out what to do? You guys advise also technology. That's a huge issue. And is that the pretext to replatforming? So this is a conversation we were having earlier. So just noise. How do you get a signal from the noise on the tools? Yeah. And then, you know, just replatforming trend. Yeah, it's, it's interesting. You know, when you look at, you know, we kind of went from being in data integration space, kind of focusing that area. And then we talked to customers. It's a broader story about data management, right? So you got to step back and say, forget about, you know, what you're moving to and what the platform is, but what do you want to do with the life cycle of your data, right? Because it's going to have some value up at the front. You know, as time goes on, it might have less value. You got to figure out what to move it. How do you optimize on different platforms? And that's the conversation that they want to have, right before you get into the conversation of, oh, am I going to use a dupe? Am I going to use my existing stores? Am I going to use the cloud? So, you know, we've been able to architect the platform. That cuts across all those different areas, right? So now you're having those broader conversations of, let us figure out, you know, what to move. You know, help you, help figure that out, then move it and then kind of manage that and get it ready for analytics. That sounds like the type of conversation that sort of a trusted advisor would have. Exactly, yes. And so who are you talking to in terms of roles? And, you know, just in thinking in terms of numbers, we were hearing previously, just at the CapEx cost, we were hearing data warehousing cost about 35,000 a terabyte and Hadoop at around 1700. But actually we've heard it was worse that it's like 100,000 a terabyte with operational costs sort of capitalized and Hadoop down to a thousand a terabyte, which sounds a bit of a stretch. Yeah, yeah. So you've got tools now and a relationship where you can advise. Right. So help us, tell us some of the scenarios. Exactly. You know, where you say, is it just the ETL offload stuff or what else? Yeah, no, it's a great question. I think for us as we've expanded out of our portfolio, we're having conversations that, you know, often we have conversations across the spectrum. So when you think about their overall data management products, now you're having, you know, CIO level discussions. And they start thinking about all the factors and I'll give you an example of, you know, how they come to us and how they think about it, right? Again, it tends often not to be a technology play. It's, you've got the first and foremost thing about cost, right? So we've got the top people at an online travel agency, one of the biggest in the world who came to us and said, look, I've got six petabytes plus growing on my traditional data warehouses. Six petabytes. What are they spending on that? Yeah, well, that's the thing. I mean, they're spending incredible amounts of money and they're trying to figure out- At 100,000 a terabyte. It's unbelievable. Yeah, I mean, and this is one of the largest online travel. It's a good business. It's a very profitable business. So the question is- Not good for the customer. Yeah, they're not going to all move everything to Hadoop tomorrow, right? But they're trying to figure out what makes sense. How do I modernize the architecture and where can I slot it in to do the longer-term analytics? And the first thing you have to do is go in there and figure out what's being used, how often, where is it consuming resources. And we go in there. That's a profiling app, isn't it? It is. So what App Fluent did as an independent company, it's now under the umbrella of attunity under visibility, is it can sit there and look at the logs and you'll have to let it run for a while or if a customer has a lot of information, you can kind of pull that and they can use that. And then, yeah, it's telling you based on the queries and other things, what's being used. And I like to think of, we're going out there and trying to hunt down the zombie data and kind of the vampire data, right? We've got all this zombie data, which is all this data that's sitting there, it's not really being used. It's taking up a lot of space, right? Very costly to be in a data warehouse. And at the same time, you're talking about ETL processes, you might have a smaller subset of data that's just taking up a lot of ETL. It's being updated all the time, right? And it's sucking up the processing cars. It's kind of like a zombie sucking that out. It could be very active, it's just not strategic. Exactly. So you don't want to use your data warehouse for data that's just taking up space and taking up a lot of resources at the end of the day. So you can have profile and figure out very quickly with an assessment of where do I want to move this, right? And for this company, they were able to save right off the bat by putting in the solution with community visibility, $6 million, right? It's a nice step, right? And then they can, at the same time, think about how do you curb the growth, right? We have other companies we work with where they might have terabytes. They may have scores of terabytes, but they know it's growing and they want to cap that. So we work with them to figure out, okay, how do we cap the level that you have and say DB2 and then figure out what's offload into modernized architecture? So Merv Adrian was on theCUBE yesterday and he had released his survey. And talking about that, class is half empty, half full, it was the debate. And the survey wasn't painting a rosy picture of adoption. What's your take on that? I mean, I think Merv's right, but again, the sample size is a little bit too small. Is that the representation of the broader global market? I mean, I think he's actually right on the survey. Results, as you've crafted the results of how he sample-sized it, but so sample side aside, he might be right, but is that enough sample? I mean, what is the representation of today's market? What's your take on that? Yeah, well, we have, we're about one company, right? But we have thousands of customers worldwide using some form of a tuning product. And then we have conversations about data management and data integration, what companies want to do. You know, it's talking to our sales guys about it. You can kind of say, half of those conversations now involve something around what they're trying to do and figure out what they're trying to move with the do. So it is that prevalence is there. Again, you're starting with smaller numbers. There's a vast amount of legacy data systems out there in data warehouses. But, you know, it's, you can say, you know, at least half or more are having those conversations. So- They're still good numbers. It's fantastic numbers, right? Because, you know, the other half for people who just have, you know, typical management issues or it might be a part of the system, they don't need to move, they have other issues to work with. So those are great numbers. And compared to, you know, even two years ago when as a company, we were at the cusp of it and looking at it. And we were just having people kind of inquiries and were like, should we even go into this phase? Right? And now here we are half the conversations. That's a pretty amazing advancement. Let me ask one thing. Sometimes the data management vendor or data warehouse vendor, they want to have a tiered storage, you know, high performance, medium performance, sort of semi-archive. And in doing that, they're doing the profiling internally. And since they have visibility into, you know, how they're constructing the SQL queries, they have a certain level of fidelity in trying to come up with that mix. Right, right. I mean, you're obviously going, you know, beyond that because they're not saying, you know, get rid of the ETL, that's what, that would be like get rid of 40% of their workload. Right. But how do you position yourselves against that? Yeah. You know, what's the message to the C-level audience? Yeah, so there are existing tools on a lot of platforms which are very good. And for a lot of them are geared towards, you know, just A, that particular platform, right? And then the other thing is they're geared more towards operational statistics on the database and how to manage it or the data warehouse, rather. And that's, again, if you're the DVA administrator, that's one thing. But we also provide visibility out to the business level, right? So if you have, let's say you're the internal IT department, you're trying to service operations, you're trying to service marketing, you're trying to service finance. The product can actually look at the data and do cuts of that based on the different businesses coming in so you can see which users from different departments are coming in. So now when you have somebody from finance saying, hey, I need another 30 terabytes and it's got to be the highest performance you have because I'm running these types of reports. Or you can associate a cost with the SLA. Exactly. What we can do, we can do show back, we can do charge back, right? Yeah. And then basically you can say, hey, if you're asking for this type of service, let's actually look at how you're using this service. You may think that you are using this and that you need this level of SLA, but in reality, you're not. Or maybe it's the opposite. It's you've asked for this little, but boy, you're consuming half of my resources. Let's have a conversation. So it's sort of like the delta between what they want and what they actually use. What the reality is, yeah. And this is, it's funny when you talk about storage. I mean, I came, I'd been at some of the big storage companies earlier in my career. And tiered storage is another thing that storage vendors try to think of is how you tier it. In some ways it's a similar fashion of, I mean, I think EMC's got a good message right now. EMC saying, you know, store it in EMC. Everyone say, don't store your data in someone else's platform. They're saying, we're open, store it in our platform. So that's kind of, even Merv made a comment about Teradata. No one should put all their data in one platform. But that's what Teradata wants, right? So it's like, everyone's saying the same thing. Don't store it in someone else's platform. And, but store it in ours. Yeah, well, I think that the challenge is for a large enterprise, especially these really big companies, right? Fortune 500 companies, they have a little bit of everything, right? They're going to have, you know, some Teradata, some EMC, they're going to have a DB2. They always have some sort of mix, different departments, different areas. And so we work closely with all those partners and in some of those areas that make sense, keeping it in one system. And then other times it's, you know, you need to have that heterogeneous view and we can help with that as well. So here's the question I want to have for you. I'm a customer and I say, hey guys, here's my concern. I'm going to invest in a Hadoop project. We've done some POCs, my guys like it. Yeah, yeah. You know, pick a van to say Hortonworks. Sure. Or whatever. And I'm really worried about foreclosing future opportunities. Yeah. So I want to have a data architecture where I can store my data. Yeah. And I want to have the ability to say, hey, I want to move to the cloud and on-prem. And I want to have multiple players. I might want to have Clouder, Impala. I'll use Hortonworks here. But I don't want to have the ability to go horizontally at a platform based on whatever my app needs are. Sure. So if I'm successful, if I'm not successful, I get fired. Right. So I don't want to get fired. It's probably the other two. If I'm successful, I want to actually have growth. I don't want to have to pay more money in headaches to replicate that data. What do you say to that? Yeah, well, part of it is it's not a single- Store it and just storage and have like a commodity storage layer? No, no, I would say it's not, it tends to be a dynamic environment, a dynamic decision, right? It's not that you will kind of move the data over. And a lot of companies do this or have the need that you take historical information and move it over a lower tier and then you'll be done with it, right? But it's also, I've heard it's a dynamic supply chain of data, right? You know, sometimes it's going here, sometimes it's going over there. And you've got to have the flexibility to move it around, right? You've got to have the flexibility to offload it. So for example, you talk about the cloud, right? We work with some of the major healthcare providers like Phillips, which is also a close partner of Amazon, a close customer of Amazon. Focus at them to reinvent show. And they try to figure out, you know, with Redshift and with Atunity, how do you go ahead and take advantage of the capabilities to do analytics on demand, right? For some of their applications. They might need that some of the time. They might need to do some of their analytics on their on-premise systems. So I don't think they give up that flexibility, but they want to have the ability to kind of choose as needed in those different areas. All right, so outlook for the industry as we wrap up here. What's your take looking back four years? What do you think? Looking ahead four years? No, looking back four years and then to look ahead. Yeah, yeah. Where we've come, is the tide coming out for the big next wave being cloud or whatnot? Yeah, I think the, it's an irreversible trend in some ways, right? Even look at, you know, where people are developing their systems. Just from a technology standpoint, Hadoop is becoming like a data operating system, right? So more and more people are building solutions around that. So the trend is going in that direction. But also towards what John said. Do you think that if we move more and more to the cloud, with Redshift as one, you know, part of an example on AWS, will that data operating system be the data operating system for Azure, Google, and Amazon? Or will we see more heterogeneity? Yeah, that's a good question. Because, you know, if you look at something, what's the fastest growing, you know, kind of a dupe adoption platform. I think it's EMR, right? You know, in terms of just to share growth. So don't have an answer to that one. It's a good one. But, you know, there's that general growth and trend of, I call it the democratization of getting to data, right? Which is, I've got to, you know, I want to take advantage of Hadoop, it lowers the cost, makes it easier to do things. It makes it more, you know, it can run more things, create a bigger lake. I can go into the cloud very easily, right? You know, I can set things up, I take advantage of that, do quick analytics. We just announced a solution for Mongo, right? If you're a developer, now you can get into it much more quickly without having the DBA background and be a SQL expert, right? So I think the bigger trend is, you know, the democratization of how to get, you know, all these people in all these different departments have much greater access, lower costs, lower entry points, easier to spin up than they ever did before. And that's what I think the excitement is. All right, Lawrence, we're running out of time. Got a wrap. Thanks for coming on theCUBE. Really appreciate you seeing you. Attunity doing great. You guys are awesome. Big day to congratulations on your recent awards. Sure, thank you. Thousands of customers. So, yeah, business is good. And it's a healthy ecosystem. I'm really excited. So, half the people want to do it. So, it's not bad. That's right. It's a pleasure to talk to you guys. All right. It's theCUBE. We'll be right back after this short break. Thanks.