 Live from the Fairmont Hotel in San Jose, California, it's theCUBE at Big Data SV 2015. Okay, welcome back everyone. We are live here in Silicon Valley in San Jose for Big Data SV. This is theCUBE, our flagship program. We go out to the events and extract the signal noise. I'm John Furrier, the founder of Silicon Angle. I'm with my co-host Jeff Kelly, Big Data Analyst and Lawrence Schwartz, VP of Marketing and Opportunity. Welcome back. So again, thanks for coming back. Thanks for having me. Always a pleasure. It's a good sign when people keep coming back. Yeah, we saw you in Vegas. Your customer event was great to see you. I had a great dinner. Thanks for hosting us. Last event we were at. Great to see the success. So got to get your take. We have been pounding the pavement all day here on theCUBE and talking to folks. The big announcement, obviously the data platform, all the analytics stories are out there. This year, what's the bottom line this year? Lawrence, tell us what's happening. What's your perspective? A lot of action, a lot of formations, a lot of speculation about the growth and new waves coming in. What's the take? Well, I think from a technology perspective, the interesting take that I see and we see with our customers is, especially from last year, is there's a movement of how do I do more with Hadoop in terms of getting this into, say, real time, right? That's a big movement. People are trying to figure out how they push the boundaries. You see some of the announcements from some of the vendors like MapR talking about this as well. You see the growing popularity of Spark. So that's coming up and we're surprised. We see it in our customers as well. They kind of start with some basic loading and getting started there for cost savings. We were talking about this earlier, but they've really moved on to more applications that look at how to keep up the data lake and real time keep it fresh, how to pull in different data feeds and get real time view of their customers and their business. So that's a new angle and that pushes us into new boundaries and some new exciting possibilities. Well, I think you're absolutely right. We're definitely starting to see the conversation shift from, oh, I'm going to move a little bit of data out of my data warehouse and save some money and put it in Hadoop to, what am I going to do with all this data? And when you talk about what you're going to do with all that data, it's analytics and analytics increasingly has to be, if not real time, close to real time or it depends on the use case, but the idea of this long lag time between when you have a question and getting an answer to it just doesn't work in this role. That's right. And yeah, we've seen some interesting use cases with our implementation. We launched our Hadoop offering at the last strata. So, and we're ready. We've seen some interesting adoption and people taking it up. And you see it for a couple of different things. One is people are really looking for a way to kind of do preprocessing and then kind of keep that golden copy in a data warehouse. So we work with one major healthcare provider and they do just that. They pull it in, they use the tunerty which is using data replication for getting it into Hadoop, doing a lot of processing there and then putting that kind of that golden copy for higher speed analytics into a traditional data warehouse. And then we see this real time data feeds that we were talking about earlier. So we have another customer, a cable provider, one of the major players. And what they're doing is they've got 200 different data sources that they're pulling information from all across their enterprise. And they wanna get a much more real time view of particularly on the financial transactions, what's happening and whatnot. So they would in the past have pulled that into a, what's called the traditional data warehouse. Now they've wanted away and they pulled an attunity to do that to get it into a platform where they could leverage Hadoop. They're actually using the pivotal version with that. And that's a way to kind of get that more real time feed that you could do in the past with a data warehouse and now you can do much more flexibly, easily with Hadoop. So we're seeing more of those trends and then we're seeing modernization is showing up as well. So surprising to see, but we've got customers who are going from mainframe to map bar. I'm looking to do that, right? Which is it's a cost savings play. It's a question of how do I get what I've worked with before into a new way to look at the data. So those are the things that we're seeing a lot of. So you mentioned Pivotal. So gotta get your take on some of the big announcements around here at the Open Data Platform. Talk a little bit more about maybe that specifically your take, but also just generally, how do you see this market evolving now? I mean, we're starting to see in my opinion some clear factions or alliances start to form. Which to me says actually, the market's mature. Talk about your viewpoint. How do you view kind of some of the announcements we heard over the last couple of days and more generally kind of where this market's going in terms of consolidation, potentially a lot of acquisitions that are probably going to happen within the next 12 to 18 months. Sure, now Attunity has always been a player where we've tried to be kind of the neutral Switzerland if you will in the data space and that's been true with the databases out there, the data warehouses all before Hadoop, everything from Oracle, the SQL server to Teradata, to Vertica, we work with all those players and we have some type of relationship, oftentimes very good partnerships. So as a company from our angle, we've tried to maintain that for the Hadoop vendors as well. So when we first launched, we came out with support for Cloud Aaron Hortonworks. Actually what we announced for this show is our official support for MapR as well as for Hawk. So we are trying to enable that to people to work with whatever distribution makes sense for them and what they're doing. Now going open, that's always an interesting kind of change and shift. And what I've seen from other places and I've come from prior to this in the MySQL space and there are a lot of vendors around that and some of them were closed and some of them were more open and there was kind of a hybrid mix there. And in general, going and having a much more open platform tended to be good for adoption, right? It tended to get more people using it. It tended to get a higher comfort level with companies on whether they could adopt something and what they could do for it. Even in reality sometimes it's still pretty much a lock-in if you really use one type of technology if it's open but still there's that perception of it. So I think it's a general good direction. I think it'll help drive adoption so we're excited to see that from our vantage point. We got a comment on the cue earlier from a guest that said ML is the new SQL, machine learning is the new. That's a query language. What's your take on that? Because that points to a direction of, okay, analytics and platforms. You have to have tooling and platforms kind of working together. Pure play tools are kind of being viewed as lower tier when platforms are seeing too much of a land grab. So this new balances platform with integrated application tooling seems to be a sweet spot. So machine learning highlights it. What's your take on all that? How's it that's a key area for figuring out all the analytics piece and working with the different distributions and so on? Yeah, no, it is a critical step that you have to figure out in that whole value system. And what we've saw, Attunity would take data from the source and kind of move it to a target like a data warehouse. And then that's kind of where we left it. And then there was a gap between taking transactional data and then getting it into a data mod or a third normal form or something that you could do analytics on. And that actually spurred one of the acquisitions we did at the end of last year, a company called BI Ready to do just that, take that transactional, do some of the integrity checks, do some of the third normal form transformations, the data mark cleanup and get it to that final step. So that is an important part of the value chain. We recognize that that's why we moved in that direction. I think that becomes important for not just data warehouses, but for Hadoop as well. So absolutely. And then when I look at all the stuff coming in on the machine side, right? And the other angle that we're seeing more of and I think a little bit of a dating is, how do you take some of that completely unstructured, varied format data, merge that in with Hadoop, right? And I think that's going to be a combination of Hadoop. I think that's going to be a combination of NoSQL. I think you'll see some of that more of that happening as well. Where are the customers seeing the most traction and where's the most confusion, do you see? I see all these events are always about the momentum points. Where are people on the edge of the bleeding edge and the reality and where's the mainstream? Certainly when you see big movements like open data platform, IBM, a lot of the big companies are here, you know there's money on the table. So the question is, where are we on the timetable of mainstream to the bleeding edge and early adopters? How can you, where in the spectrum are we? Yeah, yeah, I think if you look at that whole classic, more chasm, where are we? I think it's on the early adopter side, right? Which when I was talking to somebody else about that, they said, oh, so that means it's really early. But if you look at Moore's curve, it's not really early. That's about halfway there, right? Then you get the late adopters and then the flow after that. So it's hit that point, but it's amazing. It's still, there's still a big learning gap, right? There's still, even in the people that are very smart and savvy and the people who might come to the show they go back to their companies and they're still very well versed in traditional SQL languages and Oracle and all the SQL server. And it's like a whole new language you have to learn just to begin with, to kind of figure out Hadoop. And so that knowledge is a challenge, but how do you close that gap? You close it by making it much easier to do, right? I think it's interesting the way that, go back to NoSQL for a second, right? That Hadoop evolve versus like Mongo, right? Mongo kind of went for the very simple, dead easy to use solution, right? Right from the start, made them very popular. Hadoop very popular for the processing power, right? And flexibility, but always kind of lagging from at least in the beginning on that, you know, on the gap of using it, right? It took a real PhD initially to kind of get it going. It's still pretty complex. And it's still pretty complex, but the tools are getting there, right? It's the tools that we work on, the data integration side. It's the people who work on, you know, all different formats to adopt SQL to it. It's getting, you know, the operating system in there. It's packaging those up. And, you know, at some point it just, it's gonna get buried underneath everything, but you know, we're not there yet. Yeah, but I think, you know, those you touched on a really important point. Some of the challenges of getting Hadoop to go mainstream and some would say, well, it's already mainstream. I would argue that big data as a concept is mainstream. I think most enterprises understand the potential, but in terms of Hadoop as a technology, I think, you know, we're still seeing in terms of adoption, you know, you see the global 1000, we're definitely involved with Hadoop projects, some successful, some not. You've got your data driven or data born startups that, you know, it's built in their DNA. But I think there's this huge middle part of the market, the meaty part of the market that's still on the sidelines, and not quite sure. And one thing that's gonna have to happen, whether it's tools like continuity and others that take away some of this complexity, make it easier to consume. And then you've also, other equation is, you know, the security, the compliance, that kind of angle, which is also a data integration play. I mean, that's part of it, understanding the lineage of data. So that has to happen. And then ultimately, it's about doing something with all this data. And that's, you know, the area where, I think, you know, there's definitely some confusion, you know, in that meaty part of the market that we're seeing with some of the data warehouse vendors are seeing, you know, basically flat revenue. And I think it's because, you know, to some extent, some money's going to the Hadoop space that's not gonna go to the data warehouse space, but a lot of it is just confusion, I think. You're not quite sure what steps to take. So, I mean, so what's your take on that? Do you think, what are some of the things that need to happen to really accelerate the market? To move this beyond, again, those global 1000, to, you know, the global 10,000? Sure, sure. Well, I think one interesting aspect that we've seen in other areas on the data warehouse side, we've seen a very successful adoption of people using data warehouses, things like Redshift and Amazon that they might not have used before, or might not have considered they might just try to do this themselves or kind of do on Oracle. And because, you know, Amazon really lowered the bar, made it very easy to kind of try, get started and get moving, right? So that's one way to, that I think if you have further movement there, and some of the vendors have done that, right? Some of the providers like Amazon and Microsoft and others have really made it easier to kind of try out Hadoop on that platform. And it's not so much whether it's on the cloud or on-premise where it is, it becomes much easier to go and start and try and experiment and play, right? So those types of sandboxes, right? That's kind of one key piece of it, right? And then, you know, I've seen that, some of the announcements and a lot of announcements this week, but you can even see some of the vendors on the cloud side, talking about the tools that they're integrating for doing analytics and making that easier to do, right? As part of their platforms. So that lowers the bar. So now it's easier to kind of get on there and try, and then it's easier to just try and experiment with the analytics, right? So it's all pieces of the stack, it's kind of getting it up and running, it's integrating that and making that low and easy to try with the cloud vendors. And then of course, you know, the data integration side is how do you kind of get it there over and easy? Because when you look at, like on the data integration side, what people try and often start with is, hey, there's a great tool scoop out there, right? Part of the Apache framework. But again, very hard to use and get started with. And so those are all those pieces kind of have to come together. So talk a little bit more about attunity. What kind of momentum are you seeing? You know, as a public company, we can see your numbers, but talk, take us inside that a little bit more. What kind of momentum are you seeing with customers? Any trends in terms of, you know, maybe new types of customers you're attracting? What's kind of new in the attunities world? Sure, sure. It's been interesting for us in a couple of different areas. So one area is, you know, the capability of how do I take advantage of Hadoop? Or how do I take advantage of alternative data warehouses in places that people might not have considered or kind of looked at? So, you know, you go outside and a lot of people talk about the data here, but there's also the application layer on that. So we have products in the goal client, I'm sorry, in the SAP space, a product called Go Client, which is, hey, a lot of companies, you know, many organizations are running SAP, right? So how do you tear that, right? That's one question that we see, which is on one end there's HANA, right? On the other end there's, in the next level is SAP, but then what do I do with some of that other data? I might wanna archive and get out, run on SQL server, run on, you know, an HP platform, run on Hadoop. So we're kind of seeing some interest on people have these, you know, traditional, very capable systems, but looking to kind of tear that going all the way out to Hadoop. So that's kind of interesting. Another area that, you know, we've seen kind of some changes in is, if you look at, and this goes a little bit back to the real time, but how do you blend more seamlessly the real time experience or the real time with analytics, right? So I've got massive amounts of data that I'm storing. I've got this massive amounts of OLAP or, you know, data. And I've got these, this transactional data, right? Or LTP data. And so how do I blend that more seamlessly? It'd be nice to have the, you know, the one box that did it all, right? But people are looking for creative ways to pull that together. So that's an area that we try to help people with. And then, you know, another area is, which is kind of, again, surprising where your customers lead you sometimes. But people are looking at how do I make, for, you know, not just Hadoop, but how do I keep up a real time copy, disaster recovery and other things for just a typical data warehouse. And that turned out, it was an interesting challenge for us to look at, because that's a hard problem, right? When you are a company that pulls data out of a database, right? And you can just look at the log files. You can look for the changes, map those over, and then bring them over. And then we had customers saying, well, that's great. Could you do that for Teradata for us, right? And they're like, well, they don't have a log file that we can just go pull this from. So we've developed some new technologies that we announced as well. Change Data Capture for a platform like Teradata or Data Warehouse where you have to now go and query the database and look at the changes and then pull those out and then bring those over and make that part of a continual process. And that's good, again, for disaster recovery for if they wanna keep a copy of this, you know, archived in Hadoop. So again, more of doing more real-time aspects, doing more near-time aspects with the data stores that they have on one end and at the other end, you know, how do I pull stuff out, like the SAP example, for more archival or other reasons, so. So two questions I'd like to ask you real quick is the metadata is obviously really important. And we were talking earlier about, and will there ever be an open data platform like concept for metadata? And it's very difficult, I'm asking you to do that. And two, where's the bottlenecks in the big data world? If you had to point to a couple key areas that you see consistently in customers, where are the bottlenecks? So one, metadata, will there ever be some sort of consortium around metadata or something even doable? And two, where are the big bottlenecks? Where are the three big, top three pressure points you see in bottlenecks? Yeah, yeah, I think the metadata is an interesting one, right? Because it's so varied and it's so context dependent. So that's where. And so critical, no one's just gonna like just change anything, right? Right, right, right. So it's a critical problem to figure out, but you're right, it's hard to kind of get a standard around it and people to work around it. You know, that might be easier to do within different verticals, right? Different industries that have more common data for formats, coming out of the sharing, you can maybe see more of that in healthcare, things like that, maybe starting in that direction. But it is an area that could use further work. And then the bottlenecks, you know, it's interesting, you know, being a technology company, I first thought would be, well, what's the technology about? And the first one's always the people, right? It's the training, it's the awareness, it's the skill sets, you know, even when we bring on people to our own company and they've worked a lot for many years in the data warehouse space or database space, it's still kind of a big effort for their training and experience. They have to kind of put on top of that. You know, the other bottleneck, boy, it's oftentimes, you know, people see the value of a project in, say, Hadoop, right? And they want to get started, but then the bottleneck arises. They don't plan for it, they kind of see the cost savings, right? They see what they want to do, but then there's that three month time period, right? Where it's time to get it up and get it going and get it started, right? And so that tends to be a bottleneck, you know, it's more of a time delay, if you will, in the process of that, that's one that comes up. So those are the ones that I kind of see, you know, it's the people, it's the planning process. And yeah, I think those are the ones that are most prevalent from my viewpoint. The overall landscape of open source obviously is important, but the reality is that there's a lot of jockeying for position. I mean, is it noise? I mean, do you look at this whole open data platform as just another consortium, or do you think it's got some lights? You know, it's still new, right? So it's hard to talk to it, but I, you know, I think it's, people are gonna kind of choose what they will. There is, the nice thing is, at least in the Hadoop space, right? I think there's more commonality than differences, right? And that's not always the case as in some other, you know, open source systems. So that's helpful. You know, whether, you know, how it shakes out, how it pans out, I don't know, but you know, I think it's good for the users, right? It's good to people to make noise about how they do things differently. You know, how different it is, you know, it's probably for a lot of customers, it isn't that different. But if you have particular use cases and other things in different areas, there's some value to that flexibility. What's the one thing you think people aren't talking about that should be talking about in the big data industry right now? The one thing that they aren't talking about, boy. What should be on the agenda that's not getting enough visibility in your mind? I mean, because everyone's always obviously a big data platform, data platforms, I'll see news and there's game and chip involved in that. But also it's a legit move by, you know, Pivotal, Hortonworks, IBM, among others, there's big players, there's CenturyLink, Verizon, big players, it's like. So that has to be talked about. And obviously the position of Cloudera, Hortonworks, MapR, et cetera, et cetera. But what isn't being talked about? What should we be focusing on that maybe is being missed in the noise? Well, one of the things that I thought I'd see more talk about, you know, people, and maybe we'll see, I mean, the show's only being right and so maybe it'll be more of it. But I was surprised when you look at where the data growth is and where people are making some predictions on big data, you know, the internet of things is one that I would have thought more people would have been talking about, talking about how their platforms play into that. Maybe people wanna focus more on the infrastructure piece, but that creates a lot of questions around the deployment of the edges, how people work together in that area, there's commonality and standards that need work over there. So there's a lot of questions and when you look at the amount of data being generated by machines versus other sources, I would have expected to see more of that as a common theme and maybe that will be, we've still got a few days to it, but that was one that would be interesting to see a bit more teeth into. Yeah, I think, very good point and I think that's really where we're gonna see the next wave of innovation is gonna come around. Building applications and tying in all that data coming off of machines, coming off of wearables, whatever the case might be, all that data coming off physical objects, there's a huge opportunity there and I think that's where you're gonna see a lot of the action, whether it's this week or not, I'm not sure, but I think going forward, that's where really the opportunity lies. Again, moving beyond the, okay, we can store and process this data, we can do some cost savings, but now let's talk about actually leveraging all this new data and ways to monetize it. Sure. So I gotta ask you on predictions since we're here talking about predictions, day one, what do you think's gonna unfold in the event? How's the Strata conference at Duke World, the big mashup with the Riley media, we're on the crowd chat there, seems to be, you know, this has really unfolded yet tomorrow is really the big day. Right, right. What do you think's gonna happen? What's gonna be surprising? What's gonna be happening? What's gonna be obvious? Yeah, yeah. Well, it's, you know, so you see a lot of the pre-positioning, right, and vendors talking about what, I think the interesting points are gonna be when you see some of the Fortune 100 companies here talking about what they're doing and what their future plans are where they're investing. I was surprised, you know, we were setting up our booth today and I look across and there's Target there and I was like, wow, that's kind of interesting, right? I didn't expect that, right? So I'm very curious what they're there to talk about and what they're doing now, maybe it's recruiting, maybe it's other purposes, but once you get the actual users there, right, talking and speaking and talking about the hiring needs that they have, then it gets out of all the, you know, the vendor, you know, common line. Urinary Olympics, Dave Vellante calls it. For what's up? Dave Vellante calls it the Urinary Olympics. There we go, yes. So it's a classy way to put it. Yes, yeah. Good, but customers ultimately are trade show-like. When you start to see customers showing up, it becomes kind of like a trade show, not just an industry conference. Exactly, and seeing that, you know, just seeing them set up across from sales, like, wow, this could be interesting, right? This could be a new change, right, for this year versus in the past. All right, Lawrence, we got a break at that. Thanks for coming. I really appreciate it. Great to see you again. Congratulations on your success. Again, we're kicking off big data week, big data SV in conjunction with Strata Conference at Hadoop World. This is theCUBE. We'll be right back after the short break. This is theCUBE, we'll be right back.