 Live from Las Vegas, Nevada, extracting the signal from the noise. It's theCUBE covering Informatica World 2015, brought to you by Informatica World. Now, here's John Furrier and Jeff Frick. Okay, welcome back everyone. We are live in Las Vegas for Informatica World 2015. This is Silicon Angles theCUBE, our flagship program. We go out to the events and extract the signal and noise. I'm John Furrier, the founder of Silicon Angles. I'll draw my co-host, Jeff Frick, our next guest, Mark Smith, CEO and chief researcher of Vantana Research. Welcome to theCUBE. Thanks, great to be here. All right, great to get the analysts on now. We can get down and dirty and talk about the real deal. Okay, we had Soheb on, the CEO and chairman. You know, great messaging, top-line messaging hangs together. How could it not hang good? The date is hot. But the reality is, there's a lot going on, right? So, give us your take. You know, they're going private, retooling, they're doing okay, they're not hurting, but there's challenges. What are some of the key things that Informatica and others are challenged with right now in delivering the customer value now in today's market? Yeah, absolutely. I mean, if you look at organizations today, they're going through a massive change in IT, right? We've got application data architectures that are changing radically. We've got business going out and still kind of being good renegades or bad renegades, renting applications, creating new data repositories. And then you've got the poor analysts that are trying to figure out how to get the data together, do some analytics, right? So, the environments are significantly different across organizations. The politics are still pretty heated in many places. So, Informatica gets quite challenged in trying to figure out how do you be nice to everybody at the same time and get them to move to the new world of how we interact with applications, cloud, data, and how their tools kind of, you know- Tell us about the politics. Is that something that we haven't talked about yet is that how deep and how nested are the politics? Meaning, are the politics governance oriented? Is it just turf wars on IT? Is it people? Is this just normal politics? Is it all the above? What's your- Well, I think the politics are sometimes a misunderstanding of each organization, right? So, you go into any company, right? You got sales, you got marketing, you got customer service, you got finance, HR. They run their own destiny. They're held accountable for their business processes. IT is a service center. So, when IT comes out and dictates, this is how you run your business with these tools and this approach, it doesn't necessarily always mesh, right? And so, in the last three to four years, cloud has actually taken off because businesses said, I'm going to run apps. I'm going to onboard. I'm going to use tools that actually my team can use, not your team, the service organization. Yeah, telling me what to use, but I don't want to use. Same thing in the BI and analytics space, right? IT said, no, this is the one way or the highway, and the realities didn't work, right? So, the politics are healthy tension in many cases, but not necessarily a good understanding of each other's kind of timelines. Business moves a hell of a lot faster than IT, and IT is like, well, get to you, don't be patient. Yeah, and this comes back down to the philosophy of how people with a data warehouse and distance intelligence. It's some fenced out data set that people send reports out, truckload reports come back, slow lag, and it's just a resource. Now it has to move to the front lines, if you will. That's the shift that's happening. When you agree, that's the problem. Oh yeah, I mean, analysts who work out in those line of business areas, they're responsible for analytics, they're responsible for information. And over the last five, six years, our research has found the biggest time sink for analysts doing analytics is data. It's data preparation, right? Preparing data, reviewing data, right? Data preparation's not some new hot term. I mean, these analysts have been doing this stuff just the old fashioned way. Sometimes control C, control V, and spreadsheets, and or actually doing their own kind of little Skunkworks projects. Now we're seeing actually that there's actually dedicated tools to help them that they can use without having to have IT over their shoulder or IT saying, I'll get you next month. So what's state-of-the-art right now? So for customers who say, okay, we got a lot of wasted time, we got a lot of control and C and V and spreadsheets, all that's going on. I want to have accessible ingest of all my data everywhere. Is it pipelining, is it connectors? What's some of the things that are hot right now that are working for customers? Well, when you get in the business side, they start from the front end of what they're trying to accomplish, right? So they're trying to actually do some discovery, exploration of their data, they're trying to gain some better insights. It's basically faster and smarter. How can they become faster and smarter? And most companies have realized that just putting out more visualization is really a dead end. I mean, I'm not sure that you've been trained on interpreting scatter plots and bubble charts, but you start looking at these things, you go like, what the hell does that mean? Right? And- I had you say that all the time. I actually was. This is like a billion data points. It's like, what am I looking at? What does that mean? You know, you click here and you click here and it's pretty picture. What am I supposed to do? The practical side of what is- Are you assuming the data's right by the way? It's not old data. Well, yeah, I mean, how do you know when there's all these pretty pictures and what you're supposed to do? So the status data really is that the analysts are really driving new needs for data in different form shapes, interaction data, behavioral data, and they're looking for faster ways to bring that data into their own repository, kind of like an own departmental kind of operational data store, right? It's kind of an operational department store of data that actually people, these analysts actually use. Now, that's actually conflicting with IT data architectures, but guess what? The business has got money. They have the priority and now they have to figure out how to interconnect those systems, which actually comes back to Informatica, right? I mean, you got to interconnect all this data, so. And then you've got all the whole and other batch of public and other third-party sources that you weren't even kind of integrating into the process before. Yeah, there's a lot of talk about how you bring in Twitter data, weather data, I mean, all these things, they're not that easy, right? I mean, it's a pretty, you know, how do I align that to my territory for my products during this time period? I mean, how do you dynamically bring that data and blend it? It's not an easy task. Yeah, absolutely. Let's talk about data lakes. When we were talking earlier, you know, that turn has been kicked around. Your thoughts on data lake? Well, I think that we spent a lot of time last two years doing these silly analogies, lakes, streams, oceans. You know, we got to get back to how do we build the right kinds of information architecture? I mean, let's get back to, you know, information systems, right? What we're all trying to do is have a new conversation around information systems where we have certain kinds of data repositories that have to be interconnected, they have to be able to support our applications and our analytic systems. So I don't really waste time in talking with our clients or in our research around lakes, streams, pipelines, swamps, I heard earlier. Now we agreed to what we said, yeah. Yeah, you guys came up with, you know, talking about swamps. Only to play off how ridiculous data lake is, but that's okay. People want to have a visualization of what a concept... It starts a new conversation. But customers don't talk in data lake. No, they don't. That's not their language. They're not building data lake architectures, right? I mean, you go in and meet with data management professionals. They're building data architectures, enterprise architectures. They have to interoperate with the cloud on premise because they have to run massive production systems. So they have to look at what technologies can do that. So I got to get your take on this. And we talk about it on theCUBE all the time. Like, you know, Hadoop has, you know, had its own little category for a while. Hortonworks just announced Samarini State. They did it pretty better than the last quarter. You got Cloudera out there doing their thing with the intelligent data hub. But it just never seemed to be an industry called big data. I mean, it's not a sector. It's like it seems to be more horizontally spread. What's your take on, is it a market? Is it just a feature of other segments? Is it a notion of a fabric? Does it move up and down the stack? I mean, as a researcher, how are you putting that puzzle together? And what are some of your findings? So I was pretty skeptical around big data at the beginning. But, you know, as things begin to evolve, it's a good framework to have a conversation around kind of the next generation information management architectures. And it's a framework to have a conversation around where are our data sources, what are our integration strategies, what are our storage strategies, what are our computational strategies? How do we access and analyze that? So it really is a next generation data warehouse conversation, let's be honest. But it does provide a framework of a higher velocity compute that's necessary in companies to work with their architectures. And so, but we don't look at Hadoop. And in the last five years of our research, we've been looking at big data and we've been asking organizations, what is the primary mode of your big data strategy, right? Is it a relational database system? Is it a new Hadoop? Is it in memory? So we have some in-memory technology starting to gain some more traction. A lot of companies using flat files, you know, or maybe using SAS Institute, you know, doing their kind of, you know, proprietary data structures. The reality is it's all of them. Okay, so the reality is companies have all of them. Okay, there's not such one thing of one big data technology. So, you know, these are all evolving and companies have multiple strategies replacing different bets to see where they play out. I would totally agree with that. But has the rate of data generation with all this extra stuff, right? We hear 95% of all the world's data was generated in the last two years or whatever. You know, has the rate of that and kind of the economic pressures on systems that weren't necessarily designed to hold and process that much data economically, not so much feature-wise. Is that part of what's really changing the conversation? It does, because it's all about net new projects. It's not to say that people are going in and taking their old historical time series based data warehouses and ripping them out, right? A lot of those are important for reporting, you know, corporate reporting type of structures, right? But on net new projects, you're absolutely right, because those projects have higher demands for the data they need. And they need newer technologies that can be able to support that, which is why you find Hadoop in memory kind of architectures, but you find these kind of things blending together, right? If you go look at Oracle, IBM, Teradata, guess what? These are hybrid machines now. They got a little Hadoop, they got some in memory, they got some relational database. Oh, by the way, they got some flat files. So we're moving to- Sequel on top. We've got some Sequel on top, and it's all coming together. It's Sequel and Hadoop, that's a big thing. Everyone's got it. It's kind of big deal. It's like what's next, right? And so we're moving to more of a hybrid machine kind of approach, and we're not that hybrid machine is virtualized on-premise or in the cloud. It's up entirely up to the customer and how they want to manage the system. And that's a cool thing, and I got to ask you on that note, I want to ask you. I mean, we're just joking yesterday at the IBM event. I never heard a customer in the Cube say, I want less compute, right? You get more threads coming, power rate, all this other stuff going on in the compute side. You got Amazon, cloud. So compute's rocking and rolling. You're going to have supercomputers, flash mobbing them, whatever. What is that going to do to the data architectures? What's your vision there in your research? How are you seeing that? Because that's going to enable some new opportunities, right? So you're going to have some horsepower to throw at, analytics, you're going to have some new stuff, maybe it's machine learning and memory of the spark, low latency, so more horsepower is coming to the table. And where that horsepower is actually, where is it taking advantage of, right? So we're really moving to this distributed big data architecture, right? Because for some companies, the big data architecture might be able to compute on your mobile phone that could be staged up into a cloud instance up with Amazon, which then could be actually being, you know, streaming the data off of some other, you know, repository somewhere else in the cloud or on premise. So we're really, you know, what we see is all this big data distributed architectures actually changing how companies actually operate. And it's not easy because they're not used to doing that. They're not used to actually, it's been more of a centralized kind of approach. Now it's a very big distributed data architecture. That's a systems management problem. It is. It's a systems compute. It's a operating system, not mindset. So that's a new kind of talent, right? Yes, absolutely. And that is also where there are the impediments of looking at this is around, now how do we deal with data security, right? How do we mass data across the wire? How do we actually ensure the right access rights? Because now we have data being cached. It's not disappearing. It's hanging out there somewhere. And how secure is it, right? So- And who's touching it? And who's touching it? For what workloads, right? Yeah, big data security is a big issue that we really as an industry haven't got our heads around yet. But it is a big issue that people building these big data architectures know they got to get their arms around because if they don't fix it, guess what? They're looking for a new job. Now the good thing is, those jobs are hot jobs. They may actually get a raise for screwing up. Going somewhere else, get a raise while they screw up on site. Failing up, that's Silicon Valley way, right? You see? I'm telling you. So that's not good for the industry. But it is important that we actually bring that to- But it's an opportunity right now from a security standpoint to like throw all the mistakes away, do over and build it in because now with no perimeter it's on the data guys to own it, right? That's what you're saying. Yeah, and we're streaming the data across the internet. And we know that sometimes the data could be safer in the cloud that is on-premise looking at some of our big corporate data failures in the United States over the last 12 months. But we are moving data across all around us, right? It's streaming across the phone, you're accessing a report. That might have my information on your phone, right? So it's a big issue. Mark, we're getting a hook here. Thanks for coming on theCUBE. We really appreciate it. Quick plug for your research. What do you guys coming up? Yeah. What's happening for you guys? Yeah, actually in this month we'll be announcing our brand new data and analytics in the cloud research. We spent the last six months benchmarking over 300 companies looking at what they're doing on data and analytics. And we just closed up our next gen predictive analytics benchmarks. Going to look at kind of how things are going, how we move towards that data and rocket science activities in companies. And URL they can go to? Go to ventanaresearch.com. All right, Mark Smith, thanks for sharing your insights and color into the industry that Informatica 2015, Informatica World 2015 was theCUBE. Sharing the data with you, broadcasting live, we'll be right back after this short break.