 Live from Las Vegas, Nevada, extracting the signal from the noise. It's theCUBE covering Informatica World 2015. Brought to you by Informatica World. Now, here's John Furrier and Jeff Frick. Okay, welcome back everyone. We are live in Las Vegas. This is theCUBE's Silicon Angles flagship program. We go out to the events and extract a simple noise. We are on the ground for Informatica World 2015. I'm John Furrier, my co-host. Jeff Frick, our next guest, John Myers, Managing Research Director and Enterprise Management Associates, EMA. Welcome back to theCUBE. Thank you very much. It's been a while. It's been a while, but great to see you. So, you're analyzing, you're looking at the data, you're looking at the vendors, you're looking at all the horses on the track. What's going on with Informatica? They're looking good. I mean, they're going to go private, so we kind of saw that announcement, which is not necessarily a bad thing. We were commenting on the intro. I mean, Dell's taking his company private. You know, it's private. I think it's a good move. You know, come out smoking after you go private. What's your take on Informatica? Well, on that piece, I think that, you know, there's kind of a Faustian bargain if you get into the public equity markets. And I like the direction that some companies are taking by going private. It allows them more freedom, if you will, to do some of the things that they want to do. And without the pressure of, I'd say, a quarterly, you know, every quarter having to meet a revenue target. And you got the shorts out there. You got stock, arbitrage, you know, like it's just like a treadmill one, you know. So until we see what the structure's going to be, I don't know where Informatica will be, but I think it'll provide them with the opportunity to do some things they may not have been able to do as a publicly traded, you know, quarterly reporting company. Let's talk about their business. What do you think of their business prospects, their current business? Obviously, they're in a great business. They're a billion-dollar company, but data warehouse intelligence is transforming. It's not going away. It's like saying storage is going away. It's like, data's not, it's only growing. So what's your, what's the health of the business and your angle? You know, in terms of Informatica, I really like what they're doing from the perspective of taking something that might have been historically a plumbing business, if you will, more of a utility-based, and they're raising it up a level to being, integrating more between business and IT. If you look at what they've been doing with, Springbok now Rev, you look at what they've been doing with the MDM suite, looking at what they're doing with SecuritSource. They're taking some of those analytical concepts that we've seen for quite some time and applying it to these domains of data management. You know, if you saw it in the keynotes where they had that screen that showed here are the levels of data. Here's the information that's secured in your environment. Here's the stuff that's at risk. It's no longer down in the bowels of some report on an individual system. It's been raised up to a look at your entire organization and where you're at risk, whether it be PCI, PII, things of that nature. So I like that, that they're taking a... What's the customer needs? Because obviously the customers, I mean, I agree by the way, 100%, I think that's a great strategy. I mean, move it up where it's more relevant at an application and or level. But what's the drivers from the customer standpoint? What's driving them? So when customers, you know, if they're trying to do security on all these new systems, you know, we've talked a bit about big data cloud, this, that, you know, if you try to get into all those different data types, those different data platforms, and do whether it be master data management or security or whatnot, at the individual point level, as I like to talk, I steal a line from a few good men, you'll go blind on paperwork trying to do all those things. But when you raise it up a level, customers and their IT departments are now able to say, I can instead of focusing on the tactical implementations, I can look at it from a service perspective and keep it all in that fashion. So I overworked IT departments, organizations that are needing to make better use of their resources, whether it be head, head count or capital expenditures. I think this gives them a great opportunity to move up as opposed to being caught in the weeds of the implementation. One of the things that Sohae talked about is the different types of analytics from the traditional BI all the way up through with Duke. I wonder if you could speak to how, having to deal in a world with all of those. I think you outlined six, data warehouse, analytics databases, in-memory databases, cloud analytics service, open source of Duke and agile BI. For the enterprise, they've got stuff they've had before, they've got stuff the boss will tell them they need to put in. How are they dealing with that? Well, I think that there's a couple of things that work there. One, at EMA, we don't necessarily look at analytics on Hadoop being different than analytics that we would have done on our DBMS. You're still doing that type of, you're still doing whether it be predictive analytics, descriptive, things of that nature. You still have that. When you get into the realm of big data on Hadoop or no SQL, you're starting to add a whole lot of detail into what you can do. So previously, in what you might have called traditional analytics, we were doing aggregates. Now, we kind of have loosened that constraint and we can look at all of the individual transactions or interactions of a particular customer. And I think that what Informatica has talked about is saying, no matter where the data is coming from, make it available so that you can put it into a predictive modeling engine, put it into a particular model, things of that nature. And as we get into more and more of an iterative analytical approach, being able to see results much more quickly. So if I'm a data scientist or I'm somebody who's exploring as a senior business analyst, I don't want to have to go and say, all right, I need to learn how to do Hadoop and I need to learn how to do Mongo and I need to learn to do these things. I need to say, bring me the information I need for customer, for product and for transactions, bring that into my model and then execute the model, not worry about where does it live. And I think that was one of the things that Sohabe talked about, is that make it agnostic to the analyst or the data scientist and then enable them to do it much faster, much more iteratively. And the other thing that I think is interesting, we talk a lot about schema on write, versus schema on read. And Sohabe talked about self-adjusting schema, self-adjusting schematic data mappings. Kind of a hybrid of the two. Pretty interesting concept. Well, I come from, I grew up in the telecommunications industry and if you look at, say, call detail record, whether it be voice, data, text, there's a lot of information on that record. Some of it applies to billing, some of it applies to network, some of it applies to the marketing department and each of those groups is going to want different components and being able to say, let's not throw the whole thing at them, but give them the components that they need on an as needed basis, makes a lot of sense. So I got to ask you about best practices. So as customers are out there using some of the data as they move up from the bowels and the plumbing, what are some good use cases that you've seen with customers on the cutting edge? We talked about data lakes prior, just kicking this off, but you've got data lakes out there, we call data oceans on more of the fast, you know, multi-dimensional data sets out there where there's internet of things, real time. So you got a lot of dynamics in the enterprise. What are some best practices that you've seen? Well, I think that one of the best practices that I've seen that's been backed up by our research that we've done over the last couple of years is that organizations aren't just dealing with a single data set. We talk, you know, that's why I say it doesn't really matter if it's on Hadoop or if it's on an RDBMS or it's on a Mongo or a Cassandra. You've got data and you need to be able to pull these things together. The next piece is understanding how to provide the context to the data. Our data warehouses have oftentimes gotten a bad reputation as being old, stodgy, kind of calcified things of that nature, but they have some of the best sources of information on customer products and transaction that we have. What we need to do is provide the context of the information we now have from big data sources and make that work. Some organizations are moving or saying let's purify the transaction layer from Hadoop and move that to the data warehouse. Some organizations are saying let's replicate our gold-plated data from the EDW in. Some of the best practices are how do you manage where that information is going to be and how you're going to use it. So I've got to ask the research question. You've got to be doing the research. You're talking to customers. We love the term engagement. I mean, IBM uses the term systems of engagement, systems of record, which is a great way to kind of get your mind around record, stored, engagement, active data. So the data's changed. You've got passive data. You've got activity. You've got social data. What does your research tell you from a customer standpoint? What does engagement data mean? I mean, is it like us engaging on Twitter? Is it like retail? Is it omnichannel? I might take a slightly different approach to that. What we're seeing in our research is that organizations are doing workloads, whether they be analytical or operational, that relate to increasing revenues, lowering costs. So top-line revenue, bottom-line margin, those are the things that organizations are focused on. How those things engage, if I'm looking at, say, social data, if I can start to mine information that says these are the people using my products, how do we engage with them and engage with them better? That is where organizations are going to go, but it needs to be tied to revenues or costs. And then you mentioned earlier this context is really a big part of that. I mean, contextual relevance is really the key. Well, it's one thing to know all the clicks that came through your website are how many places that GPS locations that a mobile app has passed by, et cetera. But if you don't understand that your store was sitting in front of somebody as they were using a mobile app as they go by, you lose the context of what you're trying to do or understand that these transactions were made by someone who's traditionally a bricks-and-mortar customer, they're starting to move to our online group. How do we optimize the experience and how do we make it work? You know, if you think about it, it's interesting this engagement, age of engagement productivity is the old way, engagement's the new way because you bring in engagement data activity. You get persona, first party, you get relevance, you get context, and other cool things. And ultimately what you get to is you get to the holy grail of what is, for the data business is, the entire enterprise is connected. So if everyone's connected, the thesis should be, it should be 100% accountability for everything. If you can measure everything. 100% for everything, that's big. If you can measure everything. But I don't think we can measure everything. We used to talk in terms of the 360 degree view of the customer inside of our firewall. And then we came in, now we have social data, we have Facebook information or things of that nature that now give us an even broader 360 degree view or maybe deeper. But I don't think we can measure everything. I don't think we're quite there yet in terms of processing power and storage and things of that nature. But you raise an excellent point, that if we can measure all these components, we'll have great accountability. But I think there's also that ingenuity or that imagination where what are the things we should be measuring? Is it, should we be measuring click through? If you look at the EvolutionSafe for an online application, we started at what was purchased, then we saw what was abandoned in the shopping cart. Now we're looking at what are people looking at as they are putting items into or out of the shopping cart. And now we're looking at where are they doing those activities as they're looking at it, so showcasing at a physical store. So we keep making those leaps, whereas before someone wanted to say, why would I want to know the geographic location of that mobile app? If you go, well, if they're sitting in the middle of one of your competitor's stores, now you've raised the attention of the marketing guys. So geospatial, again, I've raised that over the top question. I mean, to kind of make the highlight, okay, if this is possible down the road, it changes the management styles of the executives. And you've got geospatial data, you've got virtual geospatial data. So if that's the case, what's the management practices that are transformed? So it's a selection of the data. So do I look at first, do I look at the customer service data, look at the abandoned shopping cart, customer satisfaction, manufacturing? I don't know if we're changing the practices. I think that the patterns that successful businesses have used to disrupt industries and to find opportunities are still going to be what we're going to use. However, what we're going to see change are the physical implementations of those. We used to think about a location of being at a point of sale system associated with a purchase. Now we're thinking about location in terms of where is the physical GPS of the object. So I think that those executives who are imaginative in what they want to do and how they want to achieve it are going to be the ones who are going to go, let me look deeper into the data or into more dimensions of the data. But every business is different. I mean, again, I talked about it, I came from a telco business. If you're selling data plans versus if you're selling something else on a prepaid, you have the same types of data. The question is how do you use it to meet your business plan? The other kind of markers on that journey that you talked about a lot is predictive versus prescriptive. From your experience out in the field and the research you've done, who are the people that are furthest along on really the prescriptive or are we still got a ways to go there? Well, I think that when we look at the analytics, there's a lot of education that needs to be done. I hate to say it, when I got my MBA training I learned how to do linear regressions on Minitab. I probably wouldn't be allowed in a statistical classroom these days, but we have a lot of management types that maybe still think in terms of, okay, I've been presented with a Nave Bayes model and they go, well tell me how it matches to the coefficients of a linear regression. You're like, no, it doesn't do that. So we have to first let people understand what these models are doing, get them appreciative of where they are. They don't need to get down into the math, if you will. But I think we're seeing a lot of organizations that are asking that next question. Not who are my top 10 customers, but who will be my top 10 customers or what are the attributes that my top 10 customers have so that we can look for patterns that allow us to do that. Again, I see that being the more proactive organizations, organizations that are, say, manufacturing or logistics trying to project out where they're going to see their lines going for people that are doing sales. How do they view what you might call the leading indicators of economic data and how does that impact their business? So I think that's where they're going to go. I think we're still going to see a lot of descriptive, but in terms of predictive, people are growing into it and they're going to start using it once they understand or break free of is this a linear regression? It's a combination of operational discipline and mindset, right? That's what you're basically saying, it's like they're getting their arms around how to use it and then how to think about it. EMA did a research study in the fall where we asked about data-driven organizations and how deep people want to drive that. One of the impediments wasn't how do people use business intelligence but it was how do they make decisions with analytics? So they're going beyond charts and graphs to how do I utilize data from advanced analytical models to be better. Talk about the horses on the track the different vendors, you got Informatica doing well you got other companies out there in this landscape we're in now what's in your opinion and your research, what are customers looking for and what do the successful vendors need to look like, what do they need to do to be successful, what kinds of business models, what kinds of products on the consumption side pricing we heard Debbie talk about cloud, friendly. So organizations are looking for time to not just heartbeat or implementation but time to value. So organizations that can enable them to make it from not just spinning something up in the cloud, anybody can swipe a credit card and get an environment up and running but how long does it take for that environment to start providing value to your organization whether it be analytics operational data or master metadata things we've talked about. So they need to be they need to talk not in times of time to provisioning but time to value organizations also need to be understanding of how do we mix the different data sources and the different data locations you've talked about cloud, I've talked about Hadoop they need to look at data as a resource not, okay we'll set you up with your on-premises structure data and we'll take care of that and we'll be really good. A frictionless resource too and that would essentially move around a lot. Yes, and so if we spin up a new cloud environment or we start to, a company that we acquired has salesforce.com instead of an in-house CRM system they need to be able to say how do we link these together without saying alright get rid of the Salesforce implementation to go to this other one they need to be mindful of mixing both on-prem and then cloud implementations and not just cloud but private and public cloud resources. And the pattern recognition that we all like to go through and successful startups, okay because the startups are out there right so what's the hot startups and what's the disruptors look like I mean Informatica is potentially being disrupted and or they're disrupting now they're going to go private they'll be a disruptor, that's the hope right but underneath that you get a lot of venture-back startups what's your take on the startup community? You know the startups that are out there the ones that I'm most impressed with are the ones that are taking a much more business-oriented or analytical view towards data so I've seen this in some of the data wrangling organizations some of the people who are not taking an approach that hey this is, we're going to give you a command line and you're going to be going with data at a very technical level the ones that are successful are the ones that are raising it up and saying we're going to give you a pallet or a canvas if you will you bring the data sources on you kind of show us how you want to do those types of things, you want to do data quality we'll give you an automatic histogram that says these are the data attributes that are in there. One good example is if you expect it to be a gender column and you open it up and go I have 57 different values in here you might go I have an issue with data quality. There are certain groups that I think Social Security has got up to 13 different identifications for gender but they all have very specific reasons so you might say okay this makes sense but having to do a profile of a column on the command line that's not where organizations are going, they're trying to say let's get our business analysts involved to say they know what's inside of these data sources and go yeah that one doesn't make sense or I shouldn't see values greater than 15 digits in a phone number even for an international plan, things of that nature so letting the people who have an idea of what the business value is that's where the disruptive start-ups are going towards and saying let's give those tools to people. Creativity is the key in there, right? Definitely people not be locked in to some syntax or database structure. Alright John Myers thanks for joining us on the Cube analysis here, breaking it down from great insight and marketplace, Informatica with the hot trends are, it's the Cube extracting the sales from the noise, sharing that with you we'll be right back after this short break.