 Okay, we're back here live inside theCUBE, our flagship program. We go out to the events, extract the signal from the noise. I'm John Furrier, the founder of SiliconANGLE.com and I'm George, my co-host. I'm Dave Vellante, Wikibon.org. We're here with Anjul Bambri of IBM. She's the vice president of Big Data and Streams. Anjul, welcome back to theCUBE. We talked to you at IOD. Great to see you again. Happy to be here. IBM, number one in Big Data. Wikibon just released the Big Data report and IBM, not surprisingly, came out number one. Big chunk of that was, of course, with services, but you got a big portfolio and you're really attacking this space with a vengeance. We are excited about it and Big Data, as you can see here, at Strata, there's a lot of excitement and a lot of really good talks and it's really nice to be a part of this and we've been obviously doing data management for a long time and to expand to Big Data was just a natural thing for us. So we deal with small data, big data, get you all the information that you need at your fingertips. So we made the comment at IOD that essentially you've taken this massive portfolio and sort of put it together in a way that's communicating to the world and then tied it into the Big Data theme very nicely. So talk about the strategy, what's new since we saw you last. Give us a glimpse as to what you're trying to achieve with your customer base, whether it's by vertical or broadly. Sure, when we talk the last time, there's obviously a lot of experimentation that has been going on in the Big Data space where companies are really trying to understand that what data, what additional data sources do they need to tap into, what's the value, what's the insights that they can gain from this data and over the last, I would say, about 12 to 18 months, we are really starting to see things settle down around, very broadly around I would say five use cases. I'm not saying that there are not more than this but this is where we are seeing a good concentration happen. The first thing is all around data exploration, where enterprises are trying to just explore the data and see that what else should they be bringing into their decision-making process. And in order to do that, they have to be able to have a platform where they can bring in all kinds of data, be able to obviously, since they don't know what the value is in this data, it has to be a platform that's scalable, it has to be a platform that is cost-effective for them to put in all of this data. They have to be able to hit the ground running that is really as the data lands there, be able to explore it and see what's of value, what's not, what is this data telling them. So data exploration is what we are seeing now, a lot of the enterprises get started with. And once they explore, then a lot of the use cases are really centered around where the line of business wants to see how do they understand their customers better. So it's all around customer outcomes, right? Whether it is in terms of what do their customers think about their brand? What are they thinking about? What's the sentiment around the products that these companies are offering? What's the next best action? How do they service their customers better? So it's customer-centric outcomes. And in order to do that, companies need a 360-degree view of the customer. So that's sort of like the second use case where companies are trying to understand all aspects of their customer. And that can be done not just by the data that they have about their customers inside the enterprise. In a lot of cases, they have to bring in data from outside the enterprise. And so in that sense, a third kind of a use case that is starting to really emerge is all around extending the warehouse to not just be limited to the structured data, but bring in data into this next generation of the warehouse, which is not about fixed schema, but the data which doesn't have schema or the data where the schema is evolving. So it's all around augmenting the warehouse or extending the warehouse with these new sources of data, but not that the data has to be moved around. Data's going to live where it's going to live. You just have to be able to extract information and be able to analyze this data from all of the sources and be able to stitch it together so that there is the information is not just about coming from one silo, but really coming from all different sources without moving the data. Then there are, of course, security is another area, both in terms of network security as well as cyber security. So that's certainly an area where there are lots of deployments that are happening. And then last but not the least is around operational analysis of machine data, log data, sensor data, be able to analyze that really quickly. So that's sort of, like I said, it's not limited, but this is where there is a big from the use cases, you see things kind of getting to these five use cases. Yeah, and that you cited these five, data exploration, customer analysis, augmenting or extending the data warehouse, security, operational analysis, and then machine data. Actually, that's the fifth is operational analysis, like machine data. Those are broad sort of horizontal use cases. IBM has a real emphasis on industry as well. How's that going? Are there any industries that are particularly savvy or picking up this space in your products and your services more rapidly than others? Yeah, I think we are starting to see that in every industry has this big data challenge and they see the value that gaining insights and gaining insights from all data really quickly is important. So if it comes to, like, say Telco, they want to analyze call data records. The volumes of call data records now is, they could be dealing with 10 billion call data records every day. So this is, the volume is really high. They are coming at them really quickly and there's a lot of information in there which if they can get to, which they are getting to, in a timely manner, they can prevent customer churn, for example. So there's a lot, very quick adoption from the Telco standpoint. Same thing when you look at the government, right? Whether it's the US or outside, there is a huge big data sitting in the government, which has to be analyzed for different purposes. Then again, when you take this to the retail industry, right? I mean, they have to understand their customers better than what their competitors are. And so retail is another example. Then again, financial industry, insurance. I mean, it's everywhere. I'll just talk about that commerce, e-commerce or smarter commerce or smarter planet, smarter commerce I guess would be about the retail side. So you've got predictive analytics, you want customer satisfaction on retail. Personalization is a big thing. But also you have customer complaints and customer satisfaction. So there was a thing on NPR just this week about IBM and the Cheesecake Factory using big data analytics to identify and predict customer bliss. Keep comment on that and what you're, I'm assuming that's obviously big data stuff that you guys are working on. How do you guys look at that? How do you make that happen? Is it just all IBM systems? Is it different data sets? How do you bring in all the diversity of data? That's an excellent question. So when you're looking at customer and next best action or customer sentiment, it is that's really not where you can just look at one silo of data. That becomes really important that you're looking at your customer data, which might be in your warehouses. It may be in your CRM systems. It may be in your master data management systems. So there's obviously a lot of that that's inside the enterprise. It may be sitting in your call data centers. And then there is data which is outside the enterprise that has to be brought in and integrated. So you want to build richer customer profiles and you have to do that in an ongoing manner, right? That it's not sufficient to say that based on the information that's in the enterprise, we say here is the customer profile. You have to constantly keep augmenting that as you learn more about the customer. It could be based on, you know... Is that a Watson-like product? I mean, I can see Watson. What would you like with your table? I mean, automation is a big part of that learning, right? The learning machines, right? So this is where capabilities like machine learning and text analytics become really important because you're trying to put all these different aspects about a customer and you are able to teach the system how to become kind of smarter about, say, your customers and be able to really predict what that customer, you know, what's the next best action here, what, you know, even understanding like the personality or the psychology. A lot can be deciphered from just text data. So machines need to be programmed by humans and maybe by themselves, some compiler technology, maybe in the future could do things like that and be adapted, adaptive code. But we've been tossing out a term here on theCUBE yesterday called data as code. Almost to kind of build on the concept of infrastructure as code in the cloud, which is a cloud theme, where data is part of the development process. So what's your view and point of view on the developer market? Because developers have to build these new apps, not just data warehousing and business intelligence, and that's a known market. There's people who do that stuff, that's great, but the world's about application development and doing things like trying to set up an infrastructure that can support new questions, the right questions, the right answers. Absolutely. So there's obviously a set of tools that we have brought to market to help the line of business, you know, explore this data, be able to ask questions, be able to even figure out what questions to be asking. And we call this, or some of our customers call this, that they want to build something like a Facebook for data, right? Where you want to bring in data from all kinds of sources, be able to see it in the right context, be able to, you know, search is another important use case of big data, right? That be able to have a search kind of a paradigm and pull out information from all kinds of sources in the right context. So, and you can't expect the line of business to be learning programming languages to be doing this, right? This has to be easy. This has to be something that, you know, they can ask the question and they get the answer. So we have something called Data Explorer that allows you to do that. Now, when you look at another class of users here, like the data scientists, right, that are trying to do historical analysis, build predictive models, and build better predictive models because now there is more attributes that maybe they have to build or bring in the picture in the model that they're building. We have something called Big Sheets that is an excellent paradigm for data scientists to really see what additional dimensions and attributes they should be adding to their models and be able to test that against, you know, a large set of data and see, is their model a good fit, right? And then there are of course things like, you know, SPSS and R is a popular, you know, choice that a lot of people use for building these. Data scientists love that, right? Yeah, yeah. So we are certainly making it easy for people to do these analytics on petabytes of data, right? So it's not just about doing analytics on small data, but it's on big data. We did a tweet chat earlier this week. We participated in one with IBM and the theme was big data management. One of the questions that came up was do a lot of the principles in the enterprise around big data management, you know, data quality and governance and the like, apply in big data. And I think the consensus was, yes, absolutely. They just, it's harder to apply. It's different. You have to apply them in different ways. I wonder if you could comment on that. Absolutely, yep, yep. So, especially like, so when you look at all these different types of data that the enterprises are trying to bring in, especially like social data, there's a lot of noise in that data. Not, you know, everything is relevant to a business and be able to, you know, filter and really get what they care about is important. So, and as, you know, things are going in production mode, they have, enterprises have to be able to be confident about the quality of the data, that, you know, rich and be able to look at the lineage of the data, you know, where is it coming from, be able to, you know, and the same principles of, you know, from the standpoint of data governance that apply to the structured world definitely apply to big data, right? So, it's all about, you know, the metadata about this, the governance, the lineage, and of course, you know, IBM's offerings in this space, that do this for the structured data, are being all extended and enhanced for semi-structured unstructured data as well. Excellent. All right, Ashu, well, we're out of time, but really appreciate you coming by theCUBE and sharing with us your insights, and we'll see you around the events, I'm sure, both virtually and live, so thank you again. Thank you. Keep it right there, we'll right back with our next guest. This is theCUBE, I'm SiliconANGLE, live from Strata Conference in Santa Clara, California. Thank you.