 Live from the Hilton at Bonnet Creek, Orlando, Florida, extracting the signal from the noise, it's theCUBE, covering Vision 2015. Brought to you by IBM, and now your hosts, Dave Vellante and Jeff Frick. Welcome back to Orlando everybody, this is Dave Vellante with Jeff Frick. We're here at IBM Vision, this is theCUBE. Check out ibmvisiongo.com, it's our interactive digital experience that we've created around this event. Michael Curry is here, he's the VP of Engineering for the analytics solutions at IBM. Michael, welcome to theCUBE. Thanks, thanks for having me. Another Boston boy, Bruins fan, we love it. From New York, but it's good to have you. So, well let's get into it. So, we were just talking to your colleagues about the analytics business, performance management. You run the engineering side. That's correct. So, talk about your role, and we'll get into it further. Yeah, so I run engineering for analytics solutions, includes the financial performance management, the sales performance management, our risk and GRC platforms, and then some of our industry solutions, solution areas like predictive customer intelligence and predictive maintenance and quality. Some of the things like our Twitter data feeds and all kinds of stuff like that. So, really interesting set of technologies that reside over. Well, you've got a lot of challenges. You've got a vast portfolio. You've got new technologies, you've got older technologies, you've got a big customer base, you've got this new Twitter thing going on. Where are the priorities? Is it integration? Is it new function? I mean, it's yes, yes, yes, and yes. So, how do you handle all that? Talk about the priorities first. Yeah, it's interesting. You know, this space in general, analytics is an area that we have a nice positioning. IBM is kind of a leader in this. Have been for a while. We've acquired some great companies in the space. So, having the base that we have, both the install base and the technology base, is a fantastic position to be in. But it's changing very quickly, right? There's a lot of pressure. There's a lot of new vendors arising. There's new technologies arising. You saw the rise of the big data technology said over the last five, six, seven years. And as that's happened, it's put a lot of pressure on kind of your core platforms. And you have to be able to adjust to those changes. A lot of those changes are related to the cloud. So we are really rapidly moving to cloud-centric technologies. It's not so much just from the perspective of customers want to be on the cloud necessarily. A lot of the customer base does, and we think that will increase over time. But a lot of it also has to do with cloud-centric technologies allow you to move at a faster pace. So it's really about agility. It's about how quickly can you innovate? How quickly can you introduce new capabilities into your product lines? How quickly can you allow your customers to adjust to new capabilities and new paths that they could take with your technology? And how quickly can you incorporate new concepts into what you're doing? So the Twitter data, as an example, taking Twitter data or taking weather data is another thing that we deal with and starting to inject it to make our predictive models better or inject it into how we do things around optimization. Very interesting space to see how all those things need to converge together very quickly. And cloud technology really does help you to do that. So that was a great setup. So unfortunately we only have 20 minutes. But let me start actually with the Twitter data. So we had Nate Silver on three years ago and I asked Nate what he thought about the predictive capabilities of things like Twitter. And he said, now this is granted, three years ago. He said the data's not there. Now Nate is used to polling and doing a statistical analysis. He said the Twitter data is too raw. It's not there yet. The quality is not there yet. Has that changed? Oh yeah. It's amazing how, you know, Twitter data like anything else has its quality issues. I mean, you're going to have things that will mislead you in Twitter data just like anything else. But it's such a big sample set. It's such a huge amount of data that you can really learn things. I'll give you an interesting story that kind of combines together both the Twitter and the weather data. I was at the weather company, which is one of our partners and we've done a joint agreement around, and we were looking at weather data and we were looking at Twitter data. And it was interesting that they were actually able to see the weather patterns on Twitter before they saw it on their weather sensors because people would start to be tweeting about rain or some kind of weather event. And then all of a sudden the weather, they'd get their kind of delayed reads from their weather stations and sure enough, there's rain there. So it's really interesting to see what a great predictor Twitter is. And a lot of people think of Twitter data as simply a sentiment analyzer. And it's a very good sentiment analyzer, but it's very good at other things as well. It's really become the fastest way for people to convey information that we have in our society today. And it's shocking to think about all of the things you can do with that and how you can begin to combine that with other business data and really start to help to improve the way you make decisions, the things you can understand about your customer base, your products, the effectiveness of different kinds of activities that you do in the market. So Michael, a lot of the data that you've historically dealt with is very structured, systems of record, so to speak. And all of a sudden you have this huge influx of things like Twitter data, weather data, so-called systems of engagement. How has that technology shift affected your engineering resources and activities? It's been a big focus for us, right? It's clearly an area, most of the world is unstructured data and there's an incredible amount of value in that unstructured information. And so when we look at it, and unstructured is one of those terms that drives people crazy because it's really not, no such thing as unstructured information. Everything has some sort of structure to it, right? But the idea is- Non-rigid. Yeah, but the idea is you need to parse through it and understand what it is. And I think a lot of our focus for Watson has really been on that cognitive understanding of what human language, right? What is this actually saying? Interestingly, the agreement that we announced this morning with Deloitte around the regulatory compliance and controls system, that's really about parsing through unstructured information that comes in through these new regulations and being able to understand what does that mean? What kind of controls do I need to put in place around that? Things like that are incredibly valuable in terms of speeding up the recognition of what things mean to businesses. And since a vast majority of information that's created is created without thinking about putting it into some kind of a structured format, the faster you can react to that, the faster you can get the value out of it. I thought Tom and the keynote from Deloitte, Tom's Campion had really succinct slide talking about the big changes in data and obviously data powers the analytics. So you already talked about structured versus unstructured but you had a couple others. The data at rest versus data at motion. Very different way of thinking about data. And then inside data versus outside data. And really moving from a pure reliance on the inside data to the outside data. Talk about these tectonic shifts and really the data sources and availability and the ways you manage them and then applying that into the analytics. It was interesting to see that because I saw that slide yesterday in some of the practice sessions and it was interesting that I was thinking about it. I'm like, wow, this really isn't just about compliance which was what that particular session was about but you can really apply that to everything. That is what's happening, right? The fact is that we used to think about data very differently. Not only do we think mostly about structured information and that was kind of contained in these systems that we were managing and everything else we sort of ignored but we really just looked at what was in our four walls. And I think if you look back even 15, 20 years ago the only companies that were really, there were some exceptions to this but the main companies that were taking external data were financial services companies because there was an incentive around it. They had the ability to kind of take in these feeds that were coming in from at the time, the telekers and the Reuters and all those kind of guys sending in those data and they were using that data to make decisions. And I think what's happened now the aperture for that has opened up so that almost every company is looking at that external data. So the fact that you need to now deal with structured and unstructured data and data that really can come from anywhere is really changing the emphasis that companies are putting on analytics and it's allowing, creating new opportunities but allowing people to do things that you never could have thought of doing before. And that's what's really exciting about this space. It's just that there's so much potential and every day I hear a new story that's just amazing about how people are using information like that to completely change the way they do things. Everybody talks about sort of digitizing their business, the digital economy, the API economy. How does that mean affect your engineering priorities and how you apply resources? Well, I mean, you know, most of our focus is really on that. It's really on that digitization and trying to create relevance out of information that might be circumstantial. You don't, it may be data exhaust even. It may be things that come out of the back end of some process that really is important to the business but being able to make sense of that, find what's relevant about that and bring it to the business user is really what we focus on. And whether it's in the internet of things space or financial planning, there's really interesting data that is, in most cases, just not being used. A vast, vast majority of it's not being used at all. And the smartest companies and the companies that are outperforming their peers are figuring out how to use that data faster than their competition. Even in our space, you look at the companies that are doing really well in the technology sector. They're the ones that are figuring out how to use their own data and apply it in new ways. Either sell it or apply it to improve the way they do things, the way they sell, the way they market, the way they build products, all of that. It's a very interesting phenomenon to watch across all industries right now. How about, you mentioned the big data before, Hadoop, sort of this new thing. How has this concept of data is everywhere? Ship the five megabytes of code to a petabyte of data. How has that changed your work? Because people think of your world as largely centralized and a kind of command and control. Yeah, it's changed dramatically. I mean, in fact, both the aspect of data coming from everywhere and just the economics of the amount of data that you have to deal with have made it pretty much impossible for people to stay in a completely on-trem kind of installation mode where they do everything and pull everything into a central location and try to process it locally. There are some exceptions to that, but for the most part, people are starting to really take advantage of cloud infrastructure for that. Hadoop taught us that you can create a scalable cloud infrastructure that can process data very economically and that I could take commodity hardware and do almost anything. Now, new technologies like Spark are really starting to bring that same sort of paradigm into real-time processing of information. So those are areas that, to us, those are the tools of the future that are helping us to help companies to take advantage of this stuff. Hadoop, we've been on that train for a while now and now we're really investing heavily in Spark. We see that as a very big differentiator for companies to be able to process that same kind of data in real-time and use the same kind of programming paradigms that they use for Hadoop. So kind of the in-memory analog to blue acceleration, is that right? Is that the way to think about it? Yeah, yeah, and in some ways, they're very similar. Yeah, I mean, I think it's all in memory, it's distributed processing and what's good about Spark is that if you're familiar with Hadoop, then you probably have a pretty good chance of being able to figure out Spark pretty quickly and there's so much energy going into it right now that it supports different programming languages. Like I said, the model's pretty familiar and it allows you to transition from both batch-oriented processing to streaming processing and doing it at a speed you just can't achieve with anything else in the market today. One of the challenges as a software engineer in terms of taking a code base that has come about from acquisitions, a lot of organic development, rather large, and then accommodating these sort of modern technologies that we've been talking about. What are the challenges and how is IBM meeting those? Well, it makes your change fast, right? I mean, this is the problem, right? You used to be able to say, okay, well we're going to be on RDB mass technology and 10 years later you're still on the same RDB mass technology. That doesn't happen that way anymore, right? The technology's changing so fast and it's a great thing actually because it's allowed, it's created this innovation model that has moved technology faster in the last three years than it moved in the 10 years prior. But with that comes a challenge because as you start to build out these systems, whether it's our own internal software or our customers in planning our software and other things, you have to be open to working with a broader ecosystem of technology bases. Some of those, a lot of it comes from open source. Some of it comes from the vendor community, some of it's you produce by your own customers. And so it's not, I think IBM has come to the realization over the last five years that it's not really all about us providing every end to end piece of the technology portfolio, right? It's really about us living in an ecosystem and how do we create stuff that facilitates adoption of many different kinds of new technologies that's flexible enough to change very quickly and that allows customers to adopt kind of innovative cutting edge stuff at a faster rate. And that to me has been one of the big changes that we've seen and we've had to really adjust to it internally. We update our products constantly, we embrace open source like it wouldn't believe. In fact, we're one of the largest contributors to open source across the board. I've put a lot of things in my own portfolio into the open source community. So we're much more deeply ingrained into the developer community and into the open source community than we ever were in the first several years that I was here. So that's been an interesting and I think a refreshing change. Well, and IBM's always had strong open source, you know, E-dose, Linux. Yep. Kind of you started within sort of the enterprise company base. Absolutely. But architecture matters, right? So how do you deal with that complexity? Is it a layer you put in? Is it just, the software was originally designed to be flexible? I mean, going from a client server era to an internet era to a cloud era, how do you deal with that? Yeah, it's just get lucky? It's definitely a challenge, right? I mean, as you move to this sort of cloud infrastructure, it becomes, you know, moving to NTR was one thing because you kind of moved away from, you know, just having one central location to now maybe spreading this thing across commodity servers that may be located in many different locations and you want to take advantage of the high availability and load balancing across all of that. And that became a challenge. With sort of the cloud infrastructure and what we see now, it introduces the ideas of multi-tenancy and, you know, different levels of scalability, you know, the level of auto scaling that you can build into things now for the resiliency. All of that stuff is extraordinarily exciting but it's also challenging because you have to now figure out how to retrofit that into everything you do. So it doesn't happen overnight. Look, you know, we've got a lot of technologies. We've got a long journey to go through to get all the way there and we're possible relief frogging. We don't want to just say, okay, I'm going to take this thing that I've had for 20 years and somehow pretend it's going to be cloud, call it cloud and slap it into the cloud and try to make it work there. You know, we may do that for some period of time while we transition but then we look for the new technologies that can help us to move faster to that next state, to that cloud-centric state. And that allows us to begin to adopt technologies and pull them in to the core at a faster pace and really not have to worry about dragging all of our legacy with us. But at the same time, you know, we do have a lot of customers that are on the old world. So one of our responsibilities as being IBM is to help customers through that transition. We can't just leave a bunch of people and say, okay, we're going to go off to the new thing, see you later. We have to help them to transition with us. And that's that bridge. So it's not cloud-washing folks, it's called the bridge. That's right, yeah, there you go, exactly. What about, you mentioned multi-tenancy. There's a debate, not of a debate, there's a discussion going on in the industry of multi-tenancy versus multi-instance, you know, multi-tenancy. Some people say it's bad, others say it's good. What's, you brought it up. What's your take on multi-tenancy? I'm a client, I'm nervous about noisy neighbors. I'm nervous about security. Of course, you have these discussions all the time with people, convince me that I'm safe. It's a simple cost equation, really. I mean, you're pretty safe these days. You certainly have noisy neighbor issues anywhere you go. And noisy neighbor issues can be overcome with really good auto-scaling, but generally speaking, you're going to have them no matter what technology base you're on if you're in a multi-tenant type environment. And this happens across anybody's cloud infrastructure and anything that you're running on. So noisy neighbor is probably one of those things that, yeah, okay, that one, I'll give you it. From a security perspective, you know, there's security technologies in place now that are going to make it pretty safe out there. Now, if you're dealing with stuff that, you know, you can't even have it running on anybody else's equipment because you have to have a guard standing over it or something all the time, then yeah, you're not going to be able to run in that type of environment. But you know, I think most of the market has gotten over most of the security concerns associated with multi-tenancy. So it really comes down to the noisy neighbor thing most of all, but you know, customers really don't want multi-tenancy. I mean, that's not what they want, but the vendors all need it because that gives you an economy that, you know, allows you to run this cloud infrastructure at a very low cost. And so customers like it because their costs are low, they don't necessarily want to share the infrastructure. So we really provide the options around that. We allow people to isolate their environments. We allow people to have multi-tenant environments where we support them. We even allow people to run in a hybrid mode because a lot of times if you do have a security concern where you really just cannot have this running in the cloud, you know, for whatever reason, you may want to still have that stuff running on-prem. So you want to have some, the ability to run some stuff on-prem and some stuff in the cloud. So you let the customer make that choice and say, you want to pay for it? That's right. You want to pay for it. Yep, exactly. And that's part of our philosophy is to help customers in that journey, right? It's a journey that they're going to step through over years. Michael, what kind of technologies are you looking at that excite you? Come another, you mentioned Spark. Yep. That's obviously something that's exciting. Yeah. You know, the real-time-ness of Spark. Other things that you're looking at that you could see coming into this? Yeah, so we do, you know, it's interesting. There's a lot of open-source stuff now. We're doing a lot of work with Spark. We do a lot of work with Kafka. Kafka is a project that's in the, more of the messaging space. It came out of LinkedIn. Spark is an area that we're doing a very heavy investment. We opened up a Spark center out in San Francisco area. So there's a lot of open-source technologies out there. I think, you know, when I start to think about technology bases that I'm really interested in, I'm really interested right now in a lot of the stuff around analytics, building analytics and the communities that have built up around that. So we've been doing a lot of work around Python and R, doing a lot of investment around enabling the R technologies to support R and Python models. We've been doing a lot of stuff around Ipython notebooks, things like Jupyter and stuff like that that allow people to be able to, data scientists really, to be able to work in a constrained environment that they can start to, you know, build out models, actually execute code in, share, you know, interchange these models and then be able to have those poured over and run directly in a Spark cluster. I mean, those types of things are, what I see as sort of the future for the data science world. And then of course, you know, a lot of what we're trying to do is translate that back into a set of capabilities that a non-data scientist can actually use. So take those same models, plug them out and plug them into something like Watson Analytics that allows a relatively, you know, non-technical person, you know, maybe a power user or a line of business person to be able to run the same kinds of statistical analytics or optimization analytics in an environment that is very non-intimidating and they can just play with their data, see the impact of those models and understand it in business language. I love it. A head of engineering at analytics for the company that owns SPSS talking about investing in R. Absolutely. And we were talking earlier about how companies like IBM are learning to transition. And one of those big, you know, learnings is don't try to protect the past from the future, embrace the future. Yeah, I mean, we obviously firmly believe in SPSS SPSS offers a different set of value propositions to many customers, but there's a huge community out there around R and Python as well. So our philosophy has become really about embracing those technologies. You know, bring the best of our capabilities around how enterprises want to consume that technology, which we're very good at and help them to be able to ingest those open source technologies in a way that doesn't break them, right? I can't just take, you know, let's say I'm going to be open to every open source project out there and then I have 47 different versions of three forks of some open source project that are running rampant all over my IT organization. It's too hard to do that. It's not governed. So we can help customers to put a management infrastructure around it, a security infrastructure around it, a scalability infrastructure around it, reliability infrastructure around it. And that then allows them to take these technologies in and use them in innovative ways without having the risk that they would normally have and just opening the doors wide open. All right, last question. Sort of put on the breakout of the binoculars. What do you see in the midterm and then telescope long, long term? Where's this business going? You know, it's interesting. I think it's actually, it's on the same trajectory. So I think what I'm seeing happening is bringing power to the user. And this is something actually we've been on this path for probably seven, eight years now, but starting to empower the individual business decision maker within companies, that is really starting to accelerate at a pace that I haven't seen in a long time. And it's more, it used to be about descriptive analytics and having nice reports and stuff like that. Now it's really about bringing tools for exploration, as you can see with things like Watson Analytics. And I think what we'll find ourselves in is, you know, maybe 10 years down the road, maybe shorter than that, you won't make a decision in business without a computer helping you to make that decision all the time. And it's kind of like, you know, you look at things like the glassware that has the integrated computers in it that are answering questions for you as you're working. I think there's going to be those types of interfaces that we're going to be dealing with all the time. And I guess it's kind of like the Star Trek where you're talking to the bridge computer, right? There's going to be computer assisted decision making all the time, cognitive and predictive and prescriptive types of analytics built into every decision we make. And it's kind of scary in some ways, but a lot of ways we're going to make better decisions because we're going to have all the information we need digested and given to us in a way that allows us to make those decisions better. It's not going to remove the human out of the decision making process, but it's going to facilitate humans making better decisions. Signing future. I just thought of like 20 more questions but we don't have time. All right, well listen, thanks very much for coming theCUBE, really appreciate it. Michael Curry, running engineering for the analytics side of the business for IBM. Great insights, really appreciate you coming on. Yeah, thanks for having me, it's great. Thank you. All right, keep it right there, everybody. Jeff Frick and I will be back. This is theCUBE. Check out ibmvisiongo.com. It's our interactive digital platform. This is theCUBE, IBM Vision, we're right back.