 Live from Cambridge, Massachusetts, extracting the signal from the noise, it's The Cube, covering the MIT Chief Data Officer and Information Quality Symposium. Now your hosts, Dave Vellante and Paul Gillan. Gillan, this is The Cube, SiliconANGLE Wikibon's coverage of MIT IQ. We're pleased to have Bob Pitchiano here, Bob is the Senior Vice President at IBM, Cube alum. Bob, great to see you again. Great to see you, Dave. So coming off the keynote, you know, a lot of energy, as we're saying, you know, the focus conference, a lot of Chief Data Officers. We didn't get to see your keynote because we're doing Cube interviews. Take us through what were your key messages and what was it like? Well, you know, first off, we've been involved with the Chief Data Officer community for some time and it's been amazing to see the role transform. A part of my message was really, they're in a catbird seat in terms of helping organizations really transform and it's about innovation. And so, one of my messages was, you know, don't be a curator, don't help solve the conflicts of, you know, who has the rights to data, how to integrate the data, how to just, you know, focus on the governance that's necessary to really have a high-quality data-driven enterprise, really focus on that transformation. Learn how to be a leader in transforming your industry in the market. Serve roles and professions in your business in a different way, using the advanced capabilities of machine learning and predictive analytics, which are being consumerized. And really think deeply about how you're going to change that space of engagement for your business, whether that's direct to the client or citizen, whether that is for the employee so that they're more connected to the brand strategy and represent the brand in a way that really allows you to perform and outperform your peers, but really transforming the efficacy of their role, which is where advanced capabilities like cognitive computing really come into play, because many roles were really missed by previous generations of IT innovations because of the amount of unstructured data that many roles depend upon, lawyers, doctors, you know, you think about, you know, several really important roles in, you know, our economy that really did not derive the benefit from, you know, prior generations of IT transformation. And of course, you know, I really talked about this in the backdrop of what I see occurring where the value of IT in terms of business is really reaching an inflection point where in previous generations it was about codifying business process in the form of applications and industrializing the way businesses perform tasks, to now the inflection point is deriving insight from all of these different applications and all the disparate data and data that exists outside the company firewall. And really the value of IT is going to be really delivering its impact through the insight it's going to deliver on top of the internal and external information. You know, you made an interesting comment in your closing remarks. You said everybody talks about consumerization of IT. It's really the consumerization of business and that data role is really changing organizations, right, driving toward much more consumer-like experiences. At IBM, you guys do more technology than anybody, but you talk business outcomes. You're a geek at heart and I mean that in a nice way and yet you're in a position at your company and you talk about business outcomes all the time, how is the consumerization of organizations changing, how is data changing the role of organizations from a consumerization standpoint? It's a tremendous question and what I see occurring is that the best organizations are really transforming to be data-driven, much more deeply data-driven. And you know, Christopher Mims from the Wall Street Journal wrote a nice piece in the spring of this year and he commented that data is becoming the new middle manager and that the better organizations were creating a transparency but also a freedom to act and interact around the data in a way that really empowered people with more confidence in decision-making. And you know, Christopher and I talked about how I see things like Watson analytics really helping fuel that transformation much faster than anyone will think is possible because it allows people to interact with their data using natural language so they can express their business problem or challenge in natural text and that it applies machine learning to the laborious task of integrating disparate data, improving data quality and also being able to create predictions on the data without the individual having to be a data scientist. Now the consumerization element also occurs that we're putting that in the palm of your hand and so our partnership with leading companies like Apple allows us to really transform all the different professions that can derive a benefit from that level of data insight in the palm of their hand without really being sort of sequenced into an IT decision making chain. What's the analytics strategy going forward? Is that really the engine underneath IBM analytics? It's a critically important engine because it's a whole new field of innovation that unlocks the potential for that IT value to help transform professions that haven't really been affected in a really productive way by computing. So I say it is a very important element of it. We're still in the early stages of ramping and scaling that business and we're doing that with an ecosystem approach as well as a key approach working directly with clients and also with client partners and then bringing Watson into specific industries like Watson Health to really help look at how that industry is struggling with all of the personalized medicine and unstructured data. So but still 80% of the problem in really being proficient in analytics is still relating to master data quality, entity extraction, data integration. So applying advanced machine learning into those spaces might not necessarily be formally categorized as Watson but it's still a very important driver and as you know in the second quarter our analytics business grew at 20% you know last year we exited at 17 billion first quarter we grew it over 20% so we're growing about you know really about six times you know five to six times faster than the market already being the number one player on a business that scale. A story that didn't seem to make it into most of the media coverage I noticed and why is that? I mean why is that analytics story not getting out there? Well look I mean I think many times the story is told by you know people to really focus on you know what is that top line or retail oriented story and if you look at IBM you know we've been transforming by you know moving ourselves from what were commodity oriented businesses that occupied some component of our portfolio seven billion that generated five hundred million in loss on an annual basis. We exited those businesses so people see that as a top line reduction in revenue and now we're investing both the gains of that as well as our own the structuring efforts into our strategic comparative growth areas. Those represented about 25% of IBM last year our goal is to have those represent 40% of the IBM company in 2018 and so we're well on that journey and the growth we saw in cloud at 70% and analytics are just you know some of the elements showing that the progress of those strategic comparative growth areas where our investors really should focus their attention is working. Bobby talking about Watson and industry specificity and you got some questions around that. We've talked in the past about you have libraries of analytics by industry but analytics generally and I guess machine learning specifically tends to be highly customized. Are we getting to the point where we can see the light at the end of the tunnel where you can start to package those capabilities into applications you know maybe industry specificity and sort of reduce some of that complexity for customers. Without question and in fact you're touching on the actual technique we use in our Watson analytics offering on the cloud where we identify those statistical patterns and implement them in a way that as a general pattern then they can be applied to an industry or to a particular data set and when you look at Watson, Watson really has evolved as a system of systems you know around discovery and around engagement and around you know evaluating the data and allowing for interactions around dialogue and those systems all use a wide array of different machine learning techniques that have all been tailored for a particular industry or a particular data set so there's about 50 very specific machine learning algorithms that are exhausted by Watson as it solves its problems and you know most systems you know will use one or two machine learning techniques and we're really using a you know wide array that are purpose built for that industry or that domain and that also comes out in the way that we train Watson based on ontological approaches and the vocabulary of the industry wanted to serve and so you know as we look at really where we're going to have the highest impact in this inflection point it's important to put it all in the context of the industry in the domain and to have a set of capabilities that really focuses in on you know say financial services and asset performance of financial services there's also asset performance in an industrial or manufacturing sense but that's a different asset and you're looking at the behavior of a machine versus maybe the behavior of an investor or an equity and so yeah different machine learning techniques same words but they need to be applied in the system that can be you know flexible agile across both use cases so I would think that's fundamental and critical to IBM strategy because essentially open source is creating a slow motion collapse of a lot of infrastructure software to the extent that you can package solutions and add value you're going to make you make more money you're going to drive more value and you're going to get more repeat customers is that fair assessment it is spot on as you guys always are in terms of your insights and it's really you know the underpinning to why we were I think aggressively and openly supporting Spark because as a platform for helping democratize more skills involvement in the journey of advanced analytics and for a technology platform that supports the potential to do much more distributed parallel machine learning we saw that as a great promise you know sort of in the same way that Java helped really scale the notion of e-business and web development we see that Spark has the opportunity to do the same thing around around analytics and not just the analytics base but also operationally because in an in-memory context it can do some things that from an operation from an operational standpoint really aren't plausible in a Hadoop context today so it's also why we donated the system ML innovations we're now machine learning can be patterned and scaled across you know map reduce architectures in a more effective way so I don't want to let go of Watson just yet would IBM ever conceivably open source Watson no but you know what we are doing is having an open developer ecosystem and so you know the notion of of aspects of any of the Watson intellectual property around you know how it learns and how it evolves that's you know fairly important intellectual property as you can imagine but developing an open platform so that developers can utilize Spark as well as our behavioral based APIs as well as our engagement advisor APIs to develop their own applications without having to you know really pay us a commercial contract up front and then as you know they develop their application as they introduce it to the market as they succeed then that's where our commercial relationship you know bears a benefit for both them and for us you'll keep the barriers low but you but you would know you wouldn't open source absolutely and I think a lot of people don't understand that Watson has been very innovative in terms of developing the ecosystem there's a blue mix Watson developer zone and in that developer zone you can find the highly sophisticated but easily consumable APIs that allow people to develop cognitive applications that allow them to develop you know applications on top of our open data platform big insights that utilize predictive analytics advanced time series space time boxes for internet of things there's a whole internet of things foundation that's available on on blue mix that's also being utilized by device manufacturers to authenticate and register their devices into a trusted platform for them to share information and data about the products and services that those devices are embedded into and so blue mixes really started to expand rapidly as an open innovation platform for developers that are differentiated by analytics differentiated by cognitive and predictive and by the maturity of having business process support by you know the disciplines that IBM has developed over many years so we have a question from the crowd here running a crowd chat at crowd chat net slash MIT IQ and the question Bob relates to and I know you're super busy all the CDOs want to talk to you I know you got a you got a run so I want to get this in before you got to go somebody a crowd chat saying ask about data silos in real time yeah and there's a lag in the data depending on the freshness of the data and the access to the data and the silos are yeah challenge there how is the industry generally and IBM specifically addressing that so a very very important point because you know especially as you start to look at the internet of things and really what's going to differentiate you know company and organizational performance it's really going to be around fast actionable insights and those data silos really put a huge amount of of time lag in to an organization's ability to act with confidence it may be because they might have to integrate a certain set of data with a trusted customer profile and that is going to take a amount of time because it's done in a batch orientation today right it might mean that predictive analytics is going to be applied to try to get some aspect of how to prescribe the action based on what's occurring and that in many instances only happens in a batch oriented process today or it has to be moved to another database and the fact that much of the data that's going to be generated the edge of network is going to lose its value within a few milliseconds that's 60% of the data is going to lose its value within a few milliseconds so that your your viewers got a great great question and we see that there is actually an important element of a real-time analytics zone that's the convergence of complex event processing and advanced predictive analytics inside of an event stream spark helps with that as well but it doesn't necessarily it's more of a micro batch capability it's happening in a in-memory context so it does it very very quickly but it doesn't have the level of sophistication of logic branching bloom filtering things that you might need to do as you integrate various sources of data to gain the insight recurse and iterate around certain patterns and then take action in an operational decision fabric because the other thing that your your viewers also pointing out is that the analytics analytics have separated from the operational system so what does it really mean if somebody at a desktop gets an alert that something has to happen they have to run down the hall and plug it into the programmatic system of record for it to affect the value so chaining those things together is that aspect of complex event processing with advanced analytics and that's what the InfoSphere Streams project has been about and that project actually is converging to some extent with more open Java mechanisms of developing in streams language and also with the work that we're doing with spark but a great question so that's a vision of the the systems of record actually extending to accommodate being incorporated directly into the the insight and being able to take actionable insight not retiring them not out putting them in the old bucket yep so it's data silos and its system silos right you know systems of engagement and things that are doing the data collection can't be far removed from the systems of record where people are taking the trusted action that represent you know the processes of the enterprise so great question the the this is a CDO conference and there's a lot of talk about how the CDO role will develop vis-a-vis the CIO role what is Bob Picciano's view on that okay as I said to the audience today you know I think the CDOs are in the catboard seat and you know I know initially that you know many of them were educated by saying you know these organizations don't work together get them to work together you know people think they own the data you own the data figure out how it works together you know part of my message and I think you know I saw a lot of nodding heads so I think people really on this is you know don't be a curator be an innovator right be a disruptor think about not just the the the role in how you know data and cloud and engagement are going to be transformative for business but you know use that position of power you're at that C-suite right you're helping paint that journey map of how you know your company's data is really going to be transformative in the company performance I also encouraged him to think partnerships that you know in many instances the exogenous data the data that exists outside of the company whether it's sentiment or climate or weather atmospheric quality whether it's open data that comes from municipalities whether there's data that relates to you know fraudulent activity that they need to bring in from syndication or you know media related data incorporating that data with confidence with the rest of their data is going to be a core discipline so I encourage him to think about partnerships and partner with companies that can not just bring the data but can bring insights about that data so it's much more consumable and and really to to focus on you know finding a way to to truly innovate for the industry not just for their company but to be a fellow so to speak in how they impacted the industry from that position well you've got some experience there you've done some weather partnerships you've done a partnership with Twitter box when I first saw this cloud deal but it's actually it was an analytics driven deal can you explain that deal yeah I mean I think you know first off I think people were surprised because you know if you were to step back and look at this from 50,000 feet you may think at a macro level you know boxing IBM were competing in fact when Aaron and I got together and first talked we said you know we're in the same companies but we actually are in in different places and where they were incredibly powerful is really in that interpersonal relationship of people with revisable form data editable content that we're really trying to work and collaborate and share work and collaborate and share and we were doing work at the more enterprise level about how that content needed to be linked together around an enterprise process that you know looked at case management that looked at defensible disposal that looked at self-service you know content generation and we said look what organizations really need to do is they need to bring these worlds together and if somebody's doing something over here in an important collaborative pattern we want to bring those things together and we want to open up the we know with of course the the customer consent want to open up the ability to help those companies analyze that data and glean insights from that data because the richer that content the harder it is to analyze it so it's not just the textual information not just the spreadsheets but images and video and we have advanced analytics that can help box improve their value proposition because they would be able to help people really gain insights from all that adjoining content so it is been a collaboration of analytics around securing the information in a meaningful way around process integration and really creating enterprise and industry vertical specific solutions around that process around collaboration and work that we're doing around IBM diverse and IBM connections so people have enterprise social integration between what's happening with box and what's happening with their existing IBM collaboration systems and of course one around cloud where we're also helping them bridge that barrier to entry in international geographies where data sanctions data regulations really specify that the cloud data center has to be in their country and the data can't leave there and IBM has the answer to that with the way that we're running our software infrastructure and globally dispersing our data centers there's no question that IBM there's a pattern to IBM's recent alliances Apple, Facebook, Twitter, Box clearly the company partnering with some of the some of the hottest web scale companies out there what criteria I mean is there is there a bigger plan or a bigger message here with of course it is look I mean it's all about data and cloud engagement you know all of these companies have an important cross-section intersection of you know interesting vector of data whether it's Dave Kenny at the weather company who did just a tremendous job another partnership that's another partnership where you know they're mapping the atmosphere in the same way that companies like Esri and Google mapped the earth now atmosphere is a lot harder to map because it's constantly moving it's a three-dimensional space and there's many things that you have to track in it not just the weather but it's you know different stratas of air patterns and movements you know air density but also pollutants pollen all the atmosphere quality and you have to do that on a global scale continuously and so that's an important vector of information that people can use in real time as a competitive advantage in real time systems that have to do with power generation alerting around insurance you know asset damage potentially customer churn is greatly affected by weather patterns in the atmosphere so interesting component of web of data vector could be sentiment with Twitter could be content with Bach cloud centric so doing it at a web scale and that really affects the way people feel like they're being engaged by a brand or a consumer if you think you know you understand my sentiment and you can respond to my intuition about what I need next because you understand that you have a view of my behavior then you're going to be my preferred supplier of products and services big very generous of time but I got one last question before we let you go you had a big year for you guys you have had waves so we saw you in January when you're bringing transaction and analytics together the Z announcement in the spring you guys brought Watson analytics to the to the BI space the you've kicked off the summer with a spark initiative and then you've got IBM insight coming up in the fall what could we expect to show us a little leg you'll be the first to know it's a little early to go with that but clearly you know this is a fast-moving very determined very focused very committed transformation for IBM and you know yeah it's been a year but for me it seems all week time is going by and the scale of our our engagement with our clients and our partners has been tremendous so I'm grateful for all their involvement and really for the great collaborations but really grateful for the problems that our clients continue to bring to us the most ambitious hardest challenges I'm so in the fall you might see the story told through more of our customers eyes and you know really the the fruits of some of these innovations as well as some some new things that that we you know we're going to unveil as we get closer to the fall time have you got a couple of block posters up your sleeve should we expect something about IBM alright Bob listen congratulations on the business momentum you know keep it up we'll be watching and thanks guys very much I appreciate you have right there everybody will be back with our next guest right after this is the Cube we're live in Cambridge mass right back