 Live from the Galvanize campus in San Francisco, it's theCUBE covering Apache Sparkmaker community event brought to you by IBM. Now, here are your hosts, John Walls and George Gilbert. And welcome back to San Francisco here on theCUBE as we continue our coverage at the Apache Sparkmaker community event. We're here sponsored by IBM, by the way, out on the campus of Galvanize here. It's just about a mile from the Bay Bridge, I think. I'm with George Gilbert from Wikibon, George. Good to be with you, sir, throughout the week. Good to be here. And we're joined by Derek Shuttle, who is the general manager of cloud data services and analytics platform at IBM Analytics. And Derek, thanks for being with us. You bet, good morning. We appreciate that. Good to see you. Good to see you. And I'm glad you're here. A long flight all the way from Boston, so it's good. Glad you made it. I met you with sirens on a tarmac, which is always fun. That's the usual service, right? Yeah, it seems like it, yeah. All right, tell me about, first off, let's just paint the overall picture here. Obviously, IBM is all in when it comes to Spark, when it comes to the open source community. Why so? What is about open source that you have found to be not only palatable, but you want to embrace it and really engage with it? Yeah, that's a good question. And I think if you look at where we are as a business, IBM, that is, there is a period of time where, I think 10, 15 years ago, open source, maybe at the operating system level, there was obviously with Linux, big introduction of change. It took time. And it took a while to get believers where that you could look to open communities to introduce technology that you could then take to the enterprise, quote, unquote, and make it sturdy and scalable and secure enough that it could solve real problems. I think today, open source is a part of everyone's life at the enterprise. I think everyone's realized that it is a way in which you can move faster. You can take advantage of innovation with less risk. You can work in communities where there's pools of talent and insight that perhaps you can't kind of maintain and protect within your own business. And if from our standpoint, open source really is, it's less about kind of control points. It's more about how can we participate in something that's gonna help our clients, help our products that we're building, either that we're contributing back to the open source or perhaps that we're keeping as an IBM offering. It is at an inflection point. And I think Spark's a great example of that. I mean, to us, Spark is that analytic operating system. And what we're announcing with the data science experience is really that first killer enterprise app on that analytics operating system. And I think if you rolled the tape back 10 years ago and you were having conversations about what do we do with open source? Is it something to be fearful of? Is it something that we need to kind of protect ourselves from? That's gone. I think the world we live in today from IBM's perspective and certainly our team's perspective is, we want to invest in it. We want to participate in it in an authentic way where it's not just us sitting on the sidelines, but rather us participating day in, day out and making sure that it really is a project that is built to last, that it has real value. And more importantly, it's a foundation that folks can invest in and build on. It's a one plus one equals much more than two. Like I think you just have to say at least the point that it's from open source, the value. The market tells you where the value is, right? Absolutely. That you're not out there with proprietary tools or proprietary solutions and taking a gamble. You've got the power or the force of a community almost steering everybody as to what's working, what's not. That's right. And I think in general, if you look at this market, enterprise software over the last five, 10 years, we are shifting from proprietary products and pipelines and things developed in isolation into a platform, a platform approach where the cloud delivery model, the economics they're in and the availability of that, meaning network geography and I'll say, consistent performance of that cloud delivery model has really provided opportunities for partners, for IBM, for our clients to move far faster than they ever would have before. And I think as a result of that, we're moving towards platforms. And as a result of moving towards platforms, that means we've got to be able to embrace not just what sits within our own building that we're building and investing in, coming out of research per se, but also these open source communities. We want to give choice with consistency. We want to be able to provide developers, data scientists, data engineers, all of the tools they need to be able to take advantage of that cutting edge, community-driven open project, as well as run alongside investments that we're making in analytic warehousing and visualization and machine learning. I mean, everything we're doing with Watson. That to us is the world we're living in, where it's a mosaic of technology. It's not kind of one technology in isolation that you've got to kind of lock into and be beholden to, right? You said a couple of things that were really interesting in there, which is... Only a couple. I thought they were like 10. They were about 5%. Yeah, John, I said, yeah. Yeah, you're right. George, about a couple that he liked. I ran out of fingers. Yeah. This is along the lines that we're shifting in value from products to platforms, that it's open source, and it's not just within the four walls of IBM. So the value, there's platform value so that there's sort of seamless sets of functionality that you can offer. But if you're also including products that are now part of the platform that's within IBM, is the value that you're getting it to work together seamlessly for a developer or for an administrator or for both? Yeah, fantastic question. So if you think about what data is doing, and by default, these different technologies that are whether they're open or proprietary, it's changing the way we work, right? It's changing how we think about data either in residency, because it used to be very rigid, right? You designed an enterprise data warehouse. It was a very strict and managed engagement. You'd have limited access. You'd have controls put around it. And that may not be kind of disappearing in total, but what's emerged, I think, as a result of massive amounts of data that are created outside the four walls of the enterprise, those that are created within the building. And then the realization that with a platform, with an integrated design approach from the beginning, those different personas, right? The data scientist, the data engineer, the citizen analyst, the business analyst, all of those folks should be able to interoperate with each other far more easily than they've ever been before, right? So we launched something earlier this year called the Analytics Exchange, right? The idea that you can create data artifacts, host them within your enterprise on your platform that allows you to shop for data, right? So if you're a business analyst and you build a retail POS model, that shouldn't be in isolation. You should then be able to put it up in an environment that you can score it. You can share it among your peers. You can make it a standard within your business, so on and so forth. I think we are going into and very much driving and believe this is the opportunity for us is to provide, you know, turn into a team sport, right? Turn data into a team sport, not something that you do in isolation, but rather something that flows throughout your organization. And I think that is, we're at the beginnings of this. And I think open source is a big, you know, contributing factor. I think cloud deployment and the speed at which you can move is a big contributing factor. And then I think lastly, you know, if you look at what we've done with the weather company and what we've done with the Twitter partnership, what we're doing with Box and many others. I think somewhere to be announcing in the now, two, just in futures, data that sits outside the building is now an opportunity. It's not something that's, you know, a question mark as to what you can derive from it. And whether you call it dark data, whether you call it data that sits in the social ether, all of that can be ingested, integrated with your own enterprise data so you can take more informed action and think about examples of where weather can help you make decisions about what flight you choose, whether you want to buy flight insurance or not, if weather can give you an accurate forecast of the next time you're gonna fly down south and there's a hurricane coming through, it gives you a recommendation, hey, you might wanna buy flight insurance, right? Good example of that. So let me see if I've got the sort of components or the ingredients mixed right. We used to think about software products and we thought about data associated with those products that was an implementation. So now you're sort of orchestrating the behavior of all those products into a platform and the data to feed them through some sort of, just the way we might have had a catalog of the different software components, we have a catalog of the data feeds. Right, because I think, and that's your value to orchestrate that. You've never been at a point in time where you've had more of the abundance of choice with which you can make the data layer is far more than there's ever been before, right? Proprietary open, massive amounts of choice, right? What we're trying to bring to what we have in market when we're trying to emphasize for our clients is we wanna provide you that choice but we wanna do so in a consistent integrated model so that you can move data easily within your business, you can inject different data types when and if needed and you can do so all the while in a secure, durable enterprise way, right? And if you look at what we're doing around Spark and the data science experience, that is all about allowing data scientists to come in, collaborate with one another, look at more data, share the data experiences they have as they kind of curate and iterate on top of whatever theses they may be trying to validate and then think about what's next. You then need to actually put it into operation. You need to take all of that iterative experience and then put it into an operational pipeline. Just to make sure I've got, it sounds like there's a third component to the platform, the catalog of data or analytic assets. Then now you're talking about the data science experience sounds like the endpoint suite for manipulating all that. That data science experience you can think of is all of the leading tools so that if a data scientist is comfortable using Python, if they're comfortable using Scala, we're non-denominational, right? We're rather about creating a consistent look, feel, use and durability so that those folks can come in and they can work on the same notebook together, right? The great example we have and the analog is we're shifting into a team sport analog, right? It's no longer about operating on the data in isolation because of religion, right? Because of the tools you choose or because of the part of the business you operate in. It should be that you're given a platform to operate within, data flows within and across it and you have the tools that you need as a professional but not as an isolation, not to say it's Java versus C++ or it's Scala versus Python. It's rather here's a platform, data's resident within it and you can operate it on the tools that you feel most comfortable and then share with your colleague and iterate thereafter. And I think that is, you know, the culture of data is changing the way we work, right? Data is going from a place where it was really rigid, static, providing reports, provided in isolation to where now it's flowing freely throughout the organization. It's changing all the time. People want access to it in real time. They don't wanna wait and to be able to do that and to do it in a way that it's gonna change your business, it's tapping into those different personas, those different professionals that touch data at various stages of the maturity from someone who's building an application that generates data to talking about the data analyst, the data engineer, the data scientist. All those people are working on that, you know, I'll say continuum of data to then only reiterate it back through an application. So that is what we're bringing forward with the data science experience and that's what we're investing in going forward. So this fundamental change in culture that you were saying, obviously at IBM, you're all in, but talking to your clients and bringing them along and getting them to, you know, nod their head and say, we're all in. Maybe a different story. I mean, so you have to, there's gonna be some education there too, I would think, and maybe a little bit of prodding. Well, John, so I've been at the company for over two and a half years and I mostly have worked in and around early stage innovative technology companies and a lot of my friends always ask me, why are you still at IBM? And the reason is we have, aside from great people, great technology, is we have incredible clients and they are very much in partnership with us. I've been blown away by the amount of times over the last six months we've gone out and talked about the evolution, meaning we're moving to the cloud, we're trying to deliver and delivering on a hybrid message. We've got new innovative technologies, we're delivering through Bluemix, so on and so forth. They want us and are very much invested in us winning. And that from my background and where we are today and what's coming next, they're all in as well, right? I mean, this is about many companies, and we sat down with the CIO of a large insurance company back in October. He said 2002, 2003, he said open source over my dead body, cloud over my dead body, right? We're now sitting with him and we're re-architecting his platform in a cloud-first methodology. And he's still alive. Oh, he's still alive, yeah, he's still alive. He's over very much in. But it's amazing though, you see that transformation taking place and I think the transformation that IBM's gone through and is going through time and time again, I think we're coming out that other side right now is reflection of the market and our clients. All of our clients, all of the folks in the market, retail, automotive, oil and gas, all of them are going through a transformation where either their business models are being questioned and they're trying to become more efficient or they're looking at different ways in which they can find more value in their employees, right, the people that provide value to their customers. I sat down with a credit company the other day, they have 6,000 people that sit at the front line and interact with their consumers. Today it takes them over two days to be able to give any kind of real-time value to those folks about what products they should be buying, how they should make different decisions about their financial investments in credit. They want to get to a place where they can do it and under a minute, under 30 seconds, so it can be a real-time exchange of information. That's the journey we're taking them on. And that's the world we want to live in. I mean, that's where the culture of data is, we're getting out from under the kind of what religious choices you're making about the repositories and we're getting into what data is actually resident within my enterprise and how can I find a ton of value as a result of that. And what is it worth anything too, at the same time, right? Because, you know, Derek, thank you for the time. We appreciate that. Pleasure to have you here on theCUBE and we'll be back with more here from San Francisco right after this, here on theCUBE.