 from the Galvanized Campus in San Francisco. It's theCUBE, covering Apache SparkMaker Community Event brought to you by IBM. Now, here are your hosts, John Walls and George Gilbert. And good afternoon, everyone. I'm John Walls here on theCUBE. Thanks for joining us for Spark Week here in San Francisco. We're at the Galvanized Campus here in San Francisco. Our hosts for the days events here for the IBM sponsored event, the Apache SparkMaker Community Event going on all day and into tonight. And with me is my partner in crime for the day and actually for the week, George. George Gilbert with Wikibon and George, good to see you. Good to be with you, John. Good to be together. Well, tell me, I mean, you've been a part of this, at least observing the Spark community now for the better part of a couple of years and you've seen this maturation process go on. What are you looking forward to delving into this week? Well, it's funny when you call it a maturation process, I'm almost tempted to think of one of those cartoons when like a little snowball starts heading downhill and it turns into this mass that's kind of creating destruction and everything in its path. Spark is maybe the wrong word, metastasized rather quickly into a force that's reshaping data management or analytics much faster even than the folks behind it have thought. And I think this year and some of the announcements we're gonna see this week is a continuation of that. There's other things down the road that need to be fixed but there's a lot that's going right now. It may be expanded. How about that? All right. Instead of metastasized. Yeah, with us to talk about what's going on here today at Galvanized is Jim Kobielus who is the, I love your title by the way, that data science evangelism. I got a long, real title but that's what I was gonna name you. I like that, Jim. So he is the data science evangelist at IBM and Jim of thank you for being with us here. Tell us about, I know we have a great lineup that you put together of some IBM folks and also some partners with whom you do a lot of business interactions but who do we have coming on this week and today and what are you looking forward to the most about this day as far as the IBM perspective is concerned? Right, today what we're at here at Galvanized is the Apache Spark Maker Community Event and hosted by IBM and our partners. What we're doing is we're bringing together the open analytics community around the evolving open source stack as it were to enable data scientists to be more productive. The evolving stack includes Spark but also our language is so important in terms of programming data science applications. So we at this event, we have some announcements. We have partners such as H2O and LifeBand as well as RStudio here and in fact several representatives from those companies will be on the cube following me. Right after me, right after I'm off, Derek Shuddle, our general manager in IBM Analytics for Cloud Data Services will talk about IBM's deep and ongoing investment in all things Spark as well as R and the broader open source stack including Hadoop for data science and really to enable teams of data scientists, data engineers and analysts and others to build data applications better and faster. So that the Apache Spark Maker Community Event is all day today and into the night. We're gonna have the main stage sessions tonight starting at 6.30 Pacific time and we invite everybody to tune into the live stream ongoing throughout the day because we'll have important announcements, we'll have a panel discussion tonight. We have invited customers and partners to talk about the trends and what they're seeing and we have going on into the evening, we'll have a hall of innovation, demonstrations of the power of Spark and R in terms of addressing business problems and enabling new kinds of disruptive solutions. So following me, like I said, let me Derek Shuddle following him will be Ritika Gunnar who is our vice president at IBM for offering management. She'll be discussing some of the announcements we're making this week, not only here at this event but also what we'll be doing at the Spark Summit which is going on right here in San Francisco several blocks away. And then we'll have Joel Horowitz and Rob Thomas our vice president for product development to discuss various aspects of our strategy and the degree of adoption that we're seeing for Spark with our customers. So we've got a full slate of activities and experts and specialists to talk about the power of open data science. Yeah, you were talking about and I want you to give away the store by any means but about announcements. And I think just the very presence of IBM with this event and then with the announcements that I know are coming down the pipeline and with the series investment that the company has put into developing the Spark community. It says a lot does not about your confidence going forward with what you think is you think of a very strategic position for you to be in. Yeah, adoption of Spark is going through the roof. We're fighting with our customers all over the world and all industries. And we've invested deeply in R&D on all things Spark. The Spark Technology Center is based here in San Francisco. We've got hundreds of developers, data scientists, building disruptive applications. Our customers we're doing, we've been investing heavily in training through partnerships like Galvanize to train the next generation of application developers who at their very heart need to master the techniques and the tools for data science, building and tuning machine learning models and deploying them deep learning into applications of all sorts. So we're making heavy investments in all things training because we see the future is all focused on statistical analysis, machine learning and so forth. So with that said, I'd like to give it over to Derek Shuddle to discuss our overall strategy and what we're seeing. Just to give you a quick question. Sure, I'll take one question. One question. When you talk about data scientists and enabling the next gen, you know, of those on open source tools and platforms, how would you distinguish between developers of data science applications today and developers of traditional enterprise applications 10, 15, 20 years ago? Well, yeah, we are now in the cognitive era and what that refers to is the fact that more and more of the execution logic for your business applications doesn't need to be hard programmed, hard wired into the code, but rather it can be extracted from the data through the power of machine learning. So more of the execution logic is statistical and probabilistic. It's in your program through R and other languages, Python and so forth. So, you know, coding in Java and C++ and COBOL hasn't gone away, but it's been extended greatly in terms of the flexibility of what you can enable now in terms of application development through statistical analysis, machine learning, predictive modeling, stream computing and so forth. It can be made more dynamic and adaptive and ever before learn from the data from fresh feeds of all types of data, geospatial, social, environmental. We invested, of course. We acquired the digital assets of the weather company because environmental data is critically important for driving contextualized guidance to all manner of applications, including the next generation of vehicular devices, self-driving cars and whatnot, because we see deep learning, machine learning and all that as the heart and soul in terms of the intelligence of this new generation. The stage, I think, for the rest of the week actually, which is how to distinguish between why are we entering this new era, how it's different, you know, and what came before for decades. So, this was a good one. A hundred years ago when IBM started, it was all hardwired literally into the hardware and then the development became more focused on software programming. And now development is really all of that, but really it's all machine learning, statistical modeling, data science and all of its manifold glory and sophistication. And that's really it. We are excited to be here because it is a brave new world and the steps that this community has already taken are impressive, but when you look at what's coming down the pipeline, you realize that the possibilities are endless and so, Jim, thanks for the time. Glad to be here. Look forward to talking to the rest of the IBM folks as well and to the general session tonight. And I'll be tuning into your stream as well. I'm enjoying having you guys out here like always. Outstanding, great. A lot more coming up here from San Francisco on theCUBE right after this.