 Live from New York, extracting the signal from the noise. It's theCUBE, covering RapidMiner Wisdom 2016, brought to you by RapidMiner. Now, your hosts, Dave Vellante and Jeff Brick. Welcome back to New York City, everybody. We're here at RapidMiner Wisdom 16 in the heart of the Big Apple. This is theCUBE and it's our pleasure to have Lars Borela here. He's the Chief Product Officer of RapidMiner. Lars, welcome to theCUBE. Thanks, yeah, Dave, it's nice to be here and I'm looking forward to talk a little bit about RapidMiner. Version seven's announced. Yes, yes, we just- You guys are excited? Very much so. Give us the update. Yeah, yeah, we just announced it today then and the product will be available in a few days for download. And the version seven has a number of really interesting features that we've added to it. You know, all kinds of stuff from- What are your favorites? Yeah, my favorites are actually, if you look at the structure a little bit here, we think a lot about the users themselves and really trying to address a broader audience than we have in the past. This field is so focused on the data scientists, but we're also seeing a budding group of users, the more advanced business analysts or sometimes they're referred to as citizen data scientists. And we have really built this release to address those users. In fact, we've added three things in the product here that really target them. The first thing we've done is overhaul the look and feel of the product, make it look a little bit more modern and also simplify the interface. We've taken a lot of more advanced functions and put them in other places so that the power is still there for the experienced and powerful data scientist or user, but also making it really simple for the starting user or the less technical type of user. So that was one of the key things. Starting out with that. And then the second thing we did is we completely overhauled our learning experience and how you get started with the product. So we've introduced a new getting started where people can learn the initial steps of how to get started with the product and use the basic mechanics of it. And then we've added a really nice set of tutorials. This is actually one of my favorite parts of this new release because it goes through how to use the product, starts out with some very basic elements of how to get your data, how to start to clean it up and apply some of the modeling techniques. And then it progresses. You can learn more about those particular techniques, dig deeper into some of the more advanced functionality. And lastly, there's a third, there's really three sets of these tutorials. And the last one digs into more of the validation, like how can you make sure that your models that you're producing are good? And then also how do you start to move them into deployment and actually use them within the business? So this piece will take a lot of new users up the ramp pretty quickly and get started and productive with the product. So a lot in there, major, major release. Talk about the architecture. Help us sort of conceptualize it if you will and where it's come from. Oh, okay, sure. Here's that journey. Yeah, at the core it's really a desktop tool in a sense. We have what we call a rapid money studio, which is the productivity tool for everybody to use or the workbench you can think of it, where they can get their data, clean it up, develop models and do their work there. So that's the base client of the software. And then we have a server piece too where you can then push down your processes and run them there on bigger data, schedule those types of jobs, run them on a regular basis and so on. There you can also facilitate a good bit of collaboration. You can multiple users use the servers, share routines, share results and so on. And then lastly, that piece also allows you to deploy into other systems. So the server itself then can integrate with business systems like Salesforce.com and you can push in your results. Let's say you do a churn model and you want to score your customers on their propensity to churn or not. You can fully automate that and have the rapid miner server then write the resulting data into Salesforce.com on a regular basis. So now you have the client and the server technology allowing clients here to model and build things up, then run on a big data and deploy through the server. And that client server architecture has basically been there since day one. Is that right? It has, yeah, it has been there since day one. Well, to be, yeah, the full story is really that we started out mainly with the client piece actually. The server came a little bit later as people needed to run bigger jobs. There were groups of people working and really when we got into more of the automation and the deployment pieces, that's when that was added. Okay. And then just a year and a half or so ago we also added a cloud component to this. So just like you can run the server internally to do all these processing things, we also can let you use the cloud. So if you're a user with the client or the studio professional as we call it, you can take the processes you've been building and upload them into the cloud and run them there. So you get all the power of let's say in Amazon there and run big jobs on the data without necessarily having to procure and install a server on premise. And it's pick your cloud? Or you have? Actually, we run it all. We are the front end there. In fact, we run it on the Amazon cloud and that's where these things get spawned up and run. But we put a front end to it all so you really work straight from the client. It's very integrated into the product actually. So you're in the marketplace? Is that right? Yeah, we are not actually in the marketplace. So we- So it's bring your own. Yeah, exactly, yeah. So we run our whole cloud infrastructure. That's really mutual. Exactly, yeah. We run it on their stuff. So you provide a solution. It's kind of powered by Amazon. Exactly, it's powered by the Amazon in the background and then we provide the front end to it all. So your channel will essentially sell that solution that includes the AWS capability all ready to go. All ready to go. All you need to do is actually get the client and from there you can immediately when you create your thing say I want to run it in the cloud then that integrates immediately with the cloud and can upload and run it there. How do you expect version seven to, can you talk about, we hear a lot about the citizen data scientist. Yeah, yeah, yeah. Great marketing term. How do you expect, take us through how version seven will perpetuate that citizen data scientist. Yeah, yeah, great question. You are so, just the way you can start out with the product now, it's so easy to get it in the first place. You come to our website, download it. We offer open source software as well. So people can start for free, right? So that will get them the software and then immediately we take them through with these new features. Hey, here's the first steps to get started due to tutorial. And then in fact, I didn't mention that yet but we also have these templates. Let's say you're into market basket analysis or you want to do some preventive maintenance. We already have pre-built example processes for that and how you would do it. So people can quickly pick those up as well. So if you're not a deep data scientist but a business analyst in some area, maybe you've used some data visualizations tools for a while and a lot of Excel and you want to get a little bit deeper into the data, right? You can pick it up now. And I think these new features will really lead the users well away on their way to learn and pick up and try to apply this on their own. So what kind of feedback do you get from customers when you talk about that concept? And how well do you feel like version seven is going to nail that? Yeah, yeah. We get a lot of good feedback actually from this. We have also performed a number of usability studies with users that have never experienced rapid miner before and we can see a clear improvement in how easy it is to absorb this new version compared to before. And also in talking to customers, there's a great need to have more people be able to do this. So they're very receptive and interested in the possibility to have a broader number of people within their organization being able to do predictive analytics. So do you find that there's a particular function or a particular vertical or a particular type of analysis that lends itself to kind of the starter data, the citizen data analysts? Where does it get started? Who are the guys that have some tips? Actually a lot of pull from the marketing area of companies. So marketing departments are sort of the ones that seem to be doing the most of this today. When we're out talking to customers and where this interest is coming from very strongly, it's in this area. I think those departments have traditionally been underserved or either supported by some very advanced people, but they do have some talented analysts within themselves and they, there's so much to be done there. There's so, you know, a lot of great rewards to be reaped if you can do this. So that's a big pull we see. And I think that's a good place to start. We'll find those type of users there. Another area where we see it too, it's more general, it's actually from the BI user community or the people that support BI systems. You have these experts or power users of Tableau or CliqueTech and many of them are looking for more advanced functionality. They want to go deeper into the data, validate maybe some of the visual trends and things they're seeing in the data, but now how do I know if it's really something, a trend I can trust or so on. So they want to apply more of this type of statistics and modeling to figure those out. So that's a more general and broad audience which are part of this citizen data scientist group, right? So I would say those two pockets is where we see a lot of interest and traction here. How about road map? It's sort of in there. What can we look forward to over the next 12, 18, 24? Yeah, yeah, we have lots of nice stuff coming along too actually. So we're going to continue on this ease of use path that we just started with seven. So definitely continue to improve the usability. The other couple of main areas are in with big data. So we've already done a lot of work with Hadoop systems and being able to run predictive analytics on or in Hadoop. And there we're doing some really cool stuff actually, moving to a place where people can run a lot of the functionality we have in the product today right inside the Hadoop stack. So that will open up a lot of new use cases that were before not possible. There's lots of systems on top of Hadoop coming along, like Spark for example, everybody's using that. That has a set of machine learning libraries and so on, but it's still a limited set of capability. What we will do now is allow you to push down all of RapidMiner into the Hadoop stack and you can run all 1500 operations and that we can support today. So that will open it up. Just follow up on that. I think I heard you say Spark. Yeah. Okay, but you said somewhat limited. And you see sort of two camps, those who have built up Hadoop infrastructure and have the skill sets to bring real time into that Hadoop infrastructure. And then there's the other camp, I'm simplifying obviously, that has never had that skill set, not so much. And looks at Spark as a way to simplify some of that. Yeah, no, you're right, Dave. It's the people that have the technical skill set and can code and go deep in and do whatever they want to almost in those systems, right? But then you have a lot of people that don't have that skill set. They're still analysts, they're even PhDs in statistics, but they might not have had the time and effort to learn this stuff. So now they can use RapidMiner with a completely graphical interface to actually do this stuff, right? We will support all of this with the graphical interface. You can drag and drop all these operators together and actually push it down in there without any coding. So that we are putting some big money on to build out over the next year. And then the third piece is around the cloud. So we've just discussed that a little bit, how we can push down processing in the cloud today. But what we want to do is also provide the interface and the modeling for the end users via the web browser and run everything in the cloud. So that is sort of the third leg of the investments that we're going to do. All right, Lars, simplifying all this complexity. Congratulations on getting version seven out. And thank you very much for coming on theCUBE. Great, thanks. Nice being here. Keep right there, everybody. We'll be back with our next guest right after. This is theCUBE. We're live from RapidMiner Wisdom 16 in New York City. Right back.