 from San Jose in the heart of Silicon Valley. It's theCUBE, covering Big Data SV 2016. Now your host, John Furrier and Peter Burris. Okay, welcome back everyone. We are here live in Silicon Valley for Silicon Angles theCUBE, our flagship program. We go out to the events and extract a signal from the noise. We are on the ground live at Big Data SV which is our event that goes on in conjunction with Strata Hadoop. We call it Big Data Week. All the action's happening here inside theCUBE and I'm joining my co-host, Peter Burris, our next guest is Peter Schlamp who's the Vice President of Product at Plattfora. Welcome to theCUBE. Thanks a lot for having me. So we had been on it before. He's the founder talking about Plattfora and then his original vision was to provide great analytics, BI at the speed of business. I think that was the slogan or not but essentially that's what it was. The value proposition was very simple. Make data science for everybody basically. Humanize it. What's the update on the product side? Do you guys feel good about that path? What's the latest on the product and what news are you announcing right now? Sure, so same vision as before. Democratization of Big Data, getting it out into the hands of the business people and we had an announcement here this week. Plattfora 5.2 is available and it's really just extending that same vision that Ben set forward now four and a half years ago. Now one of the things that we've noticed is that over the years, we've built a great product that allows you to go from raw data to answering business questions but a lot of those questions are being answered by a user who we would call a citizen data scientist. So somebody that would be your average information analyst but somebody that's maybe a little bit more technical, a little bit more willing to go out and use a new tool, ask different types of questions. The reality is, is there are tens of thousands and large organizations, even 100,000 plus in the very largest organization of business analysts that will use tools like Excel or Tableau or other BI tools. And what we wanted to do now with this release is be able to extend big data into those people's hands the people that will say, you know, I'm not going to I'm not going to give up Tableau until you pull it out of my cold dead hands. We want to be able to put that data in. Well, they have a lot of cold dead hands because their stock's been taking a beating lately on the market and a lot of people are saying that they're losing some share. Yeah, but that's not necessarily a reflection of the visualization market. I think it's more that their growth has started to slow down. Is that because people are starting to discover that there's some limitations to just being able to slap visualization on top of a very complex stack? Well, I think, you know, I think that Tableau's had, I think, first of all, I think Tableau's great. We're partners with them and part of our release is actually partnering and getting data into Tableau's users' hands. And they've done a great job of doing, you know, opening this world of data discovery where you can just simply slice and dice and drag and drop things into fields and get answers. And they haven't had a lot of competition there. They've had, you know, there's been a couple of companies over the years that have come up and done great things, but all of a sudden companies like Microsoft with Power BI, which has been, I think, a huge success for them and maybe even a surprise to some people have come along and the pricing models have changed a bit too. So I think Tableau is seeing that reality in the marketplace. That said, data discovery is, I think, it's the new Excel. It's the new way that you're going to consume at least small data sets. And, you know, Tableau's done a great job, but you're not answering the hard questions. So the new questions. One of the things that we've learned over the years is companies, when they try to do big data, they better have a use case and that use case better be about trying to get more value than simply asking a regular BI question of the existing data that you have. So many people have been kind of lulled into this or drawn into this idea that, hey, I'm going to be able to replace all of my existing infrastructure and it's going to be less expensive. Reality is it's more expensive because you have to have more technical folks. Hadoop is maturing as we're still seeing here at the show. So what do you want to answer? What are the hard business questions that you want to answer that you couldn't answer in the past? And so that's what platform does, right? We're all about being able to answer the difficult business questions that are really going to drive your business. Is the problem that you solve for the customer the front end, because obviously you have a do-it-yourself kind of self-service model, guys, always that's a sass thing, guys, done great at that. Obviously, the big data discovery is what people want to innovate on. Is the problem in the data behind the UI and UX and what problem do you guys solve for the customer? When they say, hey, we get platform, we're just a great solution, we're very happy. You have those kinds of customers. What's the problem that you're solving? Yeah, well, I think the biggest thing, and it doesn't sound on its surface like it's the hardest technical problem, but trust me, it's a deep technical problem, which is the workflow. How do you go from raw data in files to petabyte scale to visualization and then allow that user who's doing the visualization to iterate on their question? In the past, that took three separate pieces of software, three separate teams, three separate being ETL, putting it into a data warehouse and then the visualization or the building the reports. How do you put that all into a workflow that a single user can do? So that's actually quite a challenge and what Platform does is in a single package, it allows that user to go back, redefine data sets, redo the data preparation if they need to add value by doing things like event series processing or understanding the behavior of events and then even understand users down to a user level and iteratively go through that cycle over and over again. So when companies come to us and they say, we're really happy, we're really successful, it's because we've unlocked an ability to go through this workflow that they've never thought possible. Let me give you just one example. So one of the largest banks in the world is our customer. They had a new regulation put on them because of what happened in 2008 of being able to report their liquidity out to a regulator in Europe. They had to close the books and report their liquidity every quarter and so essentially within 90 days they need to report this to a regulator. They started to do it, they realized that that process was gonna take them about nine months to do. They had 90 days. Platform came in and allowed them to do that in that amount of time. The reason why is with all these data sets coming together, they were able to essentially iteratively ask questions, figure it out, okay, I've got my answer and they were the only bank out of all of the banks in Europe that had to report to this regulator of being able to do that in that amount of time. All right, so take me through the use case of the customer. I wanna engage with the platform. Is it, do I have to just getting into the platform side of it? What connectors, what's the connectors to the data sets? What's the requirements? Is it decoupled from the systems of record or is there a requirement on the platform side to connect to the data? Share with us some of those, that data in our intersection. Yeah, so most customers today come to us and they have a data lake, whether that is Cloudera, Hortonworks, MapR, Amazon, S3, Google, or Azure, and they have put some data into that data lake. From that point, we inspect the file system itself or if you have already described it in a system like Hive or Impala or Redshift, et cetera, we'll be able to connect to those. We basically go from that data directly to visualization. In the process, we build fast in-memory aggregates to allow you to access data very quickly. And is it Hadoop? Is there other data stores for the data lake? Is there a minimum requirement? Is there like a certified set of data pools or data lakes? So it is Hadoop, but at the same time, when you're using it in a cloud environment, you could use it kind of without Hadoop. So if you came to us with Amazon, DynamoDB, for instance, couldn't I do that with that? With a connector. So we do have a connector framework that will allow you to do that, but essentially there is a requirement to have a data lake at the core. I'm still trying to figure out what the hell a data lake is, but that's a whole other story. I hate the word data lake, by the way. I like data ocean better, more dynamic, more currents, different temperatures. We heard from Jerry Held that a data lake is actually just an operational data warehouse that you just dump stuff in, which is some truth. So it sounds like, I think it's a great question, John, that you guys talk about being end-to-end, but that there's still opportunities to incorporate other elements in that end-to-end chain. Are you then the administrator of the entire pipeline, whether it's Hadoop or whether it's something else? Is that really what you're providing is the administration services on the data as it's staging through all the way to visualization? Right, so we are managing the pipeline of going from raw data, but not managing Hadoop and those pieces, but managing that pipeline to go from, you could consider it undeveloped data. It's kind of like a new field that's out there that's kind of raw and you can do something with it. And then the development of the data for people to be able to consume, and that's all the way up to the visualization layer. So the other, and you talked about extending to other pieces, that's the other part of the Platform 5052 news, is in the middle of what we do is we make the data fast in these lenses. We're also announcing Lens Accelerated SQL so that now any external tool can access the data in Platform lenses as well. So whether that's Tableau like we talked about before, Click, MicroStrategy, you name it, you'll be able to access those lenses, make petabyte scale data accessible to those tools as well. So as I'm a user, I'm sitting down looking at a visualization tool, it sounds like what you're saying is the normal way that it worked is I would then go to a data scientist or some other administrator, and I would request a new churn of the data, I'd request a new build or whatever else it might be. And you're saying that through the process of doing the visualization, you can actually kick off administrative events all the way through the pipeline that generate the new data that's required so that it shows up for the users. Is that, if I basically get that right? That's exactly right. So all of that process that took tens of people to do in the past and carefully orchestrated through MapReduce and SparkJobs, we do all of that automatically. So take a huge burden off of you from an operational standpoint to do that. That's right. What kind of client mix do you guys have? Is it mostly big companies? Is it small? What's some of the customer mix? And also a class of user as well, house sophisticated. Yeah, yeah, that's a good question. We have users across many verticals, but we tend to focus on large organizations and the most innovative organizations that are out there. So banking, we have four of the top five banks, HSBC, Citi, Bank of America, Bank of New York, those types of clients. We also have customers in the healthcare industry, UnitedHealth Group, for instance. Retail, Gap, Sears as well as a big customer and we're actually telling their story this week just in actually in a couple minutes on the show floor. And gaming on and on. So they tend to be large companies that have a lot of event data about specific things. And if it's a bank, it's transactions. If it's a retailer, it's clicks on your website or it's advertisements that have hit these users. So that's where we solve problems. And then in terms of users, we tend to focus on the citizen data scientist. I kind of gave you a definition of that person before, but it's that curious information analyst that is willing to go out and ask the deeper questions of data. Not just, could I solve this with my existing BI tool in a data warehouse? How do I find out, what are the segments of customers that are about to buy? How do I answer that question? Because you can't really do that with traditional BI, but you can do that with platform. Share a use case where example, where some anecdotal example where you had a customer just said, oh my God, platform, you guys saved my life or you guys were amazing. What are some of the love letters you've gotten? What are some of the kudos you've got from customer shares, specific example? Sure, well I'll talk about Sears because we're gonna be talking about that on the floor and you guys can, if anybody's out there and on the floor they can come hear more about this. But they're a company who's transformed since the days I used to go to get the Sears wishbook at Christmas and pick up my toys. Now they have thousands of vendors and actually more than thousands, tens of thousands of vendors that are selling through their website. They need to enable information analysts within their business to understand the buying patterns of the products that are being sold on their website. It's not just all their products but they need to hand this off to other people. So how do you do customer segmentation at scale across 400, 500 analysts that are trying to do this? So they love the product. They've been able to completely get rid of their old infrastructure and answer questions that they haven't been able to in the past. Peter, final word here on the show. What are you guys seeing out there this year? I know Platt 4 has been part of almost all the strata dupes since the beginning when Ben prior to starting Platt 4. I remember the first thing we did, he was first strata dupes, the cube was there as well. What's your take on the show this year for the folks who aren't here? What's the vibe? What's the undercurrent? What's the hallway conversation? Like when capsulate kind of the vibe here and what's going on, share with the folks that are watching. You know, I think the show this year, there continues to be a lot of energy. It's actually the biggest. It feels like the biggest one yet. And on the show floor and in the sessions, I see a lot of traffic and people interested in answering questions. There's two things that stand out to me. One is business value. So it's not just about technology. What is the business value that I'm gonna be able to get out of whatever advancement we're talking about? And you see that kind of that thread in every conversation you have. The second one, which I think is really interesting and I think is gonna have the most profound impact on this show over the course of the next three years is the cloud. And literally, not just cloud in general, but the big three players being Google, Microsoft and Amazon and how is their vision for Hadoop or what the technology is behind Hadoop? How is that going to change this industry? Google for the first time has a very large presence here. I think we've seen a lot of interest there. We've seen a lot of our customers that are on existing on-premises Hadoop infrastructure start to think about is there something happening over here? Is there a way for me to do what I'm doing today a lot more easily without the headaches that I've had with Hadoop? And then you're seeing the Hadoop vendors, cloud era especially, begin to really focus on the cloud as well. It's actually making cloud era the right name. They actually had a plan to use that name. That was the original name, the cloud era as upon us. And there's other clouds out there too. You've got IBM heavily invested in Bluemix, Oracle's going all through the cloud. You have VMware trying to figure out their cloud play as well, but you're going to have a lot of these kind of environments where people are going to have more and more data lakes or data warehouses, whatever we call it. So congratulations to the platform here on theCUBE. We'll be back with more live action in Silicon Valley here for big data week at Strata Hadoop plus our big data SV event, special event at theCUBE here at the fair amount. We'll be back with more coverage after this short break.