 Live from Boston, Massachusetts. Extracting the signal from the noise. It's theCUBE, covering HP Big Data Conference 2015. Brought to you by HP Software. Now your hosts, John Furrier and Dave Vellante. Okay, welcome back everyone. We are live here in Boston, Massachusetts for a special presentation from the CUBE SiliconANGLES flagship program. We go out to the events and extract the signal from the noise. We're here at HP Big Data Conference. Hashtag HP Big Data 2015. I'm John Furrier with SiliconANGLE. My co is Dave Vellante with boogiebond.com. Our next guest is Jack Gutinkoff, VP of Big Data at Platica. Built engineering teams, been involved in a lot of stuff going back to Microsoft, too. To gaming, cutting edge, worked at Twitter. So you're very familiar with social data and gamification and unstructured data. Welcome to theCUBE. Thank you, thank you. So let's get right into it. So, obviously, if you look back at the developer market going back to when we were growing up from the 80s and 90s, very simple. Get a developer kit, you program stuff, you push it out to a server, you let rack and stack, now you got cloud. Data tsunami is here. So a lot of apps are being successfully developed because of the data. Whether it's gaming, I mean certainly Twitter's been data driven. What's going on? And what's the bottom line right now? What's the most important thing happening in data relative to these new apps, these new platforms? That's a great question. I think data's playing a much bigger part than it ever did before. People are making business decisions using lots of data. Twitter, of course, social media. We had a pretty big Vertica cluster there, as you might imagine. In fact, the Vertica was actually, they used Vertica at Twitter before HP actually commercialized it. So they were early adopters. So did Zynga, so did Facebook, so did everybody else, right? Yeah, yeah. Call in their stores, very popular. It turns out it works pretty well for data warehouse. And especially in the gaming industry. So we're in the social casino leader in the gaming industry, we have World Series of Hoker, Slotomania, Bingo Blitz, and a number of properties, about 10, about 1,000 employees worldwide. And it's kind of a velocity and a variety of data with a number of events in a game. But our games are literally, we have a version of our game every week based on data and behavior of the users. So it plays a huge, huge part in our business. And you're from Santa Monica? Yeah, yeah. So you've got Riot Games down there, another gamer, helping, right? Yeah, they're right next door actually in Santa Monica area. So you guys can hear all the coffee, coffee shops talking about data, they do clusters. Gaming more than anything really needs the data. I mean, how do you know? I mean, you keep track of everything and everything has to be tracked, whether it's points, gains, this and that. I mean, it is simulating by the most demanding. Yeah. Low latency requirements. Yeah, that's one of the things that attracted me to it. I mean, of course, Twitter was volume. So that was another challenge, of course, just sifting through the amount of data and of course, Vertica helped us do analytics over it. In gaming, it's a lot of different variety. We have a lot of different behaviors. But really, I think the coolest thing, honestly, is that the user is actually driving the experience. So if they're playing the game and they don't enjoy it, the data tells you that. Or if you roll out a new feature or functionality and they don't like it or in the social, they're chatting about how much they don't like it, you have to change the game. And it's different than you build a game. It's not like field of dreams where you build it and hope they come. You build it, they use it. You look at the data, you do the analytics of the data and then you change the game so that they're happy with the game. And it's very customer-driven. And what about when you hit a blockbuster, I think you're like a reality TV show that's super hot. And how can you use the data to keep it hot, to keep it interesting? Have you had some experiences along those lines? Yeah, around the gaming and such. Absolutely. We'll bring out a new game, for instance, and see how they like it. It's kind of interesting in our game. It's kind of a, I often tell people, so one of our flagship games is Bingo. Actually, it's a social Bingo game. I kind of talk about it as Carmen San Diego meets Bingo and it's a game within a game. You're going down a path and you play the game and you can discover different areas and cities and things like that. So we keep it fresh by not just your traditional, your dobbing, your bingo thing, your sharing. So it's Bingo Blitz. Yeah, Bingo Blitz, exactly. So you added the Blitz piece to conventional Bingo. That's right, yeah. And the social media, and so you can play all the time. We actually ultimately roll up to Caesar's Casino. We're separate from the real-money gaming, we have a, sorry. Separate, quotes separate. Well, no, we're actually very separate because we're free to play gaming, so the gaming regulations works. We're very careful about that, to be perfectly honest. But we do, because we have a tie with Caesar, like Caesar started the whole total reward system. You play at Caesar's and they give you points and you can use it for things like whatever, eating and things like that. We have a similar thing in our games and in a lot of our studios, you can build, you can get points for the total reward system and you can actually use them in Vegas and so we get a continuity of you go, play slots in Caesar's and then you leave, what are you going to do? You can play online in your mobile device. So you take a lot of learnings from physical gaming, brought it to the digital world and you cross-connected. Yeah, and I think the thing that's interesting is the amount of data that we generate. I mean, imagine every slot, spin, every payment, everything that we track to see if the user enjoys the experience, all of that's tracked through our data pipeline. In fact, what I talked about in our current session or one of the sessions that I did was there's a traditional ETL, you extract the data, you transform it, you shape it and then you load it like into a data warehouse in the Vertica. What we built is pointing to a new industry term here calling it the PSTL, you can call Pistol. So it's parallelized streaming of data from like Kafka into Spark into a Hadoop cluster and then we parallelize that and write that into Vertica. So it's kind of an at scale with the volume of data that we deal with. You kind of have to change your ETL model. The good news is that Vertica can handle that volume of data very well and analytics can be run on the back of it. So what's your take on, you mentioned Spark. A lot of discussion, everybody's excited about Spark. We're excited about Spark. We have a lot of internal discussions. Well, there's a little hype. Spark's going to cream Hadoop. A lot of the vendors, the Hadoop guys are like, whoa, slow down. A lot of people are saying, well, Spark, it's early days, it's a couple of years away. You're talking about actually diving by hand. What's your take on this from a practitioner's perspective? Yeah, I have to support the systems that I build. So I'm very careful about what we do. We trust what you say, right? If you're saying it's a couple of years off, we believe you, is it ready for prime time? Yeah, that's ready for prime time. You're using it today, what do you think of Spark? I love Spark. I mean, if you think of the traditional map, I filter, I write to disk, I read, I do lather, rinse and repeat at large volumes of data, especially unstructured data. A horrible stack, we call it. Well, I mean, when you have a lot of jobs running, like we did at Twitter, it was a huge Hadoop stack. Reading and writing to disk all the time, as opposed to a model where you put in the data warehouse and with Vertica with massive parallel processing of reading, the nice part about Spark is it's sort of a better map reduce model and you can have, think of it as like tables in memory across your entire cluster. You can do parallelism of processing the data. We use it for doing the transform and importing the data and then we do massive parallel processing in the Vertica, which is pretty new. I talked about the conference, the data session on that. We'll be talking about it a lot more and giving guidance, but I think Spark is great. I think the confusion is people try to make tools and systems behave in ways they weren't designed for, right? Vertica does a great job as a relational model and doing analytics and joins and it performs way better than any other solution, because it's a columnar store. When you try to do a data warehouse like that in Hadoop that wasn't designed for that, it's not going to do well. So stop trying to make it be something that it wasn't and people did try to do it and they put Hive on it so they had SQL semantics, which is great as a language in consuming the data, but at the end of the day it was running MapRedd's job. Spark has come along, and by the way, it's its fifth year anniversary, so it's not as young as it used to be. We're at the Spark Summit and JPL, a lot of large companies are at Yahoo is using it in a pretty big way, so it's fairly robust. It's just, I think we need to be clear that it's not the replacement for the data warehouse. So I got to ask you guys, you brought Spark, so I've been all over this all day today because it's become clear from all my interviews in theCUBE that we've been trying to tease out Hadoop versus Spark, one's dying, one's winning, and trying to separate the hype. And so it's interesting, when things are dying, they're adopted. So when they're dying, they're usually adopted, right? So when people think something's dead, the hype is dead, but what we're finding is Hadoop is actually in production. We had yellow pages on earlier, he's like, hey, you know, I'm in production. It better not be dead. So the question of Hadoop as an ecosystem, maybe vendors will come and go, but for the most part, Hadoop seems to be solid. Oh, absolutely. I mean, Hadoop's going to be around for another generation. I think that- So is Hadoop dead? No, there's no short answer. I think that if you think of Hadoop as two things, there's the storing of data in HDFS, basically just think of it as writing a bunch of data on disk, on a bunch of disks. And then there's running the jobs to process it, the MapReduce jobs. The MapReduce jobs and the legacy of data that's there will always be there for like large batch operations. Spark is just a niche. No, it's not a niche, I wouldn't call it a niche at all. No, I would call it a, in some ways, it took all the lessons learned and some of the things that were painful in a MapReduce world, where you always read something and write to disk, read, write, read, write, it's just not as performant, and try to bring more of it into memory to do the processing. So I think they'll coexist, just like Vertica will coexist with Spark. So let me ask it differently. Does Hadoop need Spark more than Spark needs Hadoop? Does Hadoop need Spark? That's kind of a hard one to answer because in some respects, I think Spark will, be a better solution than some of the traditional Hadoop jobs that were run. Does Spark need Hadoop, meaning typically you'll have your Hadoop installation and now you can also run Spark kind of because it exists? So in some ways, it needs Hadoop. It definitely needs it for the storage system and for where's my data located. So Spark definitely needs HDFS, a part of Hadoop for sure. Does Hadoop need Spark? I think in some ways it does because they'll coexist the big batch jobs that you already have, and now with other scenarios and other things you can do with Spark, I think they kind of need each other. That's why we love talking to practitioners, right? Cut to the hype. I don't know if you heard Stonebreaker the other day. Yeah, right. He's my idol. That was my bucket. Number one on my bucket list was to meet him. And I told him about my talk and he asked me and he said, well, tell me what your architecture is. I told him about pistol and parallel streaming, transformation loader all the way through. And he looked at me and he said, you're doing the right thing. So I just ran away because once he tells you, it's on the right. But it's amazing. And his key note, he threw everybody who has anything to do with marketing under the bus. Then he threw all the geeks in the audience under the bus and everybody said, we love him. Yeah, that was great. He just tells it like it is. He makes you think. I love, he did a talk one time. It was called, everything that they taught you in school about relational database management systems is wrong. And he did the talk as an invited speaker to some grad students or something. So I can just imagine the look of the teacher's face to say, wait a minute, I'm teaching these. Who invited this guy? He just says it like it is. Like Hadoop is good for nothing. But what he means is not good for the things that people are trying to make it. Once your tool is a hammer and you start to think everything looks like a nail, I think that's where things go wrong. But like Bill from YP, he was on a panel that we were on. He absolutely uses it in the right scenarios. And he's starting to use Spark as well for different use cases. Which I think for in-memory distributed datasets and processing memory, kind of think of it as like in-memory tables and doing transforms on it and then getting it into Vertica from there, it's a great solution. I think it helps Vertica a lot because it gets more data in. Yeah and then they can perform and use their performance levels and it's a coexistence. I totally agree. I think there's a lot of coexistence. We're moving from this old mutually exclusive role. I have this, therefore I can't use that. Seeing a lot of startups use Vertica, I mean venture-backed startups. It's not a zero-sum game as Bill Clinton would say. Yeah, I mean when you have a Tier 1 BC investing real money in a startup, ex-Google, ex-Apple, these kinds of pedigree, they know their stuff and then they have experience of scar tissue. Again, they're building the future because they're living it. They're using OEMing Vertica. That never would have happened 10 years ago. I agree. I think in our model, so we ingest data into Kafka. It's a great model. Kafka is very robust. It's being used by LinkedIn. It's been used by Twitter at scale. So as you're streaming in data in parallel, we use Spark into Hadoop. We don't use Hadoop to run map-reduced jobs, to be honest. We use Spark to ingest that data in parallel. So it's the HTFS who's storing it on? Exactly. For long-term cold storage, it's going to be slower than Vertica, so we're going to put old stuff there. That's okay. We transform it in memory and in parallel and then we write it into Vertica and parallel, which is kind of neat that we're sharing some new information about how to write into Vertica and parallel and Vertica announced that they're going to make the hash available that makes all this happen. It's kind of more on my GitHub wiki about it. But having the data warehouse and having that entire pipeline and be able to process the data, there's a total coexistence from the kind of queuing the streaming to the doing stuff in memory for transactions and then putting it into Vertica for analytics. And so, yeah, there's an absolute. What's happening to the analytics piece of that pipeline? A lot of the analytics is highly customized today. Colin talked about the ERP days that used to be highly customized and it became packaged. Will analytics and big data analytics go the same route so I think the field of analytics and analysis and business intelligence as we know it and going towards more of the data science is literally just at the infancy, even though we've done so much with it. We look at analytics across the entire pipeline. So traditionally you're going to do some analytics in your data warehouse in Vertica, but if you have all data you want to merge with that, so you're going to do some of it more kind of in our sort of middle tier in this kind of Spark HDFS model, but also as it's real time streaming. So imagine this, you'll do models and learning using all of your Vertica data, right? And all of your old data that you do, user behavior, et cetera. And then what we want to do is use that to get as close to the user experience as possible. So as the data is coming in from the game and they're streaming, we can see what the behavior is, compare it to the models from machine learning and actually then feed it back to the game as near real time as possible so we can even shorten up the loop from one week of turnaround to actually dynamically changing the game based on what they like. That's the field of predictive analytics and taking it to the next stage. That's as close to real time as you can get in gaming. And that's what we're trying to do with the analytics is get more predictive analytics, more prescriptive analytics, and that field is just exploding and the amount of data that processes. And much tighter to the app. Absolutely, and it's all about the user and the user behavior and that's what we're pushing that envelope on in the game industry. And again, the feedback is critical for you guys because it's either they leave the game and you can see where they drop out too. It's like, just one spot, people are dropping off. Absolutely, in fact, there's a whole game economy around this because it's virtual currency. So imagine like the U.S. economy, we have inflation and deflation. Imagine that you want all the money and then there's no money for you to be able to play with, it's game over for us as a company. So to manage that as well. Deflation going on, let's change the game. Exactly, we can't just print more currency. It's not like the U.S. government. I told my son is now 20. Multiplayer gaming is really going to be the future of work. If you look at simulations of gaming, it truly does represent, I mean, it's a virtual space. It's not virtual reality like Oculus Rift or Second Life, but it's virtual. There are a lot of parallels, yeah. In fact, at the keynote, Poppy talked about the parallels of gaming and user behavior and training each and life in general. And yeah, there's a lot of, I find it fascinating. That's what brought me to the gaming. Jeff, thanks, Jack. Thanks so much for coming on. Really appreciate it. Give you the final word. Share with the folks out there this event because I mean, we've been here for three years, so we've been saying it, but in your own words, what's going on here? What's so special about this event? I was just telling Judy, who is kind of one of the coordinators here. I was here last year, been to a lot of events, like Park Summit and many, many, many other events, mostly technical events, of course. By far and away, just being honest, this is the best event that I've ever been to. I came back again this year because I wanted to speak about the PSTL architecture, of course, and Vertica. You can't, there aren't events like this where you can actually talk to developers. I mean, that's why we kept, the price of admission, that's worth it. You can literally talk to the Vertica developers, get the internal information that you just can't get anywhere else, and you can tell them your problems, you can, we're working on things with so many other teams. Actually, the Vertica Spark connector, we're working with Vertica on that as well, so to be able to have this opportunity to actually share, and of course with people of like-minded people in the big data space. No heavy sales, no scripts, all authentic. Well, the thing, I'll tell you what, the thing I love as well is that I was on a panel and HP and Vertica are perfectly fine with us just saying it like it is. These are the problems that we had in your product and they're like, fair enough, we'll work on that. So, it's not a, it's just unfiltered, like Stonebreaker. Yeah, that's why I love it, right? It's just like, it goes back to the roots. I spent nine years at HP back in the 80s and 90s when Bill and Dave were, I remember meeting them at one of my employee orientation. It's in their DNA, in fact, their product guide was like an engineer manual back in the day. They were a very engineering focused company. Yeah, it shows. And to me, it's back to the roots for them and I think this is a shiny example of the HP way, Dave. So, it's really, congratulations to Vertica. So thanks for coming on theCUBE to sharing the insight on what's going on in the gaming and the architecture. We love peeking under the hood because, and certainly gaming's hot. Thanks so much. It's a cute live in Boston. We'll be back after this short break.