 Live from Boston, Massachusetts, it's theCUBE at the HP Vertica Big Data Conference 2014. Brought to you by HP. With your hosts, John Furrier and Dave Vellante. Okay, welcome back and when we hear live in Boston, Massachusetts for Silicon Angles, theCUBE, our flagship program, we go out to the events and extract a signal from the noise. I'm John Furrier, the co-founder of Silicon Angle. And I'm here with two distinguished engineers from Zynga, Yoko Yamazaki, general manager of analytics, head of analysts at Zynga, and Joanne Ho, engineering manager at Zynga. Welcome to theCUBE. Thank you. Thanks for stopping by. We do go out to the events. We get to live action. It's like a ESPN of tech, some say, but we go out and talk to all the tech athletes. And we love gaming because gaming is real time, loyal audience, and if something's wrong, you didn't know about it because you either get angry, flamed mail, or you'll lose customers, right? So data is critical. So I got to ask the first question as head of analytics, what's your world like? What goes on? What's the strategy? Collect everything? Give us a quick overview of how you guys approach the data given the gaming market. You know, so our team, our next team has been here for five years now. It's actually not as challenging as it sounds like. We've got the schema, we've got the database, we've got end-to-end service, you know, visualization, experiment, just like everything is already set up that we, and in the way that we set it up, we can actually launch as many games as the company wants, and we don't have to change anything. So it's actually not that like crazy difficult. So horizontally scalable, but you can repurpose for new games, is that what you're saying? And scale's been no problem at all? No problem at all, and there's no additional table schemas technology that we have to build for a new game. That's fantastic. So okay, so we'll just get this on the record. It's an easy job, and everything's working great, it's scaling beautifully. What are you doing on your free time, sailing in the bay? What's going on? What's going on? You're in the trenches. What's the day-to-day like for you? Well, I managed the Vertica database in Zynga, and also in charge of the ETL process. So we do have a lot of data, day-in, day-out. One thing that I want to echo about from Yuko comment earlier is that we actually standardize our taxonomy for all games, across games. So for critical data, we have a standardized taxonomy, and then as a first tier. And then we also let game to lock their own game-specific tracking into our Vertica database. So we load all the game activities, user activities into our Vertica database, and we can do analyzations at a later time. So standardization seems to be a big theme these days. So first question. What kind of data can you share, data volumes with us, and it sounds like, I mean, kind of imagine it must be massive. What's kind of some of the numbers? In average, we load about 50 billion records per day. Yeah. B, billion. Billion. Okay, billion. That's a lot. Okay, so we got a lot of records. Petabytes, any kind of petabyte numbers? Is it not a big data? Is it a little data, or is it just transactional? It's more, it's more like a transactional data. Yeah. The entire database is about 500 terabytes compressed with about eight to nine. Okay, so it's not hugely like, you know, big volume, but a lot of activity. Yeah. Okay. Yeah, so I think the one of the biggest success stories for us is we are allowing them to, we have PMs, product managers, and engineers to decide what they want to analyze first. So that really creates the data culture, and they start thinking about what is it that we need to log for us to be able to analyze and move the business value. So we don't just capture everything that happens, and then decide, we let them decide first, and then click to what makes sense. So they're like the data wranglers. They set the agenda for the data strategies, because they're thinking about the product in mind, user experience, feedback, interaction, all the above, so they kind of look at the data from that perspective. Yeah. So the same design, they're starting to think about like what is it that we need to collect the data, what is it that we need to analyze and so forth. So the taxonomy sounds like a big innovation. When did that come about? So how did that happen? Just you saying, hey, we're pulling our hair out, we're hacking our way to success, we have to do something. Was it a redesign? Was it more thinking about the future? How did we get to the standard taxonomy? Give some color, some detail around that whole decision. So the base of like we call tier one standardization of taxonomies, that's been developed long time ago, and the data has been collected from every single game for those data. But as we notice any success with a particular metric that a product team has logged, that makes sense to standardize, we continue to improve that. What's the biggest challenge looking back now? You're smiling, you're feeling good, you're happy, because not a lot of people have standards that taxonomy, you guys are really ahead of the curve. What was the biggest challenge you found getting here? Because there's a lot of pressure, the games are significant, there's a lot of things going on, user accounts and whatnot, and you want to stay successful, so what was the biggest challenge? Thanks to me, I think, so as I mentioned about we allow anyone to see the data, we allow anyone to do anything with the data that does create sometimes a challenge of not experiment, not being set up correctly, or a particular PM saying this is what's been done from this experiment versus that actually wasn't the right way, for example. So it does create a little bit of not being perfect in terms of the data that's been measured, for example. What's the philosophy on the data? Because I asked the VP of Engineering at Vertica, what should we think about first, the data or the database? The old days when we were doing database work, it was got to get the database, once we decide the database, then we can move to the next step. Now it's a different issue, unstructured data is a big part of it, schemas are great to have that in place, but you got to deal with the structured and unstructured data, how do you guys deal with that? So I think first we need to look at our use cases, understanding the business requirement first, and then we will design the data that we want to track, and then we can decide on the taxonomy, like the schema later on. But for most of the Zynga data that we love, we are pretty in a structure format. So talk about the team, how big is the team? Where you guys locate? San Francisco, obviously the headquarters is in SF. Anywhere else, other places, do you guys have your core team there? How big is it? Can you share? Yeah, I think most of the employees are in headquarters. In San Francisco. What's the culture like? Are they data nerds? Are they geeks? Are they getting in the data? I mean, it's like data conference, we're all data geeks. What's the culture like? People have like big data hackathons, or is there, what are you guys, Zynga is a really dynamic company. So what's it like there? Definitely very, very data driven culture. So even product managers are very technical, they understand data, they understand how to set up experiments, how to make personalized game using data. So yeah, it's definitely a data geek. So I have to ask you ladies something, because this has really kind of been something we talked about yesterday is growth hacking. Well, we talked about the product CEO kind of role, the role of product manager, really important. Growth hacking has become a big part of social, online stuff. Data plays a big role in growth hacking. To do it right, growth hacking is a dangerous game, because if you growth hack and you make users angry, you actually get backfire. So there's been a lot of companies that have done it wrong and right. Some have done it really extremely well, elegantly. Some have failed. AB testing seems to be weird. So how do you look at the data aspect to do growth hacking, right? Meaning growth hacking to get more users in a way that's elegant and relevant. What's your take on that as analytics? You guys have a formula? Do you have an opinion on this? No? So something that we developed in the past is some personization model to enhance user experience. One of the really good model that we have developed is how to install, how to increase the install of new games. So we look at the current user's playing pattern, where you look at like 10 metrics of the current user, for example, user's engagement or the payment pattern. So you can see the fatigue pattern. You can see users that are on the brink of maybe kind of bored, maybe an incentive. We also look at your friend's playing pattern, so that we can predict how likely you will be installed that game, so that we will introduce a game that you will be interested in. Yeah, so social graph stuff around the games and overlays on the affinity data between certain features too as well? Yeah. So we're very focused on user satisfaction and how to enhance your experience while playing our games. Well I think you guys are awesome. Thanks for coming on the queue. I want to ask you one last question. But Zynga is a great company, we've been following the success, the rise I was there when Mark Pincus was, this is Nome Dex, it's like 2000 and something, he was just started doing the Texas Hold'em and then I heard him talking about it, it was like riffing on and then it became a huge success. Congratulations. But you had a lot of chances to do some things from a clean sheet of paper. So I want to ask you a final question, each of you to share something that we don't know about Zynga that surprised you with the data, something that's happened with the game that was data driven that was an aha, something that surprised you that was positive that was really an aha moment for you guys, share an experience that you had. Yeah so we've launched a lot of games in the last five years and we definitely have noticed using the data that people, there is a pattern from one game to another. It's not like this only happens in this particular game and there are a lot of things that we can repeat as a success. So our central analytics has been a success with that. I think there is a lot of learnings that you can get from one to another that we continue to improve. Joanne, anything that surprised you that you didn't expect to come out of the data with around the games? Well I think one thing that I'm very proud of for my company is that we standardizing how to lock our data into one centralized place which is very easy to manage and also we can increase adoptions across the company which is a really good thing. And also position yourself for deploying new games without rebuilding, tooling up. Because we have a centralized place of storing the data and then user can access the same data, they can even look at other games data to cross train each other or even new game they can just learn from existing successful games also. You guys are awesome. We can come on theCUBE anytime. We have our Palo Alto office. We have CUBE conversations anytime you guys want to explore more. Love to drill down. Joanne, thank you very much for coming on theCUBE. This is theCUBE. We're talking about analytics from the leaders at Zynga who have created a great model. Standardized taxonomy, rebuilding capabilities with new games and awesome analytics. Thanks for joining us. This is theCUBE. We'll be right back live in Boston after the short break.