 Okay, we're back here live in Boston, Massachusetts. This is HP's Vertical Users Conference. This is SiliconANGLE and Wikibon's the Cube, our flagship program where we go out to the events, extract the signal from the noise. I'm John Furrier, the founder of SiliconANGLE. I'm joined by my co-host. Hi, everybody, I'm Dave Vellante at wikibon.org. Peter Fishman is here. He's the director of analytics at Yammer, obviously part of Microsoft now. Pete, welcome. Thank you very much. Good to see you. Yes. So Yammer, you guys have the acquisition. You guys were startup, home run, great company. Thank you. E Enterprise 2.0, the E Enterprise, whatever people call it these days, but obviously an important part of the collaboration is connecting to social data and collaboration. And so I got to ask you, that's a big data problem and so it's a platform. But now you guys work for Microsoft, you're still using, kind of separate. I mean, give us the update on Yammer. What are you guys at right now in terms of the Microsoft? Are you integrated in? Are you still being left alone? Obviously we think that we'd love for Yammer to have as big of an impact on people's working lives as possible. Microsoft has an incredible reach. Working with all the teams from Office gives you a tremendous opportunity. We think that social is going to be a big part of the story going forward and we think in terms of enterprise social, Yammer is a tremendous offering. You know, we were talking on our intro this morning about obviously Vertica was an acquisition with HP and you know, things change. You get the wind at your back and you're a little bit more muscle, huge install base to work with. I'm sure it's probably changed your game significantly. But with Vertica and respect to HP also there's been a great relationship with Microsoft and HP. You guys still have been on the front end of a cutting edge market and that is social collaboration, social networks within different distinct user groups. It's a different channel. It's not a clean sheet of paper. You got different legacy architectures to deal with. So there's a lot of talk about different data sources so you must have a lot of experience with that. So tell us, what's the state of that market right now because you know, some say it's been stalled, some say it's just hasn't even been breaking out yet. So what's your view on that? You know, I think that obviously, you know, when you're talking about an integration, you're talking about a lot of different challenges among them data, you know, when you're sort of in control of your entire pipeline, you're able to make sort of the decisions that make the most sense for you. Also, when you're talking about the scale of a startup, you're projecting forward how big you're going to be, but you're not necessarily there yet. Whereas obviously, the scale of Microsoft is just sort of unthinkable on some level. There's a lot of data sources out there. You have to reroll into any enterprise, right? You still have challenges on the data side. Sure, you know, and you know, we, you know, as an analytics team want to touch the entire business. So we want to go sort of horizontally across the entire business, not just affect sort of the product, not just sort of well instrument our software and then use those pieces of data, but also you want to pull in things like data from your CRM platform dynamics for us. You're going to want to pull in data from your marketing platform. So, you know, a number of different data challenges come up as you start to take on different data sources. And of course, you get a lot of the real value once you start combining a number of the different sources. So talk about the big data strategy for you guys on the analytics side. Analytics, you have Tableau here, it's really kind of got crowded. They went public, got the visualizations important. And we had the spill games on earlier, talking about the use cases, how the use data create the user experience. You guys kind of have to do a little bit of both and all of the above. What is the philosophy and what are some of the challenges you have right now that you're innovating around? Well, you know, a lot of our philosophy is around sort of cheapness in the sense of we want to deliver insights cheaply in terms of the actual economics of it, but sort of simply put, we don't go too deep into a lot of our models or visualizations. These are what gets you the answer in the sort of simplest fashion. So we have a lot of really like high firepower brains on the team and we asked them to deliver insights in basically a simple way possible. So not to show off their brains by being smart, but rather by doing simple things like counting. So I think that obviously we have a lot of homebrew in our front end. So we've built our own sort of front end tools which have become a big part of how some groups at Microsoft are doing some of their analysis. So that's really been exciting for us to extend our tool chain. Our tool chain is all about our workflow about how you can use sort of our sort of centralized analytics team to deliver insights. So we've built a nice clever front end that allows us to very trivially distribute the SQL queries that our analysts will write in a parameterizable way such that all of the end users can use that, get insights, improve upon the business. So I gotta ask you, Dave and I were talking we're looking at the list of interviews we're going to do this week and look at your LinkedIn profile and you have a variety of experience. You did an internship with the Eagles. Obviously we're Patriots fans, a little bit of Eagle Patriots things going on today. But you know, you obviously had PhD, you're a stats guy, right? So you got to like Billy Bean coming into the keynote today, right? I mean, that big data money ball thing. I mean, what's your take on that? Yeah, so obviously very exciting to be at the same conference as Billy Bean, sort of pioneer in really using data effectively within sports. After I finished my PhD, I did work in the front office of the Eagles for the GM, working on sort of big data in the, well, it was, the funny part is that we would call that big data back then. But you know, when I did my dissertation, if you had a data set with hundreds of thousands of observations, that was big data. If I don't see that nearly every second, there's something sort of weird going on. So, you know, with, you know, this being a conference and about data, you know, there's a lot of, you know, back in the day, it was very traditionally in certain silos, but you know, obviously the sports silo has sort of erupted. I obviously did an internship with Philadelphia where we were very closely looking at all the prospects and using sort of all of their physical attributes to build models around how likely we thought they would succeed. And stats is huge now, with fantasy on the consumer side and now with Moneyball kind of being the, you know, the domino that kicks in the new method that everyone's been using data as a competitive advantage. Obviously that's pretty straightforward, but also business is now the same issue, right? So how do you thread that? Obviously sports, obvious example, this talent acquisition, this performance on the field, you can really make that case for the business as well as talent in terms of customer acquisition, personnel and also business performance, value proposition, right? So how do you look at that transition? Well, absolutely. That's what we're all here for. We're all here for getting a competitive advantage and we think that data is really critical to that. You know, I think back on the Yammer founders, David Sacks and Adam Pizzoni, and they had this vision that they wanted to develop B2B software just like consumer software. And you know, in part of consumer development, you see a lot of data-driven development. So it's to figure out kind of what your customers are actually doing, don't hypothesize about it. Figure out what it is that they're using about your product, what's effective about your product, which features are working and not working within your product. And you know, we've been in Boston for the past few days and we've got Billy Bean, speaking, which gives me sort of liberty to use a baseball analogy, which is, you know, it's sort of well-known in software development that great sort of hall of fame product managers are sort of like hall of fame baseball players, which is they're getting it right about a third of the time. So, you know, what that means is if you can marginally improve on that, if you can essentially spend fewer cycles on what's not working. And you know, I think that we like to, we do, you know, pretty much every feature that we release at Yammer is split tested. So it's A, B tested. And obviously we keep the ones that prove out to be valuable. And this has sort of the direct impact of we know what we release works, but there's sort of indirect impacts, which are it allows for a feedback loop to your product managers and engineers. To let them know what is effective. It also sort of serves the main, sort of the big heroic action is that it, it sort of deters us from spending a lot of cycles on things that don't prove out to be valuable. You know, I make the analogy often to a poker player. So like, you know, in poker, the sort of amateurs and novices remember their hands, which they've won big pots, maybe through luck or maybe through supposed skill, whereas the pros sort of brag about the hands that they were able to get away from. The hands that they didn't lose too much money on when they were potentially stuck. And sort of taking that type of philosophy to your product development is going to give you, I think a real competitive advantage over the long run. So do you look at, so you talk about A, B testing, do you look at ways in which you can actually increase the performance of your A, B testing and predict what's going to be, I mean, to the point of not doing things that are going to fail, do you try to do that? Or do you say it's so easy to A, B test, we could just, you know, iterate those very quickly? Yeah, so I think our job is to make A, B testing as cheap as possible. Again, sort of getting back to the theme of making sort of distributed analytics very, very cheap. So that means, obviously the first part is that, one, anyone can do it in the sense of we want to build tools that enable anybody at the company to assess how effective a feature is. That's a value statement more than it is a cheap statement. But yeah, but if it's expensive, nobody's going to adopt it and you're not going to get any value. Sure, exactly. And then, I guess part of that is you don't need a team of statisticians to tell you whether something is statistically significant. You just need to know what the actual, in this case, p-value is for your test and then the sort of person that owns that product decision can sort of act with confidence or be data-driven there. And what gives you confidence? I asked this question earlier, but I'd like to hear your comment, Peter. What gives you confidence that data will give a competitive advantage that's sustainable? We're hoping to have Billy Bean on it. I want to ask him, do you have a sustainable competitive advantage with money ballers? Everybody else? I think so, certainly I could imagine a world where the competitive advantages become more and more challenging to find in baseball that it may have been very easy to identify where the sort of mispricings were on the offensive side of the ball because that's easily identifiable, whereas on the defensive side of the ball that maybe is a much more challenging data problem. So you might have to start sort of leaving green fields and going into sort of a little bit more difficult pastures. I think for us, how do we know that it's working? It's always weird to ask the measurer, how do you measure yourself? But I think the answer is a couple things that, one, the services are in demand. I think that the most important sort of, we're a big believer in revealed preference that people actually want to use your tools want to take advantage of the things that you put your energies and efforts into and believe are having an effect. You can notice the sort of lifts associated with the tests that you were part of, but that's not really a proof statement because there's no sort of alternative universe where you get to see it sort of played out both ways. So it's very hard to actually measure one's impact, but there are some domains like marketing where you can actually look at your customer acquisition costs and you can see how different optimizations and investing more in a certain type are going to actually have meaningful impacts on your bottom line. So talk a little bit about how you use Vertica. I mean, give us, paint a picture of your environment and your architecture. Sure, so we've been Vertica customers for a long time and the story that I've been telling at this conference is that when we were a startup, we started out on a half a terabyte license and then they asked me to predict in a year how much we would need. And I had sort of a low, medium, high and then ridiculously high estimate. And of course, my estimates were terrible and we ended up sort of being an order of magnitude above my ridiculously high estimate. What you find with data is that it's very addictive. You find that if you're able to sort of start small and get a customer base that really likes it, they're just going to say, well, can you also track this? Can you also track that? Vertica's incredible speed has allowed us to do a lot of ad hoc querying and do it in very, very short iteration cycles and that's where you're able to derive that value from your sharp data analysts and data scientists. Okay, so maybe can you add a little color to sort of what your environment looks like? Sure, I mean, yeah, we've got 16 nodes and we have, again, data flowing in from a number of different sources across the company. We have actually a nice platform where our sort of data analysts and data scientists act as data architects in the sense of they're close to the customer and the analysis. So they know exactly what columns are going to be needed. What are the new columns? And effectively they're writing their own ETLs and figuring out which data is the useful data that we need to keep moving the business forward. What's the biggest analytic surprise you've learned in the enterprise? Some have been critical of the whole up adoption. You have Jive, you have Yammer. You guys are out there pioneering. Now it seems to be swinging back. Some have criticized the growth. I've been saying it's too hard, it's unstructured, a lot of different structured unstructured data. It's the same argument Enterprise Search had back in the day. Turning around, growing, still great. What's the biggest surprises that you've seen? Well, I mean, the data. So certainly Yammer's still been growing. Obviously, when you look at that distribution, that crazy sort of right tail distribution of who's posting. We're a little bit different than a lot of your social networks that have that 1.990 rule. We have actually a lot more broad reaching contributors, which makes a lot of sense in the work context. That you're trying to surface your experts, you're trying to have a number of different people contribute to that conversation. One of the things that's also really been surprising for us is just how impactful a business leader can be on a network. So one of the things that we find relates quite closely to our network's health is how engaged some of the leadership is. Because what's nice about Yammer is we have the sort of bottom-up growth, that the growth that happens through inviting colleagues and through that virality. But what you'll see is when you're having meaningful voluntary interactions with the leadership, one, you're able to surface people's skills in incredible ways and identify who can be that sort of next generation of leaders. But in addition, it's great to see that engagement at all across the company, empowering everybody, working openly. That's things that we take a lot of pride in. Final question for you then, we'll break. But what needs to get worked on in terms of obviously, it's still early on in this game. Obviously we're now in mobiles going crazy, cloud, social, and big data kind of bundled in, which makes sense, I mean social data, social interactions is data, but still implied data, real data, explosive data, and explosive data. What still needs, what's the key areas that need a lot of work on the analytics side? Is it the visualization, is it more back-ends, is it the science, is it machine learning, is it some of the voice stuff? I mean. Yeah, I still see a lot of opportunities in just the basics of counts. I mean, again, we have people that really have the ability to do deep modeling work, deep visualization work that have passion for it, and that is what's sexy when you see it. But in reality, we're still a plumbing shop. We're a shop that is trying to find clean data, to capture data in a replicable, auditable, clean way. And I think that, I still find that. As Dave and I talk about, it's like fishing, you want the clean fish and the good fish, right? You want the good data. Data quality has always been a challenge. And it continues to be so. I think there's sort of a line about, novices, I talk about analysis, and this is about really strategically devoting the right cycles to building that clean pipeline. A lot of different approaches. Peter, thanks for coming out. I'll see you guys with great company. I'll see you sold on a great, great exit for the startup to Microsoft, which is now reorganized, and Balmer's last stand, he's going to go full throttle. Getting everything kind of put into its right, organized way. Not a lot of silos anymore with Microsoft, so good luck with everything there. Obviously it's bullish on Skype and Xbox and what you guys are doing. And Microsoft moves stuff into the cloud. You think you fix the edge and make that software work. You got Azure and you guys have a lot of opportunities. So congratulations on that. And thanks for sharing your perspective. We love data, data quality is key. This is theCUBE, sharing data with you. That's good data from the data analytics man himself. Peter, thank you for joining theCUBE. We'll be right back with our next guest after this short break. Pleasure, thanks guys.