 Hi everybody, we're back, this is Dave Vellante. The Cube goes out to the shows, we extract the signal from the noise, Nate Silver is here, Nate, we've been saying that since 2010, honestly, I'm going to rip you off. Really? I'm going to trademark this theater, I think, actually. Oh, you have that trademarked? No, no, no. Okay. So, anyway, welcome to the Cube, a man who needs no introduction, but in case you don't know Nate, he's a very famous author, 538.com, statistician, influential individual, predictor of a lot of things, including presidential elections, and great to have you here. Great to be here. So, we listened to your keynote this morning, we asked earlier some of our audience, can you tweet in, you know, what would you ask Nate Silver? So, of course, we got the predictable, how the Red Sox is going to do this year, who's going to be in the World Series, are we going to attack Syria, will the Fed ease or tighten, of course, we're down here, who would you vote for, they all want to know. A lot of these questions you can't answer, because it's too far out, but anyway, again, welcome to the Cube. So, I want to start by picking up on some of the themes in your keynote. You're here at the Tableau conference, obviously it's all about data, and one of your basic premises was that people will misinterpret data, they'll use data for their own biases. You've been a controversial figure, a lot of people have accused you of bias. How do you feel about that as a person who's a, you know, statistician, somebody who loves data? I think everyone has bias in the sense that we all have one relatively narrow perspective as compared to a big set of problems that we all are trying to analyze or solve or understand together. You know, I do think some of this actually comes down to not just bias, but kind of personal morality and ethics, really, it seems weird to talk about it that way, but there are a lot of people involved in the political world who are operating to manipulate public opinion, and that don't really place a lot of value on the truth, right? And I consider that kind of immoral, but people like that I think don't really understand that someone else might act morally by actually just trying to discover the way the objective world is and trying to use science and research to uncover things, and so I think it's hard people to, because if they were in your shoes, they would try and manipulate the forecast, and they would cheat and put their finger on their scale. They assume that anyone else would do the same thing, because they don't own any better. Yeah, so, well, you've made some incredibly accurate predictions in the face of others that clearly had bias that, you know, mispredicted. So how did you feel when you got those attacks? Were you flabbergasted? Were you pissed? Were you hurt? I mean, all of the above? Oh, you know, having me have health insurance for you. I mean, you get used to them with a lot of bullshit, right? You're not too surprised. I guess it surprised me how much the people who you know are pretty intelligent are willing to fool themselves and how specious arguments were. I mean, by the way, people are always constructing arguments for outcomes they happen to be rooting for, right? But one thing if you said, well, I'm a Republican, but boy, I think Obama's going to crush Romney Electoral College or vice versa. But you should have an extra layer of scrutiny when you have a view that diverges from the consensus or what kind of the markets are saying. And by the way, you can go in their betting markets. You can go and you could have bet on the outcome of election bookies in the UK, other countries, right? And they kind of had forecasts similar to ours. We were actually putting their money where their mouth was, agree that Obama was not a lot, but a pretty heavy favorite throughout most of the last two months in the election. Well, I wanted to ask you about prediction markets, because as you probably know, I mean, the betting public are actually very efficient handicappers, right? Two-to-one shot is going to be the three-to-one is going to be the four-to-one, you know, more often than not. But what are your thoughts on prediction markets? I mean, you just sort of betting markets, you just alluded it to them just recently, or is that a good indicator? Well, they're a lot better than I think the punditry, right? I mean, you know, so with prediction markets, you have a couple of issues. Number one is, do you have enough liquidity and volume in the markets for them to be optimal, right? I think the answer right now is maybe not exactly. And like these in-trade type markets, no, in-trade has been shut down. In fact, it was pretty light trading volume. Some might have had people who stood to gain or lose thousands of dollars, whereas in quote-unquote real markets, the stakes are several orders of magnitude higher. If you look at what happened to, for example, just prices of common stocks today after the election last year, oil and gas stocks lost billions of dollars of market capitalization after Romney lost. Conversely, some green tech stocks or certain types of health care stocks that benefit from ObamaCare going into play gained hundreds of millions, billions of dollars in market capitalization. So real investors have to price in these political risks anyway. I would love to have, see, fully legal trading markets in the US. People can get kind of proper sums of money where you have a lot of real capital going in and people can kind of hedge their economic risk a little bit more. But, you know, they're very good. It's very hard to beat markets. They're not flawless. And there's a whole chapter in the book about how, you know, the minute you assume that markets are clairvoyant and perfect, then that's when they start to fail ironically enough. But they're very good. They're very tough to beat. And they certainly provide a reality check in terms of providing people with real incentives to actually, you know, make a bet on their beliefs. And people, when they have financial incentives to be accurate, then a lot of bullshit, there's a tax on bullshit is one way that some of the numbers play. So I got to ask you, I infer anyway that you're still a baseball fan, right, and Detroit fan, right? I'm a tiger fan, you know, there's my bias. Do you remember the bird? You're too young to remember. A little too. So I grew up, I was born in 78. Yeah, so that's your. So 84, the Kirk Gibson, Allen Tramble teams are kind of my early age. So you definitely don't remember Mickey Lola. I used to be a big Detroit fan as well. But so anyway, when Moneyball came out, we just were at the Vertica conference. We saw Billy being there. And when that book came out, I said, Billy Bean's out of his mind for releasing all these secrets. And you alluded to in your talk today that other teams like the Rays and like the Red Sox have sort of started to adopt those techniques. At the same time, I feel like culturally, another one of your, and your Venn diagram, another one of your vectors, that Oakland's done a better job of that, that others are still culturally pushing back. Even the Red Sox themselves could be argued, you know, went out and sort of violated the principles where of course, Oakland A's can't because they don't have a budget to do so. What's your take on Moneyball? Is the strategy that he put forth sustainable or is it all going to be sort of level playing field eventually? Well, I mean, you know, the strategy in terms of, oh, fine guys that take a lot of walks, right? I mean, everyone realizes that now. It's a fairly basic conclusion. And it was kind of the sign of how far behind, how many biases there were in the market for that, you know, use OBP instead of man. But that was arbitrage, you know, five or 10 years ago now. You know, it doesn't necessarily put butts in the seat, right? I mean, if they win, I guess it does, but even the Red Sox are winning and nobody goes to the games anymore. The Red Sox? There's tons of empty seats even for Yankees games. Well, I mean, they're also charging $200 a ticket or something. It doesn't matter if you can get a ticket for, you know, $20, $30. But, you know, I mean, first of all, the most demonstrable connection in baseball is that if your team is in pennant races, Wins World Series, right, then that produces multi-million dollar increases in ticket sales and TV contracts down the road. So, in fact, you know, I think one thing is looking at the financial side, like modeling the marginal impact of a win, but also kind of modeling. If you do kind of sign a free agent, then that signaling effect, how much does that matter for season ticket sales? So you could do some more kind of high finance stuff in baseball, but some of the low hanging fruit. I mean, you know, almost every team now has a statistical analyst on their payroll where increasingly the distinctions aren't even as relevant anymore, right? Where someone who's first in analytics is also listening to what the scout's saying. You have organizations that, you know, aren't making these kind of distinctions between stat heads and scouts at all. They all kind of get along. It's all, you know, finding better ways, more responsible ways to analyze data. And basically, you have the advantage of a very clear way of measure success where, you know, do you win? That's the bottom line. Or do you make money, or both? You can isolate guys' martial contribution. I mean, you know, I am in the process now of hiring a bunch of writers and editors and developers for 538, right? So someone has a column and they do really well. How much of that is on the writer versus the editor versus the brand of the site versus the guy at ESPN who promoted it or whatever else, right? That's hard to say, but in baseball, everyone kind of takes their turn. It's very easy to measure each player's kind of marginal contribution. Sort of balance and equilibrium and that's potentially achieved. But again, from your talk this morning, modeling or volume of data, doesn't trump modeling, right? You need both and you need culture. You need, you know, you need volume of data. You need high quality data. You need a culture that actually has the right incentives aligned where you really do want to find a way to build a better product to make more money, right? And again, it'll seem like, oh, you know, how difficult should it be for a company to want to make more money and build better products? But when you have large organizations, you have a lot of people who are thinking very short term or only about their P&L and not how the whole company as a whole is doing or that, you know, hangups or personality conflicts or whatever else. So, you know, a lot of success, I think, in business and certainly when it comes to use of analytics, it's just stripping away the things that get in the way from understanding and distract you. It's not some wave of magic wand and have some formula where you uncover all the secrets in the world. It's more like, if you can strip away the noise, then you're going to have a much clearer understanding of what's really there. Nate, again, thanks so much for joining us. So, I kind of want to expand on that a little bit. So, when people think of Nate Silver, sometimes they think Nate Silver, analytics, big data. But you're actually, some of your positions are kind of, you take issue with some of the core notions of big data really around the importance of causality versus correlation. So, we had Kenneth Cooke here on from the economist who wrote a book about big data a while back at the Stratoconference. And, you know, in that book they talk a lot about, it really doesn't matter causality anymore. If you know that your customers are going to buy more products based on this data set or this correlation, that it doesn't really matter why you just try to exploit that. But in your book you talk about, well, and in the keynote today you talked about, well actually hypothesis testing, coming in with some questions, and actually looking for that causality is also important. So, what is your opinion of kind of, all this hype around big data? You mentioned volume is important, but it's not the only thing. I mean, ultimately I'm kind of an empiricist about anything, right? So, you know, if it's true that merely finding a lot of correlations and kind of very high volume data sets will improve productivity, then how come we've had, you know, kind of such slow economic growth over the past 10 years? Where's the tangible increase in patent growth or different measures of progress? And obviously there's a lot of noise in that data set as well. But, you know, partly why both in the presentation today and in the book I kind of opened up with the history of saying, you know, let's really look at the history of technology. It's a kind of fascinating understudied field, a link between technology and progress and growth. But, it doesn't always go as planned. I certainly don't think we've seen any kind of paradigm shift as far as, you know, technological or economic productivity in the world today. I mean, the other thing to remember too is that technology is always growing and developing and that if you have roughly 3% economic growth per year, exponential, that's a lot of growth, right? It's not even a straight line growth. It's like exponential growth and to have 3% exponential growth compounding over how many years is a lot. So you're always going to have new technologies developing. But what I'm suspicious of is people will say, this one technology is a game changer relative to the whole history of civilization up until now. And also, you know, again, a lot of technologies, you look at kind of economic models where you have different factors of productivity. It's not usually an additive relationship. It's more a multiplicative relationship. So if you have a lot of data, but people who aren't very good at analyzing it, you have a lot of data, but it's unstructured and unscrutinized, you know, you're not going to get particularly good results by and large. So I also want to talk a little bit about the, the kind of the cultural issue of adopting kind of analytics and becoming a data-driven organization. And you talk a lot about, you know, really what you do is setting, you know, try to predict the probabilities of something happening. You're not really predicting what's going to happen necessarily. And you talked in your keynote today about, you know, acknowledging where you, you know, you're not, you're not 100% sure. Acknowledging that this is, you know, this is our best estimate based on the data. But of course in business, you know, a lot of people, a lot of importance is put on kind of, you know, putting on that front that you know what you're talking about. It's, you know, you be confident you go in, this is going to happen. And sometimes that can actually move markets and move decision making. How do you balance that in a business environment where, you know, you want to be realistic but you want to, you know, put forth a confident persona? Well, you know, I mean, first of all, everyone, I think the answer is that you have to kind of take a long time to build the narrative correctly and kind of get back to the first principles. And so at 538, it's kind of a case where you have a dialogue with the readers of the site every day, right? But it's not something you can solve in one conversation. If you come in to a boss who, you know, we talked to you before, you had to present some PowerPoint, and you were like, actually this initiative has a, you know, 57% chance of succeeding and the baseline's 50%. And it's really good because the upside's high, right? Like, you know, that's going to be tricky if you don't have a good and open dialogue. It's another barrier by the way to success is that, you know, none of this big data stuff is going to be a solution for companies that have poor corporate cultures where you have public communicating ideas where you don't have everyone on the same page. You need buy-in from all throughout the organization, which means both you need senior level people who understand the value of analytics and you also need analysts or junior level people who understand what business problems the company is trying to solve, what organizational goals are. So, I mean, how do you communicate? It's tricky, you know. Maybe if you can't communicate it, then you find another firm or go trade stocks and short that company if you're not inside of trading rules of various kinds. You know, I mean, the one thing that seems to work better is if you can depict things visually, people intuitively grasp uncertainty if you kind of portray it to them in a graphic environment, especially with interactive graphics, more than they might just kind of put numbers on a page. You know, one thing we're thinking about doing with the new 538 ASPN, we're hiring a lot of designers and developers is in case where there is uncertainty, then you can press the button, kind of like a slot machine, and simulate an outcome many times. Then it'll make sense to people, right? And they do that already for, you know, NCA term and stuff or NFL playoffs, but that can help. So, Nate, I asked you, my partner, John Furrier, who's normally the co-host of this show, just tweeted me asking about crowd spotting. So, he's got this notion that there's all this exhaust out there, the social exhaust, the social data. How do you or do you or do you see the potential to use that exhaust that's thrown off from the connected consumer to actually make predictions? So, I'm a, I guess, probably mildly pessimistic about this. For the reason being that a lot of this data is very new, and so we don't really have a way to kind of calibrate a model based on it. So, you can look and say, well, you know, let's say Twitter during the Republican primaries in 2016 that, oh, Paul Ryan is getting five times as much favorable Twitter sentiment as Rick Santorum or whatever among Republicans, but what's that mean? You know, to put something in a model, they have to have enough history generally, where you can translate X into Y by means of some function or some formula, and a lot of this data is so new where you don't have enough history to do that. And the other thing too is that the demographics of who is using social media is changing a lot, where right now you come to a conference like this and everyone has, you know, all their different accounts, but we're not quite there yet in terms of the broader population. You have a lot of kind of thought leaders now, a lot of, you know, kind of young, smart, urban tech geeks, and they're not necessarily as representative of the population as a whole. That will change over time. The data will become more valuable, but if you're kind of calibrating expectations based on the way that Twitter or Facebook were used in 2013, to expect that to be reliable when you want a high degree of precision, three years from now, even six months from now is I think a little optimistic. Yeah, so sentiment, we would agree with that. I mean sentiment is this concept of how many people are talking about it, thumbs up, thumbs down, but to the extent that you can get metadata and make it more stable, longer term you would see potential there. But I mean, there are environments where the terrain is shifting so fast that by the time, you know, the forecast that you'd be interested in, right, like things have already changed enough where, like it's hard to do to make good forecast, right? And I think one of the kind of fundamental themes here, one of my critiques of some of the the more optimistic interpretations of big data is that fundamentally people are, most people want a shortcut, right? Most people are fairly lazy, like, but yeah, I love to save labor. Give me the answer. What's the hot stock? Yeah, right? And so I'm worried whenever people talk about, you know, bias interpretations of the data or of information, right? Whenever people say, oh, this is going to solve my problems, I don't have to work very hard, you know, it's not usually true. Even if you look at sports, even steroids, performance enhancing drugs, the guys who really get the benefit from steroids, they have to work their butts off, right? And then you have a synergy, which helps. So they're very free, free bill tickets in life when they ought to be gobbled up in competitive environments. So, you know, bigger data sets, faster data sets are going to be very powerful for people who have the right expertise and the right partners, but it's not going to make, you know, anyone be able to kind of quit their job and go on the beach and sip my ties. So, Nate, what are you working on these days as it relates to data? What's exciting you? So, with the move to ESPN, I'm thinking more about, you know, working with them on sports type projects, which is something having mostly covered politics the past four or five years. I've kind of had a lot of pent up ideas. So, you know, looking at things in basketball, for example, you have a team of five players in solving the problem of who takes the shot? When is a guy taking a good shot? When is a shot clock's running out? When is a guy stealing a better opportunity from one of his teammates' question we want to look at? You know, we have the World Cup this summer, so soccer is an interest of mine. We worked in 2010 with ESPN on something called the Soccer Power Index. So, continuing to improve that and roll that out. You know, obviously baseball is very analytics rich as well, but you know, my near to focus might be on some of these sports projects. Yeah, so I have to ask you a follow up on the soccer question. Is that an individual level? Is that a team level, both? So, what we do is kind of one problem you have with the national teams, the Italian national team or Brazilian U.S. team, is that they shift their personnel a lot. So, they'll use certain guys for unimportant friendly matches, for training matches that won't actually be playing in Brazil next year. So, the system Soccer Power Index we develop for ESPN actually looks at the rosters and tries to make inferences about who's the A team, so to speak, and how much quality improvement do you have with them versus guys that are playing only in the marginal and important games. Okay, so you're able to mix and match teams and sort of predict on the fly. Yeah, and we use data also from club league play to make inferences about how the national teams will come together. But soccer's a case where we're going into a period where we have a lot more data than we used to. Basically, you had goals and bookings, meaning yellow cards and red cards, and now you're collecting a lot more data on how guys are moving throughout the field and how many passes there are, how much territory they're covering, tackles and everything else. So, that's becoming a lot smarter. Next slide. All right, Nate, I know you got to go. I really appreciate the time. Thanks for coming on theCUBE. It was a pleasure to meet you. Great, thank you guys. All right, keep it right there, everybody. We'll be back with our next guest, Dave Vellante and Jeff Kelly. We're live at the Tableau User Conference. This is theCUBE.