 Hi everybody, we're back. This is Dave Vellante with Jeff Kelly of Wikibon.org. This is Silicon Angle Wikibon's theCUBE where we go out to the events, we extract the signal from the noise, we bring you the best guests that are at these events. We grab their perspectives, they share with us their knowledge and their practical experiences. We're here at the Tableau Customer Conference. We've been doing wall-to-wall coverage for the last two days. Jeff and I have been talking to customers and practitioners and executives and pundits. Jonathan Schwabisch is here. He's a principal analyst for the long-term modeling group within the Congressional Budget Office, the CBO. Jonathan, welcome to theCUBE. Thanks for having me. So, tell us about your role at the CBO. Why don't we start with the CBO? A lot of people might not really understand what the CBO does, why don't you start there? Sure, yeah, so the CBO is a relatively small agency in the federal government. We're about 230 people. We provide objective, non-partisan analysis to Congress and those issues can range from anything, from immigration to social security, budget issues, long-term budget issues, basically anything you can think of that the Congress would be interested in working on. So you heard Nate Silver's talk, we were talking off camera, and he was talking about the bias in the data. So how do you remain objective and non-partisan in this town with all that pressure? Yeah, so that's a hard job, and we have a lot of analysts who've been working a long time on these various issues, and what we try to do is construct as most objective and unbiased models that we can. We have a lot of outside people who review our work, a lot of internal people that go through line by line, code by code to look at all of the results that we're doing and all of the work that we're writing up to make sure that we're providing an unbiased and a balanced approach to the issues that we have to talk about. So now what specifically do you work on in terms of specifically long-term modeling? What are you modeling for the long-term? Yeah, so I have a sort of varied portfolio. So the long-term part, I help run a social security, long-run micro simulation model. So what we do is we simulate a population going forward into time, how people claim their benefits, what sort of benefits they get, what's their behavior over their lifetime, how do they work, when do they get married, when do they have kids onward, and so forth. I also work on issues like food stamps and immigration and equality, and over the last few years, I've been moving into the fields of data visualization and presentation techniques. Okay, so you're a pretty young guy, but I see a couple of gray hairs in there. So what's changed, you're an economist by trade, right? By trade, right. I should also mention to our audience, what's changed in say the last 10 years in terms of how you have used data and your profession have used data? Well, I think one of the things you see in the economics profession is people are not quite at the point where they know how to communicate their results. So the economics profession is very much following a fairly standard equation of creating a static graphic, embedding it in a report, and surrounding it with a bunch of text. And all of the developments we've seen over the last three, four, five years in data visualization and data management and open source software languages, all that is enabling us to communicate in different ways. And what I'm seeing in the sort of traditional, academic, economic profession is that people aren't really using those various tools. So what we're doing at CBO is trying to use a lot of those tools, we're using data visualization to create more effective, more compelling graphics and infographics and other sorts of tools that will help us communicate our work to the Congress in new and different ways. So Nate Silver tries to, in his book, explain why some things don't always work out as we expect, but, and your profession gets a bad rap sometimes. And I have to say, I remember 2008, it was probably the spring, it was an economist, we were at the MIT CFO forum and he was definitive about the relationship between housing and the economy and that housing was not going to tank. And it wasn't a probability, it was definitive and I lost all faith in economists, but I've since gotten over that. So, I mean, as somebody in that field, you must experience that, you've got a spectrum of sort of quality of forecasts and it's hard, it's a very difficult, forecasts are hard, right? Projections are very hard to do, yeah. So, first of all, what makes them so hard? Does data help with that problem and how does visualization fit in? Yeah, so it's a very hard problem, as you mentioned, and most of my work is in the long term, so you can imagine doing a projection 10 years into the future, that's difficult enough, and then doing a projection 75 years in the future, which is even more difficult, but it's important to do for a program like Social Security and Medicare, where there's this intergenerational transfer of wealth that's going on, and so modeling that behavior is an important aspect of that. What we've been doing, and I think where data visualization has been helping, both on the analytic side and on the presentation side, is using a variety of different tools out there that are helping us to sift through the data and understand the correlations and relationships between different variables. So, tools like Stata and SAS, Tableau, Addons to Excel, R, some of the JavaScript programs that are out there are helping us understand our data in different ways, and basically our job is to try to bring as much data as possible to bear to help us understand the correlations as they exist now and to do a better job thinking about how they might change in the future. So, you just mentioned something about understanding correlations. We've had this kind of ongoing conversation and on theCUBE the last couple of days about the correlation versus causation debate, and the big data crowd, for lack of a better term, will say, well, causation doesn't really matter anymore. We've got enough data, we can correlate things. It doesn't necessarily matter why these things are happening, but if we know, if we can determine two things that are interrelated and we don't necessarily have to know why, we just need to know that they are and that they impact one another. Where do you fall in that equation? Is causation just as important as correlation or in this big data world, from an economist's perspective, is it more about correlation and leave the causation question to more theoretical people can think about that, but let's focus on the correlation. What's your view? Well, so I always find the big data discussion rather funny because I've been doing this for 10 or 12 years now, and for economists, big data has always meant the census, the current population survey, big surveys from the BLS, big counts from the Census Bureau and the Department of Commerce and those sorts of things. So, big data is, I think people sort of think about it now as a different sort of animal than economists have generally used it. But I think there's certainly a difference between how we think about it, how I think about correlations versus causation, especially as it comes to things that we're working on. So, I mentioned I do some work on food stamps. When we think about correlations between, say, food stamps and the economy, the unemployment rate, how many people are working, that's sort of a correlation. But then you look at how policy might affect certain policies and that's sort of a causation. We have to tease out how those things differ and how they might change over time. So, there's a real tension there and really there's a lot of modeling that goes into it and a lot of data that's brought to bear to sort of try to tease out the answers to those questions. And we were talking a little bit beforehand about as you have said that you really economists don't do a great job of communicating kind of their insights or their arguments. It sounds like you really understand the data but you don't necessarily do a great job of as a profession communicating that to, is it to one another, to the outside world? And why is it? I think you also mentioned, it's really not taught in PhD programs about how to visualize data and make your argument in a way that it's communicated openly and clearly. Why do you think that is and how are tools like data visualization helping you improve on that? Yeah, I think at least when I went through school there was no training on writing or presentation or data visualization. Three things that are important to communicate your work either to colleagues or to the public or to other folks. And that may be changing now, but I think what economists tend to do is they do their analytics and they may use data visualization tools in those analytics. So they may use data or SAS or SPSS or MATLAB or even Tableau to create graphs and visualizations to help them understand those, their work and their data. And then they simply take that graph and they put it in the report without thinking about how to communicate the results more effectively to their audience. So I may be working in a data set and I understand the model, I understand the data, I understand my correlations and I create a scatterplot to understand that. And then I put that into my report, but the color and the font and the annotations that are using that presentation mode are much different than the ones that I use in my analytics mode. So it's an evolution. I think you're seeing more and more economists out there blogging more actively. Folks like Justin Wolffers and Donald Maren, these folks are sort of more out there talking more, blogging more about the issues of the day and the work that they're doing and they're using data visualization to help communicate their work to the public in ways that really haven't been done in the past. And how do you think that will impact as the, as economists as a profession get better at that, what do you think the ultimate impact will be on the economy, on the way maybe policy makers make decisions? It seems to me it would certainly increase the impact that economists have if they can actually communicate better specifically to policy makers. You know, I was at a session earlier today with someone from the Census Bureau talking about how they're trying to change their data visualization efforts. And one thing he was talking about was how socializing analysts within the bureau is helping them improve their visualizations. And I think that holds for the economics profession, policy folks, budget folks in general. There's, as people socialize and talk more about what they want to visualize, how to do it, the tools that will help them do that. As all of that increases, both the analytics of issues will improve and how people present their work will improve and that hopefully will lead to better work and better communication of that work to the public and to the press and to policy makers and to academics and all the other sorts of stakeholders out there who need to learn these different issues. So Jonathan, what can you share with us in terms of what the data is telling you? We're not going to get into the politics of it all, but what is your modeling showing? What are you sharing with the public at this point in time and what's the data say? Well, so we released, in the spring, we released our 10-year budget outlook and the budget outlook is sort of mixed depending on the politics. We're doing our long-term budget outlook right now where we'll look about 25, 35 years into the future. So it really sort of depends on the modeling, but what we're trying to do and when we release those reports is to show the results in different ways, to use visualizations and more effective ways to communicate to the Congress so that they can be better informed to construct the policies that they need to do to help the country. Okay, so when you release those reports, you say, look, there's no definitive answer. If this happens, this is our best estimate of what things will look like. If this happens, if the economy grows more, it's going to look like this. And so you make certain assumptions, right? So what's the high-level message to Congress? I mean, what are you essentially telling them about the outlook? Well, so our challenge is really to tell them about the economy in ways that they can communicate to their constituents and to their colleagues and to their staffs. So what we've been trying to do is to take the message that we've been talking about and communicate it in different ways. So we've been moving from our standard sort of PDF reports that I'm sure everybody has seen from a hundred of different government agencies and we're creating static infographics. We're creating smaller cards that we've been handing out what we're calling CBL snapshots. We're getting more into the social media. CBL just started its Twitter feed yesterday. So we're moving into that area and trying to communicate our message in different ways so that the Congress can affect policy in the ways that they sort of see fit. So what kinds of things will we see when you guys start tweeting? I mean, what kinds of messages are you trying to get out there? So basically the Twitter feed, at least in the moment, will be a way to communicate well to talk about our reports that are coming out. We have two Twitter feeds, one that will be on general reports that come out and another one that'll be on cost estimates. So we have sort of reports that talk about bigger issues and we also have cost estimates when members of Congress ask us to score a particular policy. We'll write a document up. And so at the moment, the Twitter feeds will just be announcing those reports and we're hoping that'll help us reach a broader audience and more folks. And so your focus is on, you said on social security, right, I understood that correctly. So basically when we see the pie charts forecast out, your slice of the pie is social security. So you're modeling all that out, trying to get that as accurate as possible. Right, except that I don't do pie charts. Yeah. Dave, we are at Table of Commerce, pie charts. Okay, well, I've seen pie charts before, but okay. Maybe those are from the senators and the congressmen that are trying to simplify things too much. So okay, so we all know Medicare in particular, social security, there's a big consumption of those dollars going out, going forward. So you guys model that out. What is the data telling us? And what does it say based on your best estimates? Well, our current estimates will have social security. The trust fund for social security will go to zero sometime in the 2030s or 2040s. I don't remember the exact number off the top of my head. There are two trust funds for social security for the old age side, which is your retirement benefits. And then there's a disability insurance for those who are disabled and can't work. And that program will, for disability, that trust fund will reach zero in 2016, I believe was our latest estimate. So those programs are looking to hit, looking to hit some financial walls and it will be up to members of Congress to determine the best approach to address those fiscal issues going forward. And you guys don't recommend those, obviously. You just say, this is what the data says. The data doesn't lie, right? This is what the data says. This is what the model says. This is what we believe is our best estimate. And of course, one of the advantages of our model going forward until the long term, because it's a micro simulation model and we follow cohorts of people over time, we're able to build confidence intervals around specific estimates. So we can say with some probability, getting back to Nate Silver here earlier, with some probability the trust fund will be at $0 in this particular year. So that's the sort of thing that our tools and our modeling can bring to bear to the debate. So what is your opinion on the potential of new sources of data, relatively new sources of data to impact the kind of modeling and forecasting that economists do? I'm talking about, we were talking about Twitter a moment ago. Social media data, you know, famously Google can, using Google data, you can predict, you know, flu outbreaks before the CDC can. Is there opportunity there for economists to start to leverage more social data in their forecast or are they already doing it? What's your view of that? Yeah, I think there's a space for that. I'm not sure yet how the social data or big data or whatever you want to call it will impact projections or forecasting models. But certainly those pieces, those types of data can help folks who are behavioral economics who are really interested in how do people's behavior change with response to some stimulus? I think those sorts of folks, they are looking at just a wealth of data that they can use to improve their modeling. And that sort of helps our understanding of human behavior in an economics context. And if we can better understand how people respond to different stimuli in an economics context, then we can use that research to help our projections and forecast, be it in government or in individual corporations or in the business sector, the financial sector. Any sector that relies on human behavior, those sorts of research advancements will help folks do a better job looking ahead. And you know, help me understand, if you can, to the extent that you can characterize, what are economists, what is their nature in terms of adopting new tools to do their job, in terms of adopting new data sets like we were talking about? And actually, starting to maybe use new techniques like advanced data visualization and other things. Is this going to be a challenge for your profession or are economists the type that are always looking for new tools and new ways to do their job better? Or is it a mixture? I think we're a pretty state formal group set in our old traditional ways. As I mentioned, the talk earlier about the Census Bureau and their challenges, I think they are working on changing the culture at the Bureau to promote data visualization as a means to communicate their work. You know, I think over time, you're going to see economists use more and more of these tools, but it's going to be a slow ride. You know, economists, if you go an economics conference, if you have the unfortunate experience of going to an economics conference, the presentations are nothing like the ones you see at a conference like Tableau or South by Southwest or Visweek or any of those sorts of conferences where people understand how to present their work to an audience who may not be familiar with their work. Economics conference, it's bullet points and texts on every slide. That sort of communication skill has not really found its way to the economics profession. And I think once the data visualization side starts, you'll see presentations improve. So there's a lot of work to be done, but economists generally have great programming skills. And so I think it'll be, once people see these tools, they'll be ready and easy to adopt them into that particular field. So John, the last question, advice for young people interested in economics want to get into the profession, what would you advise them? So I think if you're interested in this field, you need a very strong math and statistics background, but I think the other thing to be aware of is presentation techniques, how to give a good talk, how to write well, how to communicate, and programming languages. And I think the programming languages that economists have used to this point, Stata, SAS, Fortran, C++, those sorts of programs, I think adding to that toolbox, things like JavaScript, HTML, programs like Tableau, Tile, Mill, all of these different things that are helping us boil down our data, understand it better, those sorts of tools and programming languages will help you be a better modeler and eventually be a better economist. Excellent. Jonathan Schwabers, thanks very much for coming on theCUBE. Pleasure to meet you. All right, keep it right there, everybody. We'll be right back with our next guest, Alyssa Fink, is coming up. She's the CMO of Tableau, largely responsible, one of the people responsible for this event. We're going to talk to her about what's happening here, what to expect next year. Keep it right there, we'll be right back. This is Dave Vellante with Jeff Kelly, and this is theCUBE.