 Okay, we're back live at Strata. We are in the afternoon program here at theCUBE, SiliconANGLE.tv's flagship telecast, where we go out to the top events and tech and explore and get the signal from the noise, extract that and share that with you. We're all about knowledge and pushing it out to the social streams. Thank you for watching. I'm here with Jeff Kelly, who's subbing in for Dave Vellante. He went to take a bio break because you know what, we go all day long. I take a break, Dave takes a break, but you know what? We go all as much as we can to get as much content as possible. And again, extract the signal from the noise and do independent quality analysis. And we're joined here with Virginia Carlson, who's from Chicago, Metropolitan Chicago Information Center. You guys are a nonprofit and you do a lot of work with data. So first, welcome to theCUBE. Thank you, thank you. And let's talk about data. So tell us your impressions of Strata and what's going on in the data world from your angle. Well, this is my second time at Strata. I was here last year and as I think a lot of people felt last year, it's so good to be together with a tribe where you can sit down and almost immediately get into a conversation about sort of universe versus sampling and everyone on the table understands what you're talking about. So it's fabulous to be here and it's meeting so many people from so many perspectives. And the tribe is growing too, right? And the tribe is growing and mainstream was seeing articles in the New York Times, Wall Street Journal talking about big data. We publish on Forbes at Silicon Angle. I have a, I'm a contributor on Forbes.com. They have a big data-driven section. O'Reilly also contributes there. Great group of people, but now the tribe is growing. Data is now a force. It's a business model. What's happening? I mean, what's old and what's new? Because there's all this talk about data, warehousing, business intelligence, same story, new wine, different bottle, kind of whatever the metaphor you want to use. But we're seeing new trends like predictive analytics and real time or whatever that means on whatever parsed definition. So what's your angle on this? Where are we? I'll talk a little bit about where we sit. MCIC, the Metropolitan Chicago Information Center, sits as a sort of funnel between big data and historical big data and the common good public good organizations on the ground that would have needed those data. They still need those data. For example, anything from a local American, Indian healthcare center that needs to understand where to open a new clinic to a larger philanthropic organization like the MacArthur Foundation who wants to know whether or not its local community efforts are making a difference. So we try to do what we call the data intermediary piece, curate the data, analyze it, visualize it, give them the findings. You know, the data, the tribe, I love the tribe analogy because that was obviously, Mark Pincus has failed startup with my friend Paul Martino and then now he's doing Zynga, big data company, but the tribe and this data world that we're living in, Jeff Hammerbacher from Cloudera when we was on the queue, talked about the Hadoop world because he was a baseball star. He said, data scientists are like gym rats. They're out there. They do it because they love it. So, you know, in the early days in this open data market, you got to go find data sources. So tell us from your perspective, because you have to go out and scour sources, find sources because that's the drug. You need source of data. So tell us, what's it like out there? How did you find sources? Are they rolling in now? Is there intermediaries? Are you brokering the data? Are you a data broker? How does someone get sources? And what's scrappiness you need? You need street smarts you need. The historical perspective, you know, we were founded 22 years ago to do 3,000 household survey projects every year because there wasn't enough data to do local planning and policy development. That went away in 2002 as more administrative and operational data became available from governments and as there was less, but worse response rates from surveys, basically. So, turning to open data and sources, at that point, 2002, you had to FOIA most of it. Now, there's the big open government movement and now, you know, sort of folks selling their souls, if you will, selling their private data to Facebook, Twitter, LinkedIn, and all the rest of that. So, from our perspective, big data is, I want to say a double-edged sword, but it may be a single-edged sword, and that as more and more data are collected by private sector companies, there is less available data for social service organizations that need public data to do public planning. So, are you saying there's data hoarding going on? Are people hoarding the data? Do they want to just control it? It should be a show called Data Hoarders. That's actually a good cable show. We should run that. Data hoarders, data hoarders. Facebook, you're hoarding data. You know, I mean, that's their business model. They need to monetize what they're doing is selling you back your own data in the form of services and advertisements, and that's how they're making money. But what it's doing is it's sort of choking. Choking, there's a good word. Other sources of data that people might use that might be more public, less confidential, and that drives things up for us. And the Open Gov movement in particular is all about getting operational and administrative data from the federal and local governments. And what I mean by that is how do we fix potholes and who are our lobbyists and all the rest of that? And what's being ignored is the federal statistical system, the Bureau of Labor Statistics, and the Census Bureau who have historically done the data, collected the data that local planning organizations need. Now, as- Are they making that available, or is it just locked in? It's just, it's the budgets being cut. As the federal government is moving towards open data initiatives and folks that are here are really working hard to get transparency from government, the federal government in particular is sort of saying, okay, that's where we're going. We're not going to be doing our statistical collections. So what's the trend line then? Is it positive, negative? I mean, is it going in the right direction? I'm trying to understand how the government is working here, because I'm not a big, again, self-confessed, I'm not a big guru on what's going on in Gov 2.0, whatever they call it these days. But to me, data should be available. It should be readily available. Are we seeing positive trends or negative trends relative to the government? Sounds negative to me. Well, there's different kinds of government data. And I think that difference has not been recognized by the open Gov movement itself. So it's positive on the one hand and negative on the other. It's positive for open Gov people, certainly, you know. From a visibility standpoint, rhetoric standpoint, the USA spending, all the rest of that, and all the rest of that. Golf clap, as they say, but you need the data, right? I mean, if... You need the data. There's calls for the data. We certainly have lots more at the municipal level as well as the federal level than we did 10 years. But what the Gov 2.0 folks don't understand is that there's been historical, statistical, what Census Director calls design data, collected for hundreds of years that's been used for local planning. And the more that the open Gov community pushes for transparency in administrative and operational data, the less the government is doing on the statistical side. And that's sort of, it's a geeky difference, but it's really important to the folks that we work with because if we don't have census data that tell us who is poor at the census track level, having access to who's lobbying the federal government is not doing us any good. So how does someone get involved? The average person who cares about this, because there are a lot of people who do care. There's a whole new generation of Americans, quite frankly, who, and people from non-Americans around the world, the global economy, don't want to have a data, with big data we can actually instrument society. So that's a good thing. So we talked about that on theCUBE last night. So how does someone get involved? Listener out there, reader, a watcher, a viewer of us, how do they get involved? Well, I think, you know, what drew, I'm sorry, Jake Porway and I are having a session tomorrow about and what I'd like the conversation we'd like to start and I would like to start is, okay, if so much data is being made private, confidential, folks are giving their data to Facebook, LinkedIn, Twitter, all the sort of big three, how can those data be used for social good? Target, big story in the New York Times. What if we could get that same target profile and help a federally qualified healthcare center target that woman and say, okay, here's where you need to go to get your prenatal care. For example, what if we could get Google search results on folks in different census tracts and what they're looking for as their vision is fading and use that to help the centers for, you know, guild for the blind, figure out what sorts of vision impairments are going on. So my column, the conversation that needs to happen is with privacy and confidentiality, how can we get the private sector data into the hands of local people trying to work on common good problems? Well, let's explore that a little bit. What responsibility do you think companies like Facebook have and how do we get to that point? How do we get them to companies like that to share that data, to make it available? What role do they have to play? What would your message be to them if they're watching right now? Well, my dream would be that they would see this as a philanthropic opportunity. There are other ways to do that. A number of them have sort of a philanthropic arm that will lend out their data scientists for problems, but I'd like to suggest to them that their data is just as valuable as the skills that data scientists have. And we should begin a conversation around how and in what under what conditions privacy and confidentiality can be preserved at the same time that they start thinking about sort of letting the data free. I mean, if data wants to be free, as they say, let's use it for public good. I don't, yeah. Virginia, thanks for coming inside theCUBE. We personally care about this society benefit. Dave and I were talking last night around how society can benefit from big data. The stuff that you're doing in your work is phenomenal. It's exactly the kind of use cases that the, I call commercial vendors don't necessarily talk, not in that business of actually helping human beings, but in the healthcare example and or doing planning around making society a better place, big data can completely streamline and make so much more operational efficiency around stuff that's already existing that data. So I personally believe in what you're doing. Thank you for sharing with us. Keep in touch with us. Let us know how we can get ahold of you because we want to promote your work and conversations. Good and bad. Well, we'd love to hear them all. Gov 2.0, to me, I think that's more of a rah-rah political stuff, but I want to see use cases like you're talking about where real applied data to really help people because the government's spending money to help people. Right, right. That's like the job, right? So like, you know, there's no, you know, if you can monitor that an instrument in real time, that's a good thing. So I'm for it. I don't know if that's a libertarian view or whatever. I don't know. I just like it. Thank you very much. Thank you. We'll be right back and we would not be able to bring this great constant if it wasn't for our ad-supported partners, Cloudera, MapR, Digital Reasoning, 1010 Data. Thank you very much to those vendors. Folks, watch the ads because those are the guys who would make it possible for us to bring this great content to you. Virginia's doing some great work. Thank you very much and we'll be right back.