 Latanya Sweeney, it runs the data privacy lab at Harvard. Ethan Zuckerman runs the Center for Civic Media at the MIT Media Lab. I don't think we need it. And they're going to do a duet. So come on down, Ethan and Latanya. Hey. So just before anything else, man, I love Susan Crawford. Can we just get another round of applause? Susan is one of the most optimistic people I know. But what I love about Susan's optimism is that it has a vision of possible futures we could choose with technology. And I think for Latanya and me, we're always looking at these critical questions of, what decisions are we making about these systems? What are we choosing to embrace? What are we choosing to reject? That's very much what we wanted to think about. But it's just amazing to have someone put up such a. Ethan's trying to nicely say, we're the downers. No. And so in some ways, I wanted to start with that with you, because Latanya, you scare me. And not as a person, but as a researcher. Like a lot of the things that I've been most scared about over the last 10, 20 years of the internet have sort of come out of your research. I learned about your work for the first time when you found my governor's anonymized medical files and basically said, yeah, medical anonymity? Yeah, forget about that. Doesn't really happen. So given the chance to share a stage with you, I wanted to ask, what were you scared about 10 years ago? What are you scared about now? So the governor medical record was 20 years ago. OK, all right, all right. 30 years ago, maybe, 1997. But 10 years ago, in 2007, the thing that I was working on then was a simple idea that you could type SSN VTAY into a Google search bar and you would see people's social security numbers. Let me give you an example. And that's because people would put their resumes online and there was a habit of putting their date of birth and their social security number as part of that resume. So I built a little. So it's not that I'm a downer for the sake of just depressing people or making them feel creepy. This is real stuff. Well, it's also because I'm a computer scientist by training. And the goal is, how do you get to Susan's real world if you don't look at what the unforeseen consequences are and address them? And so this clearly was an unforeseen consequence. And with the rising amount of identity theft and credit card fraud, I wrote an AI program that would go around finding these online, figuring out if they could get an email address for the same person, sending them an email and encouraging them to take them down. What happens when you send someone that email? Well, when it first started, people would, people accused us of stealing their identity. Sure. And she would threaten to sue me. And why was I stealing their social security number? Things like that. But after about a year, it had run. And then about two years, the media from around the country started picking up about the program. By 2007, you really couldn't find them online at all. So this doesn't happen anymore. This is all set. You fixed it. You wrote a program. This never happens anymore. And we've solved this particular bug with identity theft. In 2007. But in 2017, as my students pointed out to me in the spring, they're back. So that one didn't work so well. So is that what we're worried about now in 2017? Are we worried about we didn't learn these lessons 10 years ago? We're still releasing data. We're putting it all over the place. Personally, identifiable information keeps showing up. Is that our contemporary? Well, I certainly is an ungrowing problem. Also in 2007, Facebook was moving beyond campuses and coming into the real world. And one of the things that it did is it required date of birth and birthplace, hometown. And we had learned that you could actually predict social security numbers, digits of a person's social security number from those two fields. And all of our attempts to get Facebook to stop making that public had all failed. And so in 2007, one of the things we were really interested in is why is it that for anyone born from 1987 to 2011, your social security numbers are given out at date of birth and they're sequential in your state. And so in fact, you can figure out when a person was born from date of birth in hometown. So you would think that that would be the thing that would really. Would seem like a vulnerability. Yeah, it would seem like a vulnerability. So we did get the Social Security Administration to stop doing that issuance. So newborns don't have it. But anyone here who is in that time period, yours are predictable and we can predict it if you like. But the Knight Foundation gave us a grant to figure out a little bit more at another deeper level. What are the flows of personal information? Like where is the information really and where might it all come from? And so it was 2016. It was an election year. We were interested in the kind of information that you find on election website. There were 36 states had websites where if you were a voter, you could go and you could change your information. But how do those websites know you're you? They know you're you by you're the one who has to provide the name, address, and date of birth or name, date of birth, and zip code or your social security number or something like that. And so we were interested in where might all the places be that we could get it. And we found no shortage of them, 500 in some of our places, making it very easy to impersonate voters online. Not to mention the fact that Facebook reminds me every day who's had a birthday, usually how old they are, giving me the birth date, the year, and going forward. And if I know something about the hometown, I'm in pretty good place to guess that at that point. Yeah. And it wasn't very expensive to automate to change the voter information on those 36 websites. So for less than for around $10,000, you could change 1% of all the voter registrations in nationwide. So you're saying that Donald Trump is right. And that, in fact, millions and millions of people may be voting illegally using the scheme to go in and change the information and that you're capable of doing this for just a few thousand dollars. We're saying for 10,000 people, you can do the opposite. That is, instead of accounting for people, additional votes, this is way to disenfranchise millions of people from not being able to get their vote to count. But let's say no one would ever want to prevent anyone from voting. Well, so, and it's kind of an interesting system, too, because if you change a person's address on their voter registration, they go to the polling place that they've been going to for years. Only now, they're not on the polling register. So they're said that you can't vote here. They start yelling and screaming, so they give them a provisional ballot, except provisional ballots don't count in most states. So it's kind of a pacifier effect that, at the same time, can really have a dramatic impact on its sort of undercount. So it's the opposite of the voter fraud conversation that we typically hear about new people voting. It's more like a new kind of suppression of vote. So simply by having this much personally identifiable information available through various different data brokers who tend to be lightly registered, here's an attack that, for almost no money, makes it possible to functionally prevent large numbers of people from voting. And it can be geographically targeted, which suggests that you could also make it demographically or psychographically targeted. Exactly. That was our finding. Cool, I'm worried about that. Well, I think when I look over that arc from 2007 and the work with Identity Angel, that program coming forward, what we see are sort of this notion that these problems haven't really been addressed and that they're just getting worse. And instead of there being just financial harm, now you're talking about other kinds of institutions failing and so forth. And so, but we just tend to sort of ignore it and just seem to keep on happening. It would be nice if there was some way to get control of some of this. So Michelle Shaglowski is someone who's been writing really passionately about what he thinks is wrong with the advertising supported web. And one of the analogies that he started putting out is that every company says, we're going to collect your data. It's going to be an amazing asset. We're going to make a ton of money off of it. He suggests that you think about your data as toxic waste. Here's something that you have as sort of a necessary byproduct of what you're doing. But it's your job to dispose of it as safely and carefully as possible before it essentially melts down and destroys your business. Who's right? Are these businesses that are making enormous amounts of money trying to figure out how to broker data, are they handling this the right way, or should we be looking at this as something that frankly is pretty terrifying for a lot of people to touch? Well, I'm definitely saying on the terrifying side. And the reason for that, and the reason for that is really quite simple. Data, the social contract made by companies like Google and then onto Facebook, this idea of data for service is a model that says your data's worth nothing until they figured a way to monetize it and you get it in exchange, this service. Except for one problem, and that is, you ever try to get data out of Google or Facebook? They don't give it to anyone. It was free when it came in the door, but good luck trying to get it out of there. Well, suddenly they made it valuable. They made it right. Now all of a sudden it's incredibly valuable. And this social contract is also changed with the internet of things. Now I don't get free service. I buy the device. The device still has all the data. And so now that social contract is even changing where I'm now paying and they still get to keep my data and I don't get a copy of it. And so this underscores to us the value of the data. It also, the work that I just talked about that we've been doing also underscores the vulnerability that it leaves us all in. And right now, the keepers of the data and the ones who are making the most money on it have an incentive to keep data free, which is why things like in the, oh, sorry, Ethan. I think back one. Yeah. It doesn't go backwards. Yeah, well anyway, one of the things that you saw was you could get a database for $500 that had the social security numbers of all Americans. That's pretty amazing. So that's, right, exactly. But when a company will take it and when they take it and they monetize it, this becomes additional resources for businesses and there are many data analytic companies where their products, well, let me say it differently. We are the product. Yeah. Our data really is the product and so forth. So is there good news at the end of this? Do you have the solution the way you had 10 years ago, a script that was able to go out and sort of help people realize how dangerous this vitae behavior was? Is there, are you gonna let everyone who uses the internet right now know how dangerous it is to use any number of services that are grabbing PII? Where do we go with this? Yes. No, I don't have an answer. I don't have that answer. And the program worked for a while but then when we stopped using the program it seemed like the resumes came back and the vulnerabilities came back and the number of new sources are much larger. It seems like there has to be a more, a larger answer. And maybe part of it is finding ways for people to get control of their own data. Maybe it's a way for people to have more control, offering platforms that offer more control. So that's a direction that we've been looking in. And would that be like decentralized social networks? Is that sort of going in the direction of something like a diaspora? Is that a brokerage model? Like folks like Doc Searles have been trying. I have my data, I can put it out and you might get access to it if you give me certain privileges coming out of it. How do we do that? Well, I don't have any answers. Check back with me, not in 10 years, but maybe in two. Okay, okay. We keep meaning to have good reasons to get together for dinner. Let's change the subject. What are you doing? What are you doing working on this? So, maybe I'm more hopeful than you are today. So I was thinking about 10 years ago as well, and what I was working on 10 years ago was this question of who gets to speak online. I'd done a bunch of research with Hal Roberts, John Paul, Fray, Jill York, a couple of other folks over at the Berkman Center, now the Berkman Klein Center, around censorship. And I was really interested in this idea that the internet, which this global open space was getting shut down in Iran, in China. We did a whole bunch of work on could we get around this censorship? Could we use virtual private networks? Could we use these different anti-censorship tools? To me, it really seemed like the problem 10 years ago was are we all going to get this opportunity to speak? And the other project I was working on was the first project that the Knight Foundation supported with me, which was Global Voices. And so that's a network that's now 12 years out into the world. I'm gonna be off to Colombo, Sri Lanka, to meet 400 volunteers in a couple of weeks. We should tell people a little bit about what it is. So Global Voices is basically a community of people who monitor citizen media all over the world and then share their perspectives on what conversations are taking place. So rather than me looking at Pakistan as an American Christian and going, Pakistan, isn't that where terror comes from? I can have Pakistani authors saying, actually there's this really cool conversation in Lahore right now about this new art gallery opening and here's how we're talking about women in education and these are the conversations happening. So I was hugely enthusiastic about this idea that these tools were gonna keep getting better once everybody had smartphones, once everybody has connectivity, everyone was going to become a content creator. We were all gonna become publishers. Susan, I know I'm not allowed to say content creator, but we were all going to be putting information out in the world and I have to say- But don't we all do that through Facebook? So we do and here's the interesting thing. Once we get to the point where billions of us are producing information, sometimes it's a Facebook post, sometimes it's Instagram, sometimes it's even a like or a share is information of a sort. So we're all publishers right now and there's one big problem, which is that no one has changed the supply of attention. And so instead of just having professional publishers, newspapers, people with business models competing for my attention, my friends are competing for my attention, students are competing for my attention. Someone who came up with the latest, greatest cute cat video is also competing for my attention. And at this point, I think the problem that I am most terrified about is really simple. It's filtering. How do we decide what information is worth paying attention to? Oh, but Facebook will do that for you. You've noticed that. You've noticed that. So look- So they solved that problem. So here's the really interesting thing about this. So first of all, if you're a regular Facebook user and you haven't done this, try this experiment. There is a button on Facebook's news feed. You can switch between the most recent and the news feed ordering. And it's pretty incredible. I have less than 15% overlap between what Facebook shows me and what the people I follow are actually telling me. And the reason for this is that Facebook is very concerned about me. Facebook really wants to only give me information that I'm going to care about. And so they look for certain things. They look for friends who posted something that's getting a lot of comments or a lot of reactions. They make sure I've seen that, even if that's four or five days old, they really want to make sure that I get it. If I haven't liked someone or particularly looked at their pictures for a while, they may just sort of fall out of my feed. Eli Pariser started calling this the filter bubble and sort of making the case that Facebook takes this tendency we have to pay attention to people who are a lot like us homophily and just sort of strengthens it. But the truth is there's a whole lot of other problems that come down to being filtering problems. Fake news is a filtering problem. If we decide we want Facebook to identify and pick our news out and give us only the true stuff, we've asked Facebook to do more filtering for us. Fake news also, by the way, only really ends up being a problem because Facebook filters for stuff that's highly viral. And so it sort of makes it up the chain. A lot of what we deal with, with toxic abuse and harassment online, is a failure of platforms like Twitter and also Facebook to give us filters that we have access to and that we can control. Wait a second. Don't you want to know all the viral stuff? I don't get it. So the viral stuff that has a whole bunch of people telling me to kill myself because I work for George Soros and the Open Society Foundation, I can probably do without that most days. Although I have to say the nice thing about being a man online is that my worst days online are generally what outspoken women call Monday. So there is that certain gender privilege there around that. But yeah, it'd be really nice if people had the ability to sort of come in and say, I am being abused in this. Help me block people who are making it impossible for me to use this service. So I mean, for me, what I'm really interested in... So what do we do? Go Facebook? Well, so here's the thing, right? Facebook, as well as all these other networks, could make it possible for us to filter. And we know that they can. And the reason we know that they can is that the advertising industry is the most amazing filtering mechanism anyone has ever seen, right? So I'm guessing a lot of people in the audience here read JD Vance, right? So we're all concerned now that we're not paying attention to Appalachia, all these unheard voices out there. I can go on Facebook and say, I want to target 20 to 35 year old white men from Appalachia who voted for Trump. And I can start sending ads to them. What I can't do is say, I'd like to read these people's feeds and find out what they're talking about. I'd like the chance to become friends with them. I'd like to follow them. I'd like to listen to them. We can filter when we're willing to pay for it, but we don't have good tools for filtering what we hear from and what we do. So we built something recently. Yeah, yeah, yeah, can I pull up the video on this? Oh, let's find out. We launched this thing earlier today. It's called GOBO. We just very quickly went past the fact that this is a research project and that you have to sign an IRB form to play with it. But what GOBO does is it basically says... Was that a click through agreement? No, well it is, it's Facebook's click through agreement. Okay. So to use this, you link your Facebook, you link your Twitter to it, and we pull in your feeds and we suddenly give you this ability to start filtering them. And you'll see at the very top, you'll see that a few posts are already filtered out. You can see associated with each post, it says, what am I seeing as... Wait, I saw one that said filtered by gender. Well, so that turns out to be because right now the gender is set very, very far towards having a lot of women's voices. So you can actually essentially say, I hear too much from men online. Let's move that slider over. We actually added a mute all men button. I think that may actually be my favorite button on the internet right now. There's also a brands button. If you'd like, you can shut off the brands. But there's also sliders like virality and seriousness. And what we're doing here is we're using really bad machine learning. These are really crappy off-the-shelf algorithms. We could work much harder and make them much better. But this isn't a product. This is a provocation. And the provocation is to basically say, why is it that we don't get to do this, LaTanya? Like, why is it that we have entered into this contract with these companies that basically say, we have your best interests at heart. We're gonna do things for you. And then they close the box. We don't get to see how Facebook is adjusting those levers. Why don't I have the opportunity? Why don't I have the right to come in and start playing with those levers myself and decide whether one day maybe I want my feed to be really silly, or another day I want it to be really angry. Why can't I do that? Well wait, is this a filter of the filter or a filter of unfiltered? So this is a great question. So with Twitter, we're able to get a pretty unfiltered view of it. And then we do what we would call subtractive filtering. So one of the best filters that I find for Twitter is rudeness. You can basically crank it up so you only get people to a certain level of politeness. That turns out to be very helpful. We also filter in. One of the things that Gobo asks you is it asks you what news you normally read and then based on a set of publications, if you decide to expand your political point of view, we'll start ending up handing you articles from different sources and sort of filtering it into it. One of the real problems is that we can't filter Facebook at this point because Facebook doesn't like to let you play with their tools unless you're within their environment. So from Facebook we get pages. We get the public pages that you like and we can filter those. But we don't at this point get your friends posts. And to do that we may have to break a couple of rules. So we're thinking about that. Between you and me, I don't want anyone else to know that we're... I don't want to am I too sued. No, no, no. The worst thing in the world would be having Facebook come after an academic institution asking the question of why we can't get access to our own social graph and the posts from our own friends. That would be a terrible case to have to argue. So yeah, yeah. No, but let me turn this back to you Latanya. I mean, how do we get from the things that scare us and piss us off to the solutions? This isn't a solution, but at least it's a provocation. It's a way of sort of opening the question about do we have the right to shape our own information? How do we do this around privacy? How do we do this about personally identifiable information? So a lot of the work that I've done in this kind of showing unforeseen consequences, giving it air, sometimes it is disruptive and we have had a lot of practice of businesses getting better at what they do because they've sort of been shamed into it. Yeah. And maybe that might give that to you. But also sometimes doing things like this and sort of putting it out there in their face. Also, we've got lots of evidence and examples of where you then ignite a lot of really brilliant people to work on that problem. So data privacy has been one of those areas. And when we first started out with those, we're re-identifying your governor's medical record and then you end up with differential privacy, which is a stronger privacy today. It ignites a lot of smart thinkers. Spend just a moment on differential privacy. Differential privacy for a lot of people in this room is like one of the most interesting ideas that you haven't heard. And given that you've been working on that for a while, like give us a quick... Well, I don't want, you know, there are many ways. So back in 1997, showing how vulnerable data was started the idea of, well, then how do you fix this? How do you share data? Well, you can make some guarantees of anonymity. And so computer scientists over that time, I started with the first model and people have come up with other models. Today, the operant model is differential privacy, which is simply the idea of making sure that any of the outlier information is not present in the data. I have less than a minute, so that's my nutshell. Part of the way that I sort of tell people about it is that if I have full access to a data set, if I can keep querying it over and over and over again, I'm always gonna be able to de-anonymize it. If I have a limited number of bytes at the Apple, and particularly if I'm looking mostly at data that's sort of right in the main corpus and that maybe is hidden by noise that's been generated forth, there's ways around it. But my point in this, and the reason I wanted to end on this was we are not just grumpy, miserable grouches. We are super paranoid people who enjoy looking at where these technologies are taking us and looking at the unintended consequences so that we can find ways to push back against them. Now what I found so inspiring about your work starting 20 years ago was the idea that you were out there identifying these problems and saying these are solvable, we can do something about this, because that next step, that's the one that I think is so important. And our students that have come over the last 20 years have really demonstrated that. From everything, from algorithmic fairness and how do you detect discrimination and prices and so forth, we have a long history of students actually doing that so that in the end everyone can have sort of the vision that Susan Crawford talks about with technology without the harms. But this work of finding these unforeseen consequences, erring them out, dealing with them, has to be done. So props to the Knight Foundation for making bets on grumpy academics. We'll make things better. Thank you so much. Thank you so much, Susan. Thank you. That was great. Thank you. Thank you.