 Thank you. That was a wonderful introduction. It's been a pleasure to be a part of the fellow's community this year, and really an honor to count some of the people who've inspired me the most as now my peers here. So I'm really thankful to be here. If you're tweeting during the talk, please use the Berkman hashtag, and my Twitter handle is smwat. So this talk is really a reflection of a lot of the work I've been thinking about for a long time here as a fellow. But it's also a proposal for future work, so I'm really interested in feedback from the brain trust in the room and those out watching on the web. Really my main idea here is that we need more stories that ground data in personal everyday experience. We need personal data stories that make data uses intelligible and impact personal. So my way of building that case today is to talk a little bit about my own personal encounters with data very recently. Why I think understanding data matters now and how personal data stories can help us understand data. And then I'll walk through a few exemplary stories that I think really encapsulate this kind of story that I'm talking about. And then I have a pitch and a call for stories for you. So first I wanted to start off by talking about what I do and do not know about myself as other entities see me through my data. So Facebook advertising engine seems to think that I really like cheese boards. Even when they aren't selling cheese like in the Stella Artois example, they are throwing this image up to me more than occasionally. And so I can try to figure out what's going on here. I can think that maybe I said the word cheese board and it's generating these perfectly targeted ads. Or it could be that image recognition is taking this cover photo that I put up on my profile page and realizing that like yes she really does like cheese boards. But I can't tell if Facebook thinks I'm demographically bougie or if it really knows that I'm obsessed with cheese. So when I look at about the data which is Axiom the data broker's consumer interface into the profile information that they've collected from behavioral targeting, I can see that Axiom thinks I am a truck owner and I have an intention to purchase a vehicle soon. Neither of which is true. I'm assuming that this has to be based on my father's previous truck ownership and maybe he owned a truck in like the 1990s. So it's probably even coming from DMV records. And for some reason my address had been associated with my parents address because I've been moving around a lot and that's where a lot of my credit cards and things were forwarding to for a long time. So I can try to kind of parse why I would think that I am interested in trucks. But I still don't know what Axiom thinks that means that I like trucks or I'm interested in trucks. I can't tell if Axiom thinks I'm a truck in and style in or outward bound consumer, one of the many consumer segmentation profiles that might link to this truck data point. Axiom shows us this inferred demographic information but they're not telling us how somebody else might want to use it like a marketer or perhaps an insurance company or a loan underwriter. So then when I started to worry about the traces of my connections to friends in my time abroad in the UK and in China, I realized that I can use Facebook graph search to query how many people in my network I do know in China, even the ones that happen to be on Facebook who show up in my buddy list and in the way that the prison documents have described this concept of foreignness. But I have no confidence in any way that I don't meet the threshold for confidence based citizenship as it were. I don't know what it means to be a person on a buddy list associated with a foreign power nor do I know whether my use of VPN would contribute to the score. So my algorithmically determined citizenship is completely opaque to me. So these are just some examples of the personal encounters I've been having in my daily life from the very trivial in the commercial and cheeseboards to something quite consequential in talking about my shifting sense of my own citizenship. And these concerns all point to a certain kind of asymmetry that is really obscuring what's going on behind the scenes and the interactions that I can try to begin to understand, but I am blocked. And so I think the real crux of the problem is that we don't understand the causal relationship between our data and its uses in the world. Joanne McNeil has described this as reading the algorithmic tea leaves, which I really like. It's a dark art. We don't understand the how and the why of data's uses, let alone what our data forecasts about us. I like to think about it as a kind of uncanny valley of personalization that we're in right now. When we try to understand the creepy ads that follow us around or are strangely personal, we can't figure out if it's just course demographics or hyper-targeted machine learning that generates the ads we see and leaves us with the sense of the uncanny. So this kind of gets to this... All this data is making our behaviors and our habits and our interests more legible to firms and governments. And as consumers, we haven't yet developed the critical literacies to understand what our data is saying about us and, more importantly, how it's shaping our experience. The other day, a medical professional said to me when I was talking about my upcoming talk that I have nothing to hide if they profile so that a terrorist doesn't blow up the plane that I'm taking to Disney with my kids, I'm okay with that. Now, I started thinking about it and I realized what he was talking about was only one use, that one use he was thinking was justified, that the data was targeting to stop a terrorist. But I couldn't help but think that he might be going to Disney and he would find their magic bands that they're beta testing right now where they're tracking basically all of your activity all throughout Disney. It's your pass in the door at your hotel, it's the pass to go on the rides, it is your quantified Disney experience. So, when we say we have nothing to hide, I'm trying to try to understand how can we know what we have to hide from if we don't know how data is being used. Right now, data is a big black box. It's hard to develop opinions and feelings about what we think should happen with data when most of that is happening behind the scenes. It's obscured and opaque. The flows of data and its uses are hidden. So, when I started worrying about personal data a while ago, writing about it from the CIO's perspective, I thought that we had an awareness problem in the public. People didn't understand that by using free services, we were paying for them with our data, as it were. I think we've moved far beyond that and Snowden has heightened awareness even further. So, right now we're in a moment where we're primed to have a discussion about how we want our data environment to look, yet we have only scratched the surface about how our data is actually being used. I think this is a particularly important moment because we're moving from a time when data existed about our browsing habits and our mobile experience to a time when more of the physical world is being tracked and measured and becoming data. Our cities, our cars, our homes, our bodies are all extending our data profile. Anything with a sensor becomes fodder for this larger socio-technical system that we're building. So we're also transitioning into a time when we intentionally search for things we want and search interfaces clearly delineated paid advertisements to an interface that anticipates our needs and gives us small bits of information at a time in the early iterations of Google Now. Our choice architectures fall away as interfaces become more embedded and anticipatory. So we're learning to live with data as more of our domestic life becomes subject to digital scrutiny, but the ways we interpret and influence the uses of data are about to shift dramatically. So my proposal is this. We need stories that make data uses more intelligible and it impacts more personal. We need new tools for thinking about data's role in our everyday lives. We need stories to be relatable. We need to go beyond I have nothing to hide mentality to illustrate the ways our environments are shaped and influence us. We need more personal stories to make the uses of data more intelligible and more practical. We need stories that bring data back from the big data scale back down to a human scale. I think in order to have better conversations about our evolving norms and feelings about appropriate uses of data, we need to make the uses of data more legible. That's the only way we'll be able to hold governments and corporations accountable for their data practices. So now I want to walk through a couple canonical personal data story examples to work through that work at opening the black box and making personal effects of data legible. So by now you have all heard about this example. The New York Times profiled the algorithms that looked at purchasing patterns to identify early pregnancy indicators. It also included a story about how the pregnancy coupons reached one family in particular. The father of the household brought it back to target inquiring as to why they would ever send pregnancy related coupons to his teenage daughter only to find out that it turned out she was in fact pregnant. Now this story has become canonical in part because it does a lot to educate us about what was actually going on behind the scenes and how they were determining second trimester pregnancy but it also tells us the impacts of a practice and it makes them concrete by dealing with social impacts on this particular family. More recently Mike C, I'm not sure I know how to pronounce his name but Mike C received a direct mail envelope from OfficeMax that included daughter killed in car crash in the address. This failure exposed just how egregious market segmentations from data brokers could actually be and it exposed the kinds of lists that data brokers are keeping on us and the sorts of information they think is relevant but how might that information be used and more importantly how should it be used. Clearly this was a failed use but how might it otherwise be being used. This story connected the personal effect of an insensitive reminder of the loss of a child in its traumatic event implicated OfficeMax for its use as well as the data broker for its database classification. We began to understand how something like this could happen and now it's an example of this failure and this is actually my story that I wrote in The Atlantic which is linked on the event page. So I had deliberately chosen not to update my Facebook status when Nick and I got engaged and I didn't want to show up in the database very intentionally but then Facebook one day asked me how well I knew him and also displayed an ad for a custom engagement ring right next to each other. It turned out when we asked Facebook what was going on that it was a coincidence and that the service enhancing survey to improve the relevance of my newsfeed happened to match up with a demographically determined ad. These two pieces were run by different algorithms but the confidence the coincidence didn't lessen the effect of feeling as though Facebook had intruded on my personal life and even after talking with Facebook to confirm what was going on and that it was a fluke I still had no answer as to what factors went into the algorithm that asked about Nick as opposed to any of my other friends as a person of interest. Was it the sheer number of images we were tagged in together or increasingly overlapping networks with no idea. I still don't know also if I was getting this engagement ring ad just because I was a female between the ages of 18 to 35 without a relationship status or if it was because a more complex series of behaviors across the site alerted Facebook that it seemed like Nick and I were getting more serious. My Facebook story showed that even though the ad and user survey were coincidentally displayed together its effect on me was not incidental. So what is it about these personal data stories? They detail the effects of data and algorithms on our everyday lives. They aren't about data breaches where we have no idea if we are affected or not or should we be worried. Data stories explain what's going on behind the scenes. They give us more information about how these black boxes are actually working but they also give us a framework and a vocabulary to begin to interrogate these data environments. They expose the logic of the engineers building these systems their data science practices and the reasons for their interventions and they detail the consequences of design decisions and power structures. Data stories are also concrete. They happen to real people. They are not obscured behind big data rhetoric. They are grounded in individual experience. They give us a sense of what it means to be a digital person today and they describe the dynamics of our roles as consumers and citizens and individuals are changing. I first really became interested in personal data stories about data in my research in the quantified self community. I found that individuals were using numbers of storytelling devices. The show and teleformat is quite literally a narrative using data. These data stories are full of thick description and leave room for discussion about the individual, their feelings, their interpretations and their sense of self. Like the personal data stories in the quantified self show and tell presentations personal data stories I'm interested in are about identifying personal meaning or the effects on the individual through understanding the uses of data. But most importantly I think these personal data stories have the potential to restore the subjectivity of individuals to an otherwise objective medium of data. Personal data stories are really hard to tell. This is actually a Reddit comment, I know I should not read them in response to my Atlantic article and it indicates the trouble of telling personal stories and the subtlety of talking about privacy from the database rather than privacy from people. It's not just the internet trolls that make personal data stories challenging to tell though. Data stories are hard to discover. Individuals aren't necessarily primed yet to be critical of these patterns and the strange things that happen when there is a coincidence or a fluke or a change in the design that exposes something interesting. These riffs reveal the seams of the system but they're hard to see. Personal data stories are also anecdotes. Sometimes the effects are technically repeatable but often they're not. And they are exceptional and so by big data standards are statistically significant. Data stories also need resources to reverse engineer what is going on or you need skills to be able to sandbox to build out hypothetical digital profiles to compare and contrast outcomes. Or you need the journalistic clout to get a response from Facebook to figure out if what you see is relevant or intentional or not. And so in that sense these stories can be taken out of the voice of the individual affected and end up appropriated by journalists. It's also challenging to tell these stories with any nuance. There is risk always in sensationalizing the concerns and of course the target story is certainly an example of that. There is a delicate balance in highlighting these exceptional cases and grounding it in the effects on our everyday lives. Personal data stories also risk the personal privacy of the individuals by heightening their profile and their plight. There is also a danger of personal attacks on these stories i.e. Reddit. But these stories are all the more compelling if they come from real customers real consumers. If we can answer the questions they have we can get at the core normative questions of a conscientious but not necessarily technically savvy individual. Data stories will inform future design choices and policy decisions. They will serve to educate public and representatives about the stakes at play and where individuals are still not sufficiently protected we'll start to see where the regulatory holes are. I want to see more data stories because I think they can change the nature of the conversation that we can have right now and they can even level the playing field between all interested parties and ground digital practices in human scale effects. Personal data stories will help us uncover the politics, epistemologies, economies and ecologies of the socio-technical system for which data is becoming the primary substrate. I think this idea for personal data stories work is fitting into a much larger emerging suite of tools and practices that expose the themes of data uses and algorithms algorithmic design of our built environment and so I just wanted to talk about how these fit into two other separate activities that I see are related. Lots of people are creating technical interventions, building tools to make data more legible so this tool, immersion, takes your Gmail metadata and exposes it which allows people to actually comprehend what is embedded in their metadata and what the meaning is in their metadata. And then this is Ben Groesser's Demetricator which I saw actually at theorizing the web this past weekend. It's a browser plugin that hides Facebook quantifications of likes, friends, time stamps. He calls this something that's critical software to reveal how data how Facebook structures its use and possibly addiction with quantification. And Ben actually talked about how exposing or removing the numbers actually changes people's behaviors. They don't want to be the first one to like or they don't want to be jumping on the bandwagon. And there's another class of interventions which are personal, very personal but more performative and somewhat privileged. So Janet Verresti presented this past weekend also at theorizing the web on her infrastructure inversion project. She used her pregnancy from the internet by using cache, browsing maternity websites with Tor and asking her family members and friends not to write about her pregnancy even on private Facebook messages. It's a really compelling story and you should all watch it. I can send around the link. And then in her recent book, Julia Angwin takes extreme measures to prevent tracking and protect her privacy She used a Faraday case for her mobile phone and she even created a fake identity Ida Tarbell to separate out her commercial activity online. So I really like these examples but they're as much performance as they are in experiment. They are performance pieces to demonstrate actually the futility of perfect privacy as a goal. And so in that sense they don't depict the realities of everyday life except in the ways that privacy protection hampers life. And in contrast, I think my goal in talking about personal stories from average consumers is to help ground the trade-offs and better inform practical decisions in everyday life. So my interest in telling these personal data stories is grounded in a larger vein of technological criticism. In much the same way that cultural and film critics discuss what's important and interesting for our artifacts, technology critics could undercover both the artistic cultural importance of technologies as media as well as the power dynamics inherent in technologies as political artifacts. Technology criticism should explore our relationship to the firms and the governments as individuals and as societies. So I'm advocating for technology criticism with an anthropological flavor. So to that end, I have a pitch for you today. I want to build a column for telling personal data stories. It would look something like the haggler or the consumerist but for data and algorithms specifically. I think there needs to be a platform to be able to tell these stories with some regularity and some consistency. The format would be similar, investigate into a particular case to solve a personal problem while exposing the larger systemic issue at hand. The column would be a means to surface these stories, explain them for an individual, describe their case and its impact on that person and reveal what's going on for the rest of us. Data stories will also serve to develop our attention and to notice and scrutinize when we come across something over the course of our digital lives. So I think about this in terms of like maybe a regular column in a popular publication largely for a lay audience rather than a technical audience. At the very least I think it could be a single purpose website to collect and share data stories so I'm absolutely open to suggestions and alternatives. And I wanted to finish out by, I couldn't not include a picture of a cat for an internet talk. But I want to make a call for personal stories. I want your feedback, I want to hear your thoughts. This is completely a work in progress and I want to start getting it off the ground so I need help to solicit some of these consumer stories. So do you have questions and personal encounters with data that you want to share? And I want to open up to the room too. Do you have screen captures of weird things that have happened? What are some of the compelling examples that have really changed the way that you think about your relationship to data and its uses? So with that I just want to open up the floor for questions. Thank you. A lot of these systems seem to depend on knowing who the people are filling a huge database about them. Are there any systems that can operate without knowing who the people are just sort of anonymously reading their choices and guesses about what they are learning on the internet and sort of trying to target using that system? I guess the distinction I think is there are plenty of tools that are talking about like everything's double-hashed and it's all protected and you're not really knowing an individual but I guess it's how you define how knowing an individual is. Do you define that by a name or do you define it by a series of behaviors? And to me I think the line between identity and activity is one that's blurry to me. Does that make sense? I see your point. Their objective not to be to take a person's name and build up a history behind it but to just understand an identity and target the recording that gives any system to have that as a objective instead of actually finding out who the person is. I'm sure there are technical structures that would allow for that but I'm struggling to come up with examples off the top of my head. Web searches like that the ads that you see depends on your past history and your query results. So Google does not always try to identify or see John or whatever but they keep a profile of your past searches and past queries. That's how you see that. That's on Google. They don't need to know it's you or whatever. They care about whether I search for a claim ticket than they start showing me. That's the line between browser cookies versus you can use Google and it will pull your browser cookies without being logged in but then when you log in it's another different experience. But the systems in places like this this is more of a heuristic organic process. They look at the large scale all your activity on the entire internet. Not a personal story but I have a friend online who basically said well I appreciate seeing these ads for this thing well I was searching for it but now that I've bought it I'm about to see any more of those ads. Is there some way to tell it that I've actually bought this thing? I would love to see that. I think right now it's back to this uncanny valley point which is for example a loyalty card does not necessarily sync up with your browser cookies. It's one company which you are very loyal to does not know that you purchase something in person or hasn't kind of resolved the ad network with the credit card network. David? I'm sure at the beginning you spoke about wanting to look inside the data black box. I was wondering when you talked to people at Facebook and other places they said well the black box is just too complicated even the people running the algorithms don't really know what happens. There's thousands of parameters and they may say well we don't know maybe no human can hold all the complexity in your head Yeah and I think that's one limitation of even advocating for like we just need to understand how this works better. It's the kind of like no ability problem. So I think there is some extent to which like having more concrete examples of the effects at least help us understand how this thing is being used and I mean it in the most like basic sense right like is this marketing data being used for insurance underwriting purposes like I think it gets back to this problem which is that all the data begins one place but its uses are infinite right and we haven't really put limits on what appropriate uses are and so that gets more towards this like regulatory discussion of like well we're going to say that no marketing data will not be used for insurance underwriting and et cetera et cetera but we can't really talk about that until we know that it is being used that way so Sarah is really interesting and I'm one of the things that I think would be I don't know this is just to throw it out as a suggestion but it seems like I know there are tools out there that let you easily compare legal documents like a privacy policy of say Facebook because they do change quite a bit and the other thing that seems really this is such a moving target these stories which are so I feel like this would be really valuable but of course like a story today might shift in the future by policy and practice of these big companies but I wonder if back to I want to just picturing is like some way in which you could have a the privacy policy of Facebook say and then as you read through it as a reader you could have links to your stories or like an annotation where it said like for this policy which sounds really hard to read the legalese for a lot of people here's like a related story that illustrates that like so you'd have these stories as a kind of a little gems along through what is an explainer to the legal documents of big privacy sort of data collectives I don't know I would think that would be a fun sort of way to try to well useful right making it actually concrete and just have a way to archive them you'd have to make sure you said like on you know July 2013 or July right the the flip version of that is when Facebook introduced graph search they like made it personal walkthrough basically it wasn't a video it was actually like you walking through your Facebook profile and like here let me teach you how to search for friends obviously they were encouraging you to learn the service right they had their copy spin on it to apply right to apply the same logic to the privacy is great yeah your idea for a column on this is fabulous thank you oh my gosh it's a great idea let me just ask you a couple of questions about your your vision of that or the intention or what it could accomplish because at the very beginning you said you know that place where the discovery that you're free you're the product and that we're past that I would suspect that if you were doing a column in a magazine or a newspaper a lay column that the audience is not past that and that the the function of the column would be consciousness raising that it would be a revelation to most readers that this stuff is going on so it's not yet at the building critical but it could be leading toward building critical literacy so those would be layered and then I also want to ask you about the black box thing because the critical literacies may be impossible to achieve by definition because the black box is actually designed to be black and so you can you can take like little pinpricks maybe and you get little peephole tiny little peephole but so for a proget for you so if you start with consciousness raising and then and then that creates capabilities for critical literacies but the critical literacies are right now impossible to achieve because the thing that you want to be literate about is by design illegible right so then you're into politics and so I was just wondering if you had thought about that you know if you were looking at it in a multi-layered way where yeah you start with consciousness raising but I also have thoughts about you know how you mobilize politics with that consciousness because with it without the politics you can never have a literacy that's my point yeah no I completely take your point and I think my instinct in trying to kind of unbox the black box is that they are intended to be black but they are not I think some of all these examples are where the seems kind of expose themselves and that's where we can like that envelope a little bit right exactly and so we just have to be able to know where to look for those clues I think and so even that is the consciousness raising or like a skill set to say as a consumer like huh that's weird and not just be like that's really weird and go on with your day but send it off somewhere to like try to figure out what the heck is going on so I think there is a step towards the column even being a way to teach people to notice that fluke moment I agree with that I would say too you're not going to get the public policy debate if you don't have the critical literacy and the demand it's a layering I was just curious if you projected yourself down that path and I think it does sort of fit in like there has been a lot of activity in discussing data broker practices right now and so it's not a non issue politically but I think it's still really hard to understand the edges or the kind of contours of what we're talking about because we still don't know where the contours are but even if you even if you try to regulate brokers they'll just go offshore so what does it matter right so because the internet is there is no border on the internet that's why the NSA hits us at home in a rod because there's no dividing line right so it's kind of like a feudal battle well and to that end so Axiom I should give them credit because they are leading the edge in trying to get ahead of this for obvious like pre-regulatory reasons they are allergic to that discussion so they're starting to say things like well we need to have this like better privacy policy within the data broker industry or this privacy protecting concept in general and Scott Howe of Axiom has already come out and said well the data broker industry should just say we will not use this data for insurance purposes or loan financing but it's one thing to say that previously they have said things to the effect that like well our customers like we don't even know what our customers do with our data so there's no kind of enforcement potential there my only thing I keep waiting for a story to break about some kind of really weird stalker who gets very savvy with this kind of data and like California just brought out the laws against revenge porn and so you know in all these stories of personal horror finally motivated lawmakers to try and at least do something what is your take sorry if I may ask you a question what is your take on this sort of MIT and I'm sure you know about this MIT is getting ready to make e-mail content of people's e-mail in their MIT student accounts property of the students and there's omelette which is sort of the this social media thing where you own the data of the omelette app so I kind of keep wondering when people say that free is actually expensive and if you paid a little bit of money for something you spent all of your time using and poor your whole life into even though HTTPS and I went on to explore and basically everything that you use that's digital is not actually secure ever because every measure has a countermeasure it's like an endless sort of arms race I think at some point I personally would rather pay like I love watching the BBC so I wish I could pay a BBC TV license instead of using a VPN to hack into it so I think at some point we're all going to just sort of wake up and say you know I'm spending $100 or $150 or $200 a month on digital services why don't I spend another $50 and own my digital life well it could be it could be my son is eight so he would have a chance at some privacy in his digital life but do we want to tie like financials to privacy rights what do you mean I would go even further and say that it's not I'm sorry I don't understand your question I just mean for people who can't afford to buy privacy technology then it should be subsidized guaranteed by the government as part of your basic freedoms and we have to advance sort of personal norms to include that to have a functioning socio-technical experience where data is the substrate which is a line of yours that I love and I'll quote so I think I would go further and say so I actually like to talk about the stuff not in terms of privacy and privacy rights but just in terms of like manipulation and like kind of predatory uses of data and I think that goes beyond just like a vague sense of well should you know or know about that but it's going further and saying how does the use of this data foreclose my like ability as a human being to make choices or to make to understand my world it's a subtle distinction I'm still trying to like just be that Jacques Attali spoke at Harvard last Monday and he talked about a service that a friend runs in New York we're based on a book of his that you wrote in 77 about music that you can actually they can actually determine your credit history based on the kinds of music you listen to and where you live and so they just mined 500 million data points in music and they've nailed everybody just based on music press I like the basic approach you're taking and I think the idea of coming up with data stories personal stories makes a lot of sense can I just add a little complexity to this what you have is very tidy there are some complications one is generational differences generational attitudes towards authority I'm the irony of the boomers never trusting anyone over 30 then bringing up a generation whose parents are their best friends they have very different views of authority figures and therefore who manages their information the other important issue is just the basic economics of the net some very strange reason we ended up with free and there is no such thing as free so someone had to pay and it's the advertisers who pay and who will continue to pay it's too late to ask people to pay there have been attempts but it doesn't succeed so it's really changing the attitudes of the advertisers that will change the behaviors if they see an economic benefit acting differently they will one of the other problems of all of this is the which I get inklings of in some of your stories is the attitude towards big data big data I sort of bristle at I guess I'm old enough to resist some new terms it just evokes the idea of the electronic brain that original computer is the fantasy of just because it's big data doesn't mean there are smart people using it absolutely so it doesn't make it any smarter and believe me as you know the computers themselves don't have any intelligence but I'd like to bring up a different direction that I think the advertisers will follow which will change their behavior one of the things that I do is work on community building and community building will be one of the major buzz words for online marketers and the big difference in community building is instead of the oppositional us versus them consumer and company selling things or the government instead you have companies realizing that if they have consumers meet each other online and reveal data to each other that's of interest to them then that can still benefit the advertiser and we do a lot of that in politics where we put people together who have the same attitudes and you're starting to see that move into retail and in other directions where people who buy shoes together people who discuss clothes together build communities and they want to reveal that data so I think that will create a natural shift so I think we have to sort of wait or help the market shift rather than try to impose a new way of looking at it thank you Peter, did you have something? Yeah Sarah my story would be that a year ago I decided that I didn't want Google knowing what I was doing so started using Duck Duck Go and after about a month I gave it up because the results were just not clearly as good that Google knew what I was searching for and the results are more effective and when you and I sort of would like the idea if Disney could figure out how to make Walt Disney World someplace I would ever want to go again and make it more efficient so there's a certain amount where the data brokers can make our lives more efficient more focused, speak more to us and if there are bad data brokers when they're giving you a cheese board when you don't want it that's a sign that they're not doing a very good job and they're going to go out of business and so that eventually the algorithms need to get better and better so what's your response to the idea that somehow it's there's not much to worry about here Right, no I totally take your point I think my instinct is to go to the idea that those, all of what you just listed are appropriate uses of that data and we know generally what the tradeoff is logging into Google is going to give you better results and that is a clear is a clear benefit and tradeoff and you have decided that that is a tradeoff that you are willing to make I think there's a lot of things that are far more hidden like all these data broker examples where you don't see why somebody would want to know that your daughter was killed in a car crash and so there's, I think to me the distinction is a lot about we see pretty clearly what the good uses are right now but we haven't gotten a lot of visibility on the more nefarious uses and so we need to like tackle those You think that's a nefarious use if I was trying to sell something to someone I think it would be very helpful to know that that person's daughter died in a car crash and so I would know that there may be certain kinds of advertisements that would not have a that would not have a negative response at that point so yes it's unfortunate it showed up that way but it still seems like it could be valuable data to have but it also if you could figure out that somebody had an addictive personality and then you started setting ads to Oreos or Twinkies to them maybe that would be a problem right well so the example following up on that that I've been thinking about lately is like okay you have your internet of things fridge which has learned your eating behaviors and preferences and even though theoretically you shouldn't be eating a lot of Oreos it just keeps ordering them for you automatically when you start to run out and there's this kind of line between like libertarian paternalistic internet of things and just like exploitative internet of things so somewhere I feel like it lies somewhere in between those what is good for us and to what ends and to what purposes the other example I really like is what if that information is used targeting predatory loans and being a proxy for credit scores that low income populations just don't have and that data ends up being essentially proxies for discrimination based on race and things like that so it's not race but it's well you use Twitter a lot and you have a phone package that is month to month and there you go 10 more Kendra? All of the stories you talked about had this sort of like very personal very intimate quality to them so they were sort of like it was so shocking because it was a pregnancy or it was a family death or it was like something about your relationship and I'm wondering if you think that those are the type of stories that are like sort of most effective at getting at like some of the big data problems or if there are other stories that are potentially like cover some of the same stories or like what if you can talk a little bit more about like how the sort of valence of the story changes people's perceptions Yeah I think thank you for pointing out because it is a good kind of thread throughout them I think there is some instinct that it is very personal it makes it as personal as it can possibly be like your family, your body your sense of self is really tied up in all these things I guess I'm trying to think of other examples that are a little less less heavy Yeah One is a dad busted his daughter having a New Year's Eve party when the rest of the family was out of town by noticing on the smart meter an unusual high use of electricity Right And number two is there's been a lot of use in New York in particular there was a particular like ring of slum boards who were busted by careful analysis of the communications of their companies and how they were these like shell companies were all tied into one another and they were able to take down slum lords who were making terrible places for people to live and that was a wonderful thing that happened I have occasionally read but I don't really know how much is true of claims that some companies online might just have differential pricing for different people depending on who thinks they are that you and I might go to the same site and charge you $66 or $77 because that really happened and if so Well so the good example of that is the orbit story which was they were not giving differential pricing but giving displaying different orders of results based on whether or not you were using a Mac or a PC Macs got a lot higher first choices and this is back to the like choice architecture How long by Google isn't it? Is that true? Yeah Does that help it too? I don't know Yeah I don't know is this a widespread practice and if so is it it seems like certainly that's something that would get people's attention if seemingly generic things are being charged different prices to different people without them knowing it. Right so that's the kind of kind of technical intervention that's really heavy lifting to do and so those are harder things to tell as personal stories but I think they're absolutely very interesting and revealing so I think there are a fairer number of people experimenting with this by like sandboxing and creating profiles of different people and kind of doing cross you know programmatic search queries and things like that and that is beyond the scope of what I am capable of but I think it's fantastic and I want to see more of that so I wanted to return to your notion of the uncanny valley of personalization which I find really fascinating and I know that historically at least people who build technologies want to avoid the uncanny for example robots that look too like scary and creepy and we don't want that and I was wondering if this is something that big data has engaged with the possibility of uncanny and therefore off-putting for example with the dead daunter example right I mean the cheese example to me is more like even more uncanny right it's like why do they know how much I like cheese when I'm not even talking about it yeah I think so I'm not sure where if it's really being talked about within like the advertising community or people who are trying to just like get ever more perfect information on people I guess I'm not I'm not sure what the results would be right like would that mean that you make it just like personalized enough but not too personalized so that it like doesn't jump into the valley or is it this spot where where we just don't even know how personalized it is and so we can't understand if it's getting there as a user I don't know does that I guess this is the question right right and like where our threshold lies things change and as things get more more smooth I guess any questions I just finally figured out why Facebook keeps putting up men's magazines particularly from Germany and South Africa I have a German surname and I changed my status as single just as a joke to see what would happen with some friends and I think they just thought I must start reading men's magazines like and they kept coming up in their feet and I kept saying I don't want to see this yeah I really like those activities where people are like pretending to tweak certain things so like I have a friend who says she's male on Facebook just to see what happened what if you switch I'm listed as a male what if I change to female exactly would you get Cosmo ads I don't know it's a good experiment though this is what I love this like how can you transgender people right but it's this how do you kind of find ways to tweak your experience and be aware and interested in kind of following what happens but it's still TV reading right like what does this really mean Sam just really quickly I'm trying to think of the next steps here but because my question is who do you think ideally should be creating these stories right because you've got companies that own the data that could potentially create them but they're probably not so incentivized to you know you could make stories about yourself but it's very difficult to actually gather enough information like you said most of it's kind of like these incidental accidental things right or is it you know there should there be people making tools to empower people to make stories about them what what do you see as like the best approach right I think all of them it's probably the answer I think so even in the example of companies doing this I think there's a lot of work to be done to just be transparent about what you're doing and understanding how users are interacting with your data in a more holistic way and then just like well here's the stats on our like you know our our app is used by 80% teenage girls like that's not enough to understand how they're using it or what they think about the data that's being interacted there so right yeah there is there is no story there and so it's it's kind of more an advocacy for like helping other everyone to learn to how to be like ask ethnographically informed questions which is like a larger a larger goal but but even in that sense I think journalists can start asking these questions but I would I would argue for more like engaging of the voice of the individual because oftentimes they just get so sensationalized that they're just like how this happened to one person and you'll never guess what happened next so does that start to answer your question but I also think that the other piece of having technologists help uncover these things and that's an ongoing effort I think I would I would love to start like pulling together resources of all those tools that can help consumers play around with this and kind of experiment too yeah I agree I think the pairing of technologists and people who think that they maybe have like a nugget of something that is evidence of an algorithm or it's really about connecting those people to be able to start telling the stories together because that's really the challenge here is that they are technical problems but they're also very much personal like social stories to be told thinking it through I noticed one other thing which is you know the consciousness raising aspect of it which is an aspect of a lot of this discussion it's really exciting and then I began to think about another potential side of it which is possibly more disturbing which is habituation so I was thinking about like those columns that started mismaners so when mismaners started it was there's a framework of rules and could you please inform me what is the correct rule right so that was how do I fit in and be correct that was what mismaners was and then as the process of individualization progressed to the 20th century on the meta level then it turned to Dear Abbey what it was well I'm in a situation and there are no rules and so I really have to make it up and could you help me figure out how to make it up you with me so there's that shift but then Dear Abbey which started off you know like Descartes wow I'm inventing something ended up being habituated you know because it was it was just nothing routine so given the political context of this it would be really sad if these data stories just became habituated you know just became routine like oh yeah that happened to you well look at this that happened to me and you know so I just wanted to put that out there because I'm inferring from your work that you would not be aiming for habituation you would be aiming for something much more critical and so then the question arises well are there I don't know what the answer is but just how do you move forward with this idea to sort of raise the probability that it has that impact effect and lower the probability that it leads to um routinization I mean I think one piece of it is just integrating the more kind of um power dynamic story into it so addressing both the interests of the individual and the interests of the you know data broker or whoever the kind of target is and I think there it's hard to tell those stories with an even keel to uncover everyone's interests but at least there's a step towards like well you could do one thing but you know their instinct about regulating this is this and but make the argument that you know consumers still need to be protected blah blah blah so I feel like there's room for a dialogue but I don't know to be right it also probably depends a lot where the column ends up and you know all of these well you choose right so so I have a theory on where this is going to go I think people are going to want to own their data so they'll pay to own it but they'll still want to be able to participate in information sharing with companies and so on so an intermediary will arise that you will then transact to represent you anonymously with all these other entities so that you can participate in community building while still maintaining the privacy of what you are right this is the what kind of data locker model right or like data third party sorry right personal data stores that are acting on your behalf but you still have to trust them so yeah currently it's incredibly careful about protecting my data from advertisers because that's what Google sells to advertisers advertisers have a terrible time at doing anything other than like sending requests like the black box works both ways right now the Google store the Google is the data store and we generally trust it in the way that we trust it yeah our behavior in the case trust it we need the real we I think it depends on what you define as a vector of trust I mean just because I trust Google to give me good search results doesn't mean that I necessarily trust what else they're going to do with the data of my search results or that could just be an unexpected breach and somebody just that too reading out of Google and it becomes harvested on the net and then we're in a whole new world yeah Maggie I was going to channel a question from Earhart and I also have some what goes with my own answer that maybe I'll share after you okay thank you but his question is could your theory of changing food stories if they exist where people benefited from personal data collection oh yes so quantify itself yeah right so I didn't I didn't talk extensively about that but that's kind of where this is some of this is coming from where it's a very very much a personally empowered interaction with data collection and data uses and data outcomes and so that's kind of where I'm coming at this from is a particular group of people who are slightly more technically savvy and have good outcomes I just wanted to follow up with your idea for columns so in quantified self we're actually working on a project about trying to share more people's personal stories so we should definitely talk great so I think this is a great idea because we're already trying to do something very similar so right yeah I just cannot get around a different question can't get around the fact that the recent history of market research there is an extreme deception deception as in like fraud or this process is going on forward and you go back to the 90s television survey and television I mean I tell telephone telephone research is complete deception you're telling people at that time that you want to talk about product development and you're in fact collecting discovery from exchanging discovery between you know legal departments and large companies you have field research at shopping malls it's nothing to do with product development you have focus groups that have nothing to do with product development but people have this illusion that that's what that's about but not here right well and that still gets back to what is the original intended use of the data and what is the potential use of the data the data mining was going on at the same time that in the 90s you know what we took for to be product development market research and you pick up the phone and we talked about Colgate toothpaste or whatever was becoming something that you know not what people thought it was right I mean why would I mean I worked for these companies downtown why would you talk to people who are only over 70 if you were trying to sell them toothpaste right so I think we have time for one last question alright so it seems like the stories as you collect them are going to separate into two groups one is stupid the other is evil and the stupid is the sending out the email with daughter deceased that's just stupid clearly there was no intent to do that evil is what isn't being described yet but the offshoot of orbits charging different amounts is sort of a digital Jim Crow I could decide you're never going to buy my product I'm not going to offer it to you because you're not the right demographic and all of a sudden you have people shut out without even realizing it and that starts verging on evil so I'm wondering if you have thoughts of trying to which side you're going to be on or which not which in favor of but as far as which you emphasize you're going to have to do some selection on the stories that come in so if you champion to fight the evil or if you just point out how stupid the advertisers are I mean I think the more the merrier in a certain sense I think it all gives us more insight into what is happening we will help on the other side as well there are some examples on manipulating apps like there is an app called Waze it was a very popular example some soon hack into it so they start creating fake location information so the apps start thinking that there is some kind of congestion so start directing other people so you get the highway for yourself for example so I think the data mining also the evil part works on the other way as well so I for example because of this Mac thing I use a virtual machine on my like Windows machine so when I want to buy something I use Windows basically so I just avoid those kind of Mac things so I think the data manipulation could happen on the other side as well like Google bombing you can that's an example of the users right right, right exactly I have a whole I have lots of feelings on Google but we won't get into that well thank you everyone so much for joining me today so if you have any ideas or want to make connections please do email me or get in touch on Twitter or however you prefer thank you