 Hey everyone, I'm really, really excited to be here today for a lot of different reasons. But probably the biggest one is that I always joke that my parents have no idea what I do in life. And this is like, like I thought I had done that, like kind of, I don't know, I thought that they already, I'd already like overwhelmed them, but then I showed them, I told them I was going to be speaking here. And they were like, you're speaking at a place that has exclamation marks in the name. I don't know how to pronounce this. So, thank you all, thanks for helping me raise the bar on having my parents not understand anything I do ever. Oh, my name is Mimi. Before really getting started, I want to start with a story. And specifically, it's a story that I wrote when I was working at Quartz, the publication. And what it talks about is how for all the vast amounts of location-based data that Google has, there are still some places that are impossible for the company to map. And I want to show this because we have satellite and aerial imagery that we can capture for these places. But when we try to look at the actual geodata, then they show up as blank. So here are a couple of examples. This is Makoko. This is a floating lake community, is the video playing, can we tell? Now it is. Okay. It's a floating lake community in, yeah, there we go. Cool. So you can see it. Well, now I ruined it, but you can see. The left side is sort of the satellite imagery, but then you look on the right side, that's the geodata. There's nothing there. And there are about 100,000 people who live here. Having been here, I could say that I think there are actually more than 100,000 people. But you can see that they, it shows up as blank space. So here's another example. This is in Chad, the lake-featured drainage basin area. So there are a lot of different areas around here, and a lot of people who are doing humanitarian work have talked about how there's no data here and how this is a gigantic problem, because the area is home to almost 2 million people. Here's another one. This is in the Gobi Desert. And I think you're probably getting the idea of how this works now. So once again, this is an area that's really remote. You can see there's virtually nothing on the map. And then our final example is this is Morro dos Brazeres. If there are any Brazilians, I apologize, I know I pronounced that wrong. It's a favela in the heart of Rio. And once again, we kind of have the same story, which is that you can see in the satellite and aerial imagery how much is there. And then as soon as we start to look at the geodata, we see nothing. So my question is, what is actually happening here? This is actually not a story about Google. As many of you might have already thought about or started to realize, this isn't a story about Google making places invisible or trying to erase people. In fact, Google has done a lot of work to try to map difficult areas. And in Brazil, in particular, Google has done a ton of work trying to map those areas, working with local nonprofits there. Even still, only about 2% of favelas are mapped. This also isn't, we know this isn't just a problem with Google because maybe you're familiar with OpenStreetMap, which is sort of the open source version of creating mapping data. And if you log on to OpenStreetMap, you will see pretty much the same story with all of these places. So what are we seeing? I would say that we're witnessing a tension of sorts, sort of an in-between space that is illegible to the systems of data collection that we have. This is a story of that tension and the odd things that result from that. In a sense, what we are seeing are the implications of a machine readable world, and specifically what happens to the spaces that resist that. Because that is what all those places that I showed you do. The nature of how they're constructed makes them illegible for our systems of collection. And this is really what all of my work is about. It's about these interesting moments of tension that are inevitably tied to the fact that we live in this time of mass data collection like never before. All of you all know this, the degree, the very world around us, and we as people are increasingly being rendered machine readable. And as we are being and then and continue to be transformed into data objects, there are these interesting things that happen and assumptions that get folded into those systems. And we all know this academically, but in a way that feels so normal that we don't really have to consider it too much. But then spaces like these force us to acknowledge it. And not just acknowledge it, but consider those strange effects. So for instance, the way that I found out about this story that I wrote on these places was through an article, in fact this article. And this is by this Brazilian academic, Ronaldo Lemos. And in the article he talks about how when Pokemon Go first came out, nobody in the favelas could play. The reason for this is that Niantic, the company that makes Pokemon Go, they were using Google's data for the underlying geodata for the game. And so as a result, nobody in the favelas could actually play Pokemon Go. And in fact, there are way more serious consequences. So we could talk about consequences that have to do with how people in the favelas even get mail in the first place because people don't have mailboxes. Or what it means to not have access to a lot of other state services, electricity and so on. But this was the one that really started this whole conversation in Brazil. Because it forced them to think about it in this different way, in this way that felt very personal and related to them. So hello, as Nabil said so nicely, I'm an artist, but I have had a lot of other titles in my life, here are some of them. I don't really care, I usually just say artist. And I think that the work that I do takes the form of various things. So sometimes I'm a programmer, sometimes I make websites, do net art, sometimes I create physical objects. Sometimes I do performances, often I write. All of these are different forms, but I'm trying to do the same thing and everything which is to poke into the implications of this very quantified society or at least this more quantified society than what we've ever seen before. And I'm not the first person to think about this in some ways. I don't know if anybody knows this fellow over here. This is Neil Postman, who is a writer, theorist, various things. He wrote some kind of controversial stuff, but we're not gonna talk about that. We're gonna talk about the other things. He has this really wonderful piece, Whenever I Teach, I tend to assign my students to read this piece. And it's called, dang, I can't remember. It's like five thoughts on technological change, something like that. And they're all really good, but the third one is particularly relevant. And in it, he says that, embedded in every technology, there is a powerful idea. And he says, in fact, there's usually more than one powerful idea. And these ideas are so powerful that they are actually fundamental to the technology, and they're so fundamental that they become taken for granted. So, for instance, if you live in an oral culture, the kind of idea that's wrapped into that is this prioritization of memory. The ability to be able to remember lots and lots of things. And that's why, in historical times, we have all these stories about these bards who could sing for hours and hours and remember so much. Then in cultures that are built upon writing, there's another idea that is packaged into it, which is all of a sudden a prioritization of structure and logical organization in a world like what we have now. Where we are rarely ever offline and where computers have spread from our desks, to our pockets, to our clothing, and our household objects. The question is what is packaged in that? And the answer is data. Data becomes valued above all in a world of computers, everything takes the form of data, and everything has to take the form of data. And it becomes very important for everything, or at least the things we care about to take the form of data. But this data doesn't just appear. It's not like it just comes out of the sky, the cloud. I mean, it does. But you know what I mean, it doesn't just appear out of the sky. And that's something that I think is really important for all of us who are programmers or data scientists or engineers or really any of us in this room, any of us who work with data. I think it becomes really easy for us to just not consider the collection process behind it because we're so interested in what we can do with it and what it will do for us. But if we go back and we think about that active collection, there are so many fundamental and significant ideas that are wrapped up into it. So many has to be taken for granted. And so a lot of the work that I have done has tried to poke at this process, trying to try to look at these implications in different ways. So sometimes I do that in kind of small ways. So this is a piece I did actually while I was at RC. Recurse Center. Yeah, I don't think I, I never presented it. I don't think I don't think y'all who were there with me actually ever saw me talk about this, but surprise. So this is a piece that I did while I was there. It's called Pulse. And what I did was I got a nice heart rate, heart pulse sensor and then a receiver microcontroller. And I hooked it up to myself and then set up a system so that you could see my pulse as I was talking to you. And so what I did was I would project it on the wall behind me. But that was just, that was just making it, the actual, actual performance was that I projected my, my pulse behind me in a room full of 300 people. So as people were interacting with me, they were getting this weird other information about me that they couldn't look away from. And I can, I know for a fact they would not look away from it. And so of course the thing that I was trying to poke into was what are you actually understanding and what does it mean? Like you have this information but what does it, what does it say, what does it tell you? And like what, and what does it mean to be pushed into this strange intimate relationship with me? And what I was trying to really comment on was this fetishization of the data itself without the context behind it or without knowing really what it represented. And in a way this, the thing that weirdly elevated this project is that about five minutes, five minutes after I put on the sensor, I got this text message from somebody who had been dating breaking up with me. I, I know, I know. We can talk, we can talk about the text message aspect of it but we won't. But we could. Anyway, so I got this text message and all of a sudden the stakes were just raised. Like absolutely, I was like oh my God, hard on your sleeve, this is hard on the wall. Like I'm just, everybody can see this. And it just forced me, all of a sudden I just was in this different state and I remember telling my friend afterwards what had happened and she was like wow, that's crazy, you hit that so well. And it was so interesting because in fact I had not hit it at all. I had projected my pulse on the wall. And then of course it really reinforced the same point about it didn't matter, it didn't matter if nobody knew then it didn't make sense to them. The data was just data and they would read into it whatever they wanted and maybe, I don't know, maybe they thought whenever I would think about it my pulse, the sensor, it was very sensitive. So my pulse would always like raise and rise as soon as I thought about that text message over the course of the night. But who knows, maybe the people who I was interacting with thought it was them, you know? I'm okay with that. But that was one of the ideas I was trying to get into this, like I said, fetishization of the data as the object without worrying about anything around it. But a lot of the larger projects I've done have been more about peeling back the very process of data collection and trying to get at what is in that and what that represents. And so this is a project I did a year or two before that called Pathways. And for Pathways, this was when I did this Fulbright National Geographic Digital Storytelling Fellowship. I like to show this photo because it's proof that there was one time in my life where I dressed as a business professional. It's for my parents, mom. And so for this project I went to the UK and I basically took on the role of a data collection company. So I worked with four different groups of Londoners. I worked with a group of coworkers, a family, two people in a long distance relationship, and a roommate, a group of roommates. And what I did was I collected their data and then I visualized it and I made this website for National Geographic about it. That was sort of a data storytelling website, but really, like I said, a project about collection wrapped into that. And so to do that I had to, I took a lot of their geolocation data and so I made these maps. Because this is just a static image, but on the actual website these actually are animated and so they play out over time. So you can see where everybody is moving at different points and each person got a different color. And so I did that, but then for some of them I also collected their metadata. So this is for the long distance relationship, collected it and then analysis and then visualized it. And so you can see this is how often they're messaging each other. So you can see that one of them is messaging more than the other. That's the US person who's more talkative. And then we have here and it talks about the platforms that they were using to message each other. And you can see there's one week where there are no messages. I'm guessing y'all know why. Yes, exactly. This is how you know. You can always tell the audience might help people guess. Some groups are like, they broke up. No, of course they were actually together at that time. So this is another of the maps I made. This was for a family. For all the groups that I worked with they all had something that was going on with them. So the family, I actually collected their data in the two weeks before their first child was born and then the first two weeks of the child's life. So it was sort of this data around the birth of the child. And you can see that in a lot of ways. You can see the mother is the blue line and you virtually don't see anything. You don't see her move. If you watch the actual animation she barely moves. The first time she moves is when she goes to the hospital. And of course there are things you can tell. You can always guess where the baby is even though fortunately I wasn't trapped. Like the baby didn't have a cell phone yet. So I wasn't collecting the baby's data. But really this was what I was trying. There were so many things I was trying to get out with this project that I thought really interesting. One of them of course was pushing back on this individual response to talking about data collection. You know there's this curious fact. Actually all of you know this that as individuals we don't really matter but as aggregates we're very important. And so it felt really really important to do this as groups of people and to be working with them as a group so that it wasn't just about one person. But really the big thing that I was trying to get across was this idea that data collection is a relationship and it's often an invisible relationship. And so what I mean by that is that again it's not just that the data appears. It's that there's always an entity that decides that it's going to collect something and then there's another entity that is the collected. That is the object of collection. And so there is this relationship inherent between the two but because of the way that we do things that relationship is kind of abstracted. And so when you're on your phone and you're sending data up to or whatever allowing something to be collected or like scrolling through Instagram and doing all your likes on Facebook, whatever. That's invisible. You're not thinking about that relationship but it is happening in the background. And so what I wanted to do for the project was put that in the foreground and make it so that people couldn't avoid that. They had to think about it and they had to look at me and they would physically give me their data. We would meet in person so that they could see what I was taking and see how I was just only collecting metadata and so on. And so I wanted to make that so clear that they couldn't not think about it. You can see this is another one of the maps. This was the roommate map. So I used different ways of collecting the data which is why you can see the maps are kind of different. This one was using moves. I think it was before Facebook owned it. And then this was using open paths which is a more open source version which like the constant struggle you can see. The moves version is way better but is like proprietary. And then the open source version is the right one but the lines are so like there just wasn't, didn't have as high fidelity for the tracking which was great because it was good that we could talk about those things. But there were, with the roommates it was their last month of living together that I collected their data for and so I was there when they said goodbye and there was all this wrapped up in it but what I started to realize is that once you make that relationship of data collection apparent you all of a sudden have to deal with so many more things that you didn't before. These are things like trust and ideas of possession who owns it whose hands is that information in? Who does it belong to? And now in this moment we are thinking about those things because we're in this post Cambridge Analytica world but in 2014 we certainly were not at least not in a mainstream kind of way. And the other thing that I realized that this process made me think of constantly was of course what was I unable to collect? And this question of what I was unable to collect I really couldn't get off my mind. And so this really started me after I came back from London I went to I think data and society and started working on a project that I'm still working on and have just sort of been working on for a long time over the years and it's called missing data sets. And this project you can see it, it's on GitHub. Part of my practice I really like using GitHub for things it's not meant for so I put a lot of writing on GitHub and I put a lot of research and a lot of thoughts because I love this idea that people can push and pull like at my thoughts and correct me when I put something that's wrong and it can be this collaborative thing and also suggest a sort of in process nature. But so the first thing I did with the missing data sets project is I put it up on GitHub and I wrote about it. And for me the way that I define missing data sets are as blank spots and otherwise data saturated spaces. So you'll have these places where there's loads of data around and then you'll have this curious blank spot where nothing exists. And so what I started doing was collecting some of these things. And this is an old list. Some of these are not actually even accurate anymore but these are things like the number of Latinx people in Florida's prisons that data is not collected. So we don't actually know that or public access to the financial data that Wall Street's models run off of or the number of tour relays that are run by the NSA. But again, I said I'm not interested in fetishizing the data. I'm far more interested in what is behind it and what it represents. And so what this project really has been about in some ways is about the patterns of absence in data sets. And I started to realize that there are always the same reasons why data will be missing. So the first reason is this which is that those who have the resources to collect data will often lack the incentives. And so for instance, we might wanna know how many stingrays which are the devices police departments can use to replicate your cell phone towers so they can be able to get more data. So we might wanna know how many stingrays, local state police departments are using but we don't have access to that information. And so those of us who might have the incentive to know don't have the resources to be able to collect it. And this by far is what a lot of so many when it comes to missing data, this is one of I think the biggest reasons is this mismatch in who has the resources and who has the incentives. The very first missing data set that I started working on that really got me down this path was around civilians killed by the police, which no longer is a missing data set. It's kind of switched but when you look at how much effort it took to collect that data and how many different groups have had to work and do so much on the ground's work compared to what it would look like if that were collected by legal enforcement officers, you can see there's this mismatch between resources and incentives. Here's the second reason why data will be missing and that's that the act of collection will sometimes take more than it actually gives and so there's a burden to the act of collection. This is what I mean about centering the process back into this. You see this a lot with sexual harassment, for example, sexual harassment data. Things are missing because the act of coming forward is like it's just so much work, it takes so much that it doesn't seem like it's always worth it. The third example is that there are just data that resist metrification and these are the examples that I was showing you earlier, these examples of places on the map, that there are spaces that actually just resist this need to be quantified and my favorite other example of this is cash. I find cash to be so, so interesting because in the same way that credit cards by their very nature generate data, cash just like swallows it up, eats it up and that is terrible for a world where the ability to track things is now in the ability to create data and make sense of data is actually how people make money. Now like cash is just such a problem in a world like this and so that's what you start to see. A lot of places are going cashless. We don't know how much cash, how much American currency is outside of our borders. There's just no way to track it. This like ongoing issue that federal economists talk about all the time where they're like we just, we just can't, there's no way to know, there's no way to know this and then the fourth reason is that sometimes there are benefits to non-existence and this reason, this is kind of tough because in a way every single missing dataset it benefits always someone to have it not exist but what I'm speaking about specifically in this reason is that often there are moments where it benefits the people who are situationally disadvantaged. So typically when I say it benefits someone to not have data exist it benefits the person who has power in the situation but you have these moments where that flips and so people who don't have power in the situation will actively use these obfuscation techniques or do things because they can have a benefit by not having something exist and this I've seen a lot in regards to undocumented people in cities and I've done work with people where actually the whole goal is they're like we don't wanna be collected, we do not wanna appear in these databases out of reasons related to safety and this is also things you've written, kind of get at this exact idea. So I told you I do, I work in a lot of different media and in a lot of different forums and so for this piece one of the things that I did was actually working with people who had issues around missing data. So I don't know if we have any Broadway fans in here, you don't have to raise your hand, don't worry, won't call you out but I will see you, game recognized game. So Broadway fans, I don't know if you know this but there's no data on the ethnicity of performers on Broadway stages and the reason why there's no data is that they say that it's difficult to collect and it's not very useful. However, there is a lot of data collected on audience members, on the ethnicity of audience members for Broadway so it's kind of interesting and this group of Asian American performers started to realize that there was no data on this and they were feeling like they were not represented as well in Broadway compared to some of their counterparts from different ethnic groups. And so they embarked on this mission to try to collect this data and they spent five years doing it and I joined them sort of towards the end of that process and helped them just really think about the collection process and analyzing it and also tried to help them kind of get this out and talk about what was really happening in there. And so what we did, what I ended up doing was writing a piece on it and the piece, for the piece did a lot of work and created a bunch of visualizations but really there's only one that matters. There's one that shows everything that we found which is this one. So Asian Americans are the light blue color and you can see they're only really represented in one show, The King and I which takes place in modern day Thailand and has white main characters. So this also just is a bit of a tangent because spent a lot of time looking at Broadway data. Something I realized is really interesting talking about inclusion and diversity is that the way that that unfolds in Broadway is through shows that are composed of one primary like racial group. So you'll have like an Asian show like The King and I or a black show like Collar If You Hear Me which is a two-pot musical and like The Lion King is another example of that and so it's actually very, very rare to see shows that mix various different racial or ethnic groups in the same show with the exception of Hamilton. So it's kind of interesting but we did this. This shows exactly what they were trying to represent and in a way it was sort of this neat project in the sense that this data was missing. Well they felt like there was a problem. They knew the data was missing. They worked together to collect it. We did all that and then we showed actually what they felt was right and then we were able to have all these meetings with various people in the theater industry and talk about this and talking to some of them now they say things have improved. But sometimes things aren't that easy and I have other applications of this project that I think are meant to try to highlight some of the more tense moments of this. So I also, this is a piece that I've made it's a German artist so I sometimes create pieces and this is a piece that's been exhibiting in various places. It's called the Library of Missing Data Sets and it is a physical filing cabinet and in it are all these folders and they each have the title of a missing data set. And the thing I change it depending on where it's being exhibited so I change it so that the data sets are different so that they feel like they're related to the area and I really like this piece because whenever it's exhibited you can see people kind of go through it and they look in and then they try to wonder what's inside and then they realize that the folders are empty because the data is missing and that's the point. And you get to kind of watch that happen with them and you can see and it translates this topic that I think for a lot of groups for people who maybe aren't sitting and thinking and working with this every day like a lot of us might be like it feels kind of intuitive in a different way. And what I like also is that it creates this way to visualize the things that you really can't see and I'm told you I'm not interested in just fetishizing the data I'm interested in what that means and what it means to have constantly all this missing data and the idea that we will always have missing data and what does that tell us about the things that we're drawing the ideas that we're drawing from the system that we use. And so you could say that in that piece as in many of the other ones I am trying to make tangible implications of a machine readable world. But now I'm thinking about other things and whenever I would speak about missing data sets I would always use this sign. I actually, I like these brackets I like these brackets so much that I've actually created a piece it's based on them I'm working on it right now this is this, there are these it's this neon piece they form these big brackets and it's sort of this interactive piece that hangs on a wall you approach it like you approach any other piece but when you do depending on who's approaching it categorizes you. It does some work and it decides who it will categorize and what I like about that is because in a way it represents this thing that really is the idea that's packaged into our world which is about this act of categorization and classification and the need to group things and these brackets speak of that need the need to make things into data that process almost almost before the collection process this idea of creating the categories that you have to put the world in and that that's how you make sense of the world and that's fine but in social spaces it has really interesting ramifications so this is fantastic book that really gets at this idea in a lot of ways and it talks about these difficulties that happen when our classification systems have to interact with social spaces and one of the examples it uses is South Africa's apartheid system in which classifying people's race was extremely important people needed to be identified so they decide what parts of the town they would live in and it talks about all of the challenges that resulted from that and because there are things that are difficult to classify things like race and so you have these stories of a family where like one mixed person marries somebody who is black and then they have a child and then what is that child and at the point when that child becomes 18 they have to move to a place that is where they live in and then you see all of these things about us like pushing back against these categories and so a lot of the work that I'm doing now and I'm totally in progress with has to do with this idea of this classification and categorization and what the results of that are and the ways that I've been thinking about this is a new piece that of course is on GitHub because I told you I like to put things there and talking about this idea of algorithmic violence and this is built on the existing idea of structural violence which talks about how the social structures and institutions can harm people by preventing them from meeting their fundamental needs and so what I'm thinking about is how algorithmic violence is layered on top of that how we have these algorithmic systems that can do the same thing and I think it's so crucial to say that these systems are layered on top of these existing social inequalities so that we don't again fetishize the technology and the data and the algorithms without considering the world that they're operating within and so in this I talk a lot about all these various examples of algorithmic violence you can read it I talk about things that range from as high stakes as predictive policing so sending out police placed on areas where you predict that crime is going to happen which actually is already being done and of course is what do you do when the data for that is incorrect which it inevitably actually is so things around that but I also talk about things like the lack of control and when a company changes their like algorithm for sorting your information and what that results in in this like reminder of the lack of control you have in these systems so I'm working on a new suite of pieces around this and I really wanted to have one done so I could kind of show you more about that but unfortunately life has made things difficult so that's not the case but if you are interested you'll feel free you can follow on GitHub where I'm putting all the updates for that and in the end I do I come back to these pieces the ones that showed you in the beginning which really just this article and what I really like the reason I like to start with that the reason I like to show them is because they get at this subtle these subtle embedded ideas about these abstract systems but in a way that feels like I said intuitive and gets at what it feels like and that is the thing that I'm really interested in doing is in teasing that apart the feeling of these systems and how they affect us and how they impact us so if you're interested in that please feel free to get in touch and thank you for your time