 So, welcome to this event, which is part of the Beta MYC Open Data Week. We're really excited to have everyone with us on a Friday, which is actually a beautiful day here in New York. So, I'd like to welcome you to this fabulous discussion and workshop. So, we're actually going to be doing some small breakout rooms. We are going to be talking about the ethics of data archiving with a focus on hyper-lobal data. So, we're going to start off with a brief presentation from two of our speakers, and then we'll break into our smaller group discussions. And we're going to hear about some examples, such as the New York City Neighborhood Resources Guide and Beta MYC data for shoppers with health vulnerabilities. And we're going to, as a group, create new datasets that we'd like to see and sort of think about how information patterns of our daily lives should be preserved. So, we really encourage everyone who's participated today to think of a specific story or example of hyper-lobal digital information sharing because we want this to be an open and collaborative event. So, with that, I'd like to introduce our speaker. So, our main speaker is Dr. Anne Washington. You can follow her on Twitter at data policy prop. She is the Assistant Professor of Data Policy, Applied Statistics, Social Sciences, and Human Humanities at New York University, Starnard School. Then we have Matt Yu, I hope I bounce that correctly. Dr. Yu is a Provost Doctoral Fellow at New York University as well, and he's also a Faculty Fellow for the NYU Alliance for Public Interest Technology. And finally, Jonathan Chin, who is the Interim Chief Technology Officer at New York City Food Policy Center. And I am Karen Bannon, and I am with the Public Interest Technology Program here at New America. And just an aside, Dr. Washington is actually part of the Public Interest Technology University Network, which is something that is part of New America, and it is our program to help raise up and push up the next generation of public interest technologists. And Dr. Washington is a grantee for the program as well. And so with that, I'd like to turn the presentation over to Dr. Washington as you can start it. Thank you for that generous introduction from both Angela and Karen. I'm excited about this event today, and I'm so glad all of you are able to be here. I'm trained as a computer scientist, but was honored to be a part of the library and archive profession for many years. And historian and archivist need to know the daily details of life in order to write about the past. And I became aware that much of what was happening in 2021 was all through technology. And I began to wonder, could we keep track of how would we keep track of the COVID-19 pandemic in the future? And that was the inspiration for this event. So what do I mean by hyper-local? I am talking about, we talk about local governments being the mayor, the city council, maybe your borough president for the community board. Still, you might not have ever met those people. Hyper-local are the faces you know. They're the people you say hi to. They're the people that you collaborate with at a regular basis to work together, sometimes almost hand to hand. And that's what we're going to think about together today. David, can you share the slides at this point? So I wanted us to think through as we're moving forward into what we're saving and what we're not saving, is to really focus on the hyper-local and think about what we're saving and keeping is an ethical issue. So that way we know everyone's experiences in the future. Next slide, please. And this gets me into talking about a really, really interesting piece that this art piece was actually exhibited in New York City right around Christmas, I think it was 2017. And Mimi Onu Oha, who is a professor at NYU, exhibited this piece. And it was called the Library of Missing Data Sets, where she started to articulate what is it that we don't know? Very often when we talk about big data, particularly open data, we can go to town about everything we've saved and all the information that we have. But we don't have a lot of other things. You can see some of the things listed here, locations of illegal apartments in Washington DC, all extinct languages, number of nuclear weapons. So we start to think through what's missing. And what I'm hoping that we can do today is find out some of the hyper-local data sets that might be missing that are out there. David, is there a next slide? Yeah. So I just wanted to give you a view of what this actually looked like. She created an actual file case that listed with empty file folders of what's not there. And what I'm hoping we can do today and together through the workshop aspect is start to fill in those pieces of what was there. In the introduction for this piece, I had asked people to think through what they were doing. So hyper-local means what was going on in your apartment complex, in your neighborhood, centered around a local store. Maybe you're building, maybe even just a floor in your apartment complex. What were the spreadsheets? What were the shared documents that you created? And 2020 had lots of major events where we came together to find out new types of data. One was the civil unrest and social unrest and the protests after the death of George Floyd. And we also had things that were happening during the pandemic. So in both cases, digital material arose in order to understand how we were coming together. So David, you can not share slides anymore. Thank you. At this point, I'm going to introduce one of the postdocs for the Alliance for Public Interest Technology at NYU, Dr. Matt Bowie, who is going to give us a little bit of a start of understanding this field of data justice and data empowerment and how he saw this play out in LA. Matt, do you want to share your slides and have the floor? Thank you, Dr. Washington for the generous introduction as well. Hi, everyone, I'm Dr. Matt Bowie. I'm a current postdoc at the NYU Alliance for Public Interest Tech. And as we just dive in and continue this discussion about open data, open data week, how do we as open data professionals think about data justice? I'm going to provide a few short provocations to think through. So and a brief primer is that I came to this work as a nonprofit marketing strategist and I went back to grad school because I think a lot of the struggles that I was engaging with was that there was missing data or the data weren't always speaking to the causes of the constituents. So underlying a lot of this data justice work, the study that I'm previewing that I'm going over in brief, it's these questions of how does data serve, marginalized communities, who does data serve? How do we tell stories about data as well as with data and think about those complicated ethical problems and ethical issues, especially pushing the needle towards justice and a redistribution of resources. So to begin, I'm thinking about just redlining. In the course of my work, I've been thinking about redlining as a data driven technology. So as you think about the FHA and the homeowners loan corporation, how different data schemas rendered communities of color, black, brown, as well as immigrant communities as red or yellow in this map of Los Angeles. Whereas those on the periphery, the more affluent middle and upper class white communities were rendered as more worthy of investment more worthy of mortgage approvals or blue and green based on different categorizations and different calculations. So underlying a lot of this work that I'm doing is thinking about the racial politics of data. How is data used to be used to render specific communities as invisible within resource allocation decisions, whereas they're hyper visible within criminal justice decisions or the criminal justice databases. Right, so thinking through that and thinking how do we really think about how we tell stories about data. I actually go up against the data is neutral myth, often, but also how do we use data and novel strategies and ways to advocate for justice and advocate for such communities. And the study that I'm going to delve into data justice with you all just thinking really about Los Angeles, but I think we can apply it to New York there's a lot of great data justice work in New York, and nationally and globally. But how thinking about and thinking through what are the different ways that community based organizations are engaging with data, but also disengaging with data disengaging from data. And how does this local, this hyper local view and form our work as we think about how to partner with work alongside learn from these types of organizations. So the study it's as some background was a three year field work in Los Angeles working and observing and just being in touch with a lot of these grassroots organizations, but come covered time. Instead of being able to meet with them a lot of my research went online online so the study I'm going to be presenting is a content analysis of 70 posts from 11 different organizations in Los Angeles. And I can give you more details about that later if you're curious, but just going to briefly go through some of the findings. To begin, the findings kind of touched on four categories of different engagements and distinctions with data. This first category data use is thinking about how different organizations to fill in missing gaps they did primary data collection right so they had to do interviews, oral histories, surveys to collect missing data to address data gaps. Whereas the second bucket of strategies was data reuse. So looking at the census data, looking at other available data sources to repurpose. So in this strategy I'm also thinking about work in New York and LA such as the anti eviction mapping project, where they call LLC data to think about unlawful evictions and how the corporate, the organization of real estate is impacting marginalized tenants and eviction unlawful evictions. The third bucket is thinking about data refusal thinking about more about an active orientation to the collection or application of data. So, whereas data use and reuse is really using and reusing and like oh data's okay data refusal actually calls attention to questions of what is data not useful for what should data not be used for. So, in this work thinking about me hint is no tech for ice campaign, thinking about other ways to gather public awareness about the political and weaponization of data against communities of color. And then finally this fourth bucket is just data production. I think open data professionals, academics alike. So there's always this production right you share the you share the reports you share the data sets. There's this ethos of wanting to address data gaps by sharing. So how are their novel production strategies such as zines, or infographics to have more public awareness about issues to engage with a broader public and not just those that are necessarily data literate. And so the next few slides are just a few quotes to kind of extrapolate and kind of deepen these buckets. But this first quote is coming from data for black lives. Yes she millner the co founder of data for black lives shares that for those of us living on the margins data as protests, collecting and analyzing data is accountability and data is collective action. So this she's just really calling attention to the ways that we can use data and reuse data to address knowledge gaps, and then the specific moment. Yes she millner was calling attention to how coven 19 data had was lacking racial data and not. So therefore, there were knowledge gaps about the ways in which a year later we're talking still but how coven 19 to can directly impact the communities, especially those that were most harmed and most at risk. Meanwhile, and this quote in another piece by yes she millner as well from data for black lives. She's talking about data refusal for public education and awareness. In that red box. You'll see kind of the ways in which they're using this graphic to think about how data has inputs, but these data outputs are what specifically has happened has been weaponized against specifically the black community. And so as we think about data refusal, thinking about the ways in which risk ratios credit scores. And how those data inputs are all tied to a zip code and how that data in put it is reflective of long standing structural inequalities and how we need a question through be transparent about these data decision driven data driven decisions that are increasingly getting pushed. And then finally, I include this quote to think about data refusal and production from this group called our data bodies who one of the leaders in data justice work. In their digital defense playbook, they share that they're not only are they producing data to call attention to issues of their impacted communities, but they also want to shine a light on how communities have been confronting data driven problems and forge an analysis from and within these allied struggles. I'm more than happy to go deeper but I think the takeaway from this is just thinking about and pushing us to all think about as open data professionals. What tactics and tools do we have to us in social justice through the use reuse refusal data. What tools do we need. And I look forward to continuing the conversation with you all. Thank you. That was amazing Matt. Thank you. That was a very succinct presentation of a lot of really interesting and deep knowledge I know I think we're going to move forward to Jonathan chin, who is a former student of mine and my ethics of data science class. He is also, I believe a graduate of the NYU or the class was a part of the NYU data science or social impact degree. And Jonathan currently serves as the interim chief technology officer at the New York City food policy center. So john you can correct me if I got your degree wrong, but at least I had to give a shout out for the program I teach it. Share your slides and john has the floor. Okay. I am actually in a different program, but it intersects with a lot of other stuff that's happening at NYU. I'm really happy to have been there. So my slides are visible. Yes, no. Yeah, perfect. Okay, awesome. Thank you all for joining us today. I want to talk about this sort of mammoth undertaking that myself and the New York City food policy center has have taken on since the start of the pandemic. We call it the New York City neighborhood food resource guides in a nutshell is a comprehensive list of food resources that is updated nightly. The purpose of it is, we designed something to make sure that New York City remain food secure during the pandemic. And so the major way that we do that is we have the most comprehensive database of food pantries and soup kitchens. So given to the kind of info that you would normally expect like addresses and hours of operation. We went out of our way to gather crucial information like dietary accommodations, any access requirements like proof of residency, and any unique operational notes. We have a fairly lengthy list of supermarkets and grocery stores with particular attention to their delivery options and the availability of fresh fruits and vegetables. Early on in the pandemic when everybody was at home and sort of rushing to their local grocery stores this kind of information was incredibly incredibly important. We also have additional relevant resources. Our main target was food pantries soup kitchens and grocery stores, but we also try to provide access to things like homelessness services, we can snap sign up assistance, because we recognize that these were also valuable tools for the people that we were serving to step back a little bit and understand or remember the landscape of New York City and how that drove the need for this project. I just want to go over some statistics about what has happened. Already critical food system was suddenly ruptured with the closure of everything. Since to give you some sense since March 2020, the number of residents receiving food assistance has doubled to 2 million New Yorkers. And this is only the official food assistance programs that we know about. There's a lot of informal unofficial food assistance assistance that is going on. Coupled with that one third of the food pantries and soup kitchens closed due to lack of resources lack of volunteers, a whole number of other issues. Unemployment skyrocketed from 3.4% to 16%. So suddenly a lot of New Yorkers who have never had to ask for food assistance, we're now trying to navigate that system. Additionally, mandate social distancing reduced food pantry volunteers and upturn our standard service protocols. Prior to the pandemic, there was a huge move to allow people to enter food pantry and select the items that they wanted and have more of a shopping experience. But with the public health protocols, we couldn't do that. And a lot of food pantries had to revert to sort of pre packaged boxes. So all of their logistics tech just like turned on the dime. And I don't know if any of you remember or have experienced the grab and go meal program that was rolled out to initiate support. That initial rollout was incredibly chaotic. Sites that were listed as giving out meals ended up not being able to give out meals and the rules and regulations about who could receive it were unclear things like that. Some of the major challenges that had arisen in the first couple of months was that the data about food assistance in these resources was incredibly fragmented. And when we stepped in to start analyzing this, we also found that a lot of it would update, which was a huge concern because we search for all this information the same way that your average New Yorker. You know, we went on Google and we searched for food pantries in Crown Heights and the information that we got was woefully inadequate. And so we wanted to prove improve that experience. And then also do a call back to what Dr Billy had been talking about. There are probably some connections between the data quality of these food resources and the geographic location. There's probably some funky weird stuff happening. So this being the state of New York City, about a year ago, this is what had given rise to our project. And so what do we do? How do we curate our data? The first process was to do an initial one shot scraping of the distant data sets. And we had to do extensive data cleaning on this. Not many other organizations would have had the time or the manpower or the insight to be able to do that. After that, we have ongoing volunteer calls. We have a fleet of volunteers ranging anywhere from 20 to 80 of them from across the country. They manually go in and check all of our data points. We have phone numbers for every food pantry, every grocery store, and somebody calls up and says, you know, hey, are you still operating? Are you still open on Mondays? You know, is the data that we have on you still correct? And our target is to update each record at least once a week. So that we know, so that when New Yorkers come to this database, they know that what they're getting is up to date. And of course that manual verification of data is very time consuming. And because we're working with volunteers who don't necessarily have technical proficiencies, we had to create some automated solutions to managing this really wide data set. So we have automated aggregation and conflict resolution of trusted food pantry data partners. So we've partnered with a couple of the major players out there. We receive about 900 food pantry records from Feed NYC, about 600 from Department of Sanitation, 300 from City Harvest, 150 from another digital platform called Plentiful. And even among that, we have 182 records in our database that does not exist anywhere else except with us. So these numbers give you an idea of the scope of what we have, but not necessarily the complexity of the messiness of it. So a lot of times we'll have Feed NYC say one thing about a food pantry, you know, it's open and it's open on Mondays. But we might get conflicting data from City Harvest that says, you know, no, that record was, you know, that food pantry closed two months ago. So it's really complex for us and complicated for us to make sense of it. And this is effectively our job. So if you can imagine an average New Yorker having to navigate this, this sea on their own, how frustrating and how demoralizing that experience is. And so hopefully our project is a way to alleviate a lot of that frustration. And when we were designing it from the get go, we had this, this principle in mind from the very start, we wanted to design our architecture to protect the most vulnerable New Yorkers. So we made a conscious decision to separate our data layer or master repository from our presentation layer and how we get that information out to people. So in such a way that we can take the same data, print it onto a website, or send it to people via SMS, or via smartphone app, or we can print out on flyers and hand those out or send them an email. In New York City, one of the major concerns and I think the pandemic has really highlighted this is there's a large technical divide, large digital divide. Not everybody has high speed internet, not everybody has a laptop, a lot of some New Yorkers access the internet primarily on their phone. And when thinking about the most vulnerable. We even partnered with the mayor's office to do a one off printing of all of our food pantries on paper flyers on like physical actual media, so that we can hand out in homeless shelters. People don't have regular access. They don't have the technology to do it. But we made sure that they fit into our plan, and that they were there from the get go. Additionally to that. And john, we're just coming up into time so I'm wondering if you can find some of the, the best tidbits so we can get to hear what everyone else is working on thank you. No problem. So that's where we started where we're at, and just an idea for where we're going to the future. We want to generalize this so that any other organization can take it up and run with it. We deployed a version of this statewide in Alabama in August 2020. And we're also because we're sitting on the wealth of data. We're looking at it for potential analysis to push for public policy changes to improve New Yorkers food security and diet related to health. That's sort of on our horizon. I think that's about myself. I think what's more important is a link to the guides themselves. I'll paste this into chat as well, so that you can see it and if you know people who need to access it you can share it with them. And then also my email address, if you ever want to talk or come into contact. This is how you can do it. And those are the juiciest bits. Thank you. Thank you, John for that. That's brilliant. And whenever you get a chance, you can then share your slides. Well, these were just two stories to get us started so we can start to think together.