 This is super exciting and scary, and this has been such a fun conference so far. I want to thank John and the other organizers for bringing me. This is a really, really cool opportunity, and I'm really feeling grateful for it. So this conference is about data and community, and I was asked here to come speak because of the work I've been doing since late November, helping people come together to build a community around saving data. And as you've heard, I was one of the people who worked to build Data Refuge and who collaborated in supporting the many data rescue events that have happened around the country. That work has been profoundly and overwhelmingly powerful for me personally, and it has helped me think differently about data and communities. Don't worry, I'll get less nervous and more cheerful as we go forward. I've learned this about myself. That said, I still find this whole project really difficult to talk about. I don't feel like I've fully processed what all of this means or what I think should happen differently. I'm definitely not going to offer an answer to how we should save federal climate and environmental data. I don't know yet. I will say what I do know is that it is very hard, and that those people who are offering answers that seem pretty easy are probably not looking at the whole picture. And so this is work that we will have to undertake with many, many communities over time. So rather than talk about data and communities, what I'm going to spend some time talking about is contexts and institutions. The more I have thought and learned about data and communities, the more I want to talk about context together with data and the notion of communities within and across institutions. I'm going to talk about what Data Refuge has taught me about data, wait, that's metadata, stories, meaning and context, and about community, the beautiful, powerful potential of our civic and educational institutions, and the frustrating and powerful conservatism of our bureaucracies. This is not all going to flow perfectly and wrap up nicely. The world is messy, and we all love a clean CSV, myself very much included, but the world outside of the CSV is deeply complicated, and that's the world that I work in. So first, a little about me. If you follow the news, you'll see that I am a rogue scientist, racing to save climate data from Trump. This was one of our favorite... We now counted there's like 112 articles written in the press that we've been able to find or more about this project, and this was our favorite, it was right after our data rescue event at Penn, and we really wanted to change it to middle management institutional librarians and humanists work to ensure long-term preservation of sightable and contextual data and metadata from federal sources. You can see why I'm a librarian and not a headline writer, but I do think it's important to point out that I got into this work because I am a librarian, and my institution and my colleagues believe in and support this work because they believe that we should be doing this, that this is part of what we signed up for when we chose this as our career. So I'm going to talk very briefly about what my actual job is, and then I'll go further into the work of Data Refuge and Data Rescue and I will talk about distributed networks and data packets and that sort of thing, but first, what do I do in my day job? So I head up the Digital Scholarship Department in my library where our group focuses on collaborating with faculty and students on creating new kinds of scholarship. Margaret Jantz, who's a key collaborator on Data Refuge, works on data management and curation in her day job. We have two colleagues who work on open access publishing and two people who are really talented, they're all really talented, I am very lucky, but these two people who are talented developers and also have advanced degrees, one PhD and a master's in humanity subjects who work on text mining, corpus building, design new kinds of scholarly interfaces, data sets, etc. And by the way, I'm currently hiring a mapping NGO Spatial Specialist if you know anybody. We don't claim to be the only place on campus or in the world where people can do all these things. We are not the single source of help. The library has never been the single source of information, but instead we're there because there's a need and because people want the kind of help that we're providing. And importantly, because our institution thinks it's worth hiring people whose job it is to increase the capacity for this kind of work. There was a version of this talk where I just talked about whatever people's jobs were in various institutions that were where this Data Refuge work happens to kind of call attention to the, while I've been working so hard on building this volunteer community, just trying to call attention to what are the jobs of people. But I'm not gonna do that because it turns out that would be a very boring talk. But I do think it's important as we think about the work that needs to be done to think about whose job it is. I've been a librarian for more than 15 years, and while people generally seem to like librarians and the general public certainly trusts us on some things, there's often a kind of sadness that I sometimes hear from people in the software community for me, like, oh my gosh, it must be so hard for you. You're about to lose your job because the internet solved learning. I know. So I'll just say that has not been my experience. And while communities are in some cases defunding libraries, there are still, and there are about 120,000 libraries in the United States. They are busy. People use them and value them. And sometimes they are not as well supported in the same way that we are failing to support so many of our civic institutions, but it isn't for lack of use. So just enough. I'm gonna pulp it. This is a bad time to be lecturing. Okay, so the Data Refuge Project actually got its start on the Lower Schuylkill River. This is a map of Philadelphia. The big river that you see most obviously is the Delaware River. The river in the middle. I wish I had a pointer. Oh well, I'm pointing. I'm gonna use my arm. So where it says Philadelphia, right, that narrow part is the center city and right above it where the green begins along the Schuylkill River, that's the art museum. You've probably seen pictures of the Schuylkill River at the art museum. You may, in your mind, if you can conjure them up, picture a falls. There's like a small falls. Lower than that, right? Below that. So all the way through center city and all the way through the rest of the city is the title Schuylkill. It is understudied. There is insufficient data. You can see a lot of the banks of the Schuylkill River as you get further closer to the airport, closer to where the Schuylkill meets the Delaware. It's as if this map, oh, there's nothing there. The city of Philadelphia just decided not to do anything there. That's not true. That is the home to the largest oil refinery on the eastern seaboard. The largest oil refinery on the eastern seaboard is inside the city of Philadelphia. And it takes up both sides of the river. The river is actually public land. It is our public asset, like our public data. And yet there is very little access to the river. That little patch of green over on the west side where it says whatever number that is, that's Bartram's garden. And we've partnered with Bartram's garden in trying to create an interface to a database or a platform, a data set that would allow us to expose what we do know or should know or can know about the neighborhood and the river, including scientific data, including historical information, and including the stories of the people who live there. So this is this neighborhood that sort of, I can't use that arm. This neighborhood, there's a neighborhood called Eastwick that's really close to there. It's not important that you know exactly where it is. So this is from the Eastwick Friends and Neighbors Association. This is from recently, like I guess a couple months ago. And you probably can't read this, but I will just say that the environmental justice and health issues and this relatively poor community in Philadelphia are pretty outstanding. And so the program in environmental humanities led by the director of that program, Bethany Wigan, who is a collaborator and now really good friend of mine, had been working on a public and participatory project. She co-directs the river research seminar with Pete DeCarlo, who's an atmospheric scientist and produces a ton of data. He used to work on federal data professionally for the federal government and now he's a professor at Drexel and Danielle Redden, who runs the public voting program. And so the three of them have a seminar to look together about what can we learn about this Lower Skookle River. So that's how I met them. They came to me, they asked for basically software development, a database, a platform to share all these kinds of data and what they really wanted was to be able to make tours so that you could physically walk through the landscape and see views of it that include science that allow you to create new data. And we're gonna do that, we got distracted. But that's still happening. So right after the election, the graduate students in the program for environmental humanities got concerned about the potential loss of access to federal climate and environmental data. They came to the library and they asked, they decided to have an event January 13th and 14th. They asked us to help and Data Refuge was born. It is a program of the PPEH lab as the program for environmental humanities. Penn is the first one, Penn program for environmental humanities. And it's a collaboration with the library to build Data Refuge. We were not the only ones who were having these thoughts or concerns or who were making plans and there have been many, many efforts that have started and contributed to and helped support Data Rescue. I'll mention in particular the event that happened in Toronto in December and that was one of the sparks for this group edgy to form and they have collaborated with us in helping out with some events. We have collaborated together in supporting events. But before that all happened to understand, we decided we landed on this goal. We wanted research quality copies of federal, environmental and climate data. So first question, is the risk real? The answer is yes, absolutely for a whole host of reasons, some of which I'll talk about, but the answer pretty quickly became clear when we started calling around every expert we could think of and every community we could think of, yes, the risk is real. So what does research quality mean as we had conversations with them? It wasn't enough for someone to say hey, don't worry, the EPA took that data down, here is a copy. They needed to be able to cite that copy. They needed it to be research quality. They needed to be able to claim that they could prove that this was the real data. And that turns out to be really difficult to create research quality copies, citable copies that use the kinds of trust that academics have built up over honestly centuries of kind of institutional agreements to replicate that in some technical way is not trivial. We'll keep talking about that as we go on. Same with copies. Again, it turns out the goals begin to change. When this is what you want to do you end up needing to change the way that you approach copying. And then when we think about federal environmental and climate data to be clear we think really capaciously human beings are part of the environment and so data about people is also environmental data. And then we get to this issue of data. So as for data, I'll just point out what you all know, but that has been really driven home to me through this project, that in the minds of many, many people, members of a very broad public, data isn't a terribly clear term. Everything on your screen technically is equally data. In fact, everything on your screen is equally data and it's all structured or else it wouldn't be able to show up on your screen. And so this notion of kind of what we mean by data changes dramatically as we move across communities and it becomes hard to think and talk in a public way about saving data without acknowledging that what people have in their mind when you're talking about data really varies from person to person. So that's the scope of the problem and it was huge. It still is. It's also worth calling attention to the fact that many people have been working on parts of this problem for decades and have made tremendous progress. People have been working for weeks, months, and years and we can build on the work that's been done. But this is still before, I'm still in our story, before the data rescue event in Philly trying to figure out what are we, like we were aiming for before the inauguration, right? January 13th and 14th. So what's the, how do we slice and dice the world of federal information? So that's, I mean, we had frustrating meetings like, where's the list? There's no list. There's no list. So you can organize it by data format or size, right? So there are folks who are ready to and actually good at and have some pipelines for copying petabytes of data. So that's a size, right? So maybe they should just copy the petabytes wherever they are because they have a way to copy petabytes of data and like most, we're not going to do that at an event, right? And then there's things like videos, like maybe you just want a pipeline for all the videos everywhere or there's a group working on PDFs, right? There's format and size. There's also the agency, this is a very obvious one and what some people kind of thought of as the default, right? There's like, did you get all the data from the data that you want to do? And then there's the data, right? So there's a lot of algorithms that produce data. That's another way of slicing and dicing it. There's also this covered by federal records laws. So there are a whole bunch of laws about what you can and can't do with kinds of records that are produced by the federal government and so it might make sense to make lists based on the laws that apply to what you can and can't do. So how do you measure valuable and vulnerable first? To find this perfect intersection of the data that is most valuable and vulnerable and then get that really quick before it goes away was the thinking. And so how do you measure value and vulnerability? I can say like there isn't a perfect way but one of the things that the kind of framework that we landed on after conversations with many, with a lot of people who have different kind of information ecosystem were that there were basically these four kinds of vulnerabilities. And this is not a perfect system at this point. But legal vulnerabilities, right? What laws are in place to require the collection safety and redistribution of these data? What enforcement mechanism back up these laws? There's a lot of laws that have no enforcement mechanisms, no funding to help make that possible and no checking to see if it has happened. There's technical vulnerability which I'm not going to talk about very much in this context because you can probably call out 10,000 things about technical vulnerability that yes, so that's a thing. Right? Political vulnerability. Do we have good reasons to expect that this data set is particularly targeted by hostile political forces? Do we have good reasons to expect that the agency and or unit who produce and maintain the data are particularly targeted by hostile political forces or that their funding is unstable? And the answer to that is yes we do. There's a lot of things like that. And then are the primary users associated with this data set the targets of hostile political forces? So these are all questions, issues that might make data more vulnerable. And then the uses and uniqueness of data is like a catch all for a bunch of other things that make data vulnerable. But things like it's just super unique the census that has a ton of rules and laws around it, but it's still kind of like holy shit, but it's the census. So we did a survey. The Union of Concerned Scientists has been a really fantastic partner and they sent this survey out in late December and then we realized no one answers surveys on like December 28th even though even though we were like working like crazy, that wasn't a great time to send a survey so we sent it out again and got a bunch of responses so it's like friends there's the responses. And that did help a lot. It helped us to pick out the things to start with. So and then we the weekend before the event in Philly we spun up a CKN instance and set up some S3 buckets to store some data. You know this is understanding that we will need to think about this for the more long term, but we're talking about right now what can we do. And then we had an event. And it turns out that the way that we it played out at the event is that the way that we sliced and diced the data was really more like by how it can be saved. So it was a combination of things, right? It was like going by agency and the folks at edgy came up with this system for sort of systematically going web page by web page through a particular program or agency and then but basically the idea was if the internet archive can get it, that's reasonable. You can cite the internet archive. So good. No one wants to get their data from the way back machine. I don't think. I haven't met anyone who's like, that's good. My data's... But you know, solve problems one at a time, right? So the idea was if it can go to the way back machine, great. There are a bunch of things like query interfaces and research data sets that the way back machine doesn't really get that the web crawlers don't get. They're focused on those as a separate kind of workflow. And so the event paths that we had at our event and that went to lots of data rescue events in other places had people seeding and sorting the way back machine and figuring out what things are uncrawlable and how we might harvest them and then a bunch of people scraping sites and trying to get inside query interfaces to get the data out, checking and bagging and describing so that we could have a little bit of control, a little bit of security, a little bit of chain of custody, only a tiny bit of each, but to help add some security sort of trust for scientists and researchers in the data. And then we also had the storytelling path, right? This is humanists and people with whom we sort of built a community. And so the notion that one of the ways of saving data is making sure people understand what it is. So the outreach is actually part of the saving process. The understanding, building an understanding of, for instance, the Office of Sustainability in the City of Philadelphia relies on federal climate data to make recommendations about various road-building plans. And so to create stories, to really write out beautiful, clear stories about here's a federal data set, here is how it is used by a federal by a city agency, and here is a member of the community whose life is affected by that use. So that was the storytelling route. And then, of course, understanding that we were acting in this kind of bucket brigade, we wanted to make sure that there was space at the event for people to talk about the long trail. What should we do in the future? How can we make this more sustainable? So from there it went, we had some spreadsheets, there became a workflow, the workflow became a web app. Brendan, thanks. This all became a system, which was kind of codified, right? And that's awesome. And it also means that a lot of people were like, great, now we know how to do this. Like, we don't know how to do this, right? This isn't the way, it's a way that worked for a while, and I think we continue to learn. But that's, but it did, this did in a really exciting and awesome way allow for like 50 data rescue events to happen around the country, for many thousands of people to find a place for themselves in this work, to begin to understand what it means, and I think that has been, that's been really inspiring and amazing. And I'm here kind of, so let's like do a little bit of dirty laundry, right? This is from Data Refuge and the data set says, this is a catalog of data descriptions, it does not seem to contain any actual data, and although many of the data sources listed have a tab that says data set, nothing loads. So this, like someone identified that this was uncrawlable, this is a community based effort, right? This is volunteers. So like this is, there's data in there that is mad valuable that I feel super proud of, but there's also, I don't know, I'm not putting blame on something, but this is like a, draws into question a little bit, like did we definitely need to go through this entire complicated process for this zip file. But to be clear, this zip file, unlike many, many of the data sources out there, is in data.gov. But even in data.gov, it's not quite clear what this is for. It says, the RCS is a super catalog of components, services, solutions, and technologies that facilitate search, discovery, and collaboration in order to promote quality and savings in software development through sharing and reuse. Thank you, EPA. So I mean to say only that the problem here is not just that like we did something wrong or that like volunteers can't do stuff. It's really that the, this system that we're working on isn't quite the way that we would want to do this, right? We were trying to make an attempt to say, okay, there's query interfaces, right? And we're going to pull the data out, and we're going to add whatever contextual web pages we need, and we're going to add a little bit of metadata, and then we're going to create this research data set. And that will be worth saving. And I, you know, I clearly thought that was the right thing to do. I'm not saying it isn't, but I think what we all know we need to do is move upstream, right? We need to move upstream in the data production process so that the data that's produced works a little bit better. So we, what we really want is for producers to be understanding themselves as making something more like a data package where, hey, thanks, Max. So I want to just give a shout out as we learned about this to the DAT project and this notion of frictionless data thinking about how are we going to, and especially especially in terms of the DAT project, but in terms of also just my collaborations with John and Max and folks at data.gov I think the way that they're approaching sort of welcoming people who work in federal agencies, whose job it is to produce this data, the way that the folks at data.gov are kind of working slowly with them and understanding that they all have jobs and that we're going to, there's room for us, for a set of volunteers, for a community to help them make some better metadata. And I think that's an avenue that has some real potential especially as a librarian. But I do also want to talk a little bit about this kind of multiple meanings of metadata. I never ever thought I would say that. It's terrible. No one's like, I want you to talk about the multiple meanings of metadata. But someone recently used the term paradata to me to describe the kind of universe of things one would need to know about data in order to use it. It's like more than metadata. And I have had a conversation recently about software preservation and all of these, the notion that like the data doesn't, it doesn't we all want data that exists or that has all of its meaning right there in itself. But that's not actually how humans communicate meaning. Right? That the notion of passing a data set to another person, which I am, I spend a lot of time trying to make good quality data and trying to use and reuse good quality data. But when you, when you give someone a data set, there's a whole bunch of assumptions about what they know and can do and how they can make use of it and what they already understand about what is coming to them. Like that they know how to read these, not just, not as much as you would need, not, sorry, not questions that can be answered by a data dictionary or a code book. And so I will think about, I will use a quote from my friend Rachel Appel who keeps insisting that she's quoting other people, which I'm sure she is, which maybe you've heard this, but I love it. Metadata is a love letter to the future. So imagining that the metadata, the data that you create around your data is not just for another machine that can use your data and it's not even just for another person who probably has exactly the same set of assumptions that you have, but might also be for some future person or people or computer or person using computer that is, has a different set of expectations and what's the kind of set of stuff that you need. This is all really complicated and it requires a lot and I want to be clear it's not worth doing for all of our data. It's actually worth doing for only a very small part, the part that we want to see last, right, for a long time, which that's why there's like whole communities around doing that. But I think this notion, like when I've just done is kind of blow up, we need everything, we need the software, we need the stories in order to make the data useful, that just makes swimming upstream even harder, right? And so this is my terrifying, swimming upstream is dangerous. But I also think a big reason, so I think that this notion that there's only some stuff that we can do that for, but still we want data producers to think a little bit differently, to move a little bit in the direction of creating data that is reusable, right? And this bear and this picture is really our institutions, right? They kind of do slow things down in some cases. No offense to institutions. But, so federally, I work in an institution, libraries that have a whole ton of baggage and I work in and libraries, the libraries that I've worked in in my career are part of academic institutions that have a whole lot of baggage. So I work in an institution inside of an institution. They slow things down, they discourage brilliant people from doing new things, they value the wrong stuff, they reinforce racist, sexist, heteronormative, and ableist hierarchies and they are also communities of people, people with jobs who are paid to do things that we probably in this room mostly really value. And I think I want to refer to, I saw a talk by an archivist at Princeton named Jared Drake, who's really awesome, you should read everything he's ever written. Where he basically went through higher ed institutions, like read their mission statements, just read them out loud and then talked about the ways that over their histories they have failed their communities that and sort of the power in holding our institutions accountable to the values that they espouse. And I have a lot of faith in that and I have a lot of faith in that because it's truly been my work. This is what I've, for 15 years I've been a librarian, I've worked in an institutional context and I care a lot about seeing my library do things that are technically smart and also that are ethically good. And the ways that that has worked in my life is by holding my library and my institution accountable to the values that they espouse, encouraging them and asking them and sort of making the trust that we have in our institutions an active trust. And so in the spirit of of like hearkening back to missions and what is our values and this is the part where I just thank Sherry Laster who had a brief where I was like, I don't want to talk about open data starting in 2008, I want to go back further. And so she gave me a history lesson. She's in the audience, she's a government document librarian, she's really smart. And so I don't know if she's actually here. There she is. Thanks, Sherry. So this is a quote from James Madison with whom I don't agree about a lot, but I love this, a popular government without popular information or a means of acquiring it is but prologue to a farce or a tragedy or perhaps both. I think we all can relate to that right this minute, right? A people who mean to be their own governors must arm themselves with a power which knowledge gives. And so I think by kind of framing the open data movement within this much longer insistence that a civic society needs to must enforce its institutions to live to a set of shared values and share what they know. It goes back further. And this is again tied to our institutions of cultural memory that we have long believed this is important and we have long as a society funded institutions so that people have jobs whose job is not to create a product or to create profit for anyone but instead is to help society get better at learning and knowing things because that's actually a shared responsibility. That's actually something that like I know this, I'm just preaching whatever. I'm not going to stop. It keeps going. Okay. So a little just a little brief continuation on the history lesson quickly. So there's an organization within the federal government whose current mission is to provide free, ready and permanent public access to federal government information now and for future generations. Wouldn't that be awesome? I mean there is such an organization and I think my hope is that we can find ways to reinvigorate that spirit in our government. I know our government right now is like a little bit tricky but actually don't want, I mean I'm comfortable giving up on our parts of our government right now but the fact is there are like hundreds of thousands of people. Our government employs more people than our six top corporations. So there are a lot of people who we can continue to believe in and who we can continue to support in and so this mission to provide free, ready and permanent public access to federal government information now and for future generations to provide government information when and where it is needed in order to create an informed citizenry and improved quality of life. This mission is achieved through so the program for those of you who haven't caught on is actually the federal depository library program which was started in 1895 to send copies of federal documents that were produced, federal information to libraries all over the country so that people could access to them and also so that when the federal government's priorities change which they always do the federal government doesn't want why would they keep providing access and this isn't the example that I keep using is when the Obama administration took office they likely took down information about abstinence only education. I am glad that they did that. That was weird, some weird thinking. That said it's sort of part of the process of changing governments. It doesn't mean that we as a culture should lose access to the fact that that was out there. The internet is a very presentist technology system. We don't pay a lot of attention. It's not beyond it doesn't do a very good job. It does an extraordinarily bad job of helping us to say here is a little piece of the present that I want to wrap up and give to the future because it's going to become past soon. That's the work that print publications did all the time and we just don't have a very good way of doing that and there is a loss in that and it's something I think we need to figure out. I don't know that libraries are the answer. I think they're certainly not the only answer. There's no way for that. It needs to be a community effort but it needs to be a community effort that actually believes this is a thing that we need to value and pay people to do. Okay and this is just I can't resist. This is Adelaide Haas who I just learned about but is now my total hero. She was the first superintendent of documents created in the federal government. She created the way of organizing it. I'm just going to read a few quotes about her. This is when the FDLP was created. Thousands of documents dating back many years had accumulated in various areas of the office. This is right now. Additional publications clogged the rooms of the House and Senate. None of these miscellaneous collections were arranged in a systematic way. From the chaos Crandall that's her, the guy who hired her, was expected to organize a sale stock, a depository library stock and answer the many reference questions directed to his office by the general public. Crandall realized the difficulties he would face. In his first annual report he noted that this is my favorite. This seems a simple and easy solution of the document problem. That is however not quite so simple as it seems. Sorry I read that wrong. That it is however not quite so simple as it seems may perhaps be inferred from the fact that it has not sooner been adopted. Right? As a matter of fact it involves an enormous amount of labor and it needs to be skilled labor. Boom. With these problems in mind Crandall turned to Adelaide Haase and then she turns out to have been really awesome and badass and no one liked her because she got in a lot of trouble. She went out too late at night. But by 1897 the documents library had grown from nothing to a well organized selected important collection of 16,481 print documents and 2597 maps. In its completeness so that's the dream, right? Except with multiply by like millions and millions. So that is to say so that's my little side trip to the past is to say that I think that one of the ways that we can think about data among the many is that they are this is approximately equal to a sign that I now use all the time. I think you've seen it like four times in this presentation. The data and stories are actually deeply connected that they are both ways that people have of sharing something that they believe is special and transforming some part of the world and handing it to someone else and saying I think there's some meaning in this. And so we can learn some from the way that we have shared stories in sharing data. And with that I'll just mention some of the directions that we see data refuge going in the future. There's so we are really psyched about events. I think they've been empowering and amazing and if any of you feel like it please, please get together with your community and organize an event. But I will also say that what we're kind of at least within the data refuge crew thinking about is that what we'd like to see is events that are focused on this notion of designated communities. So identify a community that is meaningful to you and bring communities together to say hey what's the stuff that we want to do? Let's save that stuff. You should check and see if it's already been saved, but if you and your community can't find it, go ahead and save it. Right? It's okay and then to save multiple if your community needs it. So that's the kind of event that we're probably going to try to encourage. We also have a project called three stories in our town that Bethany Wiggin got funding from the library that says the office of sustainability for a city, a federal data set and a person and making a connection across those and doing that for a bunch of towns. So that's a project that we're going to continue to work on that's really exciting. And then, sorry, the Mozilla Foundation and the Association of Research Libraries and the Penn Libraries are co-sponsoring a meeting next week to talk about this new, what we hope will be this a new way of understanding the needs of stuff that we might want to save and how we can approach it called the libraries plus network. The plus is because we need to build communities within our institutions and also across institutions. We have so much that we need to learn from the tech community and we have so much that we can offer and there isn't just like one tech community and one library community. You all know, right, you're in the tech community and yet you're this is you are all from a bunch of different communities and so we need to build community across our institutions. We have to build networks and we have to empower people within institutions empower not just by kind of yelling at them but by supporting them and encouraging them and helping them build the kind of institutions that we want to see. So here are the folks that were bringing together this is the institutions represented there are I think 60 people unfortunately the meeting is totally full. We didn't know it was going to be so popular. It's at new America foundation and I hope that what will come out of that is is a path forward that will be distributed. I don't believe that we can hoard everything. I don't mean suggest by this talk about institutions that I believe that we need to have like static repositories that have stuff on a server like I think we need to think really expansively about what the technologies that we use as we think about this distributed system but I also think that I don't want to lose sight of the power of the sort of trust that we have in our institutions and then for me I would love to I think we would love so I hope the library network takes off I'm super excited about it I also think that that work is going to happen across a huge number of people one of the things that we're going to do in my group is go back to our local community and think about building a platform that is not centralized that's decentralized and open but that helps researchers at Penn who create data and also members of our community who need data to make sure that we have safe, sightable, reliable access to and long-term preservation of the data that we need for the future so going back to the Eastwick friends and neighbors and trying to build the platform that they want thank you