 Okay, great So yeah, thanks everyone for joining us for the data rescue town hall I'm my name is Don. I I'm working with edgy and This call has been coming out of ongoing community building conversations that have been happening over the last few weeks and we really wanted to Have a town hall to to accomplish a few goals first We wanted to review the accomplishments of all of the sort of data rescue events that have been Happening around the country. So the nearly 30 events that have happened to date We also wanted to just make sure we checked in to identify the common individual And motivations and themes across those events and then the ongoing efforts that have happened since then And we wanted to provide a beginning to strategize the next steps Building from from this work. So like I had mentioned, I will be Facilitating this and also try and just make sure to keep us to time In the agenda. We've had some suggested Times and on the side. So I might just be trying to bring the session back or kind of trying to Make sure we move on to the next point so sorry in advance we have the The agenda where we invite you all to add your notes also as we go around Through the call. There'll be times maybe where I prompt you or if you have any thoughts It'd be great to add them in the chat if you're not also able to speak Just because we'll have those for a record after and what we'll kind of pull it together and do a little bit of synthesis So we have this as like a strong record of everyone's thoughts on this call And then for introductions, I actually wanted to hand it off both to Someone from Data Refuge and someone from edgy to speak a little bit about both organizations And then we could go around and do a quick round of names As well So Michelle if you're able to speak a little bit about edgy edgy's background and then sort of How that connects to data rescues and then Margaret do the same for Data Refuge Okay, hi everyone. Can you hear me? Okay? I'm in a very echoey noisy space So my name is Michelle Murphy and I'm one of the steering committee members at edgy I think lots of you here are connected to edgy or kind of bumped up on edgy before so Edgy is a network of volunteers basically many academics Legal people scientists now also some people who are Ex-members of state agencies So a kind of motley crew and we do a variety of kinds of work and There's kind of three week working groups in edgy one Is doing interviews? policy work Policy commentary research on the transition of the agencies about environment and climate in the United States a Second group is doing website monitoring and that has a kind of archiving component because it's both do It's coming a web crawl of environment and climate Websites and doing version tracking and so both kind of archives versions of these web pages and you know, take critically takes note of the changes that are being made in the current administration and and then a third aspect which is this web archiving preservation Side of things which edgy's role has been I think largely a support support role so In thinking that through many people here on this call are a part of that work, but I think you know edgy kind of sees itself as hoping to support diverse community efforts and Has been kind of involved in Kind of helping to cultivate tech communities that are kind of building some of the tools that are used in the data rescue events and has organized a few of the data rescue events and You know people in edgy have been very excited about the question of now that we have learned So much collectively around these different sites. What are some of the new questions? What are some of the? the further ways that edgy members can take part in community efforts around thinking about archiving data, but also about the politics of data in this moment where the government is Increasingly hostile to some data, but wants to collect data on let's say immigrant crimes at the same time So these are some of the questions with thinking about that critical side of thinking about the data in addition to the kind of question of archiving So that's some some introduction to edgy Um, that was great. Thank you. And then Margaret if you could also provide an introduction to data refuge in your ongoing work So data refuge is a joint project between the Penn program for environmental humanities and the Penn libraries we Have you know, we started around the same time as edgy at the end of November early December Kind of thinking about this issue of backing up data of this this vulnerable climate environmental data But our our goals aside from trying to create To work with these events to try and create research quality copies of this data. We also try to educate folks about The general vulnerability of did for digital information and advocate for that by Telling stories to really connect the data to the communities that use the data so that we can really tell that That story I guess about you know, this data isn't just used by scientists It's actually used by people and it affects people on a day-to-day basis. So it's actually really important beyond just science And then yeah, so we're I'm thinking about I don't know. I'm sorry I'm just gonna check the the agenda and see if I'm supposed to continue talking about things or if that's enough of a recap of who We are I think that's a good recap of who we are Yeah, I mean also if I think that's probably good for now Just we'll leave a bit of time Given the size of the call I think we can actually do a round of quick introductions if that works for everyone And so if you could maybe just say your name and If you what data rescue if you've been involved in a data rescue event That as well, and then that'll make its way into the doc. So I'll go through in the order. I see people if that's okay And everyone who's already introduced themselves. I won't cover just in the interest of time So the first person I see is toly not be new to the toly Okay, so maybe just in the interest of waiting until that tech thing is sorted out I'll move on to the next person. I see which is John Hello, can you hear me? Yep. Yeah John payers from Virginia Tech. We haven't had today data archiving data rescue event I did do an event to inform people about the issues Pleasure to be here. Great. Thanks for joining us. The next person. I'm seeing is Kevin M Hey, Kevin McCulloch. I got involved after attending a data rescue event here in Washington, DC I'm a developer. I've been helping out with the archivers app and some of our tooling Great, and then the next person I'm seeing is I'm sorry. I might mispronounce this sit s decide Hi, I'm singing that decide. I'm from I organize the data rescue Chapel Hill. Basically. I just saw Michigan's Event and as a student at the sills the high school here. I just wanted to do the same thing So that's how I got involved and I'm now are Organizing a data rescue RTP for the triangle area with the National Humanities Center trying to expand to humanities or other agencies So great. Thanks for joining us and then the next person. I'm gonna go back to Sorry, no, Matt, could you introduce yourself? I Am Matt I work with edgy. I've done a lot of coordinating of tech stuff. Although don't have been doing most of that work lately And this semester's almost over so I'm looking forward to figuring out what we do next great The next person I'm seeing is Abby Hi, I am Maybe working on the website tracking team and I'm also helping with some of the workflow staff for events Great, the next person I'm seeing is ankle Hi, I organize Chicago's data rescue on in March and Actually, another one. Excuse me on Earth Day and Northwestern University Great. Thanks. The next person I'm seeing is Mike Hi, I'm Mike Haka. I'm a staff scientist at Caltech. I work mostly on software and standards I will attended the event in Philadelphia and Then helped organize the event in UC at UCLA in January. So And I'm also getting involved in the archivers software development aspects Great, thanks for joining us the next person I'm seeing is Justin Hey, I'm Justin Schell. I'm from the University of Michigan librarian in Arbor. I I attended the Philadelphia event and then organize the event at the end of January and have had have since then have had Helped organize the Minnesota event and I've had my hands and lots of different pies for this from the app So larger data curation stuff and I think some of that I'll talk about later. That's me Great, and then the next person I'm seeing is Sarah Hi, my name is Sarah Wiley. I am on the steering committee for edgy. I'm an assistant professor at North University and I plan the recent Northeastern University event Great, and then the next person I'm seeing I'm sorry, I might missponounce. This is Kaylee I'm the science teacher Great, thanks for joining us and then I'm gonna circle back to Toli. Could you introduce yourselves? Sorry, but I wasn't ignoring you the internet just came out cut out right exactly at the time My name is Toli Renberg. I work with edgy. I helped set up some of the Primer systems and office codes that have been used early on and also helped plan the Harvard event with Andrew and Maya Yep, and I'm Andrew Bergman. I Yeah, I went to the Philadelphia event I guess New York and San Francisco and our own in Boston and I'm one of the edgy steering committee members and work on website monitoring work as well And some of the work that we do to monitor legislation and the regulatory changing as well Sorry, I didn't got weird about the meeting there. Um, okay. Awesome. So Next we wanted to move into a bit of a recap of events Instead of celebration of the accomplishments we have and I was just going to provide a couple high-level stats so I had reached out to people who've been involved in stuff across different areas for kind of managing some of the Tools that are used and to kind of collect those and they'll also be available to add some Oh, did we miss anyone? Oh, sorry. So Kevin, I believe already gave an intro and Maya I was just going to skip but because of the note-taking, but I'm sorry. I didn't introduce yourself. So do you want to introduce yourself? Oh Well, then you should introduce yourself to hi. I'm Maya I look at the edgy sort of on the website monitoring also helping to organize and coordinate these events and sort of a couple of other things and I'm really excited that this town hall has come together today So looking forward to the discussion Um, yeah, and then I had said that I'm Dawn and I've been involved with edgy and so I've been at a couple events I was at an arbor and the data rescue NYC the first one and Okay, so moving on to some of the sort of recap of what we've accomplished so far through the Like third roughly 30 events we've Added more than 55,000 unique seeds to the Internet archive Some included in the end of term harvest and post-end of term harvest some of those going into a regular crawl And that's those are being transferred over Of the the seeding work that's been done at events. We've actually identified a large portion that We've been calling uncrawlable, but that's not necessarily the the right phrase Or that phrase kind of we have an internal meaning to that But but really we've identified pages that have Difficult for the Internet archives caller to capture Datasets or also collections of data that we're interested in doing additional work to ensure it gets preserved And there are some stats for that that have been compiled. I'm just dropping a link in the chat and then further as part of the work at events we've had a System for researching harvesting And then begging and describing to ensure that we have a chain of custody for some of those difficult call Difficult to crawl data sets and there's currently 175 of those existing in the data refuge repository Covering more than 11 or covering 11 federal agencies We have at this point around 1500 currently in the pipeline at various stages that would Can you know be in addition to those data sets? and I wasn't sure if Totally you had anything you wanted to maybe add about because you've been involved in helping with the Managing the back end of the seeding and the primers So if there's additional sets, maybe you want to provide and I was expecting Brendan to be here So I'm just going to ping him and see if he's around Here yeah, thanks for that overview Don just a quick note I think most of the stats on the first page are quite self-explanatory. There's a breakdown At some point we started actually sorting the data sets and information on websites based on many files visualization database and FTP And there's a breakdown there and then we also have Breakdowns by agency and lower down we figured one way to track progress It's actually not just by looking at the number of total seats But because we have we put so much work into splitting up agencies by the sub-primers Actually sub-primer lines that we feel are a good metric to To use to understand kind of progress that we're making spanning these assets So on the second page, there's the agency coverage progress at the moment given all of the Sub-primers that have been written. We have something like 710 sub-primer lines that we want to complete So you can just ignore them as sections of agency websites that we need to send People to screen through and this is just a plot over time of how how much progress we've made It's actually really awesome to see this thing growing and hope it keeps growing and expands beyond just environmental climate and energy I think there's a lot of great direction so we can move in. I Just want to add also that I mean I think one metric That there's some reluctance to use is just the number of total seats nominated because different events nominated in different ways sometimes a Lot of nominated seats for a small set of web pages and sometimes very few But looking at the number of sub-primer lines is maybe a good way of Understanding sort of the scale of the work that's been done as a function of what we'd like to achieve Great That's a really good point and maybe Justin just had a question that totally you could speak to that was dropped in the chat about the number of Sites in the app and also Brendan now that he's here, but so maybe totally first if you could speak to that the number the kind of The number in the app Essentially the analysis here was just from the nomination Chrome extension so it has it doesn't know anything about what's in the archivers app And this is something that I think are the conversation will be to understand how to move those Properly into the archivers app. I think it's we should think a little bit about that because we don't want to overwhelm You know the archivers there's something like 20,000 Potential URLs, but maybe there's smart ways to condense that Great awesome, and then I think the the last point and we have maybe time for one or two more questions before moving on If anyone has one to drop in the chat. Yeah, so totally answered. It's a duplicated list awesome And so just a pause if there's any any more questions around this I guess on the subject of duplication, I just want to say it's worth noting that and Brendan can speak more of this It's not it's de-duplicated in the most trivial sense. Yeah, it's literally if two URLs are identical Which is very easy to duplication. It's not de-duplicated in the sense that sometimes Two websites will be pointing to the exact same data sets But are different, you know different URLs and and sometimes they'll be pointing to similar data sets or subsets of one another And so that gets more complicated and that's not done, but the new app has some features to be able to take care of some of that Okay, so yeah, maybe if Brendan could just speak a little bit about Some some of the stuff going on on the app side and then let's move into Event feedback from organizers and and take it from there Great apologies for being late to the call everybody. I totally Gapped on the timing I'm assuming we're talking about app statistics. Is that in the right? Yeah, cool. So we have a whole bunch of your yeah as referring to earlier we have We've moved but 1800 ish URLs for the app It's really sort of making good progress on that stuff a lot of a lot of content See to the data refuge I don't pardon me if I'm repeating anything you've already mentioned. No, okay, good so we have That many and then at least a hundred maybe data sets actually in data refuge now like a very a sizable number What else is going on in the app? And we've mentioned the number of seated URLs through the crawler 175 is that's actually was off by five I think maybe the question Michelle had and sorry for like putting you on the spot here was maybe just a little bit about the I had sort of said that we have Like about 1500 kind of in progress at various stages In the app and that that's a been a part of the work focus That's where the bagging and describing in addition to some of that harvesting is being managed So maybe if you have anything more you want to say to that Totally. Yeah, so we have a really good set moving through the app We do have a good percentage of URLs that hit the app that are marked is actually crawlable And so that's it's worth noting that of that 1800 URLs that we've worked with number At least 40% of those sort of are identified as crawlable Which is great because that means that we've had somebody from the technical community review it and make sure that it is in fact suitable to the IA and Yeah, what else speaking to this question in the chat when we say the data How is it retrieved by the scientists currently? We see everything to data refuge with plans to do further integrations But right now I view our data refuge or it's there and available and in the great spot It only hits data refuge once it's finished So currently we have a whole a high number of URLs sitting sort of like in a at some point in the process waiting for Sort of like waiting to move to the next phase and we have that actually on the home page of archivers that space You can actually see the live count of where everything is at at any given moment Yeah, hopefully that helps is a good starting point Yeah, I think that's that's a great starting point. So as part of And so through those events, we've had a couple we had a couple organizers that we Asked to speak in just to sort of help in move into feedback and we wanted to have a broader conversation with everyone here Who's been involved? So Justin if you're able to maybe Just give us a bit of a kickoff and describe What's happened at your event? So maybe Briefly when where who kind of what moments stood out to you the most and and if there's anything that you learned from other events Organizers it would be awesome to share Sure Yeah, so From you know, I think it was similar to Sangita who said like oh, I saw this event happen Then I wanted to do one. I saw the Washington Post article and That talked about the Toronto event is like, okay, we're gonna do one. I have no idea how But so I and at the same time one of the directors in our library said we need to do something about this Because we're Michigan. We need to do it big and so that's been sort of parallel tracks going on with the Larger sort of data curation community and building repositories and things like that But for me it was running this this two-day event In the in the design lab that I run at the University of Michigan library and so not knowing how to run an event like this I headed out to Philly to to learn that And ended up helping out with a bunch of the the initial workflow Sort of establishment and and then use that model for the Ann Arbor event We were still on a spreadsheet back then the the grand old days of January with that ever fun spreadsheet But You events now have it so so much easier But we we pulled in about 350 people 300 people I can't remember over the two days on Friday and Saturday Got a well one actually one data set is still downloading from that event. It's an EPA Data set that's been that is a high priority one and this is what this is why it helps to have your University sys admins be your friends on the project And so from there we've hosted or worked with a couple civic tech nights We are also working to do an event in Detroit in a couple weeks tentatively scheduled for April 15th of Saturday But also going back to that first theme of Michigan You know Michigan is the initial home of Hathi trust the global or the the you know multi-billion page text repository And also one of the homes of deep in The digital preservation network and so and also has one of the originators of the data curation Community Jake Carlson there and so we've been working through these networks and building out more connections to people both within Data repository land as well as within sort of federal government and federal data centers who could a Help us identify What data is you know vulnerable and the different kinds of ways that? They would judge vulnerability versus how we would judge vulnerability So for example, we we ended up downloading about three terabytes of buoy data During the Ann Arbor event and after talking to someone in the research data alliance Who is the community development person for them said? Oh, yeah, that data is like in 15 different places That's that's pretty good because a lot of people use that now the larger question is is it you know Is it accessible in those 15 different places? Is it preserved etc? So? And that transitions to the longer term question of what kinds of infrastructure Do we want to build? To or build or adapt to these longer-term preservation things At the same time we were working with our faculty here to figure out what data sets they needed as well as working with some community projects To incorporate things like Zooniverse crowd sourcing element to some metadata transcription And I think that's where a lot of my focus is within citizen science To either work with sort of developing community tasks like some community site mapping that we did at the Minnesota event looking at the crowd sourcing of Perhaps the page freezer website monitoring work that could be done as part of events As well as like I mentioned this universe project that could be around metadata transcription of the two million or so PDFs That have been harvested by the end of term crawls over the last eight years so so yeah lots of work lots of lots of Things happening and lots of directions to go forward and so yeah, that's what I've got to say about that so far Is there anything else that I'm missing done? I think that's great for now And what we were hoping is to have a couple Like use do you speak one more event organizer and then have a bit of an open conversation? Margaret if you have a question, is it something that could be typed in the chat for now or and then brought up later? Or is it a I just want to like give a shout out to Justin because like without him and the work of Delphine and Rachel at Temple University We would be so screwed like we love them with all of our hearts. We're so grateful for them We're back at you Margaret and everyone else Awesome. Yeah, so I feel like I probably should have said this story But if your questions come up or themes that you notice and something that Someone's saying it'd be great to drop it in the chat so we can like sort of revisit that later And kind of bring that forward the next person Patricia I think you had said you would speak a little bit about your event So again sort of briefly describe when where who also a moment that stood out to you the most and maybe If you learned anything from other data rescue event organizers Yeah, sure Let's see. So I actually have a spreadsheet or sorry a Presentation that I don't know how to share using this format, but I can just speak to that rule It's actually down on the bottom. There's a green button. That's a share screen. I think you might be able to see Okay, perfect. Yeah, I'll do that Okay, can you guys see my screen? No, okay, great So I just want to talk a bit about like why I decided to do this and I think that in this kind of current climate It's really important that we preserve this knowledge And the second step is to empower people, you know by showing them how to use it for themselves And I can talk a little bit about that later So just a few things. So there's the learnings that we took from our event in March There's some workflow stuff that I actually added to I Created some checklists and templates and stuff to help me personally because this was my first time Hosting this event. I knew nothing about data refuge and and doing all this stuff So this helped me parse through all the data and kind of make it make sense to me and hopefully Helped so, you know the people who attended my event and maybe some opportunities for later So the results were you know, there's 35 people signed up at 24 people showed up We didn't get as much done as I had hoped We didn't harvest as much data But it was it was more like for me and for others there I'm especially in our area there weren't at least the people that attended They were like this is my first time I want to do this So a lot of it was just like how do we do this thing? And you know after that people were like when's the next one, you know, we're you know, we like doing this So hopefully we'll get more done in the next event I actually have also events results spreadsheets that I Can share with you all this is like basically I had everybody because I wanted to make sure that like Everyone felt like they did something or at least like I was able to you know relay back like how much we got work done They got kind of took notes on what they did So that's that's that and it was mostly positive feedback from volunteers and again They they want to do it again, which is great So as an organizer while I was organizing the event it was a really comprehensive wealth Workflow, but it was a little bit difficult parts. I mean for me as a new person coming in It was like different sources Reading from different places, which is fine. It was just putting that all together might be a little bit challenging for some And it would have been nice if there was a little bit more clarity to go to like no a specific person to ask for some help and There's just all these other things so I to help me and the volunteers. I felt like doing this event It would take up a lot of time just setting up So I put together this really simple document to just an attempt to like just Convince the workflow into a very easy simple checklist format So I sent this out to people before the event So I had them like just do all of these things before so we could get started right away because I figured that like there Was going to be a lot of stuff that they had to do and I wanted them to actually do work while they were there People, you know, some people printed this out and this helped them kind of like figure out what we're doing So it was just a general checklist for you know, like what's common to everybody and then for each role There's like many checklists that they would have to do like pre install Pre-read and all that stuff. So I think like organizing that help some folks Get, you know, understand what what they need to do And then even like just logistic stuff just in the interest of time Is there a couple maybe like key moments that stood out around this like I think this stuff is super awesome Yeah, and we want to make sure that we integrate this feedback But are there kind of like themes maybe across Sort of what you notice while organizing that we could speak at a high level here and then maybe dig into specifics later Yeah, yeah, sure. Um, so I think that People just weren't sure what to expect throughout their day I mean overall it was just like well, we're gonna do something but we don't know exactly what that looks like So that was like maybe a little bit more clarity even though like we had a preset agenda It was just still kind of like the roles were still kind of unclear the paths were still kind of unclear and and maybe it was It was my fault for not understanding it at that time fully So it was it was just that like just a little bit more clarity on what they were doing and also Like knowing how much they really made a difference so like this Was actually like I hope that it validated a little bit of what they did But like because they did a lot of setting up for the most, you know for the half of the day They may I think they have felt that like I feel like I didn't do as much So just like being able to you know, know what they're getting into and like maybe an Expectation of like how much I would be doing it and you know having that result at the end Helps so yeah That's awesome And if you are able to share the link to your slides in the chat That would be amazing just for other event organizers Kind of wanted to open it up now to Oh sure, I'm also gonna share out like there's a bunch of these Templates and stuff too that are like totally available to anyone to use if you want so I'll share that out as well That's amazing. Thank you. Um, yeah So we were kind of hoping now like with those two in mind for other event organizers to maybe speak Briefly to one of those or all of those questions But but also in particular maybe like if they see themes in common It would be great to say and you can either mention in the chat and it's great that we're sort of taking advantage that but also the Sort of the channel is open now. I'm gonna stop sharing your screen. I think I can do this But maybe you need to actually Patricia. I'm not sure. No, I don't think I can sorry Okay, great, so if anyone else has has a something they would like to add in From their experience or if they're noticing a theme across those I dropped a couple of my themes in the chat and I see a few more that are have been mentioned Yeah, so I'm seeing also this issue about sort of like thinking about how to onboard people Some of that templating stuff I think it was cool that both of you had mentioned that you saw an article and you kind of like or like you saw this happening You're like, how do I get to do this where I am? Can I share something? Yeah, I I agree with everything she just said Because I think we spent so much time just trying to figure out the workflow But just trying to get things started as some of the other issues we had were I think I tried to send In registration, I tried to send a survey out explaining the roles And I think that I need to work harder on that and if anyone has better ways of explaining this Because I think I I think a lot of people actually in Start coming to me like I don't know what to sign up for I don't understand the role And so people weren't registering because they felt intimidated by the whole process And so I had to like start to roll it back and say if you can serve the web There's something you can do if you can write a story. There's something you can do So I felt like that was something that we're gonna work more on for this for our next data rescue Yeah, I think that that's the same exact thing here Like there's a lot of non technical people that want to contribute and they really can, you know They can do survey work, you know So like just making that less intimidating like you said is also one of our goals for our next rescue that rescue One thing and this is Justin one thing that we did in the sort of the difference between the Minnesota or the Michigan in the Minnesota event I staffed an orientation table that was the second stop after someone registered and For each, you know, whether it was a small group or a single person basically walked through the different Roles and gave, you know, the level of sort of technical skills, whether that was developer skills or library and metadata skills or research skills or things like that And so people could and then we had them sort of those were all color-coded and we had those colors on the tables We also had guides stationed at each one That could get people up and running You know right away But we also made very clear that said if you know if you're not comfortable with this role Like you can drop into a different role very easily. The other thing that we tried to do Partially because at that time we were we were almost out of primers. I think we actually finished the last one at that time Is we just had a a more general archiving Track that used the internet archives Chrome extension So people so we could look at we were doing some some research with the Internet archive API to see what things weren't super well covered And so we could say okay Department of Forestry or or whatever we chose for that day and had people go through and sort of systematically do that as a first step towards that community site mapping goal so so and even sort of less technical Skill in terms of not having to identify crawlable versus uncrawable content But I think that orientation table really helped get people Comfortable with with the roles and what they meant and also how they fit into the larger sort of workflow I'm glad you mentioned the color coding because we're we're trying to do that this for the next one And also what we're trying to do this time is we've actually added the host host training for for next for our next data rescue because I think people really want to do more of these stuff This kind of thing and that with their organization, but they don't know how to get started. So I'll be walking, you know folks through that as well Hi, this is Kaylee. Um, so we actually when we were doing thing is On the event right form we had a checkbox list where we had okay So you have programming skills you have back-end web skills You have all of these skills so we could get people thinking about it And then we actually sort of broke things down into okay, so as a cedar. Sorry, that's my cat in the background So as a cedar or sorter, you know, you need these types of skills as you know Somebody who's working in the app. Here's the range of skills that you might need for all of this stuff and that sort of worked of course the Visitors and sorters generally felt a bit less confident. I said a little bit of that in the chat, but We were trying pretty hard to make sure that it was good And we also at the beginning of the event ran through the entire workflow from beginning to end of What happened so that the people who were there in the beginning got oriented to things and then as as As stuff happens that the day people were able to come and talk to us if they had if they had questions So maybe even just one theme I'm seeing on this call the self is there's like so much like cross like sharing across events Like what worked what didn't that it seems like it's pretty helpful For event organizers. I'm wondering to just thinking about Some of this stuff. It'd be great if people could speak a little bit to you As well like I may be sharing this in the chat like what motivated them to attend or host a data rescue I think that would be really great to kind of uncover a little bit more of and also I think one of the other things or we've seen a little bit of this of like Maybe a thing that you wish you had known at the outset of planning and because I think it'd be great to sort of understand If there's a way that we could bring that stuff to the one year at the beginning of planning an event And if anyone like you don't have to share it in the chat, you can also just speak up. We have time there was one Onboarding like it was I think it was checkers and baggers It was something with admins like giving admin access to checkers baggers and Describers I think we were kind of like spinning our wheels trying to figure out why they couldn't you know log in and it was because like as an admin I had to go to archivers app and You know make sure that they're I select the person's username and Like a sign to them. Okay. You're a checker. You're a bagger and all that stuff and we didn't I didn't know that I thought I read the workflow Like thoroughly, I mean I went over that one So I may have still missed it, but it was like one of the things that like we learned that day After finally getting help from from Justin or breath I it was either Justin or Brendan that helped us out with that and then there was also You know that and logging into data refuge or creating that account and all that stuff It was just like I don't know like I if I you know like we all missed it So it would have been nice to to figure that out in the beginning and yeah This is Sarah I put together a Northeastern event. I ran into one interesting problem I was very interested in finishing off the harvesting and bagging for data from EPA that's been added to The the archivers app and Brendan really helped by letting us know where things were in the pipeline But then the People from the Boston area who have been really leading the archiving work here Were sort of hesitant to work on archiving that stuff because they were focused on developing the archivers 2.0 app So I just ran into this problem of like well, I really want to get this stuff baged But from a technical perspective there was sort of a feeling of well We need to have a different process to bag before we can do that So I guess that's one of the things that I that I'm wondering about is Really, how do we make sure things that are in the pipeline are really making it to getting bagged and how do we How coherent is that is that process and it was really hard for me to know at all from the outside You know What's what to say or advise so we ended up kind of having to drop that aspect? Yeah, I think bagging is a particular skill and so it It has often been an issue for people who who haven't you know like On purpose enrolled that skill already in the event So I think Brandon though it can speak to that the prospects for just moving things around with our covers, too Can I can I say something within that before you jump to that Brendan? This is Justin So with the bagging I think one thing that that has gotten I think me caught up in and sort of slowing this process down is is the emphasis of This idea of would this make sense to a scientist as a criteria for the bagging stage And that's sort of checking bagging stage, which is which is pretty hard to do in my opinion at a single event or you know if you don't have a specific kind of scientist or researcher or things like that so but if you if your role as a bagger is to Validate the data that you were you were that was harvested and you can you know go and use the Python library or the digital conservancy packaging tool and And can validate, you know that this will this will be the same you know six months from now someone downloads it and And and not have it's like not not having to worry about the sort of making sense to a scientist thing but that is that sets a really high bar in my opinion to To declare this as sort of valid And so I think that's where some of the bottleneck was So if we can say that you know, we we've got this amount of data, you know from this website We are confident in what we received. Let's bag that make that available But also knowing that there are folks working on sort of behind the scenes too About whether these are more quote-unquote authoritative copies of this data and so so Embracing as much as we can the the The important but also You know Not limited nature of this but like we can't offer the same assurances in the same way that a Government agency who has this data set could Because we are not the federal government and so being able to But still work through and bag that data and get it into secan. I think is is I think it's still the big push for me Thanks for that feedback you two Yeah, just in the interest of of time I think Brendan maybe Could speak to some of the things about how some of this technical stuff could change in the future Maybe a little bit later But I also wanted to bring in some of the event feedback that we've been receiving at the two Community building calls. We've had so far from organizers who weren't able to be here today And so I kind of like made like a mess of post-its notes in front of me I'm gonna try and bring bring these up or make sure I got them Well, one of the big themes that came up, which I think I heard again today was something about You know each event is gonna do something slightly different and so maybe understanding that we can work with that and figuring out a way for them to customize it another thing that's like closely tied to that is it seems like Events really spoke to their the importance of anchoring it to local interests and skills and I heard that a little bit today about trying to Provide a way for people to see how they could attend it because maybe there's something around the language or identifying It is being very technical when it doesn't have to be Seems like there's a big range of communities who are working at these events or who maybe get excited about this And so this could be a way that they bridge across I've heard a little bit today, but this came up in our last call in particular But there's a lot of people who want to understand how the work they're doing at events fits in Better with longer-term goals around this data and around sort of like what this community could be and and one thing I've heard a lot in All the feedback that I've gone through and then also in those calls was There's been a lot of enthusiasm from attendees So when's the next one going to be or people who couldn't attend emailing and saying oh I miss this like what can I do now to help and I can say from my personal experience on the slacks that we're getting a Lot of people joining saying there was an event in my town. I missed it. I saw it. I'm I'm here now What can I do? And so those are kind of just some of the things that other people have been bringing up But I don't know if people see any connections there or or something that they also want to speak to just on a Kind of on that note And if not I was gonna sort of make us start to move a little bit more to thinking about some of the longer-term future stuff But if anyone has a thought on sort of Current themes or I think we also mentioned a bunch of pain points or ways that you know needs that we need to address I mean, I think some of those parting ones too in the chat would be really helpful to have I Think I'd like to just say one more thing about that and that's just like visibility of the Nominated URLs and things like that. That was one of the things that I heard at my event, too It's like this there are there weren't as many cedars and there were like mostly harvesters that which was great But those cedars and also even some describers who had like not not enough to do They were not sure which Websites they should be checking So it would just like they were asking me like is there a place where we know we can start looking And I you know, we can just say that hey, this is climate related. So we're focusing on EPA NOAA and all that stuff, but you know, which exact URLs or pages or parts of the site Should we be looking at so just like that visibility would be great I mean we can definitely, you know, if if we have more of course if we have more resources we could use archivers to kind of like recreate the site map of that of a specific agency website and then just like have a very simple thing where it's a list of URLs and it tells you when the last time some this particular URL was nominated when it was last saved So that just at least like we know we have some sort of starting point. Yeah, no, it's a great point for sure I'm noticing some more stuff out of chat, which is awesome. So we're kind of keeping right to our schedule, which is also great I'm a little surprised And I'm wondering if maybe we just want to give like a couple moments if people want to just like go grab a drink or something I think some people said they might have had to leave around now before we sort of move it a little bit more into talking about like sort of local community efforts and Thinking about people who've sort of identified that they might want to host another event in their community And then thinking about proposals for the future So yeah, like we'll just take a one to two minute pause here or let's say more we'll say three minutes and So we'll start up again. So I'm gonna pause the Recording as well. And if people need to leave this is the time you can also do so But I mean feel free to leave as needed Okay, so we're back from our break and this is kind of in the second half of the town hall We were hoping to to really dig in a little bit more to the local community efforts And I think just trying to connect a little bit and understand some of the motivations for Maybe what people are looking to do going forward and then speak to some current Emerging opportunities and then sort of understand What additional work? Maybe we want to what proposals we maybe want to put forward for additional work. So We kind of just covered a bit of event feedback and I guess the first question which would be really good to hear from event organizers Is maybe like are they planning on hosting another event in their community? And this could be I want to maybe just drop in the chat just to get a sense on Numbers and or if there have has already been more than one event Maybe you could just write write that in too And then kind of while people are adding that I think the question that I would love to put to you all as well is What what are the opportunities you see and then also barriers for this ongoing? action, so in thinking about hosting another event Yeah, like are there any opportunities you're seeing or that you're kind of going to take advantage of or Have you identified some some barriers about sort of like an ongoing form of action? I Think one of the main barriers for at least that I found is People are interested, but they don't know how to start on there were at least a couple of Went to my data rescue just to learn about the data rescue so that they can do that for you know in their area So it was just like so that's why we're doing the hosting segment of that So we can get more people to learn about the process and then just like do it on their own there's actually like For this time around there's three of us in the Chicago area that's trying to do this So it's like and and it's they kind of like are looking at me to like hey, can you do this? I'm like no I Can't do everything so Hopefully they'll get more comfortable and We'll get more and more people and more and more You know even for their west and for their south and for their north to go do that would be really great Yeah, that's awesome. I'm wondering if anyone else maybe has An example or of like an opportunity you're seeing about that an upcoming event if people are already organized one So this is Justin with the Detroit event Not being from Detroit myself and also knowing the University of Michigan's More than somewhat checkered past and relationship to Detroit. I'm trying to be very careful and very Community partnership oriented to doing this event and so reaching out to a lot of different groups who have been working on on things especially with housing data in Detroit and so thinking of this both as a You know an opportunity to preserve this kind of data but also having this Function as sort of a gathering For people who have this expertise and to start You know developing the larger sort of community Infrastructure That may or may not be there in Detroit. I think that's part of like figuring out what What this event is for as well as the you know the seating and the harvesting which I think is is a little bit a little bit easier to run as part of events but the But also working with different libraries in the area to do the the bag in the metadata pieces so so being able to to both Bring together different groups that are already working on the areas and sort of local expertise, but also Trying out some some different community tasks like I mentioned before Whether that's metadata transcription or other forms of crowdsourcing is something I really like to do in upcoming events Awesome, I'm wondering if to if like maybe Speaking to that a little bit about partnerships, which I think is Yeah, what Brendan said which Is really cool I mean thinking about to like it seems like there's a spread across like the types of activities Which might align with sort of what we've been doing so far I'm wondering if other people kind of have had that experience where they're seeing like parallel efforts that they want to Sink up with I can speak to one that kind of was was brought up to me I know some of the people in Boston I are working with the data for democracy Which does a lot of parallel work, but it's not necessarily quite as focused around This sort of like preservation of federal data sets, but it's really about like working with public data So that kind of came up as another parallel effort Um So we're I'm hooking up with the National Humanity Center to Kind of written about this on the chat already, but a little bit but to we're trying they want to obviously banned to do more Like more agencies that are a little bit more humanities center. So we will still do environmental agency environmental Data, but we're we're actually trying to we're gonna send out surveys really soon to all the so I'm in the research Triangle, which is Raleigh Durham Chapel Hill. So we have a lot of universities here and we also have a lot of Industry here tech industry and research and stuff. So we have great, you know Community here and so we're trying to reach out reach out to the National Humanities Centers is very connected to all the different Like social science and that type of thing. So we're sending out a survey based off of the survey you had To all of them to kind of get them involved to say what data sets matter to them And then the kind of a little bit of a hook is hey Do you want your data to be rescued join us for our primer writing event? Which we're trying to schedule danger data week So hopefully they come and then they're involved in the primer writing and hopefully they'll come for the checking You know that kind of thing to get the social scientists in there And so that's kind of like what we've been trying to organize and then we'll have we haven't finalized the dates But it's going to be in June for this larger humanities Base data rescue RTP as we'll call it And then we're hoping that with some of the folks coming in there that will kind of train new people I'm creating smaller ones so that it can continue in Duke and So there's Duke here and CSU UNC and other institutions can then start doing their own I mean another question I kind of had Was just wondering if how people have organized like remote things Because there's a lot of people who want to come in especially around here who were like I can do this remotely But I don't really want to come in for this and so just knowing what modules are there for creating remote things And just one last thing I wanted to say about a roadblock is one of the problems I've had Being that we're a state institution because I'm from UNC Some of the librarians were working, but they would do it like on their own time They couldn't volunteer lab library staff time for fear of actually being fired for this being viewed as a political And as you know, North Carolina has some issues right now, but they're worried about being fired for Anything being interpreted as a political doing anything political during work hours. So that's all I want to say Those are great. I Saw that totally wanted to add something that kind of connected to your first point In terms of like kind of almost like partnerships or two people to connect to so let's stay on that theme for a second And then maybe we can revisit the second point you raised So totally maybe do you want to speak for a bit about your your Yeah, and so Andrew and I actually a couple weeks ago now we're down in DC for Sunshine Week, which is a Week to celebrate actually across the country, but I was particularly in DC week to celebrate open government a lot of open government groups Were there discussing sort of what you know should be done and doing this transition and under the new administration We were there and spoke made a lot of connections preliminarily Andrew and I For those of you that don't know I'm moving to DC for a little while to kind of explore the connections more both Across edging but also data archiving so very briefly. I think we'd be very happy to serve as a feedback channel With all those groups in particular Like sunlight foundation Project on government oversight There are smaller groups like open the government a lot of them are interested in the data archiving work and thinking about How to make that more visible and make that more well known One thing that we maybe we were surprised But maybe shouldn't have been surprised that a lot of people have actually heard of the data rescue events and people are like Oh, that's you guys great And so everyone's like really positive about it and wants to know where it's going and I think and so we even spoke with with folks at the National Archives and They and I think those are really good connections to have thinking about sort of direct channels from agencies to You know the the archivers app and things like that Can we even like circumvent some of these events and just get a pipeline of data going and you know happy to Be kind of a voice over there and try to understand how to do that in the best possible way I Don't did I miss something Andrew you want to add to that. Yeah, just on the pipeline. I mean a lot of these Organizations very specifically have like, you know, actually a web page or their own internal thing that says like here's what's come down and to be clear The only thing that really has come down is this one animal welfare data set although There's a lot of sort of people that are you know, there's some vague misinformation about what's what's been coming down But I think what's really important and what a lot of people express interest in is The work of data rescue and data refuge and edgy Becoming a very clear resource and us really being able to directly inform them So they weren't looking at us at all as a competitor But as someone that they could really use as a resource, which is exactly how we present ourselves So it would be useful whether it's through an API or just through you know, even to start with like a spreadsheet to have something that is Directly piping into their website so that they don't just have an ad hoc sort of aggregation, but they really can can systematically know what's changing Yeah, I'm just one more thing that you know There are actually a lot of people that have helped build the open data platforms under Obama And they have thought a lot about these problems already Did I go as one instance of that? But there's a lot of other efforts. And so I think we have a lot to learn from them Okay, so like that that seems like a big area is thinking about partnerships In many senses, I wanted to maybe I think there's been a lot of conversation about the Remote stuff and I think maybe we want to also check in with this verbally because that seems like an important point It seems like sort of simultaneously There are some barriers there, but also opportunities I'm wondering if maybe Justin or Margaret wants to speak a little bit like the current state of remote stuff because I feel like both of you Kind of jumping in kind of have the best understanding of that. Would that be okay? And then we could sort of connect that to opportunities that people have identified around it Sure So as as I think I typed in here and as Margaret mentioned you know because of the the emphasis on provenance and And and you know the sort of reproducibility of this and the transparency of this We really need people to understand the importance of that within the work before we Sort of let them have at it to the to the work I think part of this is also a security measure That we don't want to open this up to too wide of an audience that we don't know For fear of losing things etc etc or sabotage So I can see you know We can I can see you know two ways to do this one would be to or it could be simultaneously One would be good to do sort of a module or webinar that we we hosted on a week on a you know regular basis weekly that would identify this but then also would have a Sort of checking component to that people to that person's work To make sure that they've Followed we have our standards are very clear in what kinds what kinds of things need to be followed with it And we we have people who go through and check that with the particular data There's also the thing that was brought up earlier about having the verified unverified Section of the secan data refuge instance where you know if this was done by someone who is not Part of an event or hasn't met this criteria. This is a sort of you know, this hasn't been checked by an official person And there was one more that I was gonna say oh building this into the Instructions for the app itself You know having the the you know We need to see x y or z things and having that either You know sort of structured like Like it is Sort of like it is now But even more so where they're they're checking boxes that you know, it's this kind of format. This is going to be You know, this is going to be in this kind of file formats This is going to be you know, this is the kind of information We need you to say and in sort of the notes from harvesting field or the the bagging notes field And giving more structure to those fields rather than just leaving it up to the You know that person who's doing it at that moment and hopefully having some guidance But really structuring that without pretend, you know without hopefully over structuring it Margaret was there things you wanted to add So right now like we get a lot of people who Email us and want to help remotely And what we've been doing what we have now is because I do the workflow does say like use this form if you want to describe our bag to get the permissions And so people have been filling that out and if they're at an event then We really just do it that because like the admin at the event can give them those privileges But But we still want them to fill out the form because we want to know who they are and we want to know their names for that whole Chain of custody knowing who touched this kind of thing And then people who have done who have gone to events and done the process and seen the whole workflow and understand it Like they've we've been letting them continue to do that work Outside of the event that they attended and then people who want to harvest or research who have reached out We'd let them do that because that's the less Critical piece of the validation part Great. Yeah, so I mean like I feel like both of you surfaced and and what was brought up in the chat Was that there are a lot of Like, you know, it kind of connects to maybe some of the needs we have in the current workflow But there's also some opportunities and it kind of goes back to some of the stuff that Justin talked about about thinking about What citizen science approaches could be used around here? Just being mindful a bit of time and so like I'm seeing sort of on the needs front We have some stuff we could could think about for this But there's also some opportunities that come about in person and remote sort of both being Approaches we could support I'm wondering if if people maybe sort of from everything they've heard so far are Able to sort of like Do people maybe want to speak a little bit because I think it'd be great to sort of start to think a little bit about the The future in terms of like what would it mean to succeed at this effort? So like like what what what to us would look like success like what would inspire you to be more involved with the community It's sort of like it what what are you hoping to gain or contribute as part of your work on this and then help that kind of anchor Like sort of help us understand which opportunities we should be focusing a little more on And or which needs are more critical to address immediately And again, you can add into the chat or maybe chime in So if I can say something That's been on my mind a little bit Related not exactly specifically to not exactly what you just asked for but I think in the right direction, which is that A lot of this work has been done by volunteers and It's hard to sustain that over long term. And so one thing to think about is how How to support people better Long-term Funding or you know, pay positions or something that the community needs to keep this going because I know some people are already burnt out Have said that they're feeling burnt out And I can't believe that some of the ones that among you who I know are still like ridiculously High energy. I don't know how you've maintained to that level of energy over the last several months but I think it's a problem for It seems to be unsolved at the moment. So that's something to to think about is how how to keep this going long term and What are the mechanisms to support and fund people? Basically Just I'm blanking out on the right words for that, but I think you understand my point Yeah, no, it's a great. I think it's a great point and I would say I also see sort of a big need is to try and understand what sustaining this looks like But also making sure that it's supporting you know What it's supporting people feeling like they're gaining and contributing to something and sort of understanding what that shared something looks like And so as we move forward trying to you know, make sure that we're building towards like shared priorities and shared vision John do you want to say something I did how'd you know Um, that's pretty awesome. I think another part of this and the guy kind of goes to what Mike who's just saying about this being brought to your effort Value the value proposition for all this hard work. So, you know a data archiving is hard It takes a lot of time and effort and what we all want to know is that that data is going to be used by some of the future It's going to be this is important work and we're preserving this data because it's going to be useful and You know right now I think for this effort it's it's hard to know, right? It we're certainly all very scared that the data is going to go away and we're going to be really happy that the other copies of it but it hasn't happened any couple a couple of things in my understanding is couple of instances this has happened, but it hasn't happened wholesale and so Understanding what the most vulnerable data sets are like Justin mentioned that this movie data had 15 copies somewhere and some other data sets maybe that have been Focused on maybe there there are there are no copies and I think how we got the copy of that some backed up somewhere and other ones Maybe that's not the case. So really understanding really understand what the value is going to be as it's important part for sustaining It's the same thing going a long term And yes, the budget is certainly a big concern. I didn't read the rest of what you said just a bit. That's certainly enough for me. I Think that that's that's actually a really significant point that you raised John And I've been thinking that way too a little bit like what long-term what do we want to accomplish and one of the things I sometimes say in my stupid media appearances is that This came to us as a crisis, but it also provides us with an opportunity to rethink how we organize the Kind of curation of scientific data as a society over the long term so and I think Brandon has really believes that and Is thinking about what the technical infrastructure is that we can replace the kinds of the kinds of data Cultivation or data curation infrastructure that we have now and I would Like to think that that that what we've got what we get out of these events is to some extent kind of a movement to transform the The sustainability of scientific data Yeah, I get you back on that. I think that This movement that has I think it is a movement, but that's me and it and I think that Changing the relationship that people have with data is a really important aspect of this like I think that we We think of data as like something that smart people have in a corner And it's not something that regular humans touch whatever that means and I think that that's Part of a cultural shift that I think we should be trying to think about motivating and One of the best what something that I had seen at every data rescue event I've had the pleasure of attending because people interacting with information that they never Touched or with any we had any chance to sort of see or new existed Like I I did not know about whether Bowie is in the middle of the Atlantic Ocean or what they did or what volatile organic Compounds were any of these things and I learned you know I've learned a bunch of stuff just as as part of these events And I think that that's the beauty and the joy of like a lot of people coming together and enjoying each other's company and really participating in cultural ships of our attitudes towards this information and so I couldn't agree more about the notion of like Upping the way that we interact with this data and really sort of knowing that use Propagates the the the existence of this stuff. And so if we approach it from that aspect, I think that's I As Matt said, I'm deeply excited about that that part of this process and I think that the more of it We do the better Yeah, one thing that's always really stuck out to me from the First data rescue in New York Was that a majority people came there because they said they wanted to bring their skills to contribute to something meaningful and so But there was you know what they saw They thought that rescue is a way to respond to sort of like this concern They had about the ability to access this data and through that they got an opportunity to learn about this data And so to me that's like really been a great like sort of like motivating like Moment to see and and I don't know I'm kind of assuming that some of you have similar stories about like, you know Like a surprising or like a really key point that you noticed about what? You know, like maybe what you see as people gaining or contributing to this Andrew yeah, I just want to say I think what you just said Don and then what Brandon I was saying are like really important that this will be part of building a community in a movement that will learn You know how to show data to Everyone and not make this sort of just a little group of technocrats. I think in the short term As we've seen as you said John like we have there hasn't been a lot of data that's gone down But I think what has actually been really important in addition to already building up such a community that's interested is Actually between the archiving and then the monitoring work to actually like check what's gone down There's already an enormous political cost to taking something down from the web from the dot gov domain right now We've heard that from journalists. We've heard that directly from congressional staffers who we've spoken with who have said like Yeah, people talk about the risk of an agency doing that right now And moreover in our website monitoring work We've actually had journalists call up an agency after just like an paragraph was removed In some cases like an Obama era paragraph and in one case it wasn't really that important and then that agency like oh That was a contractor who made that error and then they put it up an hour later Because they actually felt like there was a big concern that they would be written up in the news So this work has been in some sense really like I think part of why we haven't seen Some little things removed and I agree like I don't think that there was going to be this big purge So it's not like I think we saved like all data but I do think that this is the beginning of those steps and Integrating a community that cares like y'all are saying with a community that knows that they're sort of a big part of The resistance to actually this these change happening. I think that's what we're serving to do right now Hopefully in the long term we actually create all the things we want to and make data accessible as well But in the short term there are still values Valley there's value that we're at it. I just wanted to Add that So just in the interest of time if maybe one or two more people have something who haven't spoken up that'd be great And and otherwise I kind of just have maybe one more prompt before we could start to think about Proposals for how we actually do some of the stuff that we're all excited about doing So yeah, if anyone hasn't Timed in so far thinking about Sort of like these efforts please feel free either on the chat or here and If not, maybe just one point that has come up a couple times in the chat is And previously in conversation around partnerships is so like what what voices are missing from this conversation And are there people who are doing parallel work? We're doing this work exactly that we haven't spoke to that we should be Are the people we should be trying to include in this conversation or reach out to to help push Some of these conversations forward and I think we've gotten a lot of great suggestions just recently driven Patricia I know Justin has also made a couple and so I'm hearing a lot of like citizen science I think a lot of people are speaking to like local groups in their city or region that they're also connecting to Which is which is great to sort of have it be local, but I've been at I'm wondering if at a national scale as well Yeah, so civic tech groups as someone who goes to the civic tech group here and that's what got me into this. Yes Yeah, so are there any missing voices or people maybe we should really think about reaching out to Maybe some additional types of data, so we're emphasizing a lot of environmental data, but there is Population data that it palplaces census data, which is also very politically charged there is drug data FDA drug testing, etc. There's also I Mean there's a bunch of things like that that I think we're not That I haven't seen too much on the radar a little bit through the primers that totally and Maya and Andrew Been putting together if there's some of the agencies have touched on that But I think there's whole slew of additional things that are at risk, which may not actually be Getting a lot of voice here or in other efforts because we're it's not we're not touching those people or that data That's a really really great point and then Patricia also mentioned in the chat this sort of like Maybe that we're not at coordinating as much with groups who are doing the same thing and so that that's something We want to think about also fence line communities totally union concerned scientists. I Want to maybe just give a couple more minutes if people have thoughts on this I Know one thing that's come up on our side. Oh, sorry I'll just finish this lot and then you can go Andrew Is really making a concerted effort to reach out to the web archiving community and try and understand Like you know what we can can learn from existing efforts and sort of how the challenge that we have like we have There are some challenges that we're finding that I think are ones that web archiving has Maybe not dealt with as much previously. And so what the opportunities are to work together around that and Then Andrew, I just want to say yeah, I think that like there's so many groups You know that work with fence line communities And then there's groups of people in fence line communities that have you know created You know, we shouldn't just think of people working in fence line communities out there that they themselves organize I think what we probably need to do as a community here is try like a couple a couple of just pilot Sort of runs either going out to communities, you know sending actual people from our community out Or whatever it is, you know the union of concerned scientists public lab various other groups work with these Communities and I think it's like there's so many people that would love to like come get a demo and actually have their own data go into our pipeline and then Understand how to visualize it and understand how to describe metadata So I think it's it's very hard to think of it as like let's do this with all fence line communities But if we just start thinking about a few that we know then maybe it'll be more More practical Awesome. Yes. Yes, just in the interest of time We were kind of thinking at this point to shift into discussing a little bit the emerging opportunities that are happening right now and then maybe identify some Areas that we want to begin to work on to sort of move this stuff forward. I just kind of wanted to Reflect a little bit on kind of the pain points and then the needs that have been identified in this call And then also some of the other themes that were brought up before and just now I'm so I'm actually from what I'm hearing there are a lot of needs around sort of improving Aspects of the event workflow. It seems like There's something around onboarding that we should really be thinking about here I'm also hearing sorry my like I've been taking those and pushing over so I'm gonna do some like side-looking. Um, I think I'm also hearing some stuff about Understanding what our shared goals are and sort of like what shared goals could be with partners Thinking about ways that across events there could be some communication or sharing I heard I know Sarah mentioned her Like the module that got developed there So there's like things that are happening at these local events that could be of interest and value for other people and figure out A way that that can be clearly communicated and Then I'm also hearing sort of this issue of like broadening who feels like they could belong in this community So there's some sense in which there's a little bit of like a technical potential like people seeing it as a technical Activity but it doesn't have to be that and it hasn't been that And then I guess the other big thing I've seen or heard was this other types of data So thinking about broadening not just people who say they could be involved But broadening the types of data or making people aware that this doesn't just as a model may be applied to environmental data and I guess so moving more into proposals for what to do next to emerging opportunities That are already ongoing one the library's plus network From data refuge and then sort of an iteration on the web archiving process are kind of currently on the go And so we're hoping to maybe speak a little bit about that and then sort of see The additional areas that we want to identify So the additional concerns that aren't met by maybe those proposals if they've come out in this call And then what we can do to address those so Margaret. Are you able to speak to the library's plus network now? Oh great. Sorry Yes, the library's plus network Kind of fits into it kind of addresses a lot of the issues that a lot of people have been bringing up So What we're cut what we've been thinking about really since the the beginning of these events were As I said data refuges in part the cap of this effort is coming from the libraries So we're thinking about it a lot from that perspective and we've always felt that this backing up data Backing up federal information making it accessible has always been the work of libraries. It's what we do And we were the library's plus network is kind of envisioning a Reboot of the federal depository library program Which if you're not familiar with that the idea was back when more things were printed They were sent out to these federal depository libraries so that there would be copies of them distributed across the country to improve access But now as we all know very well their most government information is coming from It's born digital and that stuff is not going to libraries We've all just sort of been letting it sit on these federal servers some of it's backed up But not all of it And it's not backed up in any sort of systematic way So we're working with libraries the open data community the government community people within government agencies All kinds of stakeholders. So we've been talking to you to sort of think about this problem and really identify what the different What the smaller problems within that giant problem are? So that we can start thinking about a way to systematically and sustainably keep this data backed up So that's a library's plus network in a nutshell And can you kind of speak to you? I'm sorry. I said I had this as a note prompt and I failed I was gonna ask you and Brendan to speak to sort of like immediate next steps or like where people should go to find out more information And ways to be involved Yeah, so The lawyers plus network is The big thing that we have coming up is this meeting that we have in May with many of these stakeholders that I just mentioned that meeting Has a cap on attendance and it's invite only So that's that's already happening. We're trying to be as open as possible You can go to libraries that network to get more information about what we're up to we try to keep the blog up today We are doing because this May meeting is Coming a pretty big deal And we know a lot of people want to be there who can't be there We are doing a number of different webinars to keep people updated and we also And then we're speaking about it at many different places And that's now of all those seeking engagements and webinars are now on the website If you want to check out how you can hear more Can't type and talk at the same time, but I'll put it in the chat awesome, and so I Given the time we have left was hoping we could do both proposals and then have a little bit of a conversation about That and if there's any specific questions, we could have them in a chat and then see Kind of other concerns that we want to be immediately addressing So Brendan, are you able to speak a little bit now to the iterating on web archiving? And I know it's kind of been hinted at or mentioned previously in the call Totally yeah And if you guys don't mind it love to just demo like a really quick thing that it hopefully speaks to some of What we're sort of been up to since we started working on Basically since we left the spreadsheet territory and built an app We like immediately recognized a whole bunch of things that could be improved I think a lot of it was covered in this call today Actually far more clearly than it has been sort of iterated Before it's amazing how much having an event organizer sort of tell you Just face-to-face what was what could be improved on was just it's just invaluable And so the biggest things that we were sort of noticing was The technical hurdles are pretty high in the current app The we're sort of missing an opportunity in online participation and and really trying to Drive a more simple ask from volunteers so that it's a sort of more direct In a nutshell like here's the thing that you can do right now That would help us greatly And while obviously there will always be sort of a number of things that we're doing Hopefully we can sort of get everybody in the door on some of that stuff So with that we've been working on a proposal That I'll just do a really quick demo of like what some of this workflow could look like in the future All of this is just a proposal and we're our big ask for this We're sort of Working title just archivers 2.0 The big ask is we're really looking for feedback on this We'd really like to develop this into something that the community finds deeply useful So yeah with that I'll share my screen quickly here. Hopefully this works Share screen Cool. Everybody see that? Oh, you can talk because you're all muted. I'll hope you I hope everybody can see this so The way this works sort of in the future is we've sort of changed a couple of things away from This is sort of in addition to a lot of what I'll show you is in addition to our current workflow And has everything to do with attributing metadata to individual URLs And so Volunteers would show up at a much more simple like hey, we need you to archive some data And then we take you To this kind of screen which is much more sort of like these are little progress bars of like where we are on individual We're calling these I'm calling these sources which are just Uh The individual URLs from primers So if we look at the actual primers list, we now have ingested the actual primers that everybody Has been putting this incredible amount of work into building And we can actually sort of grab the information so we can get acquainted with a resource and sort of understand Okay, this haps website needs needs data And so getting back to A piece of content that needs metadata This is in the background. Oh, yes And so it's worth noting that we we're now sort of automating a lot of this process so that people don't actually download and upload Things as much and instead what we're asking them to do is sort of participate in describing and qualifying and contextualizing this information And so if we click through to information here, we've got a piece of content We don't know anything about what it is a volunteer would hit download They would look at the file that they're seeing and then they'd actually have URLs that that link to this To the point of from the chicago chapter We're this this version like crawl specific URLs and you can see sort of snapshots of the url And how it works But then once we add metadata We get a much more sort of It this looks a little complicated, but this is actually derived directly from the data.gov spec that as was mentioned earlier is a very good We think is a great spec that's been developed by The united states government this interoperates with secan instances with Uh, what else with data data verse instances works in a lot of different areas And so it's been agreed that this sort of format would be a great starting point And we've been in talks with some other projects that are also engaged in archiving and they feel like this If we had this coverage of every file, we'd be really excited to sort of see that Um, so an individual user can come in and submit the metadata all they like They can fill this out Most importantly, once I do fill it out, it's actually attributed to me as a person So I can actually see the metadata that I've added And uh, so hopefully in the future we could use this to build a better reward system for when people actually write down Right in and take the time to volunteer So if you came to somebody's sort of personal page and you saw this list of achievements and accolades for The metadata they've attributed, we could hopefully I think that'd be a good thing But yeah, and so uh behind the scenes we have that's kind of it for the demo Oh, yeah, I should show like just very quickly like we're actually working across Like url so we can actually search for like epa and we get Like the actual website. We're crawling across the actual And we're sort of traversing the actual pages so we can go From a page and we can actually see the links to and from it Currently this this one has no links. So that was an awful example, but anyways So nothing that everything that we um Or how do I stop this share? Here we go. Yes. So that's just a quick demo But basically uh, the hope is to sort of really help reduce make it so that somebody can show up online and just start contributing This metadata to a place that we could sort of start to Coordinate with other services because we're now working across content as well as urls And we could use this as sort of um a starting point and then distribute to as many potential partners as possible Getting the and the hope is to sort of really up the value of a contribution that a volunteer makes By having to go to as many places as possible and also try to reduce the technical overhead That is needed for a user to sort of get to contribution um And the hope from that with that is we would sort of automate a lot of this with web crawlers And then we have a whole area we'd be building a whole new area for very technically motivated users to write scripts and submit scripts Instead of actual downloads and uploads that we could then rerun in the future Which we would think would be a much more robust method of archiving hard-to-reach data And all this is to say we could then use that to Distribute to lots of different places which we think would be a really great source of authority So like by pushing to data refuge when a when a bag hits data refuge You have a much sort of like higher degree of rigor as martin mentioned But like having the libraries and anybody who is sort of a data repository verify and be able to grab this data And they would be able to individually see contributions based on users and sort of factor out any content that they thought was valid Or wasn't contributed by a trustworthy source And so yeah, that's kind of where we're hoping to head We've been building an api. We've been building a whole number of services to try and spec this out as a proposal the big hope is that Obviously, this is sort of like a lot of new thinking and we would really like to Start to understand this as a community and really get sort of feedback going Um, anyways anyways, so one thing you need to talk about was Maybe slightly more technical, but I think uh Touches on some of the stuff that Justin into some extent, but mostly mostly Justin was talking about um about verification so the use of cryptographic keys as a way to I to uh associate any Any work within individual um Eases I think is meant to be a technical intervention that reduces the overhead for the provenance work In part. Yeah, it would also allow for instance something that entered Like the whole huge system not through us but through climate mirror to be kind of You know, uh To enter the process kind of late and be um uh Signed by you know, some trusted individual say at Penn Say a trusted librarian at Penn signs Um, some large piece of data signs and described some large piece of data That was downloaded by climate mirror then its authority is no less kind of Real than something that went through the process solely through us so To me that this is I mean that's an exciting thing that's it's sort of flexible, but also rigorous at the same time Yeah, and to that point, uh, we can start to really just to add to what matt was saying This opens up some possibilities for us. We could do mechanical Turk style triple check requirements Where you can say look we need to have three people review this before it's even considered remotely finished um, we also have By switching to you were seeing a lot of really crazy long strings of letters and numbers in that demo Those are called hashes which are digital fingerprints of the content itself And so that's what as matt was mentioning what lets us do a lot of our de duplication work And also allows us to verify From a computational level that we're all talking about the same data Um on top of that it allows us it opens up the doors to Distributing copies of this data all around the world And so we can keep duplicates all over the place associate metadata with this these long strings of letters and numbers And that helps us sort of like couple things together later on and really know that we're all talking about the same thing And so a lot of this is sort of Been based on a ton of feedback that we've been getting In talking with people who have been running events I think the real hope of like being able to have a really solid online contribution Arm of this is where we're hoping to go and uh, yeah, so hopefully that helps. I'm happy to answer any questions as well Yeah, just in the interest of time those questions might be something to sort of push to uh, like maybe having in A slack channel after or ongoing but just maybe to understand to Um, I saw you answer to Patricia that like you're maybe looking for people to test it I know that you're kind of like the coming month is going to be a big month for sort of like understanding whether this system is connecting to all the needs that are coming out of um, sort of Existing pain points that are identified around using the archivers uh current app Um, so could you maybe speak just very briefly? To like maybe how people could become involved in that Yes, I'll I'll drop a link to a github repo of our proposed services And we would love love love if people could read through that If they have the time and if not that issues cue is where we're really hoping to sort of centralize some of the feedback around it And yeah, is that good? Yeah, I think that's good. Um, so um, coming out of uh both of those Like opportunities and I uh, I'm kind of wondering if we may want to just check in and see how that fits to um, the The the proposals or or thoughts that people have had just earlier And maybe um, if there's additional work that we want to identify here as like striking out is doing to kind of meet Um, those identified needs and Matt, I think you have a point Well, I think so, um I think one point that's worth that that what that john's point also raises is um john's point from earlier that I responded to is is whether data rescue is the most useful rubric work that we Are trying to moving forward if if we think that what we're doing is rebuilding The infrastructure for data. I'm not sure that's the same thing as rescuing data and in And so there's a kind of maybe a broadening of the scope That isn't captured by that name Um, I don't know if people feel that that I'm right about that My thought on that is that um data rescue doesn't have to be limited in scope to just what's happening with the archivers app and the primers We've really been open to calling it a lot of things a lot of events have done a lot of different kinds of workflows Or haven't done any archiving. I'm just stuck to educational activities or Have focused on other areas. So I think it's really it's not about saying like oh data rescue is it's over It's more about we need to redefine what it is Okay, so that's maybe a thing that's Is a big is a big point that we want to speak to is maybe Understanding or redefining what data rescue or like what events are going to be Because there are going to be future events So that to me seems like a big area where There is sort of maybe more work to be done I know coming out of today. We've heard a lot about sort of like some of the the learning opportunities sort of the Not only archiving, but maybe how people get to interact with data thinking about storytelling And a lot of this other sort of Other efforts that people see you being complimentary to what's already happening. So To me that seems like an area that we don't have something Like right now about so we maybe want to think more about how we can sort of Think through that and what the right structure would be One thing that I'm going to just pull forward from before that's also come up is partnerships So I think maybe thinking a little bit strategically about What those partnerships could be sort of thinking about Oh, sorry, I just like lost my I lost my train of thought because I looked at the chat But also like connecting those to the themes that kind of have come up out of events I think that there's more work there to think about like what a strategy around some of those partnerships could be And that's not maybe something that we can answer in like one day, but we want to be spend some time thinking through I mean also in addition just I think there is I think we're hearing and and then throughout organizer feedback. There are some needs About some of the current stuff that we have. So there has to be some I think level of You know working with that feedback to address As some of the needs that organizers are giving us And and I've been keeping those notes throughout here And they're in the notes from today's chat, but I wanted to sort of see if people think those make sense Are there additional concerns that aren't being addressed? Is there sort of stuff that we've missed that we should be thinking about? There's there's something on the topic Of uh, sort of we're we're saying we don't need to necessarily just do data rescue We don't need to just do archiving We can do and then there's the other side which is events can be about education I think there's probably an intermediate where we can still have something like a pipeline but the pipeline is Has visualization tools and it has you know, it focuses on data a script Metadata description It's still something like a pipeline where everyone feels like they're contributing to this This overall cause, but it's not just to archive data And it is actually in and of itself sort of an educational piece I just I'm saying that because I think concretely we could envision a future Archivers app, but it's not called the iCarve that it's just called like the data app or like the community data app whatever That uh That both has an archiving component But maybe once we're done archiving or we don't value it anymore because we feel like it's sort of saturated It also does these other things But it really is still a concrete app because I guess my worry is that I do want to work towards something concrete That allows community building to continue happening in the same way that this app has sort of facilitated it But it doesn't just focus on the technical archiving aspect I think that's a great point. Um, especially in relating that data to people In I mean if we do our jobs properly, we're going to have all of this data Um, all of this climate data all in one place essentially So we can leverage that data to create even more comprehensive Maps of like sea level rise or like have that all that data in one place that is a little very relatable Okay, so, um, just also thinking about uh time and like, you know a two hour call does not solve everything I I'm kind of hearing just to kind of circle back around. Um that uh, maybe we want to uh This is the model I got I feel like maybe a working group So like I think we want to like connect to Sort of having some people who want to think about the partnership strategy side Have some people speak to um kind of I guess the the vision goals Like vision goals, um To use mics vision goals, whatever, but like it's community building aspect and then also Yeah, and so that's a good point So like how do we do this as a way for people to maybe scaffold their own stuff out of it instead of Being prescriptive for sure. Uh, and then also I think maybe A way that we could uh at events address some of this existing feedback that we've received Um, so I'm wondering if that kind of sounds good to each other to people here like so maybe we want to have a couple Groups that in the in the coming weeks We'll kind of investigate this stuff and we could aim to have a Keep going with our weekly community building calls But have those more be almost like working group checking calls on these three areas And then come back and have like a town hall Again, but maybe like a shorter one. Just as like, uh, we've had some people go and sort of like investigate and try maybe propose Like a community document that answers like, uh, you know What what could be the vision and goals for this kind of work? And then like maybe a document about uh, here's a strategy for some of these new partnerships identifying people and then also, um You know have been doing some of the ongoing improvement to event plans Is that I'm seeing nods I think it would be really good to just do a bit of a check-in and we can just do it by chat and we can put a a thing in the The channels after so you don't have to commit today But could maybe we we check in and say who's interested in what so are there people interested in kind of that? community building vision goals Work and kind of exploring what that looks like here. And if so, maybe you could add like a plus one Or your name or say i'm interested in the chat or put your hands up or something So i'm seeing a few okay cool, uh And then um And this could be something that we'll reach out to you after and connect on I'm wondering if uh, there are people who are interested in more of the partnership thinking about You know communities that are missing strategic opportunities and ways that you could connect to current groups and how that those efforts that are already similar to You know the stuff that's been going on at these events I'm seeing some hands cool. We have interest And then maybe the last sort of area also being the events Like sort of addressing the needs that have come out of this call in the last couple weeks of community building and necessarily doing like a deep dig back through A lot of the feedback we've received and so um, maybe like, uh, I don't know what to call that. I'll call it a future of events So I'll have that maybe as a call. Are there people who are interested in that? Um, yeah, so it's a good point. Um Maybe my could you drop some names? My names are terrible in the chat And Justin's asking or saying interested in all the things like I don't really see these being super separate from each other So I'm going to propose as a working model that Well, we keep doing the community building calls on Thursdays Uh, but really maybe frame them around just to check in on each of those three areas Like that's what the call is for of course anyone can join open at any point. Um But then um, uh, we aim to sort of over the arc of a month kind of like work to develop You know like a sense a stronger sense of each of those So maybe like have a deliverable and we can sort of spend the next two days identifying what that deliverable should be for each um and then If you're interested in a couple I mean, I I don't see there being reason why you you couldn't be involved in all of them But I mean really just being mindful of like volunteer time and keeping this sustainable. It might be great to Um, yeah, so plus a slack challenge for each working group. That'd be awesome But it might be great to just sort of Not commit to all of them. Um, but then aim to be a part of one and then through those Um, uh community building conversations every week get a way to check in and offer feedback and sort of like help Push all of this forward I'm seeing nods. I'm wondering how people feel about that I'm seeing thumbs up being multiple thumbs up. I will take them Poof. I got my thumbs too. Great. Okay. Um, do we want to maybe have a couple? Is there any parting thoughts that people have? Um, just maybe open the Conversation now. Um, thank you all for going on this two hour call with uh with me and each other Um, but yeah, it's there so, um, I'll be maybe Pinging some of you, uh, but I'm sorry. I'll be pinging you just as a reminder about this And so you don't have to commit to the working group today But in just the next couple days think about if there's an area you want to be involved with I mean you came to a call like this So I feel like you have lots of ideas. They were amazing. Um, and I could speak to each of those areas And and I think it'd be wonderful to have you kind of like Help kind of set some of those directions as as we kind of try and unpack and understand what this uh, this could be Okay, uh, then I will be also saying Goodbye and like letting you all go roughly on time um So yeah, thanks everyone for joining and I'll see you all in slack And we'll be be pinging just with a reminder about the groups and then um for those interested We'll have like a short call on uh thursday Awesome, thanks a lot dawn everyone. Thank you dawn Thanks everyone for coming. This is awesome. It's always so exciting to hear from organizers Yes, let me talk to you guys Have a great day