 So at this point, we're going to move on to an update on what the snack cooperative has been doing over the last couple of years. I have been following this work which continues to be fascinating and exciting and continues to grow new collaborations and new dimensions. I'm delighted that we have with us today Daniel pity and Joseph glass Daniel is well known to many of you for his work in many aspects of archives and special collections. And Joseph is a little bit newer to all this and is the technical lead who's been moving along all of this slack snack technology over the last couple of years. Thank you both for joining us and I will turn it over to you. Since we're starting to slightly late, I'll jump into this and Joseph and I will attempt to go through these slides, expeditiously leave a few minutes at the end for questions. Okay, I'm trying to get it to advance and it doesn't seem to want to. There. Okay, so is Clifford introduced it so we can sort of skip by who the reporters are and, and what I'm going to discuss is the snack community, and some of the current editorial and content activities. I'm emphasizing the social aspects of the snack cooperative and, and I'll introduce a few of the technical projects that we're working on but Joseph will go into those a little bit more detail. For those of you who are perhaps not familiar with snack it is, it is a cooperative is the, is its name suggests, I'm getting things out of the way so I can actually read. It's a community of cultural heritage professionals coming out of professionals out of libraries archives and museums, sharing the work of describing persons organizations in family. Most of these entities being historical, but quite a few of them are contemporary or, or recent. And, and describing those persons as well as the historical records in which they're documented so that it overall is creating a vast social document network. So it provides integrated access to distribute historical records. In addition to this provides access to the social networks within which the people organizations and families exist or existed. I can get it to advance one time and I hit the same key and it's not day. Hmm. It's hard to be expeditious when to do this. The cooperative social structure said there's an administrative team and consists of myself and Joseph and staff at the National Archives. Joseph leads the technical development team along with assistance at the moment from Jason Jordan and UV library it. We have an operations committee says overall that it consists of administrative team is as well as working group chairs in a key part of the cooperative are these working groups. So the, probably the most popular of the groups is editorial policy and standards and a lot of what I'll emphasize will come out of that group. There's communications, there's technology research and reference is a recent addition to to the, to the governance and this was the compensate for the fact that the majority of people that have been fall. The National site and stack come from the technical services side of things rather than the public services side of things and we wanted to help balance that out and finally, there is a snack school, which the director is Jerry Simmons at NARA. There's a bit more about that. There's now a large number of trainers and what these people train people to do is to edit the descriptions of persons corporate bodies and families in the snack and also the resource descriptions. And they have a course once a month. We've now developed a new course, which is directed at reference professionals, as well as researchers on how to use snack for research. Just a quick overview we began in 2010 as a research and development project. And we begin immediately planning on trying to turn this into a permanent resource. And we began doing that in earnest in 2015. And we were in the final steps of doing that. Current numbers, there's over 3.5 million descriptions of corporate bodies, persons and families. You can see the breakdown there of each category. There are over 2 million descriptions of resource archival resources and sometimes that's a collection, or it can be items but more often than not it's collections. So the ultimate number of items represented by that 2.1 million is actually in several magnitudes more than that 2 million. In terms of the social document links, there's 7 million links between the 3 million CPF entities as we call them in the resource descriptions. And then the social relations among the CPF entities, there's close to 7.3 million. We currently have 57 member institutions, and we're averaging over 101,000 visitors per month and it's greatly increased over the last 18 months or so. The number of observations on the numbers is actually the number when we got done with the research and development bill. It was larger than the numbers that I just gave you and the reason behind this is speaking of identity as we heard in the last session. We're trying to identify identity resolution and trying to identify people and pull together descriptions that were from the same person or same corporate body or family. This is a really, really difficult thing to do from an algorithmic point of view. And I emphasize false negatives because once you merge two different entities into one, trying to separate them back out can be, well, a nightmare. But we currently have been doing a lot of merging of duplicates based on human review and open refines really a great tool for this more on that in a bit. But overall the number of descriptions relations decreased. And, but we're about in again referencing open refine, we're about to resume ingesting large batches of new data and in fact we have ingested some using the tool that we're developing. The community itself is, you know, there's the usual things you expect archivists and librarians to be devoted to, but one of the things that's really emerged out and a lot of it has to do with the times that we're living in the broader context within which we're, we're doing this work is there's a really strong emphasis on perform performing that work. And also recognizing historically that our archives and librarians have tended to privilege the records to put it bluntly of white males at the expense of everyone else and so there's a real keen interest in making the people and organizations and families and snack representative of mankind as opposed to a segment of it. As I said there's a real strong emphasis on ethical description and one of the key things that we spent a lot of time focusing on and discussing is demographic description and categorization of people, and it is useful to scholars and scholars quite often are focused on particular demographic groups, but it's also that, you know, classifying a people can be abused and historically has been. What we're attempting to do is find ways to do it that you know that it's that it's done ethically thoughtfully is based on evidence and a major emphasis in this is ethical description needs to respect how people identify themselves. So, referring back to the editorial policy working group. There's a number of things that have come out of that group. The overall editorial ethos statement. And this can all be found in the about stack editorial policies section on the site website. There's ethos for Karen editing and that really has to do with the editors respecting one another. Not not destroying the work of someone who knows more than you do and this is in particular, you know, in terms of self representation. There's a overall statement on demographic classification. We're still wrestling with some of the demographic categories. And there's a proposed policy and editorial guide for indigenous entity descriptions in snack. Many of the projects also as a highlighted up above is about making what's in snack more representative. And so under the editorial standards group we have a subgroup on slave description. Working at best practices. And this will also lead to, you know, some some changes in the technical infrastructure of snack and also coming soon an indigenous description subgroup. And again, you know, self representation community representation of themselves is a big focus of this. We had a indigenous edit on in October of 2021 with over 70 editors working in stack creating new entities, 20 of the editors come from indigenous communities, and many newly trained, and we're working on developing an editorial training that specifically aimed at indigenous description editing. And then finally, in terms of representation, and Joseph will have more on this is for developing something called light snack, which is form based entry for small or under resource repositories to contribute data to snack with minimal training. And this is really another sort of major focus of the community feels very strongly about this is wanting to lower the bar and make sure anyone that wants to participate can. So on the technical development side. Some of the major things that we're working on is we have a web service for bag baths extraction of descriptions of corporate bodies persons and families from existing ed description. And then we're developing well along and developing an open refined snack plugin. And Joseph will say a little more on this. And just wanted to point out that the open refined snack plugin that you can get data into it in other ways than just for me, if you have a database with data that's roughly compatible with what's in snack you can map it in and use this tool. We're also developing a plugin for archive space so that and that's well along and development. And again, the light snack, which I referred to a moment ago. In addition, we're completely revamped. It's called concept vocabulary management in snack and it is multi lingual in the ability here to, you know, leaning a lot on a lot of outside authorities, but also having the ability to curate different terms and in particular, for example, we're in slavery era demographic terms in order to the class class, you know, be able to classify who's enslaved and who's a slave owner and the like. And we're also contemplating adding the National Museum of American Indian ethnic vocabulary. Again, there's a lot of this going on in archives and libraries and museums, in terms of doing reparative work of fixing what was, let's just call it inappropriate or harmful description in the past. And a lot of it has to do with these vocabularies. So the ability to to give control to communities to to address that. I'll leave it at that. Let Joseph say more. I just want to conclude my remarks by saying that we're very much open to having new cooperative members with the training that we have in place and the tools that we're developing we're really in a good position to onboard new institutions. Employee also volunteer editors and we have a fair number of those were always looking for ideas for projects particularly those focused on improving representation. And we can help people develop these. So, with that said, let me now turn it over to Joseph. And people say a bit more. Okay, thank you Daniel. Are my slides visible. Yes. Okay. I'm going to go into some of the more technical aspect of the points that Daniel has mentioned what we've been doing on snack, how we've been trying to leverage some tools to both improve our data to make it easier for editors and new institutions to onboard. So to get into stack, improve its quality and stay in sync when they make changes locally or on snack. So, to give you an idea of what the snack editing their face, usually handles is we're describing corporate bodies persons and families, their archival resources by biographies subjects occupations places, and the relationships between various people organizations and families, and between each other. Here is a profile of William James on snack it's very, very typical you might see, you know there's a biography, there's a list of various resources and where you might find them in different universities. Profile picture from what can media comments. There's a there's a lot of different fields here when you're editing, and it can take a long time for a single editor to completely describe one one person. So, as snack has grown and gotten larger and had, you know, more and more editors we've realized that there's some some challenges at scale. Not the least of which is manually editing these one by one is very slow way to onboard a new institution, you might have thousands or 10s of thousands of records, and doing it that by hand just isn't really feasible. If more identity reconciliation is really hard. Is this person that I have. Does it already exist in snack. There might be four or five or 20 people with the exact same name. How do I know I've got the right person. Once I've made changes on snack or insert inserted data. How do I keep that in sync with something I have locally, maybe in another system. I have an API a programmatic API that can do mass edits and batch ingests, but that requires some level of expertise it's certainly not completely accessible to everyone would like to use it. Just to give you an idea. This is a script it's a relatively simple one as far as they go for uploading into snack but it won't be legible or usable to a large portion of our users. How what what our solutions. Well, first off, we have a snack open refine extension, an archive space plugin, a light snack section that we're that's under development and our vocabulary management system. It's fine to start off with if you're not familiar. It's a very powerful spreadsheet like tool for data cleanup. It's sort of laid out like Excel, but has a lot of power user features. It's pretty familiar to a lot of people in archives and libraries. It's used by wiki data and various other organizations. So we're sort of building on that familiarity and expertise. And then our own extension on top of open refine that interacts with snack. I think it was Joan who mentioned the, the, the usefulness and the power you get from using an off the shelf solution. You have a community you have lots of expertise you have shared knowledge. Well we're sort of using an off the shelf tool open refine, and then just building on top of it a little bit to customize it to our needs. So, with that extension users can do a great deal of data cleanup and alignment. They can catch typos and errors and near duplicates before they ingest them into snacks so that we're cleaning up data making sure we get good high quality data. Reconciliation, identify and match their local names and identifiers to identifiers and snack. And they can basically use the power of an API interface without having to have a program or they can just click around on the graphical user interface to make those changes. And I'll show you a little bit of how that works. So here we have some records and open refine that are ready to be inserted into snack. These are resource descriptions. And there's a bit of a learning curve. It takes a little while to get used to it but it's, I can promise you it's much easier than that API page of code that we saw before. It's intuitive as a spreadsheet that you can go, you can click on columns you can move things around you can facet filter cleanup data. Correct misspellings like I mentioned, so a user can take a spreadsheet. Either one that they generated from their own own database or one that they got through snack by processing ad. So that's what we're going to do for that service. Bring that into open refine, and then match it to the snack model and upload it and potentially do instead of one record at a time do thousands or tens of thousands of records in a single batch. So here we've been really excited with about this. It's still got some things to improve and new features we want to roll out but we've already started using it to ingest new batches of information and snack. Now, before you do an ingest, you will have to go through that reconciliation process I mentioned. If you were to do it manually, you'd have to look at each one of your names and decide who in snack that name matches up to so here I've got the Wikipedia you've probably seen a disambiguation page before. This is just for George Washington just for relatively famous people who share that name. And as you can see there's a George Washington and baseball and inventor a trombonist, which George Washington do I want, I might just have the name that can be very difficult to go one by one, and research to figure out who I'm talking about, not even getting into, you know, the George Washington the train or the for George Washington. Navy ships for the George Washington University. It can get very difficult. Luckily, open refine lets you search automatically and filter, and you let the algorithm decide if this is a likely match, you set a certain threshold of how close the names have to match and how much you trust the system. And then you can go through by hand afterwards and select or reject. It's, it's suggested reconciliation it's the matches it's found. So here you can see in the the second row. We have New York State Library it's given us many different options in snack that might match this name, New York State Library law library, and I can select one of them if I think it's the right one. I'm not sure I can hover over, and it'll give me this pop up and say United States National Archives Records Administration here some existence here's a brief biography. So stay on one page you don't have to go open 40 browser tabs, you can just stay here hover over click select and decide how to match up your data with the existing ones in snack. In the future, we hope to be able to include additional columns such as either exist states or occupations or various things. Right now we started with with names string matching. After open refine, we also have the archive space plugin. This is in development by Jason Jordan at the University of Virginia allows searching, finding and pulling descriptions from snack into a space. And likewise, once you link them you can pull or push descriptions from archive space back into and just to show you a little bit of what that looks like. Here is a slide of Jason when he was searching for different names within his archive space instance querying snack for identifiers names and linking them and importing those back in. And this will allow a lot of people to not redo work they won't have to do it in snack and then come back into their own instance for vice versa. They'll be able to do it one place and sync up their updates across the two systems, saving a lot of work and time. Last but not least open refine and archive space cover many institutions use cases for snack, but not everybody is trained archivist not everybody has access to these tools. So we want to make it we want to lower the bar even for make it accessible to someone who's perhaps just a part time volunteer without archival training. And that's our that's what our light snack program is is targeted for. It's a way to allow small under resource repositories to send a volunteer in login, and then start editing and adding records, but maintaining a high quality level of data. The way we do that is to create almost an installation wizard for record creation, a guided experience in which people answer yes no questions, select options from a limited list of dropdowns and are guided by the system to eventually create a complete and accurate description of archival resource descriptions, persons families or corporate bodies. We want to make it as easy as possible trying to make it as foolproof as as can reasonably be made, and trying to make it very easy to look up answers when you don't, when you're not quite sure what sort of data should go in this field, or how it should be formatted. There's a question mark right above the field you click it, and shows you how you should enter it and it takes you right on to the next question. Once we have all this data in we need a way to describe it in a standardized manner, which is where the concept vocabulary management system comes in. We've all spoke about this a little bit more in depth about the hierarchical control vocabularies subjects activities occupations places in future at the names and relations, variant terms and multilingual support. In fact, we can lean on some some very robust vocabularies that already exist, but still incorporate them into snacks so that they can be used efficiently, and also be used to describe and search for records. I have to say many thanks out to all the people who have helped us develop these these plugins and extensions and who are continue to test and offer feedback and information for us as we try to streamline and improve them. Thank you so much for your ransom center Mark plain papers National Archives Library of Congress and Smithsonian and many other people who have contributed time and their expertise towards improving these tools making snack more accessible to everyone. And thanks so much for your time. Thank you Joseph. It's far more expeditious than we should have been it's only 317 so we have quite a bit of time here to have a discussion of any sort that anyone has questions or comments or observations, please jump in with questions and comments. Well people are thinking. I was really impressed to see the amount of traffic you're running on the website now is that to the extent you know where it's coming from. How much of that do you think is just scholars now trying to discover archives with material that they might be interested in. Take a pass at that Joseph. We, well, we sort of we've discussed privacy earlier we tried to respect the privacy of users on snack and so we gather high level data on usage, but can't track specifically who our users are. It's, it's a large mix. We have a lot of people coming in from Google who are just searching for a poorly described or sparsely described person, they might be doing genealogy they might be doing a research project. And they stumble across snack as the best place that has described this person or organization. I say, certainly a percentage of archivists we get some feedback and comments and requests from from the archival community pretty regularly. And, you know, a large part of our traffic are just people searching for their own personal research or or genealogy history. To go a little bit further in that probably, you know, from the Google analytics you get a lot of those, you know, the numbers that the traffic but you don't get user profiles but we do have a the ability to comment and a ticketing system where we have teams of people responding to questions and observations. And so you get at least an anecdotal sense of who's in snack and it varies across a spectrum. And, and in some cases, you would have to describe them as low information users. On the other end of the spectrum really sophisticated users, and I would say, in terms of that distribution, it's probably more towards pretty sophisticated to very sophisticated users. And then there's a, we get a lot of vanity requests, would you put me in snack. We say no. It's actually a fair amount of that. But some people and I've heard this to the California online archives of California now this is that they mistake the page that they've landed on as the website for, you know, the corporate body. So, you know, we get the occasional wanting to help someone who has a son in the Kansas State Penitentiary. Interesting. I have heard some anecdotal stuff that from biographers who have found that very useful for identifying relevant correspondence that's held in various archives and special collections. Yeah, again, we haven't done a systematic study, but it would be, you know, we have gotten a lot of positive feedback and an important now with the new reference and research working group. You know, many of these people have a lot of contact with researchers and we do know that a lot of reference staff use it we just don't know the numbers. Well, good good news to see those figures going up for sure. Questions for for Daniel and for for Joseph. We have a neighborhood group that's looking to create the history of their neighborhood. And part of their proposal is they want to create a research center and train the neighborhood members in research. I'm wondering if snack would be a place that they might as they create their repository of materials about the neighborhood and the people in the neighborhood. If that might be a good linkage with your product or project or not. I thought we've had people approaches in terms of wanting to document communities. And I do think there's a lot of potential there. I think part of, you know, from, from a data and training point of view we could most certainly accommodate them. But what you often find is, is that such groups, they don't want to be part of something large and be lost in a large thing they would like to have their own little space. And so, you know, it might be kind of interesting to figure out if you could do that if you could have both this huge social document network that could provide views into a community, I don't know. But there are a lot of. There's a lot of activity along those lines. Any other questions. I could take requests for your favorite search and snack, you know, like, name me someone we'll see if they're in there. Let's have a let's do a last call for further questions for Daniel or for Joseph. Thank you, Daniel. If, if any of the institutions represented here interested in joining the, joining the consortium. They should just get in touch with you. Let me briefly show my screen. So if you, if you go to the snack side, and this is the opening screen and it just gives you a random images of people in it. In fact, you can see the skew towards white males immediately. So the, to the about section is is here's becoming a snack member. And here's a list of the, the current members. Not all of them as active as others, but many of them very active. Great. Okay. We know further questions for for Daniel and Joseph, I think that probably the thing to do is to give people an extra couple of minutes of break before we move on to the final, the final two project briefings and then our invited session on the LS commission. So we will go on break till 345 Eastern daylight time. I'd like to just thank Daniel and Joseph one more time for a really, really comprehensive presentation. Thank you for joining this project since it's inception and it's just amazing to me what it's turned into over the years so thank you very much for updating us on this, this wonderful work. And I'll see everybody in about 17 minutes. Thanks everybody.