 glad you could be with us today. This is the session on sorcery and remote access to archives that is part of the CNI spring 2021 plenary day presentations. I'm Cliff Lynch, the director of the coalition and I will be moderating this session. We have a wonderful panel that I will introduce in just a moment. Let me cover a few mechanical things first. We are going to be doing this as a panel. We have a few slides at the beginning and then we will unshare the slides and you can just look at people. There is a Q&A tool at the bottom of your screen. Please use that to pose questions at any point. The way we will run the panel is I will have a few questions for the panelists and then we will move it over to take questions from the attendees after we get through those. There is a chat box. Please feel free to use that for comments or you could also use it for questions. This is being closed captioned and please feel free to turn that on if it is helpful to you. The session is being recorded and as with all of the sessions pretty much at the spring virtual meeting this year, the session will be subsequently available to the public. I think that is all the mechanical things I need to cover. Let me introduce our panelists and then I will introduce the topic just very briefly. We have with us today Barbara Rockenbach from Yale University, Greg Colotti and Tom Scheinfeldt from the University of Connecticut and Dan Cohen from the Northeastern University. The topic at hand is really remote access to archives and there is a wonderful application that was developed by the team at the University of Connecticut and Dan which really at this point it is sort of in its second incarnation and being scaled up but it really served as just this fantastic launching point for a series of conversations last fall about the broader issue of remote access to archives, how that could be achieved both mechanically and organizationally and technically the pros and cons of doing so, discussions about the extent to which this should be a priority, really just a fascinating range of policy and strategy questions for libraries, archives, special collections but also for communities of scholars that rely on these archives and that really in some ways need to use them today in rather different ways than they've used them in the past sometimes. So basically we wanted to and Dan and his colleagues were kind enough to let me participate in those conversations which really was one of the most stimulating things that I had going last fall I think. Our purpose today is to introduce you a bit to the application but the focus is not deeply on the application and much more on the broader questions that it raises and frames. I will just note that there is a pre-recorded video from the December 2020 CNI virtual meeting that introduces the application in a little more detail and that might be useful for those of you who want to go a little deeper in that. With that I am going to turn the turn the conversation over to Tom who will give you a sort of a quick overview of the application and then we will pick up from there. Thanks Cliff that's a great introduction and as you mentioned you know this panel is taking place in the context of all of our work on sorcery which is a melon funded open source not-for-profit application to facilitate access to non-digitized scholarly sources but our purpose here really is not to focus on sorcery but rather on the opportunities and challenges of providing access remotely to archives and special collections opportunities and challenges that our work on this particular application has surfaced and which the application must grapple with so you don't want to spend too much time talking about sorcery per se but as the kind of proximate cause of our asking these questions we thought it probably would make sense at the outset of this panel to give you a brief introduction of the app and the need that it aims to fill. So sorcery is a sort of very specific solution to a very specific problem that scholars face more often than archivists may realize which is access to known archival sources right a source that you know is in the archive and that you can't get physical access to it's a solution to a problem like this one. So this is a screen grab from an h-net thread and the scholar here in question needs access to a source that he already knows sits at the National Archives in London and the question is how does he get it right he's not in London not sure where he is but he's not in London how does he get this source that he knows is there now perhaps the most commonly chosen option at least in the past in in better times was to plan a research trip to London right to fly across the ocean despite all of the obvious disadvantages of time and planning and expense and carbon emissions and and everything and everything else barring that in other other cases the scholar might do as as this scholar did is reach out to colleagues who might be nearer to the source than he is to obtain the document and you know maybe maybe there's a colleague at a university nearby who can send a graduate student to to go take a picture of this take a picture of this document but these kinds of informal networks as I think we're seeing in this in this h-net post are really only open to scholars of the more privileged sort right the sort who have like grad school buddies who now teach in London right the kinds of scholars who have built these professional networks by past attendance at elite institutions and who have had the benefit of frequent conference travel for instance students early career academics contingent faculty independent scholars may not have access to these kinds of informal research methods and so you know you sort of resort to posting a request you know sort of out onto the internet for for help now another and at what might first seem like the most obvious and maybe the best option is to kind of contact the host repository itself to ask for the item to be digitized and sent directly but not all archives offer these services or the level of service that the researcher needs you can see in this post that he contacted the archive and they said you know sorry we're not going to do that much for you they won't do the 20 to 40 pages that he's looking for and even when institutions do offer these kinds of services it can often be difficult for researchers to find information about placing a request on an archives website some archives can provide access in a matter of hours or days for a nominal fee or at no cost others viewing these one-off requests as adjunct to their core functions of assisting on-site researchers or doing mass digitization work can take weeks and costs as much as hundreds of dollars per document especially since these services these kinds of scanning services are often designed around producing high quality images for publication right reproduction copies as opposed to the simple reference copy that this scholar needs and for scholars working in dozens of archives right each which has different systems different services different contact points different contact mechanisms different policies the costs in the hassle of obtaining a document from just one archive that where they really maybe just want one document they don't they're not interested in seeing the whole collection maybe and that may come as I think a shock to some archivist that like sometimes you just want the one thing even though the collections may be you know rich in other ways the researcher just wants that kind of in a very transactional way that one document the costs of you know doing all of that extra work to find out who what and where may be prohibitive and they may ultimately just choose to do without right do without access to that document and that's where sorcery comes in sorcery is designed to provide that scholar on hnet with a shortcut a simple single interface for finding someone who might fill his request it connects that researcher who needs access to a document to a much larger pool of potential helpers who have access to that document much larger than he'll find on a single list listserv or through his individual professional networks it provides the researcher with an easy way to compensate an anonymous colleague who agrees to fill the request and the next time he needs something probably from somewhere else right probably from a different archive he can simply repeat the process for a different item at a different repository now while our initial use case and our kind of initial brainstorm um uh revolved around this uh this case that we see in that hnet request in which a third party like a graduate student would provide the researcher with the request and materials we've always understood that archivists themselves are ultimately best placed and best qualified to do this work right and indeed in covid times they're the only ones able to do this work they're the only ones able to access these documents and so while we had thought of initially the a kind of archivist version of sorcery as a kind of second order project uh when covid hit we sort of switched the script uh and decided to launch the enterprise version of sorcery first and the enterprise version of sorcery will allow an institution's own staff to fill requests placed through sorcery right they'll have the app on their phone they can take a picture directly through through the app on their phone it'll let them set their own pricing free if they choose it'll make it easy for them to collect small payments right um and it'll let them decide whether they have the capacity to fill a request or whether they prefer to release that job to the crowd to the grad student to fill on their behalf so our aim is to help repositories better serve remote patrons by streamlining the reference document provision workflow and to aggregate small dollar transactions that can be sometimes more trouble than they're worth than they're worth and our long-term aim is that source sorcery which like its stablemates omeka zotero and tropey is operated by the corporation for digital scholarship that sorcery will become a core piece of research technology infrastructure as i said at the start sorcery raises as many questions as it answers and indeed as academics ourselves right we build software but that's you know we're doing it in pursuit of larger questions about you know the future of research research in the digital age etc we're at least as interested in the questions and provocations raised by the existence of something like sorcery as we are in making it a successful app and these questions are many and we're going to get into those in just a minute for example how does sorcery work with existing individual use policies right policies say that you can only use this for your personal use what about copyright what is the gig economy entering into the library world mean for librarians and archivists how does this impact the visibility of archival labor what kinds of what about folder level as opposed to item level requests does it integrate with other library systems what are the pricing models etc etc etc and we're asking all of these same questions and we're excited to dig into them more here so with that I thank thank you for your time and your goodwill and your expertise we're eager to listen and learn as much as you from this conversation and if you want to check out sorcery there's there's a beta version at that url all right so i will stop sharing there and we will get into our discussion all right well thanks tom um as you can see this does raise lots of questions and maybe as a place to start before we get into some of the some of the more technical issues i wonder if i could get you all to reflect a bit about how you think the mix will will ultimately settle out between the the number of requests serviced by the by the staff at an archive or special collection as opposed to players in a sort of a scholarly gig economy um I can speculate um I think the first thing to say is it's going to vary very widely from institution to institution so for instance I've spoken with colleagues at the Library of Congress and they have very little you know compared to the volume of requests for this kind of work that they have they have very little capacity to do it and they're more interested in um farming these requests out to the kind of army of professional researchers who are already sitting in the reading room at the Library of Congress who already do this work and just kind of facilitate that and create create a create an easier connection between the remote researcher and those reading room based professional researchers so so for them they they may choose to to do most of the work through third parties for others who have capacity they may choose to do most of it themselves and only use the the the crowd when when necessary but I think it I think it will will vary widely and I think what we're trying to do is give the institutions themselves that choice so when an when a when a request comes in through through sorcery for a document our enterprise partners will have a sort of a right of first refusal period in which a kind of grace period in which they'll they'll be able to sort of determine okay do we have the bandwidth to fill this this request or not if they determine that you know boy we're just swamp this week or two people are out sick this week they can then just release it to the crowd and and and let someone there take it up so that that they can to help them balance resources more effectively I'll just jump in to say that while Yale is not yet an enterprise institution using sorcery we're talking next week actually to to think more about this but I'll say demand is high for this kind of resource and we have a lot of capacity at Yale and we have robust resources but there's just such a high demand and such high scholarly need for this that I think having multiple avenues into increasing access as broadly as we can think COVID has shown us that the the need for digital images for research and for teaching is not a nice to have anymore it's a must have and so I I like these efforts because they do provide the possibility of a broader way to get access to materials than we've had in the past and I'll jump in here to say at Northeastern we have a slightly smaller archives and special collections than Yale but we are very much aligned I think with what Barbara just said you know I think in terms of the future if we could you know ideally the enterprise version sounds great for us because it could streamline our workflows would allow us to you know pull boxes and files in in some kind of sequence that makes more sense as opposed to what's happening now which is just de facto on the ground researchers in the reading room with their smartphone just rapidly taking things making requests and so forth but more broadly and I think important for the context of what Tom just mentioned if we could in a sense aggregate demand for that digitization because as Barbara noted there is a lot of demand out there and in fact I'm hopeful that in this discussion we could talk more about this this sort of aggregation of work and if we could roll that up I mean I would love to hire another archivist or another digitization person based upon the fact that it isn't just happening in a sort of scatter shop shot way on from h-net request but actually could be we could point to something to say hey there's actually a lot of demand out there folks knew that we have great riches in our archives and special collections as the word got out more about that we could actually expand our our actual in-house work which would be great and just you know one other note on that is we're also trying to balance not just the current day use of our archive but the future use of the archive so every scan that happens in that informal way reference scan where someone comes in we don't really know about it doesn't end up in our digital repository or isn't done in a way where maybe we could add metadata to it in the future maybe it starts as just a basic scan but gets metadata in the future then I worry about actually future researchers who also lack that privilege right so if if we were to do it more in-house we'd actually be able to aggregate more digitization for not just current researchers but actually for future researchers and ensure that we're servicing them and their needs as well as the current ones so it's it's a complex I think balancing act that this project raises you want to add anything Greg? I was going to add to it what everyone else has said in that every institution doesn't have enough staff to meet the demands of the questions I've worked in places where I was the the loan arranger and we didn't get a lot of questions but the number of questions we had was still more than I could handle and right the Library of Congress has a lot of people but they have a lot more questions there's I've never worked in a place where there was enough staff to manage all the questions and having something like this would be would have been great in many places that I've worked and we would have outsourced that to so many people because there's a lot of people who know more about your collections than you do let me ask a related question so I know that there are some archives and special collections at various institutions that are not open to the public you have to somehow become a qualified user to do that and you know there are various methods for that often I think at universities sometimes it's it's good enough to be a faculty member or a student at that university if you're from some other university you can get a letter of reference or otherwise qualify as a you know as as having access how does this sort of play out in that economy all of a sudden you've got these credentialed people who essentially can monetize their credential it's a you know it's a great question well I would I would hope ultimately you know I think like you know the work we've done previously with Omeca and Zotero and other other other software projects that this would help to democratize access to some of those to some of those repositories and make it maybe not worth the repositories while to keep the unwashed masses out you know I think the other thing is something like this might also put a different kind of pressure on those on those more closed archives and special collections which is if researchers are making if this allows researchers to make better and more use of the open more open collections the more closed collections are going to see less use and there you know fewer citations of their materials and scholarly books and and articles and and it may put a kind of a kind of peer pressure on those on those organizations to to open up a little so I I mean as I said you know sorcery in many ways is kind of a provocation and in that respect I hope it's a provocation for openness fascinating I'd second that Tom I think that's a could could be a really nice product of sorcery but I also like how it leads you mentioned this in your opening as well that the notion that we need scholars need different kinds of images and so we need the high res preservation archival publication you know all of those types of images but we're in our Yale special collections we're also digitizing on demand and for remote reference and putting that in our digital library and this is yet another kind of image and source and I think there's a I think questions already emerging in the chat around metadata that I'm sure we'll get to but the bottom line is that I think having understanding that different types of images and sources are useful in different contexts this is I think a helpful aspect of this project. Cliff I would also widen your question a bit as well there there are archives that are open but that there are many people including most of the general public who would never think to go into an archive that they sort of see it off limits and so I think another interesting thing that that sorcery brings up is the possibility for different kinds of uses of the archive you know you might have journalists who would be more receptive to say grabbing a document or I think in the best case scenario we we have I mean thanks to my staff in our archives and special collections we have a program called teaching with archives that we run with the boston public schools where we bring well pre-covid we brought students in but then we digitized enough to actually do it virtually to provide the kinds of documents that maybe a academic researcher like myself someone with a history phd might not necessarily go in for a surgical strike but that might be really relevant to their neighborhood or community history or culture that they would want to get and so I think it just opens up the possibility of actually what gets digitized you know barber has an enormous collection and and I think you know professional researchers will probably use her special collections in a very different way say than the general public would so I think that's a that widening is actually a really signature aspect of this as tom put it a sort of democratization of access that I think could be very healthy for all of us yeah why don't we turn a little bit over to the some of the technical and technology issues here I mean it seems that going down this path raises a lot of questions about everything from you know what quality images you won't really want to be keeping what kind of metadata you want whether you whether this is a push towards item level rather than folder level description how this what what kind of management system we want around this to to track the the items that are being digitized and you know where where copies of the images are I'll just I'll just sort of throw that collection of questions open I mean I think you can you can use this as a point of departure for calling virtually every practice around the management of archives into into some question and just get your reflections on that that's a really big question cliff but honestly I think what what we have seen in a lot of the work I've been doing is that you have you have something of a luxury with paper documents or analog documents that you can manage them at a very high level a box level or even a collection level if it's small enough but as soon as you digitize something you're you're required to at least manage it at that item level you at least have to know where that file is and you may not know a lot about that file but I think we can we can sort of split the difference in that if I've if I've digitized a whole box of things I could still have a metadata record that talks about the box but then leads me to all the items because if you came into my archives and you asked for a box you wouldn't ever expect that I would know every single thing in that box you would expect you need to look through that box and so I think I'm not exactly sure how that relates to sorcery but I think that's the issues that we're dealing with in archival management and the shift from analog to digital one thing I'll chime in and say is I think you know and there's a history to this and there's really good reasons for it but I think there's been a bias towards thinking that there can only be one scan of a document online right that that there should be at least one scan of a document online on the library's website right that you shouldn't have multiple copies of of the same item on a website and so if you're going to digitize something you should digitize it at the very highest quality and mount it in the you know whatever the the official repository is with you know sort of high level or or detailed metadata and description and you know that's been sort of the the bias what that hasn't recognized though is that there are often multiple copies of images of of of items taken right like there you know researchers will come into the reading room and take their own cell phone images an archivist or or like an undergraduate assistant might get a request from someone through an email address and take a picture on her phone and email that picture directly to the requester so so there are sort of these parallel digitization efforts going on already at institutions one which is the official preservation digitization workflow with all of its rigor which is the way it should be right and the other is this much more informal much more impromptu much more pragmatic one-off digitization program where people are taking scans of things for use in the moment and for really for reference and so I think what what what what what this app kind of forces us to recognize is there are already already these two digitization workflows going on and then that then the question becomes do they need to be integrated or can they exist in parallel um I think for the present are kind of working um assumption with sorcery is that for the present they can exist in parallel um as as they already in fact do um farther into the future is there a way to integrate these two streams um maybe probably um and that would be for the good but for right now I feel like it's better to just recognize the reality of what's going on and make it more effective and more efficient for the researcher and for the archivist rather than to you know hold up some ideal that has never come to pass or has not yet come to pass and may never come to pass um you know a kind of perfect as the enemy of the good kind of attitude and Tom I just want to chime in really quickly because I've been there and to go into range 85 shelf nine takes me just as much effort to make a good copy as a bad copy that was always the thinking but if we're if we push off the actual copy making that we don't have to do it all we have to do is get the box then we don't care it's no more effort I'm just getting a box out and I don't have to do all these things um and so I think that that lowers the borrow of effort on the archives part to participate in the sorcery because I don't have the burden of creating and managing that object that gets created so and if we if we can get archivists to think like that then oh it's like ah I would I would definitely breathe the sigh of relief you go oh wait that's something I don't have to worry about I'm great well and it addresses attention Tom that you and in a previous call when we discussed this talked about and it's this this inherent tension between our drive now for increased broad access as much as we can provide with very rigorous descriptive and digitization standards and that I think this has been a struggle for for the archives profession and even more so now as we we're really thinking about access equity and issues of just the broadest possible way to particularly and I think about this often an institution like Gail that has great resources has a huge responsibility to make these resources available and so that ability to think this differently to to to um to start to address that tension between access and our rigor is but not lose that rigor I think that's an important part is there's always going to be a place for that but to be able to fulfill that access mandate just seems incredibly important to me right now um Cliff you're muted sorry I didn't click card enough there um there's some wonderful things coming in in the in the comments and a little later when we open this up for the audience I'm gonna I'm gonna maybe read one or two of these as introductions to questions um I want to go two places before we open up for for questions from the audience so the first is that this the the the kind of combination of experiments like sorcery and the pandemic is really you know sort of framing a question about whether we should be assigning much greater priority in the design and um you know arrangement of our services around archives and special collections to support remote access um you know one could envision well we we've already slipped past certain barriers you know it used to be if you really wanted to talk to one of the archivists you probably had to go to the archive to do that now you know we're we're cheerfully having consultations via zoom um uh and reference interviews and things of that nature um I can imagine uh us slipping into a situation where you might actually go through the contents of a folder with somebody physically there and you're on zoom or some similar app uh as they flip through the you know folder now obviously probably aren't going to want to tie up a you know a professional archivist doing that in most cases but um one can imagine um uh the same kind of staff or doing a lot of the front line digitization work could easily do that sort of thing um what what's what's your thought on should we do that is it is it culturally ever going to be possible to do that is is this the the future for archives or part of the a major part of the future for archives so Cliff I think this is really interesting in that it it is about um in a sense rebalancing the people you are serving right and um there's absolutely been in the same way there's tom talked about there being a sort of um you know single very you know gold standard system for digitization of archives and and with full metadata and so forth there's also been a kind of gold standard of interacting with the archive which I absolutely benefited from in in my doctoral work and and it was great like I mean did I like going to archives and spending a long time and having meals with archivists and finding out about boxes that someone hadn't seen before sure that was terrific but I think it's a lot more equitable to say you know what if we were to think about dividing up the time differently and um helping more remote people um or helping again the general public or people in the future and I think Barbara and I have the very difficult task of doing that kind of rebalancing and saying what are we going to devote resources to who are we devoting those resources for and what does that mean actually for our staff and for their labor and for their time and what kinds of things are they doing we certainly don't want to overload them with um 100 online requests versus one in person request just because um you know we're we're helping more people and they're all remote I mean we still need to have some in person things but I think all of us and this is why again sorcery it just raises such interesting questions we'll have to go through that exercise with our staff and with the community of folks who want to make use of our archives and special collections to think about what that division of labor is and I think that's a really interesting question for say the next three to five years and COVID has only accelerated our interest in trying to solve that equation others want to comment on that I'll just say briefly I think Dan that that's very well put that thinking about how we're focusing our time is is going to be a challenge we just posted our AUL for special collections and director of the by-n-a-key position and a part of that job description and that that person's work will be to have a community engagement team to really be thinking about those questions how we ensure that the resources we have are reaching our local community global community in new ways that I think it's just absolutely necessary for us to be thinking about right now so even in the posting of a new position early on in my tenure I really wanted to be thinking with with the special collections team about how we we articulate a community engagement aspect of the special collections work so I want to just before transitioning us to questions from the audience I want to just read read one that was sent a comment that was sent to the panelists that says this is a great project there's a kind of precedent that Mellon funded for many years at the Medici archive in Florence they they did software and policies that would where the archives would digitize materials and the remote scholar would help to digit help to describe the materials as a sort of a quid pro quo and they also developed a fellowship model for people who were heavily engaged with a specific archive I wonder you know many many resource archives particularly do support do have these sort of fellowship arrangements where scholars can you know apply to come and engage with an archive I just I just had a conversation with the Getty who has a similar program around some of their material can we are we going to be able to do those digitally can you imagine being able to you know maybe as a as a first step to to scaling up certain kinds of engagement having digital fellows so I'll I'll I'll chime in on the on the on the first part about about the the the idea of the the Medici archive model of posting things online and having researchers provide metadata one thing we're we're working towards with with sorcery is to stitch it together with our of the other with the other software packages supported by the corporation for digital scholarship so there would there is a there would be a kind of workflow where researchers would would would request an item through through sorcery a an archivist at the institution would scan that and that scan could be provided with some some rough metadata and pushed to a reference a reference archive on the on the institution's website built in omeka so it would be a kind of parallel collection to the main sort of the official collection the Folger who we're working with on this on on on this next phase is doing something like that already where they have a reference image collection so one of the one of the one of the interfaces between sorcery will be with with omeka to to create those kinds of collections the other interface is between sorcery and trophy trophy is a desktop application kind of like iPhoto for archival documents where individual researchers can take all of the take all of the images that they've created while they were in person in the archive and organize them you know build you know idiosyncratic researcher metadata but metadata nonetheless into those into those collections and there we're we're thinking about how how researchers might be able to request items maybe a scan that they took that was that was you know blurry or they missed an item in a series request additional items through sorcery for that collection add them to their collection describe them in the collection and then that would be then in turn returned to the originating originating archive with the with the metadata and posted on the omeka website so there is a there is kind of an interface that you can imagine where where that kind of full circle kind of work can can happen as for remote fellowships i don't know i mean i think some of this could be done remotely you know as dan was describing like you know there's there's nothing like sitting in a in an archive for for you know a week or a month and working with the materials directly i don't i don't think that mode of interaction and that and interaction with the documents but also interaction with the archival the archivist's expertise i don't think that mode of work is ever going to go away for historians but i think it can be supplemented and you could imagine you know a parallel program of fellowships maybe that provides you with a week of special access you know prioritized access through sorcery or something else or whatever through email to an archivist's expertise and that and that i can imagine certainly taking place tom i want to add to that and say yes i think i think you can do digital fellowships and and i know you love and i do too i became an archivist because i love the old stuff and i'd love to pick it up but most people's need is for information and not for that experience especially if you're not a scholar so you look at northeastern which is a large collection of community-based archives that are used by community members who want to do something and they want to and they want to show something and they need that as information and so that same experience if we can if we can get them to think that they don't need that much hand-holding they just need access to the content and so that's i think the other side of this i i think we often again even in how we create we expect everybody is the serious researcher who's going to spend a week it's like most people just want to answer a question and if we can build them tools to answer those questions then we're we're going to serve a lot of people we have a question about i'm going to go to questions now from the audience we have a question about have you interviewed folks who run closed archives globally as you did the development of this we i don't think we've talked to people who run closed archives we you know most of our conversations with archivists have been in us institutions where i think that model is a little less common you know i did my doctoral work in in england where that model is much much more common and i think that's true on the continent as well but that is that's certainly a good idea something we should we should think about doing here's another question i'd like to hear a little more about the opportunities to link these on-demand scans with systems and tools maintained by the archive to help future researchers do the is the right thing to do to connect these to finding aids or other discovery tools um can they end up in some kind of repositories for digitized special collections there it seems like they're quite a few mismatches between what current archival systems do and what they need to do to work with these so i think this gets to this really interesting question about a kind of the the different pipelines right let's say we have one pipeline we don't really we have informal secondary and tertiary pipelines and so forth but i think this group here on this panel has actually had some some sidebar conversations about this very thing which is maybe you end up with a sort of semi-formal secondary pipeline that includes some sorcery reference scans that go into a digital repository at the institution and over time may be connected through various means to finding aids or to metadata along the way and i think this is actually fascinating for the cni crowd because i think there could be multiple modes for those kinds of enhancements of let's call it track two so one could be indeed eventually those scans make their way over to my metadata group and they're given formal metadata over time um some of them may go to crowdsourcing or other techniques and then just to connect it to some other cni panels if you look at the campy project at Carnegie Mellon which is the computer edit aided metadata generation project you could run those through some some machine learning or computer vision techniques to at least get you 80 percent of the way there and we're starting to do tests on these kinds of just raw scans that haven't been given metadata and it's imperfect i'm sure it will get better over time but i think what's really interesting about that campy project which i encourage you to watch their their 20 minute video as well is that it's actually the combination of the human expertise and the computer vision apis that actually get you far along with a lower amount of labor so there could be suggested tags metadata and so forth and then a human expert comes through when appropriate maybe it's a few months in the future and formalizes that metadata so i think for cni this is sort of a perfect combination of in a sense more traditional human expert practices and some much newer digital pipelines and techniques if i could add if i could add just one more oh sorry Greg go ahead i was just saying it's it's again we come back to the question of how much is good enough right and and you might say and i just this came into my head this is pretty radical though you might say we should actually stop doing high resolution scanning and throw everything through the lowest resolution we can find and get it all out there and then oh let the good stuff rise to the top and then we do that now i used to teach digitization years ago and that was like the last thing i ever would have said i would have said get it out do it the best you can so you can use it later but you know that's with the way we can digitize things down the way people want things i don't know it's just an interesting idea i think you know in terms of connecting back to finding aids and other other sources of of discovery one one thing to mention is part of the the motivation for sorcery was the observation that and i think something that i think probably is underestimated by most archivists is that the way one one primary means of scholars finding materials archival materials is not through finding aids in that are prepared by the archives themselves it's through footnotes that are prepared by scholars right so you're reading a book you find a and this was sort of the like one of the original use cases for sorcery you're reading a book you you know just it's a new book that you've been asked to review or something it you know footnote references a document that you think might be interesting to your own work just that one document right and you want to get your hands on it and you go to the institution's website and it hasn't been digitized what do you do that was the that was the use case but there's a there's a seed of an idea here where if you could connect if you could connect the the information that's contained in footnotes to information that's contained in finding aids through this kind of crowdsource process you start to end up with a whole different kind of information ecosystem and to bring the the really the promise of citation full circle so so we're we're we're thinking deeply about that as well there's a whole richness of citation out there in books that's you know semi standardized that could be leveraged for for discovery in other ways yeah that's a really important point here here's a comment I think more than a question but it's it's an interesting one the idea that Greg just raised is not unlike the melon funded project for digital history monographs at UNC press invest just enough in a new book to create a good digital file make it easily discoverable for open access and then if it if it actually gets used if it if it if the reception merits it invests more in a future edition and you know that that that is sort of what what would happen in Greg's scenario as well um it's also interesting that you you get a you get a much better sense of of demand for use if you get this kind of low barrier to entry use I mean certainly any archivist who's curating a specific collection over time can tell you anecdotally that you know these are the kinds of things that people coming in mostly want to look at but this gives you a much better um a much um broader view of that I'm just looking for here here's a comment for our team managing digitization access it's not the scanning time that prevents us from delivering access files the big work is in file naming and structural metadata to manage the scans and get them associated with the right collections records metadata has been the bottleneck for probably the last 10 years once scanner's got once we moved to camera capture which was very quick and that's why dance comment about using AI and things like that to pull some basic information about even a photograph you know man on a horse or something like that um is really good and yeah it's it really isn't that it's the metadata that's the problem yeah and I think that goes to Barbara's comment about this tension between rigor and access um you know I think if it's a question of I think you know honestly it's it's priorities um if if the priority is just getting the material into the hands of the researcher who wants it um maybe maybe you don't describe it maybe you don't connect it to the finding a maybe you don't um you don't you know you just you just don't do those things but the but the researcher is happy um it's it at or maybe you do it at a at a at a lower level or maybe you do it in the ways you know we experiment with ways to do it with artificial intelligence or or or or some assistance from the from the researcher herself or but I but I do think it's it is it is a tension and it's a I think it's a tension where we're we're um trying to uh to provoke I don't think we have these answers um but it's a but it but it's but it's worth exploring um you know where on that um where on that uh continuum of sort of um you know easy easy access and you know complete rigor um we want to be it's not just easy access it's it's easy useful access easy useful access well it's and it's also do we want to provide um you know what level of access right access to I mean the individual researcher who asks for the item doesn't need any more description right right they know they want it like they've already found it they you know they probably know as much about it as the as the archivist does that they don't need it um it's who you're providing access to if you just give it to them you're now not providing access to the to the to the broader to the broader uh public um and we have to ask ourselves is is which value do we care most about do we do we want to meet that need of the individual researcher quickly and easily or do we I think it's it's worth not meeting that need and instead um meeting meeting a longer term need of a of a of a broader public and a future public and these are these are ultimately like everything else uh resource uh uh choices yeah and you know to my mind the wonderful thing about this project is it has framed many of these questions in a very you know sort of tangible way um uh and we haven't been asking them enough until this came along we are past time I'm gonna just give everybody a moment if they want to make a quick closing remark and then I'm going to thank everyone and conclude the session Greg um I just want to say that this is a great opportunity to rethink how we do things and we should keep this discussion going because we like like Tom said we don't have the answers we but we have a lot of questions Laura yes I agree with Greg and I think that um you know what what we're seeing here is just again different needs by different users and different approaches and so having a variety of approaches I think is a very good thing in any any work we're doing and um I think together we'll be able to start to answer some of these questions Tom um I'll just say uh thanks uh to you Cliff and to to everyone in the audience um and also if there's anyone um listening who is interested it was at an institution and is interested in piloting the the enterprise version um and you know working with us to to develop it um we're we're we're very much looking for for those kinds of partners so please please do be in touch and I'll give you the last word I'll just conclude with one more thanks which is I want to thank my staff and actually all the archivists um and librarians and technologists and many others who participated in the workshops that Tom and Greg and and so my staffers ran with support from the Mellon Foundation it's actually um out of you know tough questions that they've asked of all of us and and thinking about the roles that I think we've made progress and certainly there there haven't been full solutions but I I I really have loved the collaboration and thinking through with staff at all levels how this might work and work in the most ethical and productive way so um thanks to everybody who's helped us think through these issues all right well thank you all for your thoughts on this um this is a topic I am sure we're going to continue discussing um at CNI and we're certainly going to want to follow this project and continue learning from it um I thank the attendees for joining us and for all the very thoughtful comments here and uh I'm going to declare this adjourn thank you again