 Good afternoon. So while people are probably still walking in, so let's start our afternoon first session. Wish you all had a great lunch and conversations. So three of us going into really giving you an intro about machine learning status in our library communities. So mine is really more overview based on our workshops and survey engaged with our community, give you an overview about where we are. And I actually get a very lucky, I get to know Liz and Harish during the grant and they are going to really give you some specifics about how machine learning can be applied and used in the library science. So we're trying to maybe talk between 30 to 40 minutes and leave 20 minutes for Q&A. But there will be plenty of time for conversations engaging the speakers as well as with the peers in the room. So the grant started about a year ago, so NordAIM is the PI. So we're about a year mark. We're actually wrapping it up. There are a couple of things we're still working on. But really the question we're really trying to get answered for is how can machine learning better facilitate cross-disciplinary discovery? We as a community, as a library field, trying to better facilitate discovery of cross-disciplinary research being a topic for 20 years. But given the state, our classification system is very disciplinary. So we're really very interested to find out with the advent of machine learning with our new opportunities as a community, we're really looking forward to really helping the campus scholars and researchers to understand others' research better. So in my next slides, I'll talk about why NordAIM are so interested in that topic. But really, let's define who are we. So when we talk about we are really not just the librarians. We're also looking at IT engineers, practitioners, as well as scholars. So in our grant, we're trying to outreach to very diverse communities, trying to understand from our faculty what the problem they are trying to solve from a cross-disciplinary point of view and how can libraries and IT professionals help them along the way. And really based on the study, we hope to find out some immediate next steps we can do in the next three to five years. And as a community, so let's see what are the opportunities we have. So really what to give an impetus for NordAIM being the PI for that grant, because we did have a convocated project happen about four years ago. So we actually spent on that project for many times. There are some links. Once the PowerPoint is published, you can actually follow the link to learn more about the project. The key point, sort of takeaway points are it is an interdisciplinary research initiated by our faculty. And our faculty is really trying to understand the interconnectedness between the Catholic social teaching and human law, and understanding each other's discipline using different lingo. And they're really trying to find out how to better facilitate this type of research. So another takeaway point is we actually got it right, people in the room, amount of scholar students and the librarians and IT folks, you know, we not only contribute on the web user experience, the website, technology side, but also we created a control of capillaries that really map that to disciplines very well. And because of that control of capillaries, it really shows some potentials we can use machine learning mechanism to harvest and getting more material into the subject. You know, as I said, we spent almost four years working on the project. It's very human-centric, labor-intensive, so now we have control of capillaries. We can see if we do more of the harvesting, it's going to really save us a lot of time and energy. So the grant component really, you know, we have some commitment for IMLS, I really thank for their support. And I think we pretty much finished everything we said we will be doing. We added two more components, it's the writer workshop, besides the white paper we're going to publish an open access book. And there are many authors contributing to the book and they cover many of the theoretical aspect, the practice aspect, as well as some challenges when they are going through their own project. So really looking forward to the report, the white paper and the book. Some numbers to share with you. So through the grant, it's about $50,000, really it supports about 324 respondents of the survey. And I think that number probably give you a sense how early we are in the project. We actually try to outreach to more communities. I think that's the numbers kind of sustained or go flat to after a nine month. So it may be minus and plus several practitioners and I think that's probably the community we're looking at right now. So since people are still walking in, I'm going to stop for a second. So I think by virtue of your attendance, you already voted by your feet. This is definitely one of your interests, right? So maybe by showing off a hand, how many of you are actually experimenting with machine learning right now? Right? Okay. So how many of you think you will actually kind of make a machine learning go mainstream in your library already? No one? Okay. Good. Just give you an impression there. So I think the 324, really good number just pretty much probably that's the practitioner within the communities between the librarians, IT and the scholars. So we had four workshops. We engaged 24 speakers to talk about the various projects and totally online and in-person with about 104 attendees and the book, we got 21 authors for 19 chapters and that's just kind of where we are. So something we learned, you know, as the core question for the grant is about a facilitated cross-disciplinary research, so one thing we learned from our community, that the problem is probably going to be a long-term issue for the entire academy to solve. So for every workshop we had all the onsite participants talk about, this is not just our library, this is faculty members, they actually talk about how difficult it was to actually to start the cross-disciplinary research. One thing for us is about the language, the vocabulary, the other thing mainly, we hear that throughout the workshops is really the university is for structural issues and it's a very silo operation and many faculty do have the desires and they actually give a lot of ideas how to break those silos and they actually think the library should play a very critical role in really helping to facilitate but with that said, as of today it is very difficult to do a cross-disciplinary research. Again on campus most of them are very organic, just like in Oregon, you know, we did not initiate that cross-disciplinary research our faculty did or the institution needed to give a lot of investment and then push probably cross-disciplinary research centers to really make it happen. So we decided to maybe put aside this cross-disciplinary focus for a second and really see what's going on within our community. So there are actually a lot of projects going on, you know, by showing off a hand you probably get a sense, you know, there are many of our peers are really trying to understand what's the potential of ML in our field and so this is a set of challenges, you can see them as challenges or opportunity, but really think about it's more a mainstream and maybe a more average of the community, some of our individual peers doing better than the others, but as a community through the grant, through the survey, through the workshop, those are a set of challenges and opportunities we see. So the first point is really more about at individual level and at individual level most people fell, they don't really have a lot of capacity to learn, experiment, I'm sorry, and to test. The problem has been, you know, there are a lot of services we offer already to our community, you know, probably people fell like in 40 hours, there's so much already on their plate and besides words, there's just not really much they can do there. So the second point is about the library resources level and also probably the same, you know, the increasing the license material cost and there are probably shrinking budget support and it's very hard to find resources besides the library already offered to the campus. So, you know, again, I want to stop on that point for a second because we have about, you know, this 324 people, you know, they're actually very engaged in this type of research and we didn't talk about that fear factor at all, right? So as a default, I think the 324 people felt there are a lot of potential for ML and they didn't talk about whether machine learning will replace the human job or not. So I think maybe in your own institution there may be that challenge as well. You might have to really clarify with your staff and faculty how that will not replace their jobs, right? But I think the thing is for the library field, I think the general understanding is, so we're so under staff and the resource, I felt there's just more work that people can support. Machine learning is definitely a tool to help us. It's more in a, like, assistant role rather than really take it over. And there are a lot of teaching, the human can help the machine to do better, but human is very centered, is very centered in this process. I just want to make that clear on the table. If in your own communities, that is a fear factor, you probably need to work that out first. And another thing we hear from the community is about there are a lot of machine learning curriculum online. Actually, I took Andrew Ng's project, thanks for Stanford. But they're not really tailored towards our academics. A lot of scholars, they don't really trust the black box nature of machine learning. They want to really learn about how that works. And they're not really catered to their own disciplines. So there were actually two thoughts. One thing is about you have to learn everything to be able to do machine learning. On the other hand, from the workshop, we hear machine learning is kind of like you're driving a car. You're not good with mechanics, but you're driving. So there are some thoughts we're talking about it. So just put it out there. You can think about your own community, what is more predominant conversation here. But really throughout the workshop, our scholars are really concerned about the ethics of the algorithms. They want to understand how that will work, how they generate the data, and then how they get to that conclusion. And also, I think the commercial algorithm or Ken algorithm is an opportunity, also a challenge for us. Because it really gets us a very quick start. But meanwhile, there are tons of bias we learn from our workshop because the commercial algorithm we're trained based on commercial data, probably from your, you know, the images they have don't really have the volume or the size of the academic collection. So actually to be able to understand the type of focus you like to maybe spin through the algorithms. So there's also a dilemma, you know, like how much you want to give that control to commercial, or you really want to engage the community to engaging the open algorithm to understand them how that can do for the academics. So the technology is not really an issue. It's really about the data, how clean your data is, how much data you have. And that's how the algorithm getting more accurate. And how we operate right now, I think, you know, at least you should read the report. But right now, if you have Hispanic collections, but the way they're so scattered around in our community, it's very hard to get in that size of data actually to clean your algorithm to be more accurate. Right? So that's another challenge we find out from the grant. And the last point is, at least for Nordame, there are plenty of CS faculty members. They actually engage in machine learning research, but it's very hard to get across the library to be part of the partner. They look at it more as a research rather than into the utility side. And we're just kind of with the help of the grant, and we're actually just a crack open the door. There's one author from CS department faculty that actually is going to contribute a chapter in the book and talk about. So I think the door is open, but it's very difficult to create the collaboration on campus right now, but getting the expertise in the in the room and helping to solving some of the real problems. So I'm going to stay here. And as Harish take over, talk about a very specific project he has been working on. Thank you, John. I'm just going to set a timer for my thoughts. So I leave enough time for Liz and Q&A. So my name is Harish Muddingen and I'm from University of Utah. We wanted to experiment with machine learning and AI because that was the thing to talk about last year. And we just wanted to see what we can do with our limited skill set in the library. So this project, our project was about how we can use latest and best machine learning algorithms that are out in the open on the library data and how we can improve our internal operations. So this project started in the library. We've worked with Dhanishka, he's the assistant head of software development. I listed my co-collaborators here and Bohan, he's our software developer. Excellent, excellent skillsets. But very quickly we realized after working on the project for nine months that we were out of our depth. We didn't know the results, how to interpret the results that we were seeing. So we went to the doorsteps of a computer expert. He's an expert on natural language processing and AI on our campus. And he was very open. He just said that he was dying to collaborate on projects related to digital libraries. So a natural partnership evolved from there. His name is Vivek Srikumar. So we wrote him into the project and we have been working for well over six months now. In the local context at University of Utah, we started digitizing different types of content since 2001. In our special collections and archives division, we have well over maybe two million images out of which at least one million have been digitized. But we only put maybe half of them online for a variety of reasons because of copyright, because we don't have staff to describe the image as well. We have to look at the deed of gifts and a bunch of other things before we put them online. So we do have a subset of images online, photographs online. And that is what we wanted to do, what we can use in terms of image analysis and machine learning techniques to start adding descriptions and labels to these photographs. So these are the high level goals. Again, expedite metadata creation, enhance the discovery experience, but also enhance the metadata too. This point was explicitly pointed out by Thomas Padilla in his report that was released yesterday. And it's a very valid point. If you are using machine learning, why not think about enhancing the metadata too where feasible, right? And hopefully address the backlog issues that we have in our collections. Before we dive further, just wanted to set the stage. In any AI conference you go invariably, you will come across different definitions of AI. And because I have the stage here, I thought, why not take a shot at this, right? What you see here, this is from a blog post that Chris Bourg, director of MIT Libraries, you posted on her blog, which was a reference to definition that was published in the book. What I like about this is the focus is on the data, right? It's about what the data will tell you. That is what machine learning is. These are a couple of other references. There was a report that was released last year. A think tank in Germany, they came up with, they did a comparison of national strategies to promote artificial intelligence, right? Several different countries have come out with their report with their take on how they are going to tackle AI. And this think tank went through what was in these reports. And they talked about AI definitions, too. And one thing that they mentioned was that we should be open to defining AI because there would be invariably technological breakthroughs as we move along. We should be open to just redefining AI as we move along. But one thing that they found common across all these definitions was that AI was a driving force in the digital revolution. And I think that's pretty consistent no matter which definition you're subscribed to at this point. This is, I think the same image was cited by Dr. Keith Webster yesterday. If you were in that presentation, you probably would have seen this image. It just neatly packs different terminologies, right? What neural networks are, deep learning, machine learning, AI, how they fit with each other. There might be several different perspectives, but in our library, we just stick with these definitions. It just gives us a framework to think about all these terms. Coming back to our project, so we used TensorFlow to begin our project. ImageNet, ImageNet is an image database with 14 million images. TensorFlow uses what we call inception model. So you take ImageNet, you feed it to inception model, and then you can apply it on your own collections. It's pretty easy, pretty straightforward. And you just look at the output. The output would be, depending on the model you use, you would get either labels or captions. We went an extra step and thought, because ImageNet may not have all different types of data, why not teach the model with the data that we already have in our extensive collections? And that's called transfer learning. So you can repeat the process and you can retrain the last layer. We went through that approach, we developed a new model, and we started looking at what the labels and keywords would look like. Here are some examples, both good and bad. So if you go to our digital library, this image right now, the one that you see on the top in the red background, that's the title we use. And that's the only thing that appears on the webpage. We don't have any description, we don't have any labels, it's just Campbell's Ferry provides context, but nothing else. So if someone was looking for suspension bridges or some other types of, if they input some other keywords in our digital library system, invariably they'll miss out on coming across this image. So we thought, yeah, this is a perfect example of how you could maybe simply run your existing collections through these algorithms and look at the output. And if the output makes sense, you can start ingesting that data into your systems. This is another example. Although it wrongly interprets the concrete lab as the surfboard, just because there's an ocean or some type of a water body behind the scene, it's trying to be too intelligent in this case. But again, you can look at different results. In this case, it was very useful. As I mentioned, in our special collections, we have over 1 million images digitized. Most of them are stored on hard drive or on network drives. And the way they are stored are we use different names, right? Naming conventions are sometimes pathetic in different departments. You will have like final version, final version two, no, this is the final version. It's hard to say which one, right? But it doesn't capture what's in the photograph, right? So we are hoping that if you get questions such as could you show us an image from 1880s or early 19th century that shows people wearing different type of clothing, we rely on expert archivists who carry that knowledge in their heads right now. But again, through simple algorithms that are out there right now, it's quite possible to pick those images from a vast collection that you may have. So we were pretty impressed with what the algorithm was telling us, even though it's not perfect. It's another example. The caption, it does tell you that this athlete is on a balance beam, right? Whoever was describing this image, they just labeled it as unidentified gymnast and left it. Again, there are no labels, no keywords. This is another good example where the title is pretty neat. But in the title, so the words that you see in the red background, right? Nowhere do we actually use the word basketball, right? But the machine learning output, it was able to suggest basketball as a likely term that you could use. So these are what you saw so far are good examples in the sense where you can see a pretty neat tie in with enhancing the discovery experience. Now, let me take a few minutes to talk about the challenges. We hear about narrow AI and general AI, right? Domain adaptation is one type of narrow AI problem. For example, a model that is trained to optimize predictive accuracy on one domain. In case of ImageNet, ImageNet images are pretty much images that were collected from the web. Most of the cases may be images that were captured through your cell phone cameras, right? It's likely that they are not suited for another domain such as what we deal with in archives and libraries, right? Archival scans of black and white photographs. For example, this image. This is the image of President Wheatlake from 1955, I believe. If you look at the suggestions, right? It's like, how come these models haven't seen a laptop before? Right? How do they get confused with such a normal thing that we expect from archives or from other types of things that we have in our collections? The second type is diverse captioning. It is like generating captions and labels from multiple perspectives, right? To improve discovery of content right now. When our metadata folks are involved in describing these images, it comes from their background, from their training. And we only come up with maybe one caption at the most, if we can. And this is a good example. This is a picture of people transporting telephone lines. And it's a historical image because this was in 1914, when the first transcontinental telephone lines were laid. In the title, we only say again, telephone lines, nothing else. So if someone was looking for a horse drawn carriage, they would not come across this. So if you have a system that can look at the image and give you multiple suggestions, it would probably be very easy for a human expert to not only include the telephone lines and the description and the context, but also not miss out on simple captions. The last one is socially and culturally responsible again. This is a shoot out to what Thomas Padilla has put out. And we have seen this. This is another example where the title is Kabuki Theatre, but when we run it through the machine learning thing, you could see the suggestions, right? It says a room filled with lots of different types of luggage. And Kabuki, if you're not familiar with it, it's a classical Japanese drama, dance drama, I believe. This is another example where we have just the name of the person and father library described and then we just have a generic description. This is not a problem with machine learning itself, but if we are not careful, if we are not looking at the type of data that we are feeding these algorithms, it's likely that we are going to perpetuate the problems that we have in our systems over and over. The last slide is again, what would be the role of the library? On our campus, our provost here recently initiated a new program called Utah Informatics Initiative and we are trying to see what the role of the library would be. But in the first couple of meetings that I attended, most of the focus seems to be on data science, AI, and again, given the report that came out and the work that we are all involved in, seems like to have a seat at the table to talk about responsible operations. We need to demonstrate what we are capable of doing, both on our data and also in consulting and participating with other researchers on campus. What we see is some of the colleges, when they talk about these issues, it's from a research perspective, but some of these problems, as we encountered, they come at an operationalizing state, right? And I think it would be pretty, I think your words will have weight, if you have gone through the process, if you actually got involved in a few projects and if you are able to speak with authority on what data you have in the archives and the libraries and what sort of issues you are seeing. And I think that would have a powerful effect on campus relationships too. The other one was about the data sets. Just last week, there was a conference, Fantastic Futures Conference, and one of the keynote speakers was from NVIDIA. He was a vice president for deep learning, applied deep learning research at NVIDIA and the main point that he was making at that conference was about the data sets. For any AI experts to do their work, they need to have data sets that are well described, that capture data from multiple perspectives, right? And I think that's a role that the libraries can fill in. And again, this was explicitly mentioned in the report too, in the OCLC report. If you haven't seen, it's just a plug for AI for a lab. This seems to be one of the communities, or at least anchors that we could all come around and maybe start developing some working groups and have some wide discussions and develop some partnerships. Also, a quick plug to the Littises Catalyst grant. None of this would have been possible without support from Littises Catalyst Fund. I know John is in the audience, so thank you so much for helping us kickstart this project and learn a few things. Good afternoon. So it's been an eventful year for conversations and work around artificial intelligence and research libraries. If we look just over the last four months, we've seen the release of ARL's report, what do artificial intelligence and ethics of AI mean in the context of research libraries. The Library of Congress held a summit on libraries and machine learning. And this week, yesterday, of course, we have the remarkable position paper from Thomas Pudilla, Responsible Operations, Data Science, Machine Learning and AI in Libraries. So while one could be forgiven if their eyes kind of start to glaze over at this point with a mention of AI or machine learning, as these major documents and events make clear, we need to be talking more about these topics rather than less. And we need to begin doing so I would argue at greater levels of depth and more nuance. I think as both Harish and John have started to do here today, to boo from only high level conversations to more degrees of depth and nuance in our conversations. And so I think I'm going to attempt to operate at sort of both of those levels right now, some depth and some abstraction by talking through work that we've been doing in the university libraries and at the University of Nebraska-Lincoln, where I am in the libraries and I co-lead a cross-disciplinary research team with a colleague from Computer Science Engineering, Lean Kiatso. So Lean Katt and I have been working together, we began working together in 2014. And originally we gelled around a very particular, very specific project, which was the identification of poetic content in historic newspapers through image analysis for the purpose of derivative corpus creation. And through this initial focus application, we have expanded our research agenda. So our team known as AIDA explores applications of image analysis and machine learning in digital libraries of historic materials. We are especially interested in what we might learn from the millions of digital images that librarians, archivists and others are creating as they digitize the cultural record. And we're intrigued both by the questions that machine learning approaches might help surface in these collections and about our professional practices, as well as by the questions that our collections and professional practices might help to surface about machine learning. Across our team, which includes Lean Katt, myself, and two graduate students in Computer Science and Engineering, Yilu and Chulwupak, we have expertise in intelligent systems and machine learning, computational methods, humanistic methods and approaches and digital libraries. And we also have collaborators at the University of Virginia on one of our projects and they likewise bring expertise in literary studies and digital project development. So I want to take a moment to just sketch a few parameters of our work and this is towards the sort of call for more depth and nuance. And so as we frame our work within libraries and machine learning, we're actually focused in a quite specific domain. So we focus specifically on historic materials that have been digitized from their original or from surrogate, so digitized or microfilm. We deal almost exclusively with two dimensional materials and our primary focus is on textual forms, such as newspapers and manuscript collections. We've done some work around graphical content in those textual forms, but we have not, for example, treated as Harisha's team has done historic photograph collections or other artwork. At the same time though, we're taking an image based look at the textual materials and this is different and makes our work that is focused on textual materials different, say, than natural language processing projects or those focused on optical character recognition and improvements in OCR. Ultimately, we have not actually been that interested in the semantic content, but rather in the visual signals that we might glean from these materials. We've primarily used artificial and convolutional neural networks and our projects have included both supervised and unsupervised learning. And again, the reason I'm spelling all of this out is because when we talk about machine learning in libraries, it's not one size fits all certainly and it's quite multifaceted. We've worked with historic newspaper collections for about five years now with an emphasis on chronicling America and more recently on the Bernie collection of newspapers from the British Library available through Gale. We have quite a few reports, presentations and other materials about this work on our website, it's project data.org. In May 2019, for example, we released a report that looked at what happens when we tried to implement a model for identifying poetic content that we had trained on chronicling America and then tested on the Bernie collection. And this is a rather lengthy report. The spoiler is that it didn't work that well, but we still created a pretty elaborate report outlining exactly what happened, where we think some of the limitations are and so although we didn't ultimately through this application identify say a whole bunch of poetry, we took as our responsibility here to really talk through exactly what we had done and where things perhaps went off course. I'll note that the report reflects our first generation approach while our second generation approaches fairing significantly better and we will be following up with some additional reports. But most recently, my team has been collaborating with the digital innovation labs section at the Library of Congress and I know many members of the labs team are here at CNI and have been presenting their work. But if you haven't had a chance to meet and talk with them, I can't encourage you strongly enough to do so. We've had a chance and I'll talk a bit about the work my team has done with them, but I've really been in quite some wonder at what that small but mighty team is accomplishing at the Library of Congress. So thank you. So from July to November 2019, the AIDA team worked on Digital Libraries, Intelligent Data Analytics and Augmented Description demonstration project. And the first half of this work is represented in Eileen Jakeway's blog post for the Signal Blog, Summer of Machine Learning Collaboration with University of Nebraska-Lincoln. Over the summer and fall, we conducted a series of iterative explorations around machine learning, library processes, collections histories and materiality and more. Briefly, so we have a series of five explorations and could talk about any one of them, but just quite briefly. Our first round of development focused on five explorations, document segmentation, extraction of content from segmented documents, document type classification, quality assessment and digitization type differentiation. In our second iteration of the exploration, some of these projects converged and others branched around document clustering content extraction, advanced quality assessment and differentiation of materials. Throughout, we focused on newspaper collections and minimally processed manuscript collections with a brief array into illuminated books. As a just sort of a brief peek into one of the aspects of our work, with the Civil War materials available on the library by the people, we tested approaches to distinguish among printed and handwritten materials as well as between materials digitized from different sources, whether from original or from microfilm. We conducted advanced image quality assessment and that image quality assessment was not itself a machine learning endeavor, but it's facilitative of other machine learning endeavors, in part through better understanding the original materials and their digital representations. I think that's one of the other points I want to make is that under the umbrella of what we need to be thinking about when we think about machine learning is not only the activities explicitly of machine learning, but everything else that we might need to prepare and understand in order to be able to do that work effectively. So potential applications of just this work with the Civil War materials could be for end users interested in particular subsets of collections for teaching and research, something like, might see handwritten there or the way or something was derived from perhaps as a proxy for difficulty. So trying to think about different ways that even this kind in some ways surface level information has potential for how end users interact with the materials. But we were also interested and found ourselves actually thinking more frequently about how connections with in what those connections with internal processes within the Library of Congress might be an internal understanding of the materials. For example, how might quality assessment help the Library make decisions about materials that say go to crowdsourcing campaigns versus those that might be good candidates for computational approaches versus those that might require internal human expertise with the understanding that actually most materials are going to potentially hit each of those. Right. So likewise, how does differentiating among particular characteristics potentially help shape internal processes and procedures as well as in how we understand the collections as well as the previous approaches and decision-making. So we really saw machine learning being a point of inquiry towards lots of different avenues, including about how we've treated materials in the past about the materials themselves, about our collections as a whole. So we're still drafting the final report about this work, which will deliver to the Library of Congress in January. What you won't see in the report or our outputs is any splashy statement about look at all of the cool things we found in the LC collections with machine learning. We will be making code and data sets available, but these are toward transparency of method and approach and enabling others to critique our work as well as to contribute to shared benchmarking sets among other challenges. The report will detail our varied explorations, including about including discussion of how accurate or not those approaches were some emphasis, some consideration of what accuracy even means in these contexts. Importantly, we'll also reflect on higher level takeaways from this work around such themes as machine learning as an exploratory mode through which we learn as much about ourselves and previous decision makers. So machine learning as that exploratory mode and not as a straightforward cost-saving end. As well as about the potential interplay of machine learning, crowdsourcing and information professionals and the full range of resources and expertise, computational, human, technical and social necessary to do this work within cultural heritage organizations. So that's a kind of overview of the work we've been doing locally. You've heard from Harish and John and just in the last minute or so I want to share what the three of us have been thinking about as some next steps. So you can see, I think that we each come from working on our campuses within our libraries and partners on campus in some different ways. We among among us have different areas of expertise. But what we knew is we each are developing local expertise and conversations on this topic. So what happens when we begin to bring our different areas of expertise together? And in the last few months, we've started framing this question, what would socially and culturally responsible machine learning application and development look like in cultural heritage digital libraries? And how might we achieve such a vision? And I think you can see here the sort of return to the opening comments about depth and nuance. So we're saying specifically these applications within cultural heritage digital libraries and seeing that as one place where we can have perhaps most impact. And there's real connection here to what we heard Thomas talking about this morning and is in the OCLC report. And I think that some of the steps we've outlined perhaps nicely converge with work that he's presented there. So we have sketched a project that starts to begin to address this question through what we are seeing as a series of investigations that would lead to the development of sustained technical and social analyses, guidelines, frameworks and knowledge bases, critical code and tool studies, collections, assessments, code and data sets and interviews exploring archivists and digital librarians knowledge of machine learning, including how they think about issues of bias. So this work is quite new. We've just started sort of we started framing it within the last several months. And so there's lots of opportunity both for you to help inform this work. And as we understand from the the research agenda that Thomas has shared with us that this is going to require all of us in the room and many more people be on that. So we look forward to the conversation this afternoon. Thank you.