 Thank you for attending our presentation today and a big thanks to the CNI team for making it possible for us to be together, both virtually and in person. My name is Liz Gushy, and I'm the Associate Dean for Digital Strategies at the University of Miami Libraries, and I serve as the lead for the university's Esploro project. I'm joined by my colleagues and members of the UM's Esploro team, Kinneret Ben-Knan, research and assessment librarian and subject liaison for Gideic Studies, and Angela Clark Hughes, the director of the Rosensteil School of Marine and Atmospheric Science Library, and the research impact strategist for the Esploro project. So today we're gonna tell you a story about our institution's investment in research infrastructure. Our story is about our institution taking risks on implementing new technology, but it's also a story about building partnerships, both internal and external, that have been critical to our success thus far. And ultimately, this is a story about our collective efforts to maximize both human and technical systems for the benefit of scholarship. So a little bit of context about Miami, and actually I think my stats are a little updated from what you're seeing here, but our university is a private, secular, non-for-profit, R1 institution located on three separate campuses, which include a medical campus, a central residential campus, and a marine and atmospheric science research campus. UM has seven libraries that support 12 schools and colleges, and our population includes 17,000 students, which is our largest class to date, and 16,000 faculty, 3,200 of which are, whoops, excuse me, 16,000 faculty and staff, 3,200 which are faculty. So Miami will celebrate its centennial in 2025, and in advance of this university, or this anniversary, our university has developed a roadmap to the new century, and among its core initiatives is a strategic reinvestment in research infrastructure. And Miami, like many universities in the US, faces challenges for creating effective and efficient approaches to the management of its research information, and I'll go over a few of our challenges, which will probably sound familiar since you're likely experiencing them on your campus as well. We have many tools to perform overlapping tasks. Just one case in point is until recently we used a combination of digital measures, pure and be pressed for researcher profiles, and this resulted in an uncoordinated approach with incomplete coverage and lots of people to maintain systems that weren't really doing what we wanted them to do. We're distributed across three campuses, which makes it easier for individual schools and colleges to buy or build databases without consulting with stakeholders, so we have a lot of redundancy that's unintentional. Prior to UM's implementation of Workday, our systems did not necessarily benefit from having the same root source of data, and nor did we leverage APIs necessarily to have data exchange where it was going to be most beneficial. And until the establishment of the RIMS Task Force in 2019, there had not been a cross-campus group assembled to make strategic and future thinking decisions about research information on our campus and the resources required to support it over the long term. Alongside our challenges, however, are also our aspirations. And a driving force, of course, is to support scholarship and to increase competitiveness for grant dollars. And two initiatives that our school has had a particular focus on in recent years, are interdisciplinary teams of scholars to address complex societal challenges, as well as backing teams who are invested in community engagement and positive change. Miami wants to be better able to highlight the expertise of our researchers both internally and externally. We want to be able to facilitate connections across researchers, the academic units, projects, and their activities, especially in support of interdisciplinary teams which can sometimes be tricky tracking that activity. And perhaps most importantly, we want to create a record of curated data about our university's researchers and their research and activities and to reframe this data as a valuable and durable institutional asset. So into this landscape, steps Esploro, and you might ask, what is Esploro? So Esploro, produced by Ex Libris, is a research information management hub that aims to support the collection and use of multiple types of scholarly outputs and citations of researchers affiliated with a specific research institution. A core feature of Esploro, when fully matured, will be its ability to create accurate and complete representation of researchers' output by automatically capturing research output using AI and machine learning algorithms. And Kinnerett will talk about this in more detail. When Miami was approached by Ex Libris as an early development partner several years ago, we recognized this as an opportunity to rethink our institution's approach to research information management and to address many of the challenges that I previously outlined. So how did we start? We started with a strategic partnership on our campus, which was already in place. Our university libraries and the Office of Research and Scholarship has had previous positive projects working together that have been durable over time. One has been to create a vision of a library faculty commons, and another is to have librarians as active members in interdisciplinary research teams. And there was with that kind of foundation of working together that had them now turned towards research information and thinking about what we wanted to do at our campus and getting stakeholders across the campuses invested in this project and talking about this as well. And from that, the University of Miami then moved into, from an early development phase with Ex Libris into an early adoption phase. And then working groups were established by Ex Libris with partner institutions to prepare for migrations to Esploro, as well as to gather core requirements and features from the university partners who are investing in Esploro. And during this period, Ex Libris visited Miami on numerous occasions to talk to our Office of Research and Scholarship, the libraries and our research deans. And in fact, we also had an analytics workshop during this time as well, where Miami invited any number of partner institutions, thereby building our network of early adopters and Esploro users. And then turning the microscope a little bit closer into Miami itself, we've had three groups that have been ongoing now for some time to keep our research infrastructure and conversations about research architecture in place. So we have the Research Information Management Group, which is a combination of top level university leaders led by the libraries and the Office of Research and Scholarship. We have the Esploro Implementation Team, which is a mix of librarians and representatives from the Office of Research. And we also have the Esploro Change Management Group, which is recently put into place as we're moving Esploro from implementation to a campus service. And I can't emphasize enough how important it is to have information flowing from each one of these groups. And we've managed in part to do this by having several members that overlap into all three of these groups. So my colleague Angela and I, for instance, are case in point. So we're particularly busy working on all three of these groups at the same time. So as this project matures, and again, as we're moving Esploro into a service on our campus in 2022, it's occurred to me we're going to need to build in more staff. Up till this time, we've been working on this project while also being fully involved in other aspects of our job. And so at this time, I'm wanting to develop a lead for RIM services within the library, and you can see the staff that would be associated with that unit. And they will work in a cross-matrix sort of way with our liaison librarians. And Angela will talk about that, as well as our research and assessment librarian and also maintaining contact and interaction with our research navigator in our office of research and team members of the team from UMIT. So in closing, what I know now and what I would tell you to do if you were going to embark on improving the research flow at your university. First of all, assemble your stakeholders across your organization to have these conversations. And consider having internal and external stakeholder conversations. Communicate often and in a variety of campus forums. Expect that this is a long-term project, but it will result in long-term benefits. And know that there'll be mushroom events. And I call these things mushroom events because they seem to pop up. It seems like you've got everybody on board and facing in the same direction for what you're doing for research infrastructure. And yet they go off and they build their own database and pretend they didn't know what was going on. Well, just move on. It doesn't really matter in the long run. And keep people invested in your project by tying it to other initiatives. So for instance, we are tying Esploro to the support of our interdisciplinary teams as well as our ORCID rollout. And invest. Invest in your people, invest in your data. It will absolutely be worth it in the long run. And at this point, I'm going to transition to Kinaret who is going to talk about the collaborations between humans and machines. Thank you, Liz. Hello, everyone. Good afternoon. In my segment, I'll be focusing on the implementation phases of Esploro by laying out our strategy in adapting Esploro and its AI features. Our goal, as Liz already mentioned, is to build a unified system of record that reflects a complete picture of the research outputs conducted by UOM current researchers using Eclipse Explorer. This is with the intention to serve satisfied information needs related to research of diverse entities across the university. Esploro can be described as a smart blank canvas. It includes integration and machine learning features to support the information flowing into Esploro researchers profile. We began the implementation phases by, just a second, I'm sorry, by first streaming information about UOM researchers from our HR system, Workday, using explore integration features, information like researcher name, organization unit, subject area, research subject area, and internal ideas. Then we started bringing in research outputs, publications, into each researcher profile by utilizing the system machine learning capabilities and the central discovery index. So the central discovery index is Esploro metadata source pool. Based on ex-libris documentation, it includes billions of records that are updated either daily or monthly for multiple sources, publishers, aggregators, and repositories of various kinds. The CDI, based on ex-libris documentation, harvest records from all subject domains. The machine learning or the author matching algorithms features are embedded in two main processes. The first one, the smart expansion. It's a retrospective outputs import. The smart expansion loads assets that are known to belong to the researcher. I'll be talking about that process more in my comments The smart harvesting, which comes after the smart expansion is filling up information gaps, bringing in new publication and ongoing outputs updates. Angela will provide additional information about the smart harvesting implementation. So understanding what explore author matching algorithms means has been fundamental in shaping our strategy on how to go about implementing Esploro. We needed to understand first how this AI model works, what problems it's trying to solve for us, and what AI algorithms cannot do. So let's start with how it works. Esploro machine learning algorithms learns how to match researchers to their research output by taking into consideration variants of researcher names, their affiliations, research domains, years of research activity, years of research activity, and many other data points from their record and from their previous known assets. All of this against research assets covered in the X Libre CDI. The algorithm assesses the probability of a relevant records to be the output of a given researcher. And if yes, the solution ranks the confident level of its determination as very strong, strong or uncertain. What problem is trying to solve for us? The AI aims to automate the updating processes of researcher profiles, and thus reduce the efforts needed to sustain a research manager's wisdom that is relevant and always up to date. And what the AI algorithms cannot do, AI cannot be self-taught from scratch. It requires humans in the loop and lots of tech data. What it means? It means that for Esploro to automatically capture relevant outputs and match them with our affiliated researcher, then place this into the right researcher profile, our team needed to do a little bit of legwork first. We needed to locate you on publication recorded in other external sources, open sources preferably, and match this with the correct identity of our active researcher, researchers. Regardless of the solution, the algorithm constantly improves through the use of control import data and preferably feedback data. Okay, the more controlled information it collects, the better able it be to accurately and consistently identify researcher work in the future. What I'm trying to say is that we know more than we can tell the Poloni's paradox, which names in honor of the Hungarian-British philosopher and polymath, Michael Poloni. In 1966, Poloni argued that to a large extent, human knowledge and capability relies on skills that often beneath our conscious appreciation. Today, a machine learned more than what was imagined back in the 60s, but only if we teach them to the super-visionary learning. In the case of the picture shown here, in these slides, we need to tell the machine who the puppy is and where is the muffins million times in order for it to learn. A human-plus machine, Paul Daudery and Jim Williams show that the essence of AI transformation lies in human and smart machine collaborations. In fact, many AI pioneer companies, they interviewed, see AI adaptation as an investment in human talent first and technology second. In their book, they stressed on the needs to develop the missing middle roles, where collaboration between humans and machine happened. Investing human roles such as as trainers, training machines with the proper data, and explainers explaining the machine outcomes, then when the AI is trained well, it can give human superpowers. So to our project to obtain the right metadata for enriching and filling up our researchers profiles, we needed to identify you and publication recorded in other external sources, open sources, and match this with the correct and active researchers at UM. Then to input that information into Explorer as our tag data tagged by human to train the machines. The feature that Explorer provides for this procedure called smart expansion via CDI. Smart expansion worked with a list of DOIs known to belong to the researcher. To run smart expansion, we needed to provide the system with a list of DOIs and PubMed's IDs of our researchers assets and associate each DOI in the list to the internal IDs of our active researchers. The smart expansion import file acts as a prescription-based request. The file is locally assembled by us to guide the CDI database on which assets to import and to whom these assets belong to. Then after this, the smart expansion machine algorithms process attempts to suggest matches of co-authors that may be also part of the UM faculty. To assemble these files for running the smart expansion, we first used data from the University of Miami PURE's RIM system, which covers only a subset of UM researchers. The research outputs in PURE were already linked to UM internal researchers IDs, so it was straightforward process to assemble that list and run the first smart expansion. Then we used Microsoft Academic OpenData by accessing their database in Microsoft Azure and using the REST API to pull University of Miami publications. We developed a matching mechanism to match the internal researcher's idea for our active researchers to Microsoft Academic Author IDs. And then bring only researchers' active researchers' output. This was possible only due to the transparency and openness of Microsoft Academic Project and to the fact that they provided unlimited API pooling available for free. Microsoft Academic Projects started in 1916 with the aim to assist human conduct and scientific research by leveraging machine cognitive power, machine learning. Unfortunately, last May, Microsoft announced that Microsoft Academic Website and the underlying API will be retired at the end of this year. Through this process, we have been able to bring in about 100,000 research outputs, publication, to more than 2,000 research profile in our explore. With the use of Microsoft Power BI, we developed this dashboard that automatically ingests explore and Microsoft Academic Data to present an overview analysis of the project status and progress. Through this dashboard, we constantly trying to assess and inspect the AI performance. This is to identify mistakes and issues and to better understand and explain its operation and outcomes. We are trying to be very deliberate on how we measure this project. And you can see this is just an overview but what it does when you click on one researcher, you can see a comparison between Microsoft Academic Publication and explore what we have and also subject area in the publication in explore. That's our one. We have other pages of this dashboard that we measure and inspect other areas of this project. And now I'll give the stage to Angela to speak about small harvesting and collaboration. Thank you, Kenarit. For today's researcher, there was no shortage of profiling systems that showcase a researcher's output and expertise. The breakdown of many non-proprietary systems is their heavy reliance on mediated or manual input. Proprietary systems like Scopus or Web of Science do a better job of compiling output to create a profile but they too require the researcher to identify those articles and claim the profile to affirm their collection whilst limiting research that falls outside of their proprietary systems. And that is because the intelligence behind the system is prone to common errors in matching like the puppy and muffin example that Kenarit displayed earlier. It requires a trained eye to see the difference. The smart harvest in Esploro works essentially the same way when the profiles are empty. The intelligence is rudimentary. For example, when we ran the smart harvest on one of our University of Miami law professors, Lily Levy, the system brought in any article from the Central Discovery Index that contained Lily Levy as well as First Initial L and Levy in the author name field. Articles containing just the First Initial and Last Name are brought into Esploro as an uncertain match which means that it requires an action or a mediation to approve or decline an asset on that researcher's profile. The publication date on this particular article was 1886. So it was immediately red flagged, verified as not belonging to our current researcher and declined. But with over 3,000 researchers at the University of Miami, we knew in order for this initiative to be sustainable, we were going to need to add additional resources. As Liz alluded to, we needed to build, we needed a bigger boat. So before turning on the smart harvest, we began expanding our system with known and verified content from other systems to train the intelligence as Kenneret has detailed. We also implemented a series of train the trainer sessions to engage our subject liaison librarians in this process. These will be the humans with the trained eyes who not only have more knowledge about the research of their areas and the types of research in their areas, but they also have established relationships with the researchers. Together with the help of the liaison librarians, we can improve the accuracy of Esploro smart harvesting AI capabilities for future publications. The librarians were given access to the backend of Esploro and we developed an extensive workflow which covered reviewing assets, comparing and verifying, approving, denying, adding and editing assets. This product is still under development. So of course the training will be ongoing and we also anticipate others participating as well, adding to the human resources. But with the liaison librarians in training and able to assist in asset verification, we continue to expand the researcher profiles, bringing in known assets of all types across all disciplines to train the system. Now it's time to engage the researchers. So in February of 2022, we will embark on our first Esploro pilot program. The Esploro implementation team has selected three areas to focus on. My area, marine and atmospheric sciences, architecture and nursing. And the liaison librarians in those areas will act as Esploro contacts while simultaneously promoting ORCID as a mechanism to ensure the accuracy of future harvested assets. We are coordinating our efforts with UM's Change Management and User Adoption Unit and this team is responsible for training services at UM, communication and management strategies for all IT related projects. And finally, we will engage the selected researchers. This is an example of one of our profiles. The selected researchers will be asked to view their public profiles and give us feedback on the pathway, validate our expansion efforts, tell us what we got right and what is missing. We will have some in-person training and do some video tutorials on managing their profiles, how they can add proxies to their account to assist in the management of their profiles. Sessions on adding and editing their content as well as removing erroneous assets or just suppressing them from the public view. The end goal is that our product will also continue to develop and we will switch on the smart harvest and allow a mature machine to do the heavy lifting for us as we build toward that unified system at the University of Miami. So stay tuned, there is more to come. And with that, we will be happy to take any questions. Thank you. So my name, oh God, sorry. My name is Charles Watkinson. I'm Associate University Librarian at University of Michigan. Thank you very much for the presentation. I wanted to ask about your situation with these various research information management systems and how you're thinking about the sort of politics there and handling that. So you mentioned, I think, that you had pure on campus. You also had digital measures and Esploro is now sort of entering the picture. How do you see this playing out and how do you, how are you thinking of articulating the particular role of Esploro and therefore kind of the role of the library as well in this evolving landscape? Are these, will these be complementary systems? Will they be broken down by discipline? Do you think in the future? Or is this a no, do the other ones have to hit the floor for one to appear? Because certainly at Michigan, we're a digital science campus. We used to be a pure campus and it doesn't feel like there's space for more than one player in that environment. That is a multi-layered question. Let me see what I can do here. Well, first of all, I don't think anyone on our campus is going to be particularly sad when pure goes. Primarily because it really has been a limited number of researchers that are using that tool. Another way to think about this is that I think our thoughts about Esploro are evolving on our campus and I think what we're trying to do is to build Esploro as a source of truth so that the data from that system can populate other systems in a way which is truly beneficial no matter what the front end looks like. But then as I mentioned, we've got all sorts of different collaborative groups on campus trying to keep the conversation current. And admittedly, during the pandemic, there was an absolute halt, especially at the university leadership level with our RIMS team where, and I'm sure everyone else experienced this as well, where the campus was completely overtaken with the digital first and the teaching remotely and testing of students and testing of faculty and staff. And so that was just the thing that took over a full year. But again, as things are quote unquote getting back to normal, our RIMS group are having these discussions and I do think we will have fewer systems and I do think we'll have more integrations and I think that's where we're hoping to go. And helping, having change management really helping us make it hopefully a less bumpy ride. Thank you for your question. Hi, thank you for a great talk. My name is Boyan Kim, University of Rhode Island Chief Technology Officer. I have two questions and I do not currently work very heavily with RIM. So if my question's a little bit out of touch, I hope you can still like take that into consideration. So the first question that popped into my mind is, don't scholars, like don't researchers, your research faculty on your campuses currently do this for their Google scholar profile? Like editing their publications, information in their Google scholar profile. So in your planning for this, have you considered that at all and what that impacted on your decisions about this project? And the second question is, if you're working on author matching in scholarly literature, I think the traditional paths that have been taken, many IT systems projects for this in the past is usually mapping authors' emails or institution. Like that's as far as I know, the most traditional kind of straightforward way of trying this. So I was curious if your intention of trying this path, using machine learning was based upon the intention to develop a model based upon discipline or was there any other reasons for going in this path of matching authors with their TOIs and research topics rather than some other identifier information for authors? Thank you for your question. I'll take the part about the author matching. One of the, we are, this plural system does not just work on author matching and that was sort of the example I gave in the Smart Harvest. They match on other parts of the researcher's profile. So that is in essence why we are training the system. We're putting all that content in, different researcher identifiers as well so that when information is passed from the CDI into our researcher profiles, it will give more than just an author name match. And an email address is also something that we've done in the past so that could help us identify our researcher versus another researcher at another institution. So definitely those systems are in place. One of the reasons why we chose a plural and why we're so excited about a plural, a lot of the systems that exist the example about the pure, it was a medical campus implementation. It was very heavily utilized by our School of Medicine in which case the humanities were largely ignored, social sciences and even other science areas on our campus. So one of the reasons why we wanted a plural is we wanna get away from that siloed kind of activity and setting up a system and a plural can cross all disciplines in all areas and we're very excited about that because the coverage has been lacking in areas like the humanities and so we're bringing in that content now. I didn't realize you could make these things work. Okay, here we go. So was the first question, and I may have not caught it quite, was your question whether we expect the scholars to edit their own profiles? No, I was just asking because we know that a lot of scholars already do their own editing in Google Scholar and usually when you implement RIM systems, all sorts of incentives to go in and edit their records to be correct is the biggest hurdle, right? But they are willingly doing that in Google because Google Scholar provides really important for promotion and tenure processes. So it's sort of like, you try to really engage your faculty in that respect on your institutional projects but then these faculty will not be so responsive for that but then they will all go to Google and spend hours and hours. So I was just curious if you considered that as a part of the consideration for this project and what kind of impact it had in your decisions? Yeah, no, that's a good question and thank you for elaborating. I'm sorry I didn't catch the whole thing. I think we're gonna learn more during our pilots. I really think so. So, and I would have to say that some of our faculty as we all know are far better about maintaining their profiles than others and I think the automatic, the AI feature of really helping to populate a lot of their research output automatically, I think if we can truly get that matured in a way that Kennerette described, I think we'll be in great shape and our point is to be able to capture output across all disciplines but make it less intensive for the researchers themselves to have to do and I think also having integrations across different systems will help as well such as ORCID and possibly other things. Do you hear me? I wanted to say that it's first of all, those systems are for the faculty, researchers themselves but there are also other reasons why we need this kind of unified system in our university. We don't have control on Google profiles and this collection of our research output, university research output can be used to other needs, research needs, university needs. Also for the individual faculty, also for the individual researchers but also it can be a great value to the university. Hi, Amy Couttsman, Sacramento State University, California State University Sacramento. We're using Esploro when we went live probably two months ago and what we did with populating our faculty profiles and this is probably one of the only upsides of COVID is when we had to shut down our libraries, we have a covenant to our students to keep them employed, to give them jobs but they couldn't come on campus and so we trained up our students on how to enter, they were proxies for Esploro and we asked faculty for their CVs and we had several hundred given to us and we just had people go to town entering this and what's really fabulous about it as you stated was our humanities faculties are represented in a manner that they've never been able to be represented before. So if they're choreographers, we have a jazz faculty member who's downloaded his work onto the page, people can listen to it. So we're able to connect with our community to be a resource in ways that we were never able to be before and it's really exciting. It's not done. We probably have about 30% full profiles. We're about 70% of having all the CVs in and it's gonna be a lifetime of work as long as it sticks around and it's scholars.csus.edu is up and running. Thank you. Hi, we are live too. I'm Keith Webster from Carnegie Mellon. Just as the last person said, we certainly found the pandemic was a great time to use our student employees to update profiles in our non-esploros system and that truly has helped us dramatically and that has been one of the incentives we've found which is being able to show our faculty relatively well curated profiles for them to verify and validate. The other has been the incentives around what they can use their profiles to do whether it's faculty annual reports or promotion and tenure case books or biosketches for the NIH, et cetera. I'd love to hear a bit about what you anticipate Esploro being able to do once it's live and kicking. Do you have particular institutional use cases that you can share with us as we think about the opportunities these systems bring forward? I think that will be part two. But thank you for sharing your ideas. Quite frankly, this year has been so completely focused on getting these profiles populated and then we will turn toward the pilots where I think we really hope to get faculty feedback directly from those users about ways that they could really see this as being added value. Great, okay, thanks. Sure, that may be it. We're blinded up here so I can't see if there's anybody else at the mics. Well, thank you very much for coming today and for your great questions and comments. We really appreciate it. Thank you.