 to get started. My name is Diane Goldenberg-Hart. I'm with the Coalition for Network Information, C&I. And you have reached a webinar that is part of C&I's Spring 2020 virtual membership meeting. And we're so glad that you made some time out of your day to join us here today. Today's webinar will be a panel discussion, including a conversation about current trends and issues in discovery systems, talking about various kinds of features, user behaviors, identifying needs by analyzing transaction logs, and also using artificial intelligence and machine learning for discovery. Our panelists will also be talking about the recent Ohio Link and Ithaca white paper on user-centered library systems and the concept of full library discovery. You may have been fortunate enough to catch a talk on that report that we also had at C&I in April. And we will chat out a link to that video if you didn't get a chance to see that. Our talk today is entitled Discovery Systems in 2020, Issues and Trends. And we'll be hearing from four speakers. We'll be hearing from Lorcan Dempsey of OCLC, Tom Kramer of Stanford University, and Bill Michaud, and Michael Norman of the University of Illinois at Urbana-Champaign. Before I hand it over to our speakers, I just want to orient you very quickly to a few features of the webinar environment. One is that we have a Q&A box. If you look at the bottom of your screen, there's a little button that says Q&A. If you click on that, box will pop up. You can simply type in your questions or your comments in that box at any time. And after our panelists have completed their entire presentation, I'll come back on to moderate those questions. We also have a chat box, as I alluded to earlier. We'll be sharing some information with you there, but you should also feel free to use that chat box to communicate with us and communicate with the other attendees on this webinar. So without further ado, I want to thank everyone once. One more time for being with us here today. And a special thank you to our panelists for their presentation today. And with that, it's over to you, Lorcan. Thank you very much, Diane. Pleased to be here. When we were talking about this session, Bill suggested that I say a few things maybe by way of general introduction, but also say a little bit about the BTAA operationalizing collective collections report and full library discovery, as was mentioned. Given the times have changed somewhat since that discussion and we're in unusual circumstances, it seemed to me that it would be sensible to say a little bit about rediscovery in a changed environment. And I'm going to talk about three things very, very briefly. There are really quite high level and I think will impact and change the way we think about discovery over the next while. They accelerate our current trends. My comments are partly based on three sources. I just thought I'd put up very quickly. Resource discovery for the 21st century library. I have an introduction in this volume. It's coming out next month from FASAT Publishing in the UK, the BTAA report that was mentioned. And then I released a blog entry a couple of days ago talking about the ways in which collections have changed in the current environment, the way in which we think about collections differently, the way in which the relationship between the library and the collection has changed. The library collecting activity as it were has peeled away from the locally managed collection in various ways. So I'm going to talk about three things and I'm going to relate each of those three things to a pandemic effect. So one of the things we're seeing at the moment is a lot of discussion about what will change, what will persist, what will be accelerated, what might go away. So three pandemic effects have been very pronounced in the library context. First obviously there's been a forced migration online but I think the effect of that is as people begin to interact with services online, we really begin to think about what does a holistic online experience mean? What does it mean to provide the full library experience in an online environment? The second thing is really a focus on mission. Universities and colleges are really now very focused on their distinctive impact, very focused on where they should be putting emphasis, very focused on strategic directions when they come out of the current situation. And I think for libraries there is going to be an increased focus on alignment with evolving institutional priorities. And this means that they will want to optimize. There will be pressure on budgets, there will be pressure on institutional alignment being seen to contribute to institutional priorities at a critical time, at a difficult time. So online, on mission and optimize. And I'm going to say a little bit about a discovery effect of those three pandemic effects. So a little bit about full library discovery, a little bit about the discoverability of institutional assets in the context of a focus on research. And then under optimize, stretching a little bit, thinking about D2D, discovery to delivery, but suggesting that increasingly they were going to think about D2D in the context of decision support or dashboard. They're increasingly were going to have data-driven decisions, were going to want to think about how to optimize things in the context of data, usage data, traffic that suggests behaviors, choices and so on. So discovery to delivery to dashboard. Okay. So first of all, holistic online experience. Now I am using PowerPoint in the way it was supposed to be used here with lots of bullet points. I've shifted back to bullet points for this presentation. So I think one of the effects, one of the pandemic effects we see is that the library identity is this sort of strange hybrid between a set of services and an actual building of physical manifestation symbolically manifest on campuses. And I think one of the effects of the pandemic will be to make that fully online experience very real and the experience of the library and the identity of the library manifest in that online experience. And this means as a target or as a new target on the horizon moving forward, we're going to have more pressure to think about how to deliver fully online, how to deliver the full range and richness of the library experience online. So you have things like consultation and expertise. How do you substitute for the face-to-face interaction that creates the relationships that allows you to develop research support or other services? Clearly there are information integration issues, integrating into learning management, interaction and programming. Big focus on personal interaction. We've had discussion about customer relationship management systems, but also profiling over the years and how to get into the user flow, how to use social more. So I think all of these things, mixes of these things going to become more prevalent as we think about that fully online library experience. At the same time, we're seeing a new relationship between collections and the library, a new relationship between the library and collections. Increasingly we're thinking about facilitating access to collections that may not be locally owned and curated. We're thinking about collective access to collections across groups of libraries, across consortia. So our model of collections was of the careful construction of a locally acquired collection. But in a sense, even though that's still quite central to library organization and operation, a lot of what we do has moved away from that because now what we're really thinking about is how do you optimally satisfy research and learning needs from a facilitated network of resources? There are resources that are acquired locally. There are resources that are collaboratively provided. There's open resources, there's commercial resources. So we still provide literature search and so on through that discovery layer, but you don't own everything in the discovery layer. Resource guides are really interesting phenomenon. They're like tribbles in Star Trek. You looked in the room once and there were two or three of them and now all libraries have all of these resource guides and this is a signal, if you like, I mean signal of various things, but one thing is that you're facilitating access to a range of things that are arrayed around the needs of that particular course, that particular subject, that particular area. Clearly big emphasis on open access, thinking about how to deploy, array, open access resources, how to access them more effectively, open educational resources, we facilitate access to a whole array than of network resources, free network resources and try and tie library resources into those. At the same time, even we connect to acquisitions. We connect to ways in which discovery connects to ways in which we acquire materials to demand driven acquisition. We're offering spot acquisitions the ability to order a document. Increasingly we'll see more smart fulfillment around resource sharing, the integration of acquisition, resource sharing, discovery to develop a richer view onto what is available to the person. We will buy the professor the book from Amazon rather than request it if it's not available within a certain amount of time. So this whole set of services beginning to provide facilitated access to a network of resources. And interestingly, we used to be in a situation where the collection drove discovery. You've had a collection and you wanted to discover what was in the collection in a demand driven or a facilitated environment. That sort of flipped a little bit. Discovery, what somebody has access to tends to sometimes influence or drive the collection. And what comes out of this as an early manifestation is a focus on full library discovery, thinking about not just access to a literature research but access to the range of materials that you facilitate access to. And I think over the next while we might see a sort of trend emerge here where you have different levels. So historically then the library provided access to the acquired library collection, that what was bought and licensed to provide a discovery to that. We're beginning to see through bento box displays through a focus on full library discovery and Bill will talk about this in a while, access to the website, to events, to various things, to programs, access to expertise by pulling up relevant librarians or experts in response to a particular query and access to a broader facilitated collection. Yes, you have the articles, you have the library catalog, but you also have a web search. You have potentially access to Google Scholar, various other things. So we're beginning to see this broader array of things pulled in. Now beyond that, over the horizon again, there's how do we think about that full library experience? So I think we're seeing a move as the library discoveries are peeled away from that library collection to think about this broader array and currently we're sort of thinking about an array of services across wider aspects of the library. And I think that trend will continue as we think about that full library experience. The second thing I said was important was thinking about the mission. So we did some work a while ago with Ithaca SNR where we sort of pulled out a model saying that universities tend to have three poles, three emphases, and these will vary depending on the institution that you're in. Clearly there's a distinctive research focus where doctoral research scholarship, but then a focus on liberal education, broad undergraduate education. And for many institutions, quite a strong focus on preparation for professions, on credentialing, on moving forward. And most institutions will have a combination of these, but they will lean one way or the other. And consortia are quite interesting because they contain a mix of these quite often. BTAA less so because it's a consortium of peers. Ohio Link, very much so. You have Case Western Reserve University. You have Ohio State, very strong in research, very strong in undergraduate education. You have somewhere like Franklin University, very strong in career preparation. Other institutions, very strong in undergraduate education. But universities are going to get much more purposeful, much clearer about where their distinctive value resides and sharpening what they do to deliver value in that context. And I think one result of that is thinking in the context of a CNI audience, thinking about research institutions, and thinking about what has happened over the last while where research itself has been affected by the pandemic. So a strong pandemic effect is the impact on the research culture that we're experiencing all around us. Now, sort of temporary ceasing of some types of laboratory research, but at the same time, really big focus on short circuiting processes and practices to get material, to get research outputs out earlier. Big focus on collaboration across disciplines, across institutions, urgency about reporting results, much greater use of open channels, and then concern about assessing validity and relevance, which has really come up in the last week or two. At the same time, we could look at the way in which publishers are making deals for temporary access, various other things that are sort of changing the way in which we think about how research is communicated. Some of that will rebound, some of that might stick, but we're in this period of questioning also about research. A stronger desire to showcase expertise, potential contribution within the institution. So I think we'll see coming out of this, research libraries much more purposefully curating, managing, making more discoverable research outputs like preprints and research data, and also becoming maybe more involved in the disclosure, the discoverability of expertise on campus in a way that's already quite common in other parts of the world and a lot of activity in the US as well. The, very clear, we're very familiar with how this manifests itself, but I think this focus on discoverability of institutional resources will grow. Currently, when we talk about discovery, quite often we talk about discovery of outside-in resources, the ability to find articles, the ability to find books, the ability to find resources more generally. I think one of the things that we'll see in research institutions now is this focus on discoverability of inside-out resources, discoverability of institutional assets, institutional materials. So things like research data, preprints, institutional repository, and I quite like the way the Purdue webpage is organized because it very clearly shows, at the top you have the discovery, discovery of materials might be, and at the bottom you have discoverability of Purdue assets, Purdue intellectual outputs. So you have Purdue EPUBs, the institutional repository, publications, you have E-archives, special collections and archives, and then you have PER, which gives access to research data. So these are all institutional assets that you're interested in sharing with the world. And the dynamic is very different here because sure, you want your local population to see these and understand what you have, but from a reputational point of view, from a scholarly point of view, from a dissemination point of view, you want to share these materials with the rest of the world, you want to push them out, you want to make them discoverable. So that inside-out focus becoming more interesting. At the same time, we have seen quite a few libraries becoming involved in the development of expertise systems on campus. This is the one at Minnesota where you're looking at faculty profile and outputs and sharing those with the world. Number three, optimize. Everybody is going to be very focused on optimizing against particular goals, optimizing their collections, their services. And really you have to choose the goals and in a teaching and learning institution, you will be very optimized on immediate support for learning, student success, retention, thinking about that new experience. Some other environments you may optimize for other things. Just thinking about collections, they'll be much more optimizing for value and the discussion with publishers very interesting in that regard. Probably more optimizing for open, certainly for curricular support and some emphasis on regional local affairs. Big push towards collaboration, it suggests there will be more collaboration. And I think one area that will maybe get a bit more emphasis is pluralizing collections, diversifying collections, representing and respecting communities that are overlooked, shunned, ignored in the context of collections that have been developed according to certain characteristics or criteria. So all of this means though that there will be an increased emphasis on decision support because to optimize you need data, you need to understand how things are being used, what's not used, how to really focus in on making choices. So data is required to support choices and that leads to dashboard. Now, Bill had suggested talking about the operationalizing the collective collection report which we did for BTAA last year. And really here what we were looking at was discovery to delivery, the complex array of services within a major library consortium that allows them to share materials across those libraries. And discovery in that context is part of a very complex ecosystem because Illinois is part of BTAA. It's also part of Illinois Infrastructure, Ohio's in Ohio, Link. Rutgers is in Palsy, but also in a variety of other organizations potentially. So we have a very rich, very diverse, very complex ecosystem within which libraries are sharing material and making decisions about their collections. At the same time, these are large, relatively autonomous, self-standing, relatively wealthy institutions that are building large collections. Illinois, good example. So what we recommended at a very high level is that the libraries begin to think about the optimal distribution of collections. How do you begin to manage your collections at the level of the consortium as well as at the level of the institution to mean that libraries can specialize because the network will take care of things they're not specializing in or that shared print facilities are in particular areas or that you do eventually move to a prospective coordination model where you share interest in subjects. But thinking about the distribution of collections across this optimal distribution, this needs to be supported by efficient network fulfillment, tying together the various requesting delivery discovery systems that currently exist in a smarter way. But all of this depends on system-wide awareness. This all depends on knowing what's available, knowing where it is, knowing what terms it's available under. And a lot of the inefficiency of the current system is that you lack forward knowledge of those things. The systems have to go and look or people have to make joins between systems. So it's a very fragmented environment. So from a system-wide awareness point of view, what we were saying was that increasingly we will see the need to think about more data-driven decisions, more data-driven systems, more data-driven choices around integrations, choices around where collections go, distribution of collections. And this will mean sort of greater integration between discovery, resource sharing and acquisition because they're all about making choices about materials. It will mean greater coordination between shared print, digitization, specialization at individual institutions. All of this depends on better system-wide awareness, better data. It depends on dashboarding, pulling, transaction data, holdings data, acquisitions data into a way of looking at things. Now this is very aspirational, it's on the horizon, but we can see that we're sort of gradually moving in this direction where we want to have ways of making decisions about collections that discovery can contribute data to. But then because we're in this facilitated environment it might be influenced by what happens. We can see in the licensing arena, consortium manager in Rome from the same company, but a lot of data about what is being bought. Unsub recently renamed a lot of coverage around that at the moment. We provide green glass in the monographs area. So I think increasingly we're going to see dashboards, decision support systems that help manage this increasingly fluid way in which we look at collections. So the discovery choices will be made in data-driven environments that they help shape because choices that people making discovery can be factored in, you have downloads, you have the whole way in which they play in this ecosystem. But then in turn they're shaped by because you want to offer for discovery things that you recognize are valuable. So in this sort of more collective, more facilitated, more fluid environment really sort of beginning to see data play a bigger part. So that was what I wanted to say, rediscovering, discovery, three examples of how the current changes may make us think a little bit differently about discovery very much in a library environment. Thank you. Thank you very much, Larkin. So I assume everybody can see my screen here. We're going to talk a little bit about, Michael and I are going to talk a little bit about really an extension of what we did in 2017 at a CNI briefing. Discovery trends in particular, we're going to talk about the bento systems, what we learned from transaction log analysis and then some of the other elements that are going into what we're seeing is really sort of a transformation in discovery. Larkin touched on a lot of those points and I think that's a very good introduction. Again, we've also opted here a little bit for putting together some fairly dense slides, idea being that these can be useful for later reference and I'm going to go through these fairly quickly. So we're still thinking that library discovery is at a crossroads. Roger Schoenfeld wrote a nice briefing in 2014 about academic libraries reconsidering their vision for discovery. Tom and his group has done a lot of work at Stanford and it's a quote here from Catherine Coleman about a revolution in discovery and he's going to talk about that later. And then the Ohio Link white paper which was referenced by Diane. And she also actually gave you a link to this as earlier presentation. This is this Ohio Link manifesto essentially which basically is proposing that we reexamine how we're designing ILS systems and discovery systems that typically they've centered on the collection and not on the user and that what we need to do is look at systems that are much more user dependent. They talk a lot about some of the things that Lorcan mentioned, the full library discovery inside out libraries and providing modern business intelligence to library systems. Again, I'm just going to go over this quickly. Full library discovery is something that a lot of people tried to incorporate now into discovery systems. We spend a lot of time working on this moving beyond the retrieval of collection materials including local information, local services, local content. In our case, we've integrated websites, web guide information, subject specialist lists, course management content, et cetera. And the goal is really to bundle and interconnect these related information services this interoperability. Briefly historically, there's a few of you that can still remember what we used to call super catalogs that loaded abstract and indexing services. Some of you remember when we libraries were loading the BRS software locally and loading indexing services. We moved from there to federated search systems to web scale discovery systems which are now used literally in thousands of academic libraries. Web scale discovery systems are characterized by having the metadata of full text content aggregated into a single consolidated index. So you're searching one large system. More recently, we've seen the introduction of what are really hybrid bento style systems that do utilize some broadcast searching from federated searching techniques. But they're typically done over the top of web scale discovery systems. Bental systems are characterized by having the result displays presented in a zone through a partition screen display and with content grouped by type and material. So typically there's a search for articles, a search for books, a search for on the library website, a search for journals by title, et cetera. There's a very rich literature on web scale discovery services. There's a nice bibliography by Francois Renauval at the University of Leige. The literature centers around a number of issues that are connected with web scale discovery services. One is general confusion with blended result displays. So a lot of people that moved into bento displays or bento style systems are basically doing this in reaction to users concerned with scene displays that blend book results, monographic results, journal article results, dissertations, newspaper articles, all into one result display. This affects known item retrieval and the relevancy rankings. You might be doing a known item searching and would be on the second or third page of a blended result display in a web scale discovery system. We also are seeing concerns about the lack of full library discovery, the lack of access to local services and concerns about better addressing known item searching. The advantages of bento type system again is it does partition results in material type and format. One of the things that we've found and others have found is that a large percentage of searches that are done by users are known item searches. They address the web scale discovery issues of blended results, relevancy ranking, known item access. They incorporate full library discovery features and they are able to provide nice one quick links out to full text. This is something that a number of us have been working on to try to expedite full text retrieval and bypass the link result. Example of our page, we're gonna talk about some of the specifics here. I'll walk and show the slide of this also. Typically, if you look at the results in the articles page on the upper left hand side, you'll see links to open access articles, links to table of contents, PDF links, links to article data. If you look at the article link on the upper left, number two, you'll see links to data sets. And these are all done by essentially integrating a number of what are basically siloed services into the bento style display. And here's a sort of a model of our display and the elements in the bento display. We have a suggestion box, articles on the left, catalog items in the center, subject suggestions on the right, we have a place to do some advertising. And this all makes up our bento style display. The features in our bento system, we provide a lot of context specific adaptive search assistance, spelling suggestions, links to live guides, direct links for frequently performed searches, limit suggestions. We identify DOI, so the person types of DOI and we put a link directly to the dxdoi.org or doi.org site. A link to our Ask a Librarian online chat, journal title links, direct links to PDF when available. DOI publisher, open URL, custom value added links. Next slide talks about that. We also try to recommend several relevant subject and AI services and provide links to those that when clicked on open up at the point of completed search and then library and in departmental library subject content. Again, these are all following in the philosophy of providing full library discovery. Specifically, in terms of our system, we've added a number of sort of value added links over the top of the article APIs. So we use EBSCO EDS and Scopus API for article results. We take these results, take the DOIs out of these results then they asynchronously go out and provide links to clickable altmetric badges, to give the altmetric attention scores. Using Skollex, we pull out the dataset and article data links. Unpayable pulls out the open access links. Browsing pulls out the direct PDF links that you should table content links. The PDF links are complement the links we get from the EBSCO Discovery Service. Here's the list and I'll expect you to memorize this of the bento libraries that we're following. We can go back, refer to this later if you want to look at some of the examples. We're looking at about 42 different bento libraries, academic libraries right now. And interestingly enough, we found in the last year about 10 libraries have dropped the bento approach. So this used to be a figure that was over 50 libraries. Some of those are Primo installations and we're going to talk a little bit about the problems with Primo APIs later. But we have a nice spreadsheet that sort of characterizes the features, feature sets of all of these bento-style instances. They all have the books and article area. So everybody is recommending monographs from the online catalog and articles from typically a Web Scale Discovery Service. And the Web Scale Discovery Services that are being used for articles, you'll see a good number of them are using Summon, but a number of them are using Primo and EBSCO Discovery Service. In addition to the articles and books, we're finding that website search is probably the most popular in terms of what else is being provided. There's 34 of the 42 providing that, research guides, some 24 journal title links or journal title searches, databases, digital collections, search repository, Contacts, which is something that the number of us are providing has grown dramatically recently and it's about 18 libraries providing that. So in terms of observations about these bento systems, feature sets vary, a lot of bento versions do not do spell checking, do not provide top level direct links. Spell checking turns out to be critically important. You'll see later when we look at the analysis of our click-throughs. Only three employ the one click-through full text without going through the link resolver option, which our users at Illinois find very, very useful. They OPEC varies, sometimes it's a separate application like VUFinder, Blacklight. Sometimes it's part of the WebSkill Discovery Service. So we are seeing systems where the catalog results are from the WebSkill Service, article results from the WebSkill Service, perhaps the Archives and Management results or digital collections might also be from the WebSkill Service but they'll just be in separate bento windows. Bento provides a lot of local control and customization. You can see that by looking at the various options, various options that people have been using, but does require programming, service staff, maintenance, and a fairly significant amount of work to locally maintain a particular system. There are a couple of systems now that are available, that are being used in more than one institution, but a lot of the systems are still homegrown. We'll talk a little bit about custom transaction logs. We think it's very important to look at user search behaviors and what we learned from user search behaviors. We have a very heavily instrumented transaction log program. It records all user actions, suggestions the system makes, all those search room formulations, identify sessions on all the click-throughs and click-throughs are actually routed through one of our websites where they're recorded and then redirected. So we know our click-throughs into external resources. We have a lot of transaction logs going back to something like 11 or 12 years. Like this study goes up through April 2018 where we looked at a million and a half searches and a million and a half click-throughs and then took out a sample of about 5,400 searches where we redid the searches, analyzed these type of search, success rate, particularly user behaviors. Two important points, two or three important points here. One of the things that we've seen is that the average words per query is going up dramatically. So we're now at 6.1 words per query. There's a lot of copy and paste searching where people are taking results. We also know this from our focus group interviews and from a user survey we did last year where people are searching Google or Google Scholar, pulling out references and pasting them into our system. There are only about a little over two searches per session and 60% of the sessions are one search. We look at the use of our suggestions, about 20% of the searches the suggestions made and almost a third of those, the person follows the suggestion, particularly the did you mean spelling suggestions and direct links. We're seeing a lot of local DOI searches. This has been growing over the years. This is the Sci Hub phenomenon where people will love to put a DOI into a system and pull up the full text. The other really important point is that in the sample of these 5,400 searches, we found that about 64% of them now, which is going up the last time we looked at this, are no item searches. Often these are title word searches, author and a couple of words from the titles, many cases of full citation. In fact, when we looked at the sample, percentage and a half of these searches were no item, which isn't a lot in the sample, but this extrapolates to 45 per day where people are literally copying and pasting full citation into our system. If we look at the usage within the bento, our article links are 58% or 57% of the usage, books and monographs less, the OPAC, a lot of use of our suggestions, added links, even some things like the library links or context, which is one-tenth of 1%, it still means it's happening more than once a day. In fact, if you look at the click-through actions here, for last month or the month of April, 141,000 searches, 3700 or almost 4,000 clicks per day. Full text clicks are 1,600 per day, a lot of clicks in the browsing, a lot of clicks into our Bing API results. The did you mean spelling suggestion is 55 per day? So those systems that don't offer spelling suggestions or don't offer, did you mean, you can see how heavily, let's use literally 55 times a day in our system. The open access links, 20 per day, this has been increasing since we started remote teaching at Illinois. The direct link suggestions, these are the commonly frequent searches of about 20 per day, agency journalists about 20 a day, ask a librarian four times a day, and again, even emailing a subject librarian twice per day. We have a lot of library services that are not used every day, so having services used twice a day, we would consider successful. We did a user survey November 2019, about 483 responses, 230 users providing comments, 24, the response for daily users of the system, and 40% of them were daily or weekly users. We got a lot of nice suggestions, but there was a high level of satisfaction with the entire bento approach. I'll put up one slide here, which is a question here about discovery in general, and I think a lot of the literature kind of centers around this. It has to do with whether that library should even be the starting point for users seeking content. There's a nice ethical S plus R survey question about how important is the gateway function, which dipped over the years and now it's gone back up. So there's a growing consensus, I think when users at the library is in fact a valid starting point. You might argue that library systems have always played a supplementary role, and that's true in many cases. And you might also argue, a lot of people have that our focus should be on aiding known item discovery. We've done an awful lot in our system, and you can see that with the known item searches at 64%, that the importance of providing access to known items, searches. Number of plans here for next steps within Bento systems. You can take a look at this. There's a lot of mega indexes, especially discovery services. These are dimensions and lands. Tom's gonna talk about machine learning and AI techniques. We still have a lot of complimentary digital services with open data, data management services, publication metrics, visualizations, course management content, faculty profile systems that we can add in. I'm gonna turn this over to Michael now. I was gonna talk a little bit about our pre-move implementation. All right, thank you, Bill. Hopefully you can hear me mute there. Yeah, Bill and I thought it'd be a good idea to just give you a little introduction to the implementation that we're going through for Primo. And so in June, 2020, we're going to be migrating to Exli versus Alma and Primo VE systems. And that's really a new deployment model that combines the backend processes of both Alma and Primo into one integrated platform. Currently, we work with separate systems in that with Voyager and our catalog discovery is viewfined. So that will be a new process for us. Really going into this combined backend processes and then really almost in real time that being reflected in the Primo VE library catalog. And we're transitioning with 90 other libraries in the iShare consortia. And what kind of systems are Alma and Primo VE? Alma really is a unified resource management system that allows libraries to manage the print, but really what we're looking forward to is helping manage our electronic resources and services really in this single environment. That's going to be a big change for us. So we're looking forward to that. For an electronic title, just a reminder to everybody, Alma really creates an electronic inventory. So a portfolio that really then permeates Alma and associates the electronic access in Primo. So a Primo catalog. And all instances that it really can match on identifiers. So Alma also has a network zone that we're really getting accustomed to that takes over the function of that union catalog. The network zone, we've encountered some problems with that within the consortia. The network zone is really built on using a first end premise. And that is really the first copy that comes in to the catalog, the union catalog, this network zone really becomes the master record for the consortia. And this master record is a shared bibliographic record that is linked to our local holdings or the local holdings of each of the libraries that have that title, that material. And much of our local data, and we do have a lot of it, particularly in our rare books and special collections materials, really didn't migrate over from Voyager into that master record. So we've been doing a lot of customization work to reintroduce that local information. There are ways to do it, but we're all having to put some time and effort into that. And then one of the big issues that we've encountered and Bill can talk more about this is really these localized URLs to eResources, populating the master record. This would be a link to a recent electronic resource that one of the other iShare libraries may have and then that pops into the master record. But we may not have access to it or it has their proxy appended to it. So we've run into some issues with that. So just important, even though we do have access to Primo through with the consortia, we actually had access to Primo six, seven years ago. We had a pilot where we had access to it for about three years. And then we went away from it. And so now we've come back to it. But we do wanna emphasize that Easy Search Bento will remain the library's primary and default discovery service available basically in the single search box on the library's web page. Primo will replace ViewFind as the catalog search. We'll talk a little bit more about how we're doing that here in a later slide, but we are really testing certain features of the Primo Central Index, particularly the ProQuest collections, which when we had Primo previously six, seven years ago, we did not have access to the ProQuest collections. A lot of their newspaper collections at that time. And so now we do. And so we're testing a lot of those features with the Primo Central Index. And we do still continue to see the benefits of really these separate bento zones. Here Bill has got the image up of our library's front gateway. So there's the single search box. When you do a search within it, then you get results. And so again, as Bill was showing earlier and Lorcan was showing earlier, you've got your article section there. You've got your in the middle there is our library catalog. And so here in a few weeks, Primo will be populating that. And then one of the things we're looking forward to, we've been testing and we may be able to offer since we do have access to the Primo Central Index, the Primo Central Discovery Index is a newspaper articles. And we've had some of our users, as Bill was doing some of these surveys is that, or comments is that newspapers is one of the things that maybe they'd like to see emphasized a little better. And so we do have access to that. What we've run into, it's really painful right now that we're trying to work through is these API issues. And Bill mentioned this and I mentioned a little earlier is that we are currently using the Primo Search API to pull in results from the library catalog into easy search. And we are very pleased with how quickly results came in from Viewfine. And many times it's less than a second to pull in those results, but averages maybe one to two seconds response time. With Primo, we're encountering really some slowness and performance of those, with those APIs and the results coming back through those API calls. And it's really averaging about 10 to 11 seconds for search and we're getting a lot of comments from testers about just how slow that is. And so API performance for the Primo catalog really definitely needs to be optimized and improved to gain really the full benefits we have of easy search right now. And we've been working with Ex Libris. We were hoping that maybe some improvements are coming here in a few weeks. But those APIs are just really critical. And you saw all the APIs that Bill mentioned that he's pulled into the articles being able to pull in some of this other information. So I think that's the last slide there, but... Yeah, we can turn it over to Tom. Yeah. Second and find the right screen to share. This looks like it. All right, that appears to be working correctly on my end. So I'm gonna proceed. Hello, everyone. It's a pleasure to almost be in San Diego with you. Kind of wish I were there right now after two and a half months at home. I'm gonna talk a little bit about a different facet of discovery, which is some of the things that seem to be emerging right now on the leading edge, in some cases, the bleeding edge, but which may soon become core parts of all of our discovery environments in one way or another. The first of them is linked data. I used to do a lot of work with Dean Craft of Cornell and he had this beautiful slide showing Eden and the Tower of Babel and how linked data would be able to basically be a babelfish and allow us to work collectively across all of the different schema and vocabularies, ontologies and domains where we have our data. And I think we still haven't achieved that, but we are making some progress. And one of the areas where I think the progress is most notable and where linked data is most important is when linked data serves as the bridge or the gateway between library data and our ways of representing knowledge and resources and that of the web in general. And we really look at this in two ways, getting library data out onto the web for discovery, reuse and linking, but also pulling data from the wider web into our environments. And as beautiful and as much fun as this is to talk about ontologies and RDF and the semantic web, really where most libraries are interested in linked data is because it's going to enhance discovery. And so through the LD4P and the LD4L projects and the series of projects that have been funded by the Mellon Foundation and in partnership primarily between Cornell, Stanford, Harvard University and the University of Iowa, we've been focusing most recently on seeing if we can augment existing discovery environments with linked data features. One thing that we do know about linked data is that it's not going to appear quickly and it's not going to appear magically where one day everyone is using mark based systems in the current environment and then the next day we'll be in this new wonderful and somehow different world where everyone is using different interfaces, different systems and different feats. So we've focused as part of the LD4P grants in the LD4 community on trying to do incremental enhancements to existing discovery environments. And we've identified five areas where we think this might be most fruitful. Those are represented here and I'll give you a sample pair for each one of these and where I think we're seeing some progress and where we're seeing some challenges. The first represents a way to get library data out on to the web in general. And in many cases, many of our catalogs are being indexed by Google and other harvesters and search engines. One way to accelerate that is by doing better and more rigorous schema markup. So schema.org markup in their catalogs. This exposes the data to harvesters where it can be incorporated into the web of data and then eventually emerge through search engine optimization and higher searches. We have over the course of the LD4P project seen some progress on this. We're by and large focusing on Blacklight as an open source application using solar underneath for a couple of reasons. One is that lots of the LD4P institutions are using Blacklight. Two, it's open source and using common technologies which even if you're using ViewFind or if you're using a commercial search engine, a lot of the lessons and techniques are portable. The second area, and this really exemplifies bringing external data from the web into library discovery environments is knowledge panels. And I think people are generally familiar with this from Google. We're now beginning to see this more and more commonly within library discovery interfaces. I believe it's now a feature that is included in Primo or at least some Primo instances. And this is a great example from the University of Wisconsin-Madison which has a home-built discovery environment where they put a lot of work into integrating the knowledge panel, but also building a service around it, including what are the ethical and service considerations for when they find bad data. We're doing more work and more consideration about how to get better forms of browse. And I think it's telling that most browse interfaces from this decade look like they may have been coded or designed 10 years ago or even 20 years ago. Spatial browse seems to be the one exception to that and we're seeing some good breakthroughs there. A semantic search, so instead of searching just on the text, can you search on the meaning of the text? So if I search for a heart attack, could I actually get search results for myocardial infarctions? The best example of this, and this is using Mesh from the National Library of Medicine and search there, this is yet to become a common technology or a common appearance in most library search engines, but perhaps this is on our future. And then finally, as Bill was suggesting, I was really glad to see this. Type ahead, auto suggest and spell check suggestions really had a chance to see people helping with both known article search, but also general browse. And progress on this, especially semantically aware progress would be a great advance. We've seen some very good work on this from the University of Ghent in Belgium. We recently presented on a Blacklight Link data workshop that we held at Stanford last September and October. Of all of these techniques, the one that seems the most relevant and the most visible is the knowledge panels. And at this same workshop that I just referenced, about 25 people came together and began to crowdsource a document on how to add a knowledge panel to your existing discovery environment. This document is linked. You can see the bit.ly at the bottom, so bit.ly ld4-kp-recipe. And it's written as a how-to document. It's currently graphed, but it covers everything from what is a knowledge panel and why you might wanna use it to identifying data sources, considering the minimum amount of data that you need to have a quality display and then the technical strategies for actually implementing it. If you are interested in any of these techniques, any of these advances, or if you have your own advances that are working with Link data, we would welcome you in the ld4-discovery affinity group. This is a set of bi-weekly calls that are open to anyone. There's an extensive knowledge base that has been built up over about the last, it must be eight months or one year at this point. And the co-chairs are Huda Khan from Cornell University and Jesse Keck from Stanford. And you can see at the bottom, or if you do a search on ld4-discovery affinity group, it is not surprisingly the first result they've defined. The other area that I'd like to forecast just briefly is artificial intelligence and the potential to impact library discovery. I believe there are two big opportunities emerging specific to discovery. Now, there are a lot of places and a lot of ways that artificial intelligence is going to and has already affected the library, the library services and the information environment. This includes back-of-house operations. This includes digital curation. This includes things like chat box and reference services. But specifically on discovery, I think we're seeing that artificial intelligence introduces, it just changes the equation in terms of the kinds of metadata that are possible to derive and produce. And it also hints at new opportunities around new interfaces. So artificial intelligence is something that might scale in a way that our libraries, our technical services departments, and even the networks of data exchange have been unable to do or at the same pace, at the same scale and at the same expanse. And looking at the common techniques that are really emerging as now, not even state-of-the-art, but very commonplace in many different industries, the ability to process lots of different formats of information, whether it's text, images, or time-based media, are really able to generate lots of what looks like traditional-based metadata. So this could be techniques for named entity recognition or text classification for texts, for images doing aboutness or labeling for descriptive metadata generation, or recognizing objects that are within an image, which might be helpful for description, or just extracting parts like better OCR. For time-based media, speech to text is really a game changer, not only in accessibility, but just overall discovery. And there's lots of examples where structural analysis of things like video allow people to pick apart these complex, time-based objects and get just the captions or just the segments that are interesting. I think all of these techniques are gonna become standard for library processing in the not too distant future. I think actually within the next five years, if not 10. The second opportunity that we're really seeing emerge with artificial intelligence is new interfaces. I think one of the best examples of this, there's actually some academic ones that I could cite, but none that I could find right before this presentation running in production. But Google is one that you might be familiar with and it's a reverse image search. So by entering an image, can you find other images that are like this? So here's an example of a killer rabbit from medieval times. If you paste this into Google images, you'll actually find lots of different images that are like this. Now this can be done completely without any text and without any descriptive metadata. It's a potentially revolutionary approach. We also see different types of interfaces for different types of recognition. This is an example of a knowledge graph exploration from UNO, which might be familiar with people in this audience. And this is actually interesting because there are two ways to enter this. One is by keyword search of a concept, but another is by uploading or examining an existing document. UNO will understand what the concepts are within that document and then draw links to other documents and other concepts within this environment. It's a completely different kind of discovery and search, very different from known item searching or even keyword searching. At Stanford, I must admit, we are still getting our feet under us for AI. And I think the exciting thing is it's probably true almost everywhere. Even people who are far ahead of us, we're still at the very beginning of the revolution. We have two sets of projects going. One is around theses and dissertations and can we do more and richer descriptive metadata and descriptive work on these to enhance discovery? We happen to have the full text for the theses and dissertations that were deposited electronically at Stanford. So this is a rich proving round. And currently we're looking at comparing multiple different models and seeing how we might expose the richer metadata through various interfaces. We're doing the same thing with images, with image labeling, with object recognition and with image-based search. So my colleague at Stanford, Claudia Engle, has done a great case study comparing what happens when you use commercial search engines or commercial machine learning engines, clarify Google and AutoML, Google Cloud Vision and AutoML. And how good are these for academic purposes? And you at the bottom of this, you can see some of the results they've been working with. Basically, under-described set of 50,000 images taken from an archeological dig over the last 20 years. As I said, I think we are all collectively at the beginning of this and there's a real opportunity for libraries, archives, and museums to better understand and to build our own capacity for leveraging artificial intelligence for discovery, but for other areas as well. With the National Library of Norway, the British Library, the Smithsonian Institute and the Bibliotech Nationality of France, Stanford is helping form a open community, which looks very similar to others in the space for things like IIIF and it's in fact patterned after IIIF, is can we collectively work together to understand what the use cases are, built common technologies, models, and capacity building. So if anyone is interested or is already active in artificial intelligence, we invite you to start participating in this as well. And with that, I will end and see if there is any time for human-aim discussions. And I think for any of the panelists. Thanks, thanks so much, Tom. And thank you to all of our panelists. What a wonderful sweeping overview of discovery systems, the demands on these systems, the potential really extraordinary. Thank you so much for that great collection of presentations. And given the hour, I'm mindful of folks' time and we already do have some questions. So let me just dive right in, beginning with Rob Cartolano, who comments regarding Primo with 10 to 11 seconds per search, question mark. What are reasonable performance expectations for search? Shouldn't it be sub-second response for API queries as we already have today via solar? Yes, that's what I would anticipate. Bill can probably talk better on this than I can. But yeah, we did a little study just to get that information back to ex-Liris about the API calls and just the speed that those were coming back. And so that's what the time is on those. And so they've actually done some modifications to the Primo Search API to see if they can speed that up. And then we're also, we're going to be on a production server farm here in a few weeks. And so we're hoping that that might speed up. Some of the activity too. But Bill, you would know more about performance of APIs than I would. A typical search for us, we will use Epsco API, Scopus, Open Paywall, Scolix, Altmetric. I think the other one's a Browsine Bing. So we'll send out 10 or 15 asynchronous searches. At one time we get things back in a matter of seconds. A VU find is the other very fast one. So yeah, a response time that we're seeing in that's anything above four seconds is really unacceptable. And unusual. Thank you. Thanks very much. Thanks for that question, Rob. All right, the next question comes from Steven Bell. When you talk about AI and discovery, could that include voice bots that can search the discovery layer and return results on a screen or send to an email and work with common search assistance people have on their phones and homes, et cetera? Absolutely. I'll answer that though. Others might want to jump in as well. There's one of the fantastic futures conferences that we've held for the last two years. Much of the activity around AI has been around natural language understanding and conversational agents. So clearly this is a different modality for conducting queries. And one of the interesting things to me is the way AI stacks upon itself. Karen Kariani from WGBH has been very active in using AI for better understanding and access to videos. And what she and her colleagues at Brandeis University have demonstrated is it's actually, teasing a part of video is about nine or 11 different functions that stack on top of each other. So if you do segment analysis and you find out where the captions are, you extract the captions and you do the OCR on the captions and then you do entity extraction on the extracted text. So I think conversational agents are just one more link in this chain. Interesting. I'd also add that Eric Fryberg from Epsco EDS does a demo of the use of Alexa with Epsco EDS for the voice of search and the results are spoken back to. Interesting. So it's a very rich area. Larkin, you seem to be on mute if you're talking. We're all experimenting with Alexa, certainly. I mean, there still is a gap. Just thinking from my own experience, you know, asking for Irish names or various other things. But it's definitely worth exploration, yeah. Interesting. Thanks, Steven. Now another question. Thanks for the wonderful and rich presentation. The complexity of the sources that need to be managed is unbelievably complex. I don't mean this question to come across as a kind of, well, what have you done for me lately question, but many libraries are struggling also to integrate museum objects into their discovery systems. Can someone comment on that aspect of discovery? No one wants to take that one, huh? We need to add another panelist, I think. I can, we've looked at doing this with a couple of different, the Archaeology Center at Stanford and also the museum. And the challenge or one of the challenges is just the differences in the level of description and also the schema. It's just the interfaces are, and what the indices expect and what the patrons might expect given the current layouts for library-based environments are very challenging. It seems to me that Bento actually could be a good approach to this and also approaches like knowledge panels or like hyperlinks that link out to other types of environments are potentially successful ways of dealing with this. We've been part of a immersive scholarship grant that was awarded to North Carolina State University and we've been looking at the virtual reality implementations and I guess a lot of different emerging, immersive technologies. And these are all very fertile areas and at some point we need to, we need to try to integrate some of these things. We're moving forward on a lot of fronts. I just thought I would put that question out to our attendees as well. If there's anyone in the audience who has any experience with this or thoughts about it, if you just raise your hand, I can turn your microphone on and you can participate live here in the conversation since we still do have quite a large audience. But that's a great question. And thank you so much for bringing that to our panel today. And just to relay from Rob Cartolano, who asked about the search times, he just thanks you for your responses and his comment is that these were all great presentations. So thank you so much. And just to reiterate what I said, I'm sorry, Tom, go ahead. Oh, I was just gonna, if there aren't obvious questions, I was wondering if I could ask Bill, one of the things I'm curious about if your research is uncovered, where are library patrons typically starting their search, the library homepage, Bento, or the catalog? I don't know if you have any data on that. Yeah, and in fact, that's interesting because we just had a recent discussion about this and we kind of looked at the literature. The Ithaca-Asposar survey asked people where do they start their searches? There's been a couple of other large surveys of users asking, do you start your search at an AI service or at the catalog or at a search engine or in the library? And of course there's varying answers and people are starting their searches in all these different places. Some of this for us comes down to this idea that we wanna be able to provide the best delivery services we can. So we know that people are starting a Google or Google Scholar. They're coming in from off campus now, which everybody is and they don't have the proxy link in front of the Google Scholar search. And in fact, it's impossible to even do this within Google to put a proxy in front. They're being asked to pay for articles. They're being asked to log into articles. So we're seeing a lot of people doing these copy and paste searching where they're literally taking something from Google or Google Scholar and pasting it into our system. And we're really encouraging that. We're hearing that in the user surveys. We saw that in the survey we did in November. And again, I think better integration of all these different silos is what we're all trying to do. I am. That's the holy grail. The museum question was interesting. Just going back to Tom's answer and some of what Bill was saying, it seems to me that for a long time, we were very obsessed with Google-like searches. But even Google doesn't do Google-like searches anymore. You do a Google search and you get back a river of results. I mean, there's a load of advertising at the top. But they have the knowledge card. If there are scholarly articles, they pull them out. If there are images, they pull them out. If there's news, they pull that out. So, I mean, effectively, they're giving you a sort of bento-style result without the boxes. So, as I say, even Google doesn't do a Google-like search anymore. So I think, Tom's point about the different library archive museum, shared searching, cross-domain searching, as I called it in a previous life, a big aspiration for many years. And because of the different curatorial traditions, the different metadata traditions, the different orientations of the services, difficult to pull together. But I do think the sort of more bento-style approach might be interesting there, but also the linked data. I mean, one of the things we're doing and we're working with Tom and his colleagues in this context is, you know, with Mellon support, developing entity backbones. I think over time, if people share, so if we have persistent identifiers for people, for places, for a variety of things, then we begin to share that infrastructure and have those identifiers in our descriptions and our discovery labs, it gives you a way of connecting things together at some level, but it's still some way out in the future. And it sort of interfaces then can take advantage of some of those ways of doing things. But different contexts or domains can link to similar, or, you know, link to the same entity infrastructure for particular shared entities. I think in the future, that's something that we should see more of. And we see it happening already with wiki data and so on. So, Lisa Hinshlefrock, a good point here in the chat, that the word of your start question differs a little bit about whether or not you're doing a note item search or a topical searching. And the relevancy ranking of these services, I think is really critically important. For classes I teach, I typically do a demo of Google Scholar and pull it up and do a search for the term federated search. And what I find typically is that the first handful of results is an article that I co-wrote in 1999. Well, if I was gonna tell people in French and on the topic federated search, a 1999 article is not gonna be very useful. Even I've written 20 articles since then that have been more relevant. So, we do a lot of analysis of relevancy ranking in these web scale systems and in AI services. And I think that's a really critical element. Thank you. Thank you all for your thoughts. Thanks, Lisa, for that comment. If we have any other questions, please feel free to type those in. And inviting any attendees who wish to still be around and have a chat with our panelists, please do so. A sincere thanks for our panelists for your time, sharing your experience and valuable information with us at CNI and our attendees for making the time to be with us here today. So thank you, everyone, and be well. Yeah, thanks to everyone for attending. Take your time. Thank you.