 All right, we're going to go ahead and get started. Thank you so much for joining us all this morning. My name's Emily Singley. I'm the Vice President for North American Library Relations at Elsevier, and I'm going to be moderating our discussion today on how advanced technologies are shaping the future of publishing, and I'm very pleased to be joined today by our three panelists. We have with us today Jud Dunham, who is going to be giving us his perspective on how AI technologies are developing within search. We also have Corey Harper. Oh, can I get the slides up, please? We also have Corey Harper with us today who's going to be talking about some of the risks of generative AI as well as the potential role of librarians to mitigate that risk. And we're very pleased to have Emily McElroy with us, who is going to serve as our librarian respondent, and will be providing the higher ed and library perspective on this topic and participating in our moderated discussion and Q and A after our short presentations. So, yes, slides are up, great. I am going to be starting us off with a very brief introduction and summary overview of the topic before I hand it over to Jud. So, of course, very hot topic right now. I think, you know, just having come from the table discussions at breakfast, two of those tables were about AIs, and of course, we've had other presentations here at CNI and a bunch of presentations and webinars and blog posts and I've sort of lost count of all the different conversations that are happening around this topic right now. But it's not a new topic, right? It's been around for a while. These technologies of large language models and machine learning have been around for a while now and have been used by publishers, are being used by publishers in different ways. And I just want to highlight a couple of those at Elsevier, one of which is our science direct topic pages. So, topic pages are automatically generated content using machine learning and using the data set of the corpus of science direct articles. And within the editorial workflow side of things, we also have a tool we call the reviewer recommender, which helps our editors choose and select peer reviewers based on both the credentials of the peer reviewers and the content of the articles submitted. And there are AI tools really that have been developed both within publishing and externally to support just about every stage of the publishing workflow from performing literature reviews or systematic reviews to helping with writing, with scientific writing and different types of writing, citation management, citation verification, and on and on. So, we're not going to go into great detail on all the different kinds of tools that are already out there and being used by authors, by scientists, and by publishers. I would point you to a really good post that came out on the scholarly kitchen, which was a summary of an SSP webinar where they go into this in a lot more detail. And I've got that citation there. And in addition to all the different kinds of technologies that have arisen in this space has also arisen a lot of regulation and policy and best practices around responsible AI. So, I'm highlighting here what we follow at Elsevier, which is our parent company's responsible AI principles. Many organizations, including ours, are building teams and best practices around responsible AI. And of course, also policies for authors who submit to our journals have been around and are being tweaked every day. And we're tweaked recently. You may have seen that many major publishers decided to not allow chatGPT to be cited as an author, although chatGPT snuck in before we were able to do that. For the simple reason that these bots can't actually sign contracts. So, to be an author, you have to be able to sign a legal document. But authors can, of course, use different types of AI in their workflows, as I previously mentioned. And we have policy around how to disclose that and how to responsibly talk about what you did. And I'm just going to close my brief summary here with this wonderful slide that was put together by my colleague George Setsaronis, who's our Vice President for Data Science at Elsevier, showing, as I mentioned, you know, government policy that has arisen around this. And you can see it's really quite global policies around responsible AI that are emerging or are already in place. So, with that, I do have my references here. So, if you want to do any further reading on some of those topics, I'm going to hand it off to Judd. Thanks, Emily. My name is Judd. I am head of product for ScienceDirect, and I'm going to talk to you for a few minutes about some of what we are doing exploring the utility of AI and literature discovery, some of the opportunities and risks we see awaiting us and kind of how we're approaching it. So, before I get into that, I wanted to just mention a few kind of principles that we kind of hold ourselves to to keep ourselves grounded and honest. And this isn't just for AI. This is all the product development we do. So, the first is we try to, in my team, we focus on problems, not solutions. And what I mean by that is we try to avoid jumping on the latest thing or rushing ahead and saying, well, we're going to deliver this really shiny new thing that we think users might like. But instead, we focus on really deeply understanding and empathizing with the problems that researchers have, what they're trying to achieve and how we can help. Kind of related to that is that we try and remain humble and realistic about what we can deliver because we know that researchers have big, complex jobs and we are a small part of it. So, we try to make sure that we're fitting into existing workflows as much as possible, helping where we can and otherwise staying out of the way. And finally, we look for opportunities to deliver incremental progress that delivers immediate results. And when I say incremental, what I'm really saying is we try to experiment our way into everything. We don't try and throw out everything and start fresh and build some massive new thing. We try to see how we can improve on existing things, make things better and always focus on trying to deliver some immediate benefit to our users. Because it's, I mean, when I say focus on problems, and all that sounds very deliberate and cautious and I think that that is the case, but it's not the idea that we think that everything is great. We know that there are a lot of problems. And what's on the slide is a representation of an example of such a problem that researchers have that was described to us by a researcher that we were speaking with last week. And this is something we hear from a lot of people. He's running an experiment, he's setting up his experiment in his lab, he's got his mouse model, he's got his protocol, but something's not going right and he doesn't know why. So he turns to the literature, which is what people do when they have a problem. They're trying to figure out what somebody else did. And what you see here is the current workflow in a nutshell and kind of a demonstration of how clumsy and annoying it is. He has to figure out what words to throw in the box and then he gets back a hundred or a thousand papers that he has to sort through and find one that looks like it might have what he wants and open a PDF and scroll down and look in the methods section and find out that yes, it's similar to what he's looking for, but actually it references a protocol in another paper. So he goes to that paper in hopes that it's described there, comes across some terminology he doesn't understand and on and on and on. And I mean, it's enough to, I mean, when I say empathy, I mean you really can feel this person's pain. It's a terrible situation that he's in. And I think we have to ask ourselves if that's the best we can do. So again, with the idea of focusing on a problem, just unpacking that a little bit, what we, the reason why we're excited about this and why we are, you know, moving into experimenting with this kind of stuff is that we think that this can help us break what is obviously a broken paradigm that we have to, I mean it's an amazing achievement that together we've built this wonderful institution of scholarly communication. We've put all this information available worldwide everyone's fingertips, but it's still pretty broken. So we see opportunities here to break what we call this three words, 10,000 papers paradigm. So again, going back to the process there, the problem in focusing in, what is this person trying to do? He's trying to figure out how to fix his experiment. And the systems that are available to him forced him to dumb down his query into just the right set of words that might return papers that are relevant to him, but won't exclude papers that might have the one thing that he needs. When, and I should say there's a happy ending to the story because in the end, he didn't solve his problem directly through the literature. He found an author who seemed to be running an experiment that was more like what he wanted to do and he communicated with an author via email and then got on his Zoom call and he worked it out. So that's instructive. I mean he solved his problem in a way that he couldn't do in the existing system. And one of the things that we think could be amazing about this technology is it can help us build systems that allow people to simply put their problem into a system in plain language with a much better chance of getting, if there is information that can solve that problem, get that out of the system. So do something more like ask questions instead of do this kind of dumbing down of queries and prompt engineering that this person had to do in order to access the literature. It has the possibility to, again, his goal was not to go out and find 10,000 papers, his goal was to find bits and pieces of information that were potentially spread across dozens of papers and get that information back in a way that would answer his question. And we think that this technology could allow us to get to a point where we can do that as well, where we can allow someone to use plain language to interrogate the literature, build systems that interpret those queries, turn them into something that will pull back just the relevant bits and present those in a better way. And we think we have evidence that this is what people really want when they're interrogating the literature. They don't want to read papers front to back. They want bits and pieces of information that are relevant to the task at hand. And we think that we're getting to a point where we can actually set about building systems that do this. It's worth mentioning, though, that one thing we think we will never get away from is having always maintaining a connection to the verifiable, citable source document because provenance is critical. And I've never met a researcher who I think would want to turn over to the critical task of evaluating the trustworthiness and the quality of an individual piece of research and whether they can and should trust it and reuse it. I don't think anyone is about to turn that over to a machine. So I think where that leaves us is, I know I'm excited, I think we're cautious, but if you took anything away from it, I think we're excited. Everybody's excited about what we can do with this. So I'll just talk about a few things that we think are, I mean people, we have this new lexicon of things like guardrails. I think of these things as like handrails. How can we wade into this world in a way that protects and preserves what we've built and the value of what we've built but solves researchers' problems in a new way? And I think that the biggest thing there is really trust. If you think about what is the scholarly communication system and the systems we've built to access the literature and interrogate and ask questions, it's a gigantic, amazing, beautiful institution of trust. We've managed to put all this trustworthy authoritative information on the internet in ways that people can access it and trust it, and we talk to people who say that they're lost, they don't know what to believe, but when they come to a site like Science Direct, they at least believe that this is something on the internet that they can trust. And that's hugely valuable. I think it's our core strength and asset as publishers, as providers of information services, and we need to protect that and guard it jealously. But as we move into this, we need other things that we can hold onto that allow us to kind of stand a good path and stay safe. So what are we doing? We are looking at, for example, looking at technologies that we already deploy and looking at how we can use them in new ways and make use of advances. So one of those is, for example, domain specificity, domain specific machine learning and extraction technologies that we've already used in many products using them in new ways. And we are also looking at large language models, but looking at them not in the sense of a chat GPT or something that would just throw everything in the world into it, but rather looking at how we can apply those in smaller ways with, especially with a focus on focused training data, feeding in the right information to make sure that the results that come out are, again, trustworthy. And we will go slow. We will, as I said, we will experiment our way into it, and we will try things with summarization and Q&A and things like that. But I think what we are all unified on is we need to make progress. There are problems that we can solve, but we need to do it in a way that protects the value of the things that we've built and delivered. So again, excited, cautious, but excited, and I will turn it over to Corey to hopefully excite you about even more things. Thanks, Judd. I'm gonna start my timer here so I can track my time, so bear with me one moment. Cool. So I like going after Judd, because he can talk about the Elsevier side of things, and I'm not gonna talk about Elsevier. I'm gonna riff on our kind of panel title around human-AI interaction, but I'm gonna talk about librarian-AI interaction. I'm gonna talk about it specifically in the context of foundation models and a role for libraries and scholars and academics. So foundation models are kind of my preferred term for the kind of generative models we're talking about here, right? All these large-scale, self-supervised, pre-trained models like Dahle and stable diffusion on the image side, and GPT, whatever number, on the tech side. And I love this slide from Alex Ratner at Snorkel because he lays out on two axes, problem complexity and accuracy requirements for foundation models. And he's basically saying, if your accuracy requirements are low, these things are good out of the box. You can use them for exploratory tasks, creative tasks, but when you get into high accuracy requirements, which is most of our enterprise automation, unless it's a really simple, general task, you typically need to adapt these models further, add more data, do additional pre-training or prompt-tuning or fine-tuning. And I love that because it brings me back to this idea of data-centric AI that I've been thinking about for a few years, really since I saw this Andrew Ng talk where he talks about ML ops, machine learning operations, and the move away from model-centric AI, new models, training these huge models, very expensive to build. Most of us aren't gonna be building new pre-trained machine learning models. That work is being done by big organizations and it's becoming kind of commodified, right? There's new ones every couple of days. I just heard one last week that Bloomberg launched a finance GPT. On the data side, it's a lot more interesting because as you put these things into practice, you're bringing your own data to the table and really getting gains by adding new data to the mix of what the model is doing. Now, the hype in the press the past couple months would have you believe otherwise. You see that all this talk about these generative AIs replacing people, passing coding exams, passing MBA exams, but at the same time, you see hype around the threat of misinformation. I love this quote from Neil Gaiman where he says, chat GPT doesn't give you information, it gives you information-shaped sentences. It connects to this other quote by Dieter Castel below where they're pulling together book recommendations that don't actually exist. So there's some implications for libraries here, right? This thing fabricates information. And if you want to think about why that is, I really, really recommend this article. It's a couple of years old now, but by Emily Bender, Timot Gebru and their colleagues called On the dangers of stochastic parents. I love their title. But they're talking about what these foundation models are. They're self-supervised. All they are is probability sequence generators. They're probabilistic models that predict the next most likely token in a sequence. And they do it haphazardly. And they do it without any understanding of the meaning of what it is they're saying. And I think that's tremendously important because it's why they kind of create that information out of whole cloth sometimes. And my one doom and gloom slide is On the dangers here. If there's one thing to read in that space, I highly recommend this article by Gary Marcus in The Atlantic because he talks about the threat of an information sphere disaster. People weaponizing these things, turning them into misinformation, generating misinformation at scale. And I think that's a real danger that we're seeing in, we're gonna see it in our political landscapes, we're gonna see it in other aspects. And I think it's a potential threat in scholarship and sciences as well. But setting aside the risks and moving into the opportunities, I always love to fall back on this quote by George Box, a statistician. He said this in the 70s, I think, that all models are wrong, but some are useful. And I think that's really an important way to think about this kind of stuff, right? These models can generate false information, but they're also really powerful if we align them well, if we adapt them to our use cases, and if we ground them. And something I've seen in the last month or so that I've been really excited about is this notion of augmented language models. I have two links here. One is to this tool former article where my image is drawn from. And the other is to a review article on this topic as well. The idea here is taking the things that these large language models are not good at. They're bad at remembering facts. They're bad at generating citations or tracking provenance. They're kind of bad at math. They get math wrong a lot of the time. Let's outsource that. Let's not have the model do that. Let's have the model call out to a system that's actually designed to do that, whether it be a calculator or an information retrieval engine, or maybe a library catalog. And the first article I saw about this was from Stephen Wolfram of Wolfram Alpha at the beginning of this year, right after ChatGPT had kind of come out. He said, hey, ChatGPT hallucinates. It makes things up, but it's still really powerful as an interface to have dialogue. Put something like Wolfram Alpha behind it, which is a knowledge base that they've built over many, many, many years of factual information. And then let those two interact together so that the language model can correct itself, it can check its facts, and it can cite sources and pull information together. Now, I'm not necessarily recommending we use Wolfram Alpha for something like this. They're kind of a niche tool, but I do think that all these other kinds of knowledge graphs that we have out there, Elsevier is starting to look at this in the context of HGRAPH, which is our health markets knowledge graph, and some knowledge graphs we've been building around life sciences. Connect these models to those knowledge graphs to ground them and provide more information. And I'm struck, as I think about that, of back to the original Semantic Web article, and those of you who know me know I've been beating the Semantic Web Drum in libraries for a very long time. The original article by Tim Berners-Lee and Aura Lasilla and Jim Handler talked about highly functional agents that would use Semantic Web data to get day-to-day things done. And I'm starting to think that the reason this never really materialized is we didn't have those agents. But I also think these large language models are those agents coming together and what those agents are going to look like and really bring this structured data that libraries and other organizations have been working on for a long time, kind of more to the forefront of how we're going to use these kinds of tools. And this is just from a couple of weeks ago from a marketing webinar around the Microsoft 365 co-pilot launch. And I thought this was really interesting because I only learned about the Microsoft graph in the bottom corner of this maybe last year. Microsoft for Office 365 customers builds a graph of your institutional context. So it's reading your emails, your Outlook calendar, your contacts, the structure of your teams, organizations, and your chats, and it's extracting information about that from that about your organization and putting it into a graph. And what they're doing with Office 365 co-pilot is now using that knowledge graph to ground the way that that chat module interacts with your authoring tools. So you're working on a PowerPoint presentation. It can ground the suggestions from the model in your organizational context, just potentially really, really, really powerful. And I think I just want to close this out by saying library curation is becoming more and more and more important in this context. Vetted information is necessary for these models to work the way we need them to and to not kind of go off the rails. Foundation models need that grounding and our tools and systems have the capacity to provide that grounding. And that grounding and vetting contribute to AI safety, which I think I meant to say before, I think is one of the most important areas of AI research going on right now. So how does that grounding work? It works by building knowledge graphs of domains, of the science itself or of the humanities, of organizational structures in the research landscape. We're working on some of that at Elsevier through things like Pure and Syval and understanding the research landscape. But I know libraries are very invested in understanding these organizations and their structures. So let's build those knowledge graphs and expose them in a way that we can put API endpoints so that these kind of augmented language models can retrieve from them and make sense out of them. We can use that then to curate the output of language models to better track provenance and to enhance reproducibility in a space where people are gonna be using these tools to advance their scholarship. So that's all I had to say on that. And I'm gonna hand it back to Emily to handle the questions and stuff. That was fantastic, Judd. Thank you so much. Thank you, Corey, for those two really good presentations. It gives us all a lot to think about, I think. So yes, we are going to open up to Q&A and so if people do have questions, you can come to the mic and make sure you identify yourself when you ask your question. And I am going to lead off while you all sort of think about what you wanna ask us. So I have a question for the panel after hearing those two great talks. So, Corey, you talked a little bit about kind of some of the risks and doom and gloom, right? And, you know, the way some of this misinformation is an issue, but you also talked about some of the opportunities and, Judd, you talked about the importance of trust. So given the potential for falsification of information and this threat of increased fraud in scholarly publishing and communication, what are some of the actions or approaches that we can take as a research community to ensure the integrity of research and trust in science? And why don't you lead us off, Emily, and give us the higher ed perspective? Sure. So I think just an example of some of the challenges that we face just in terms of research integrity as a whole. About a month ago, we had a company come out and meet with our research integrity officers and wanted to talk about some tools that they could use to verify things. You know, everything, all your checklist around research integrity. And she basically said, look, research integrity officers across the country are not going to use these tools. We don't have the staff to proactively go out there and verify that all of our authors are following the rigor and reproducibility and meeting all of these things. We're so busy handling the current investigations. And so her response was that it's really what needs to happen from the higher level of the institution is to look at the partners such as the libraries and working with our authors on proper research ethics and things like that, but then also looking at the publishers and the tools that they're developing to highlight some of these issues that they're embedding in their products. And I think that related to that a couple of weeks ago, Toby Green had something in research information where he talked about how libraries, publishers and researchers have a history of working together on developing standards, infrastructure and tools. And I think that that's a perfect example of what we really need now that will help build confidence in the research community as if they're involved in this process. So as all of these companies come out with these tools, I mean, they're new ones all the time and I really appreciated the examples that Emily provided is include the researchers in formulating and testing these systems because who better can tell you if there are flaws in that and help improve what we're offering and also building that trust. I think if individual or companies putting these tools together can be very transparent about how these tools are being developed, what they're checking that that raises the confidence level not only with our researchers but also with the larger community that this is information that can be trusted. And finally, I think one of the things that I hear the most often at University of Florida and I'm sure this is not unique to us but all of our researchers are just incredibly overwhelmed with the administrative workload for everything, filling out a form for every single thing that they do. And so I think one of the things that they're concerned about is that with looking at research integrity and AI is it's going to just add more steps to that workflow. So the more that we can automate the better and I think it goes really to what Corey just described using structured metadata that they can move from one thing to another. And we're starting to see this with some journals asking for certain checklists and things like that as they're submitting an article. And so I think speaking for researchers that's something that needs to be automated. We need to make it easy, as easy as possible for them. And I have a whole other tangent about libraries but I'll hold off on that in case anyone has questions. Jed or Corey, do you wanna respond to the research integrity question? I don't really have a huge amount to add to that. I would just say that there's an opportunity side of this as well where you think about these tools as an authoring assistant as a way to sort of remind authors of things that they should be including in their method section to help enhance reproducibility. One of the problems we have right now is a lot of information that is necessary to reproduce an experiment isn't captured in the manuscript. So if there's something like Copilot helping you drive the manuscript drafting process it might be able to be tuned to remind you that there's certain pieces of information that you really should be including in order to help us better be able to reproduce that research down the road. And I forgot to mention that I think one of the key pieces is looking for those trust markers for articles. And I think it was one of your colleagues who gave the example of the model cards or the data cards. And I think that's an example of that structured metadata that if that's easily available to readers that again builds that trust for the scientific community. Great, thank you both, yeah. Questions from the floor. Hi, Mackenzie Smith from UC Davis. I don't see a line. So I'm going to ask you a half baked question that just kind of flitted through my mind as you were talking. What got GPT-3 across the finish line was that word large. It's a compression of the entire web. And you both talked about domain specific models which I think is where we would like to go. But, you know, Jed talked about kind of an unsupervised like subset of the web. And you, Corey, talked about a more knowledge graph driven one. So I'm wondering if either of you could speculate on what it will take to get to that point where we have a large enough data set or, yeah, a large enough data set to reach that domain specific model across all of the research subjects that we work on. I'll take a stab at this. It's a fantastic question and a difficult one to think about. But I'll try to ground it in a concrete example around, you know, these large language models, they're expensive to train and they're also expensive to run. The inference gets expensive. And as you expand the context size, which they've been doing with each generation of this, that inference gets more and more expensive as well. One of the techniques I've seen recently that I thought was really interesting was using the large language model as a way to generate more training data for simpler, smaller language models to work with. So iteratively asking questions of that large language model about specific things in our landscape, it's often things like genes and proteins and how they act on particular human cells and how they act in particular behaviors in the context of disease, right? And if you ask those specific questions and get back the ones, you know, get back what the large language model has to say about it and then pick and choose ones that it's not super confident about or that you're questioning and put those in front of humans and get them to correct them and get them to improve them, you can build a training dataset that you can then run with a much smaller model to kind of predict those kinds of things quickly and easily. And so that's part of that adaptability kind of thing that I was referencing. I think it's iterative. I think the large language models are gonna be tools that are never really specific enough to handle all the specific use cases but can help contribute to trying to solve those problems at a smaller scale if that kind of addresses what you're getting at, Mackenzie? Great question. So a bit of a half-baked response and I feel frankly uncomfortable sitting here and opining about this technology because I am not a technologist. You know, I talked about how we try and focus on researcher problems and that's my job but I spend a lot of time around people who talk about it and I'm sensing a growing confidence that because of examples we see in the market, like you might have seen the Bloomberg model that came out that's focused on Bloomberg's training data plus a kind of a side bolt-on from some of the web. I think that people around me seem to have confidence that we do have enough data and there's been enough tools like the Stanford-Alpaca model that is much smaller than ChatGBT but seems to function at a fairly high level and similar level of quality that we can do these kind of smaller models. And as Corey said, one of the strengths of our organization why I'm confident that we can do it is we have a lot of experience already with kind of machine extraction of knowledge with human in the loop to improve it. I mean, we have pretty well-established protocols for how we do that. And for example, we have a bunch of chemists working on tools like ReAxis already. So I sense a confidence that we can take smaller, highly curated, highly authoritative data sets, put them into these models, have, again, guardrails, handrails, human in the loop, focusing on quality and trust. We're never gonna throw something out there that we don't have full confidence in, but it seems like there's potential there. So I find that very exciting, but I'm also, it wouldn't surprise me if a year from now we haven't cracked it. But on this idea of synthetic data generation, the problem is that we know that a lot of the data that gets generated from these models is questionably true. So that brings us back to human in the loop, which is a great concept, but the scale is interesting, right? How a subset of the large language models would still be incredibly huge. So how would we even imagine structuring ourselves to provide that kind of curated authoritative feedback? I really don't know. So it's a great question, but I suspect we're just gonna have to try a lot of things. And I'm wondering if either of you could talk about what you are actually trying. Well, so I think I don't, it's challenging to be also sitting up here as, it's like there's something new every day, and Elsevier is obviously a big risk averse company, and I'm wary of saying anything that might not represent the Elsevier position, but I can tell you that, and I referenced it a little bit in my talk, we think that there's probably already huge value to be created and delivered for people without using any kind of generative technology. One of the things, and I mentioned it briefly, talking about the user problem is question generation, query generation from the text, and that would help with question interpretation and query interpretation to let people do what they cannot do today, which is throw a bunch of words in, maybe even just paste a whole bunch of text, and step one of the process is going to be, first, using some of this technology to identify the passages in the corpus, which I think I saw somebody estimated that if you took apart the entire Science Strait corpus, you would end up with something like 600 million individual sort of passages, and starting to look at those is, that's our new result list, and we show things, I mean Emily mentioned topic pages, that's a very kind of blunt instrument approach to this, but they're just an amazing advance in helping people solve some of those more simpler problems, I mean they're a godsend to a lot of students, high school students, undergraduates who don't even know where to start, because none of those people are gonna crack open one of these thousand page books and find the exact passage that might help them write their term paper, so we think there's a lot of value in the incremental approaches that we would need to follow to get to a system that looks more like one of these generative things, which I don't necessarily have confidence that we're gonna build and deploy anytime soon, and as we, again, experiment our way through that, I think we will learn a lot, and I think, again, we can do things safely, and with ways that we already are pretty familiar with that will help us figure out what's valuable and what's not, and kind of build our way into it. I don't see this as a feat first approach. Over here. Hi, Dan Cohen, Northeastern University. I was wondering if we could thank you very much for lowering that speaker, I was talking to myself. I was just wondering if we could think another step ahead, not to the technology, which is clearly evolving very quickly, but to the human behavioral reaction to the technology. When search engines started up, I think we all changed our behaviors in terms of things we would mention, keywords we would use, because we would know that it would be found, and maybe if we didn't want it found we wouldn't use certain words in social media, or things like that. It seems to me like, I'd be interested in your perspective on how researchers may, in fact, change their behavior, knowing that their articles are now going to be even less likely to be read front to back, but will be part of a kind of pool of information that will have extractions from it. It seems to me that it will continue to sort of devalue, you know, obviously good writing. I say this as a humanity scholar. I know in sciences maybe writing is less important, but still I think there'll probably be less attention to these kinds of synthetic works of making a very good article end to end when you know it's much more likely that a chunk of your method section will be extracted for better protocols for another study, or it will be compared in a way that might not be visible to the end searcher with other studies. It's sort of an extension of systematic review where you could imagine a kind of devaluing of the article itself, which I won't lament here, but it does, I'm wondering about, if we could talk about like one step ahead where you could imagine that maybe like, do we need to write articles? Should we just post data sets and some bullet points rather than having a chat bot complete those bullet points into some kind of end-to-end narrative article since we don't really care about the language anymore, I'm being flippant here, but you know what I'm saying, I'm just wondering about the human factor here and whether you have thought through, and again, I don't have any answers, but I'm curious about your views on that. That's a fantastic question, thank you, Dan. I immediately think of, there was a recent, what wasn't too recent, back in December, I think webinar from SSP, and Lucy Lu Wang was one of the, from Symantec Scholar, so she was one of the panelists, and she said something along the lines of, what is the purpose of papers, and are they for humans to read, right? So, and I think we are starting to see questioning, a lot of science is done by machines in labs and computers, right? So can the machines themselves communicate what the science was and communicate with other machines, right? So, especially with scientific writing and scientific communication, there's a lot of questions around where the human fits in. I'm gonna shoot this one over to you, Emily. So what are you hearing from your researchers at the University of Florida in terms of how they might be thinking about the human element in terms of their research? So I haven't heard as much about it, and I should state that University of Florida is one of the institutions in the country that has a huge AI initiative. So in the last couple of years, we've hired AI faculty across all the colleges, and they're embedded in the curriculum and things like that. We have an AI librarian who's actually speaking later today, so she might actually have some insight into this based on the departments that she's working with. But I'm going to take an optimistic approach to this, and I think one thing that I've heard over the years is just with the proliferation of articles, it's almost impossible to keep up with the literature. And so in some ways, maybe being able to really drill down and expose yourself to a lot of these summaries or hopefully when the citations are a little bit more reliable, is that it will make articles, things that are maybe a little bit more hidden, expose them. And true, some of that could be based on the method, some of that could be based on reproducibility, but my hope is that it would actually help cut through some of the noise. I think that's probably what I would say, just because it's just this information. I mean, no one can keep up with how it is now. I think one thing I would add to that is, excuse me, I think there's always going to be a matter of personal preference in some of this, right? Even before all this AI stuff, I know a lot of scholars that, particularly younger ones, that don't like to read papers, but would rather watch a five or 10 minute YouTube summary of that paper. And there's a lot of YouTube content creators who put those summaries out there, three minute papers in computer science is a fantastic feed and resource. And I think people are going to use chatGPT, kind of tools in the same kind of way to interact with the research in new and novel ways. But I think there's always going to be a cadre of us, myself included, that really like reading a journal article. I mean, I'm part of multiple journal article reading groups, both through my PhD program and at Elsevier. And I really enjoy reading an article with eight or 10 other people, and then all sitting around and discussing it for an hour. And I sincerely hope that doesn't go away. And I think as long as there are people who want it, I'm not sure that it will completely go away. I think it just adds another modality to the way that you might interact with this information to better serve a broader array of information behaviors and preferences. Go ahead. Hi. Oops. Thank you all for your presentation. My question is an addendum. My name's Green Bogeeta Stonybrook. Addendum to Mackenzie's question. This morning in the MLAI round table, we were discussing what if all publisher decided to put their LLM together? Because it's truly gonna be really large domain specific. I know there's a whole, you know, layers of architecture and trust, but you as a Zewer, you're open to it. Great question. Anyone want to, from Elsevier, want to take a stab at that? So, another thing that keeps me busy is an initiative that you may or may not have heard of where we are actually actively trying to bring essentially everything that would be contained in Scopus Under Science Direct in full text form to allow people to access it and find things there and access it more quickly. And Emily and I were talking about that this morning. I think that just like we don't have lots and lots of, I mean, we have lots and lots of people using Science Direct, but most people, if they're sensible, are going to go to a tool like Google Scholar because that's where everything is. So, you know, we are focused on trying to make tools that are comprehensive and give people the ability to discover across everything that might be relevant to them. So it seems obvious to me that an eventual tool like this that uses large language model type technology is only really ever going to be useful and used if it somehow includes, if not everything, at least most things. But it's, I mean, it's absolutely our vision for Science Direct that eventually that we can evolve this thing into a tool that does include everything because that's what people want. And I think, you know, Elsevier's an odd ball in the industry a bit, because it's a publisher, but it also has all these other capabilities. And we think that's part of the value that we could bring to other publishers is somebody's going to build it. We're probably building it. Wouldn't it be a lot easier to allow your content to be interrogated in that way and allow your customers and users to get value out of it? So, yeah, I would certainly be optimistic that that's where we're heading, but. Yeah, thank you, Jen. Yeah, I don't think we have specific strategy around that. Yeah, but it's an interesting question. And yeah, something that I think about for sure. Other questions, yes? Maybe one last question. Shimawa from Northwestern. Corey, you mentioned something caught my eyes. You said curate output. I took that you meant output from the GPT. So can you elaborate what do you mean by that? Mostly, what do you see or imagine would it be the library and the library's role from the expertise to the infrastructure? That is a fantastic question. And like many of this, I'm gonna give, I guess, a half-baked answer because it's such a new area. But I think increasingly we're gonna see scholars wanting to interact with these systems in order to refine their thinking, learn new things. And in the process, conversations are gonna, these dialogues are gonna bubble up that are between a research group and this AI that represents the corpus of all information in human history. And there's gonna be a conversation between those. I think those become information resources in and of themselves if they are managed correctly and thought about in interesting ways. I think there's a tremendous amount of potential for libraries to sort of add this into the set of things that they collect and then provide access to in the same way that you're starting to do for other kinds of multimodal content, other kinds of primary source content, video, whatever else. Why not also include the output of these models along with the prompts that got it there when that was done by a particular part of your institution, right? Is that something that should land in an institutional repository? I think there's potential really interesting value to that. But again, this stuff is so new. I don't purport to know exactly what that looks like. But I hope that gets a little bit of a flavor of what I was thinking about. And again, it's a fantastic question. Great, thank you, Shima. We have two minutes left. We have time for maybe one more question. If anyone else has something, a comment or a question or anything they're wanting to ask. I will just shoot one more question over to all of you. So our topic today is human AI collaboration. What are some of the new skills that either researchers or librarians or people in the publishing industry need to acquire to collaborate with these tools and to interact with these tools? I just want to, so I was playing around with chat GPT quite a bit and trying to push it and ask just really bizarre things. And like many people who've done this, you might find some things alarming. And so I went to Brie, our NLP librarian, our AL librarian, and she really explained it all to me in a way that I understood, saying about how it's based on statistics and things of that nature. And about a week later, I was in a call with some librarians across the state of Florida who were really alarmed by, say for example, nurse clinical researchers coming to them with all these citations and really concerned because they were all false. And it could have led to maybe something with clinical care. And so the librarians on the call were really disturbed and had this very strong reaction to it. And I was able to take a step back in using the information that I had heard through my own experiences, explain a little bit more about what the foundation for AI and what it's doing and things like that and explain why they're seeing what they're seeing. And so I think that for librarians, I think the more that we understand what is behind these systems, I think helps ease some worries and then also think about also learning again about those tools that are out there that we can either guide people to work with our information literacy, start to work with our students and researchers as we work on responsible conduct of research and things like that. Great, thank you very much. All right, well we are pretty much at time so I wanna thank everyone for coming today and for all the wonderful questions. Thank you very much.