 Hello, everyone. So we're just about ready to get started here, and we're going to continue with the theme of trust and automation, and thinking about how we can improve the systems we use around scholarly communication by automating some of those processes to achieve trust. So I'm Christina Drummond. I know we're still working on getting some slides up here, but I'm the executive director for the Open Access Book Usage Data Trust. And it is my honor today to be here with this panel to talk to all of you about a workshop we actually held just a couple of days ago immediately before CNI. This conversation is one around how do we think about and improve our ability as an ecosystem here in the United States to share usage data and impact metrics for publicly accessible research. For those of you who heard the talk yesterday around public access mandates and everything coming out in the Nelson memo, I know I'll speak myself. I was struck by watching the NIST presentation because I felt like I wasn't alone. For OA books, we've been working since 2015 with some funding from the Mellon Foundation to look at some of the challenges when you have these open research outputs, books in this case, and they live in multiple places and you're pulling usage from multiple places. How do you aggregate that? How do you benchmark that? How do you exchange it in a trusted fashion? Long story short, it's complicated, it's time intensive, and you have run into issues with data quality. And we wanted to know if we weren't alone. Is this an issue just with books? Is this a shared issue with other outputs? And that really became the basis for this workshop, which was supported by the National Science Foundation. So today it is our honor to bring to you some of the findings and some information about what it was that we discussed. I'm honored to be joined here by my co-organizer Charles Watkinson and also Neal Stern, who traveled from Europe to help share some European perspectives on how they're handling this with infrastructure as well. So with that, I want to hand the mic to Charles. So thank you very much, Christina. I'm Charles Watkinson. I'm the Associate University Librarian for Publishing at University of Michigan and Director of the University of Michigan Press, and this year's President of the Association of University Presses. So I'm going to talk about the objectives of the workshop and the perspectives that were represented at it. And we do have, for after the event, we do have a copy of the program that we had for the workshop that you're welcome to take if you're interested. So the first thing that we really wanted to do was to capture an understanding of the current state of public access usage and impact reporting, because the reality is that there's a lot of projects going on. There's a lot of interest in this, but it's incredibly dispersed interest. And there's dispersal based on the sort of outputs one's talking about, so books, journal articles, data sets, and also an interesting area of other stuff. And so there are lots of people thinking about this in different spaces, but they've not really come together. And the other part of this is there's a very strong conversation going on in Europe about these kinds of issues. And it's not really manifested in quite the same way in the US. And also there are conversations going on around the rest of the world. So the aim of this is really to try and get a snapshot of our current state. The second part here is explore how to achieve fair usage data. And Christina has put here Fair and Care to Share, which may be my new favorite motto. But essentially these are the different values frameworks that are being used to think about data and the sort of the principles and the ways we want to think about the behaviors we want to adopt around data. And of course, Fair will be familiar to you, but Care is introducing the perspective of indigenous knowledge. And also I think introducing the concept of the human into data that is otherwise really fair is more focused perhaps on the machine. But the human aspects of this activity I think are right at the heart of what we wanted to explore because the last session was all about trust. And this is a space also with usage data where trust is absolutely central. And then the third area is develop recommendations. But what we wanted to really do is open up the conversation. We wanted to get out there some nascent recommendations with the hope that we can really nudge a discussion that needs to happen and is currently happening in silos, which is odd because really it's a convergent conversation. So it used to be that we cared about different things when we looked at usage data for journals. You know, libraries cared about, you know, use per article. Books were like this one-shot thing where you bought the book and then you kind of had it sit there. Essentially all of these are now converging around the idea of public and open access and how we move from usage information to an understanding of provable effects on the real world. So in other words impact and how we get from here to there. So participation brought in perspectives from a number of different parts of the landscape, data infrastructures, consortia, publishers and platforms, funders, standard setting organizations. What was interesting in the room is everybody had multiple hats. So there's a lot of overlap between these perspectives. I'm now going to pass back on to the next slide. Thanks, Charles. And while you do that I do want to just elaborate for those who aren't familiar with the care acronym. You know, we're used to fair, findable, accessible, and reproducible. The care part really comes back to control and authority and ethics. So making sure that there is a shared common good, making sure that there is responsibility and ethical use of that data. And when we think about usage data, you know, I'll share a quick quick before we go into takeaways as I mentioned in the workshop. I don't know, raise a hand if you were following the TikTok hearings in Congress. I might just be the only data geek who really love that stuff. But long story short, the conversation around TikTok being used in public institutions is around usage data, is around geolocation data and who gets access to that data. Now we're talking about usage data with respect to scholarship. And so I just want to pause and note that these are the same things from a technical perspective and that we're trying to think about well, what does that mean when we're talking about trust and ethical use at scale for usage and impact metrics. The takeaways from this gathering, this day-long gathering, we really ask folks, well, what's needed and what's next? And there were kind of four general themes that emerged. The first really was one around discovering what our shared vocabulary is. What are the exact definitions when we say usage? What specifically do we mean? Which, you know, is it a glossary or they're crosswalks? We have a lot of existing standards in these different spaces that we refer to. And so making sure that we have that common communication framework to build from is going to be key. Second, and perhaps I like to think of this that you already heard of as a data geek, a policy geek, but making sure we have a solid understanding of the compliance elements of this, the contractual elements of this, and what that means in the global scholarly communication space, where, depending on where your author may be, depending on where the publisher may be, laws around IP addresses, and if their private information, for example, may differ. Third, we really had a conversation around those values and principles and how we need to understand what our shared frameworks are around data stewardship for usage data. I'd mentioned earlier that, you know, with respect to publicly and accessible scholarship, the information about how that data or how that scholarship is used could become sensitive. Is it something that needs to be as open as possible, but as controlled as necessary because of potential negative impacts and unattended uses that could happen down the road? So having this conversation across scholarly outputs or across our different communities around what those values and principles are, specific to how we steward that data and what the ethical use of usage data is is something that came out. And then finally, there was a, I would say it's a chicken and egg. It's always hard when you think about systems and infrastructures to figure out where to start. And I have to credit Tasha Bell and Cohen from Counter for introducing not the MVP, but the MLP, the minimum lovable product. What is a solution here that not only makes our lives easier, but we love it so much, we're willing to sustain it, to invest in it, to continue using it. And so we had a lot of conversation that circled around what does that mean? And I'll talk in a moment towards the end, but with the book space for the OEI usage book data trust, we've been leveraging some of the frameworks coming out of Europe that has been heavily funded by multiple programs through the European Commission to develop something called an industrial data space. And they're looking at how do they create interoperable ways for both public entities but also commercial entities to share their data through infrastructure and to provide that data governance layer in the same way we would think of the internet today where it just works. We want to simplify data sharing agreements, contractual agreements, all that processing and connecting and data curation and normalization that we have to do today around usage. And so we're exploring that on the book side with the IDS model and in this workshop, it became clear that as an entire ecosystem, we can take what is being piloted in this very narrow way and learn from it and perhaps see what is extensible and what isn't to this greater conversation. So what we're now going to do is we're going to go into some initial reflections, just personal reflections from the three of us on what we heard two days ago. And it is very initial because it is just two days ago. So this is our dinosaur reactions, our gut reactions to what we heard. And Niels is going to start. Thank you, Charles. And thank you for inviting me to this panel. I'm really delighted to be here. And I'm Niels Stern. I'm Executive Director of the OAPEN Foundation and the Director of Open Access Books, two not-for-profit organizations dedicated to provide open infrastructure services for monographs and edited collections. And I was brought in also, I think, to give a European perspective on what this is, how we manage or how we try to deal with these challenges. And just to give an initial simple explanation of how it works from our perspective. So OAPEN is providing services for open access books as an infrastructure. We are operating globally, but we are based in a specific country in a specific field. Now, we are also part of a larger infrastructure called OPRAS, which is providing support to scholarly communication in the social sciences and humanities. So a broader perspective, also including other output types than books, so journals and data and other things. And OPRAS, again, is part of a larger interoperability framework in Europe called the European Open Science Cloud, the EOSC. So in this way, you can see there are different levels and ways to sort of connect initiatives. Things are not perfect, but it's work in progress and maybe that can be inspirational in these conversations. So I was very honored to be part of a conversational workshop dealing with U.S. national explorations on this topic. And I think it was an overwhelming workshop and I had a lot of takeaways and a lot of things to digest. And still, after just like two days, I'm still trying to figure out how to take this down. But I think the initial takeaways that are also presented on the slide here, I can report back to some of those or actually to all of these four, but just start by saying that while we're talking about a national exploration exercise, it's, I think one of the key takeaways was that it's actually an international challenge and we need to find international solutions because there are definitely actors in different places that we need to come together. So again, I really like this way of breaching between continents and of course, think of it more globally as well. And also, a lot of focus was on books, which of course I like because that's my field, but I also think it's because there's a lot of unsolved challenges in the field of books still. And this brings me to the first sort of initial takeaway listed on the slide here, so education and advocacy. And quite simple things like definitions and common language about things were discussed. So for instance, in Europe, we talk about open science as all inclusive, all disciplines, including social sciences and humanities, which I've understood is not the case in the US. So this is a small thing, but it does make a difference. Also the definition of the book, the academic book. In Europe, we have 24 different languages and we have very different diverse practices of book publishing and so we need to agree on what are we actually measuring? And this is not a simple exercise, but we respect this, what we call Biblio diversity, which is very meaningful to what we do. And then we also need to understand what are we measuring and why are we doing it? So what kind of impact, and Charles was talking about real life impact, so how do we get there? So we need to ask the stakeholders in the field. I'm the scientific coordinator of European Project Currently, which is convening research funders also to discuss what kind of metrics do they want to evaluate their policies? So we're trying to help them in developing policies, implementation of policies, but also in the impact reporting. And then of course we need to identify the organizations that are needed to solve these challenges. So there are many small organizations that are vital to the community and we need to make them come together to solve this problem. The next one is about applicable law and policy and I was a bit hesitant to talk about law at this stage and I think that was also a takeaway from the workshop that could we try to talk about trust first and not go down a legal potential rabbit hole? And I think what we're trying to achieve is basically to find a way of having a shared space, a shared data space for exchange of usage data and as long as it's transparent and accountable and open we can hopefully maintain a level of trust that will make contributors feel confident in what we propose and try to keep it as simple as possible. I think that was also a takeaway and we have the tech in place. We have a book analytics dashboard in place that is currently up and running that is another Mellon funded project we're part of but we need this clearing house. There's a whole aspect around that which is what Christina mentioned. And then on the values, I mean absolutely important that we sort of come to alignment around our big values, the vision and I think the Nelson Memo did a good job. I was very inspired when the European Commission came out with its work programs seven, eight years ago really showing how open science, open innovation can help solving some of our big grant societal challenges. But all that visionary, all that narrative has to boil down to also principles for good data stewardship. So there has to be connections in both directions. And then finally, I think what came out as one of the best terms of the day was the minimal lovable product. So a nice way of phrasing an MVP. And I would just finalize by saying my personal minimal lovable project would be to come up with a simple visual articulation of a data space for open access book usage reporting so that we give clear overview of what are we actually talking about? Who are the actors? What is the direction that we are heading towards? What are the gaps? So we can share that with you, with other stakeholders and get feedback and improve our minimal lovable product into something more sustainable, which will then fit in some way, have a better chance of fitting into a larger sort of research infrastructure framework and be useful for scholarly communication across the board. Thank you. And we agree that I would go next. And so education and advocacy also really was interesting to me. I think that we are seeing a lot of invest in open infrastructure. Niels has some leaflets actually, which are very, very interesting about the invest in open infrastructures for scholarly books. And it was just announced today that our facilitator, Catherine Skinner, has just been hired by invest in open and has the invest in open has received an a million dollar melon grant. And I think that's very exciting. So as we think about investing in open infrastructures, let's not forget the usage infrastructures. So that would range from the identifier suppliers. So Chris Shillam from Orchid was with us. But it also goes through to some entities that we just assume are sustainable, but may not be. And one of the particular ones that came up in the workshop was counter, project counter. And of course, all of us as publishers and libraries are relying on counter stats to a greater or lesser degree. But we learned from Tasha Mellon's Coens, who's one of the videos, you will share a link in a moment, who wasn't able to attend in person but did a sort of a testimony by video that counter has 0.5 FTE and paid staff. So the whole infrastructure we're relying upon to actually analyze usage stats has half a person paid at the heart. And that is a pretty scary thing. And Tasha has just taken over as director and she's managed to get another half person to be this education and advocate person. And I do think thinking from a university press perspective, it's really striking how our members, EU Press' members who are mostly very small publishers, many of them don't know how to create a DOI. Many of them don't know what an orchid is. And these, as we heard from the national, from the public access reports out yesterday, a DOI is absolutely central to being recognized by a federal grantor, right? So that's really scary. So we've got a big gap to bridge and we really need to think about these organizations that we rely on and their health and sustainability. The second thing is really around values and principles and applicable law and policy. And that's this point about trust is core. So Christina and I work together with Charlotte Lair, also in the audience and a number of representatives on this open access e-book usage data trust project, which is funded by the Mellon Foundation. There's a parallel project called the Books Analytics dashboard which is closely connected but is deliberately kept separate from, so there's an arms length separation. But the Books Analytics dashboard is really interesting in that it's done very well with getting usage stats from the usual suspects, as it were, from JSTOR, from Project Muse, from OAPN. What's problematic is it's now running into barriers. So try getting usage stats from ProQuest, try getting usage stats from EBSCO, try getting usage stats from a publisher who wants to keep things close. But also try getting usage stats from all of the library repositories that are actually holding book chapters, journal article chapters, et cetera. And one of the other participants in our project is Joe Moore from Iris. And Iris, as you may know, it has normalized 209 repository usage data, but now it's run into the problem where Joe doesn't know if Iris can share with publishers the data from all these individual repositories that's been normalized because of this lack of clarity over who has guardianship over which data. So that chilling lack of clarity and the lack of a trusted space is making it impossible for publishers, for authors, for funders to get a 360 degree view of how we're advancing from openness to public impact. So it's a really deep problem and trust is at the heart. And for those of us who like, I just want my dashboard. The problem is we'll only get a certain level because at a certain point this is data that's being re-deposited, all these outputs, open outputs are being re-deposited across multiple platforms. And unless there's a trusted way of those platforms sharing the data, which is mutually beneficial for all those platforms, using a technology or a framework like the industry data space which is used in far more competitive industries, there will be no advance on our ability to measure progress towards achieving grand challenges and linking openness and public access to the grand challenges that we seek to address. And I'm going on too long. So lastly, I'll just say that I think a particularly interesting problem is looking at this other stuff outputs. We're seeing scholars who are increasingly developing digital projects that have multiple components in them. They're not book-like, they're not journal-like. They kind of incorporate articles and video and data sets, et cetera. And these new types of outputs have even more complicated usage issues. And just slightly anecdotally, we at University of Michigan Press published 75% of our front-list open access now thanks to the supporting libraries for fund to mission. There are two books that hit the counterstats hard. They have many, many more counterstats like together they have all the usage that MIT Director Open reported yesterday. But that's very skewed because they're not books. They're book-like things with multiple resources on the Fulcrum platform connected to them. But all those resources, all the counter usage rolls up to the book level. So we're kind of gaming the system. So when we tell you, hey, we're more successful than an MIT Press, that's completely false. That's because we're relying on a counter system that needs to keep adapting, needs to keep researching as these outputs change. So that's me and Christina, sorry, over to you. So a couple quick observations and then we'll shift to questions. The one thing I will note is as you hear about usage data aggregation, I'm sure there are some in this room who are like, isn't this an easy problem? We have technology. Can't we just build a data warehouse or a data lake? There are open data lakes. I'm sure some of you have heard about and some of our consortia that are building. But the problem is it's not technology. It's about governance and trust. Can an entity trust that the data they put into a space is used and protected in a way that they need? I'll note that this really is disruptive innovation. The International or Industrial Data Spaces Model, the IDS coming out of Europe, is radically different than how we do things today. It is a semantic web decentralized solution where the data is sovereign. It stays with the entity that creates the data and is processed in transit to where it needs to go. And so I'll just footnote that there is a lot we will have to go through as a community as we learn how to adopt those types of frameworks and technologies to the governance and stewardship principles we develop. So I think we have to start first with defining those values and principles, keeping an eye on ethics and what we truly need from this usage and impact information before we get to the technology. With that, we did wanna save some time for questions. I know all Charles advances to the last slide. I'll just note one of the great things about this workshop is we had 11 invited talks, five minute talks from a number of the players in this space. So we wanted to invite those of you. You can see the ones that were prerecorded, the ones that happened on site are not yet in there. But for those of you who may be interested in taking a deep dive on the current state, you can watch those videos. If anyone's interested in finding out more information, you're welcome to contact me and I can put you in touch with the players or engage you in the work we're doing on the bookside. But with that, let's shift to questions, reflections, and any thoughts from all of you. And I'm going to keep this slide up there so you can get the snapshot. And I just want to say, we've sort of shortchanged Christina on her reflections, but I really wanted to pay tribute to Christina. She's bringing the whole concept of the industry data space into our industry. And it's really, really exciting. If there are any comments, I also wanted to say, please do come up at the end or at the break or come and talk to one of us because we definitely see this workshop as the start of the gathering, we'd love to see some more videos in our collection hearing about other projects in this space. So please do come up to us or email Christina after the break. But if there are, if there are any questions or comments, we'd love to hear them. I'm a senior research scientist who looks at a lot of data sharing and reuse among researchers and have recently, I'm currently working on a paper actually with folks from the University of Michigan at looking at trust in data repositories. And we're finding, typically there are several precursors to trust, right? There are factors that lead to it. And I'm interested to hear what you're thinking around trust if you could break it down a little because there are different factors, so integrity or benevolence or structural assurance. So there are identification. So there are different ways in which you can do that. And you're dealing with a complex community, those who are giving you data and those who are using the data. So that trust can vary across those. And so I'm curious about your thoughts on that and kind of how you're thinking about those things. That's a wonderful question and I'll fit note what we got to do with Melons Report was to really do a lot of literature review around this and there's some excellent resources from the Open Data Institute around trustworthy institutions and data collaboratives and how to sustain that. I'll say that there are layers of trust. Everything from cybersecurity and technical controls. We were lucky to have Ken Klingenstein from In Common and Identity Federation Internet too with us all the way through making sure that you do no harm. So trust needs to be both from those who put in the data that it's secure but those who rely on the data that they know its quality and that they can rely on what, you know, the data is what it's supposed to be. But at the end of the day, the other thing and this is where we get back to care principles, we need to have trust in this ecosystem that we're not doing any harm to the scholars and to the readers that are reflected within that information. And we have time for one more question. And by the way, we would love a video from OCLC. Hint, hint. Hi, thank you, Bruce Hedrick from Ithaca and JSTOR. As we've been working for the past several years on building an organization infrastructure services for the dreaded other, the special collections, distinctive collections, primary source collections, I'm wondering, you've kind of focused on books, journals, data sets. And is there really room for the other in your conversations because I can tell you from aggregating thousands of collections from hundreds of libraries, there is a big question about usage data and impact there that people are trying to get their arms around. C&I is such a great meeting. I had the chance to talk with a senior developer from UT Austin libraries on the first day just at one of the tables. And he was talking about some of the special collections that UT Austin is making available. And the question about usage data that comes from special collections, but also the way in which the impact of the letters that they receive from the people whose lives these have touched, in many cases, people from marginalized communities, that is impact to him. That's what keeps him in our industry. So I think the special collections conversation really needs to come in there. And I think that's maybe a missing group for us. And we should make sure to actually incorporate them. So we have arrived at the time for the break. Thank you very much. Do come up and say hi or email Christine or talk to us at the break. Thank you so much. Thank you.