 I would like to invite Sarah Deraike to the podium. And she will be talking about surveying the global uptake of research assessment reform. The floor is yours. Hello, everyone. For those that don't know me, I'm a professor of science and evaluation studies at Leiden University. I'm also co-chair of the Research on Research Institute. Over the past 10 years, I've tried to develop a research agenda around the interactions between evaluation and knowledge production. I also spend half of my time trying to intervene in policy settings by way of evidence-informed advice, for instance. So as a disclaimer, I've been involved in several of the responsible metrics initiatives that I'll also be talking about, including the publication of the Leiden Manifesto for Research Metrics in 2015, and also the UNESCO recommendation on open science in 2021. So I'll be talking about the state of play when it comes to the global uptake of research assessment reform. And reform movements for responsible metrics or responsible research assessments starts to emerge around the 2010s with interventions such as the San Francisco Declaration on Research Assessment, the Leiden Manifesto I already mentioned, and also the metric tide report that James Wilson, who's also here somewhere, also led on. And the principles shared across these reform documents include not relying on journal-based metrics when making assessments, and only using indicators to support rather than replace expert judgment. And more recently, you can see that these reform initiatives are also broadening out to include movements for open science and also research integrity diversity, equity, and inclusion. And they all share concerns around the fact that an excessive emphasis on metrics overlooks and also disincentivizes other important academic contributions, including teaching and collegiality and openness and integrity. I recently chaired a global scoping review on the future of assessment. And for that, we looked at the regions from which these movements emerge. And much momentum is currently coming from Europe, perhaps with the exception of global actors like UNESCO and some regional actors who have also championed research assessment reform. And these efforts use campaigning to raise awareness of problems around metrics, but have also issued global standards and principles of good practice, have appealed to self-regulatory mechanisms like asking individuals and organizations to sign pledges. And they place much emphasis on responsibleizing individuals and also organizations to become better citizens and to rework what is meant by good evaluative practices. Now, signing Dora can be seen as an indicator of impact of the responsible metric movements that we would have to conclude that this movement didn't yet have a lot of impact here in the US, because in February, only two academic research performing institutes had signed Dora and only four academic departments and three university libraries. And that's not many given that the declaration is 10 years old and smaller research systems like the UK and Netherlands have scores of academic institutes that have signed the declaration. So for the rest of my talk, I will draw on a survey and survey data and interviews that my colleague Alex Rushforth and I undertook as part of Project Terra. It's a project led by Dora. You see the current team members here on the slides. I'd also like to flag that Anna Hatch, who's also here, was, there she is, was, sorry, the program director that actually made this all happen. And she was the guiding force to get the funding from Arcadia. What we tried to do with Terra is to identify and also understand and make visible the criteria and standards universities use to make hiring, promotion, and tenure decisions by developing an interactive online platform, dashboard that tracks criteria and standards by also developing a toolkit of resources and by doing research on US academic institutions to understand attitudes and approaches to assessment reform. So for the survey and interview part of Terra, Alex Rushforth and I wanted to know how do researchers and evaluators in the US engage with the responsible metrics agenda and how are notions of responsibility experienced and practiced amongst them. Our survey focused on the current practices around hiring, tenure, and promotion assessments and also on whether organizations had or hadn't signed Dora or took other measures to reduce reliance on metrics in these procedures. I focus in this talk on the qualitative data in the open text questions of the survey and on the data from the interviews. We held 18 in-depth interviews after the survey, which covered issues from respondents' awareness, again, of these responsible metrics campaigns to their uses of metrics in hiring, promotion, and tenure, to their views on the prospect of also reforming these procedures in their institutes. So what we looked at, basically, with the survey and the interviews was how fluent respondents were with the responsibility language around the Dora statement and why the responsible metric movements and how they engaged with the responsible metrics to reform movements' responsibility language. So we were asking about is it appropriate or when is it appropriate to use metrics? What are your professional obligations to use these tools responsibly? And do you recognize accounts of good and bad practice advance within these responsible metrics for reform movements? I'm an SDS scholar, and this will show now because I'm also now drawing on some literature from science studies that we used to understand our materials. One of them is the citation folk theories literature. So folk theories are generalizations based in some experience but not necessarily systematically checked. Their robustness derives from their being generally accepted and thus part of a repertoire, current in a group or in our culture more generally. So these types of studies are great for understanding whether and why scientists find citations valid or important tools for assessing impact but what they don't or haven't done is to link citations to questions of professional responsibility. And that is something that the responsible metrics movement prompts scientists to do. So this is why we also turned to science studies accounts of other responsible science reform movements and in particular to the concept of responsibility languages coined by RIP to analyze the responsible research and innovation movement. So responsibility languages set out a grammar for responsible action packaged in the form of rules, standards, principles, mantras, narratives, and they seek to transform the world through pushing a division of moral labor. So texts like Dora or the line of Manifesto and metric tide, they offer model languages that universities can copy into their documents and websites to communicate that they are responsible evaluators. And general characterizations of bad evaluators also circulate within the movement, for instance, individuals or organizations that persist in using the journal impact factor or replace expert judgment entirely with metrics. But the question is, does this new division of moral labor travel and reorder the world as much or as straightforwardly as advocates hope? Multiple responsibility languages are floating around in today's science and even scientists unfamiliar with or uninterested in guidelines or standards for responsible conduct will still most often identify with a responsibility to do good and to do good science. And these types of bottom up accounts may or may not align with these new divisions of moral labor that a reform movement's responsibility language articulates. So tentatively and without trying to claim any generalizability here, our overall impression was that there was not a very strong familiarity amongst our US-based respondents with the language of responsible metrics. Informants rarely referenced specific points in responsible metrics statements, displayed a lack of awareness of who's behind the strong push for metrics. So nobody, for instance, mentioned publishers or research intelligence providers. And they had a clear propensity to present their own bottom up responsibilities, which were different from the reform movement's language or were similar only by coincidence. I will give you some quotes from the interviews before I wrap up. And these correspond with three types of responses to problems and solutions set out in responsible metrics reform movements. And one of these largely agreed with the framing of the problems, while the other two kind of responses were more ambivalent. I would shut it down if someone on their CV wrote it. I needless to say, I have to write a lot of letters for promotions, and I will see that on people's CV. I've never seen anyone, none of my faculty have ever put that on their CV, and I would have it removed. This is the kind of thing where I'd go back to them before we send the stuff out and say, take that off. So this respondent clearly signals that they are responsible and take it as their job as a senior academic to challenge any uses of inappropriate indicators when they encounter them. After reading out a section in the Dora statement calling for abandoning GIFs, the interviewer responds, yeah, exactly. I would say that that's in line with what I was saying, although we like the differences, that if it is there and noteworthy, we would say something about it. We don't base the assessment on that. So several respondents firmly argued against any notion that assessments were skewed towards metrics, arguing that they appear in certain parts of the decision-making processes, but they do so alongside other considerations and do not dominate assessments. The core emphasis of the pragmatic rejection accounts was that on the needs to live with the imperfections of metrics, so it goes something like this. I think for the most part, people kind of begrudgingly are okay with it in the sense that it may not be the best, but it's probably the best alternative that we have. And so if there were other alternatives of things, maybe some of the type of metrics that we've talked about that gain more traction, open science in the case, as we were discussed earlier, and maybe they could gain some momentum, but otherwise I would say that people are kind of, this is kind of the way that we've done it and it's worked out okay. So we're just going to keep going down that path. So this is obviously a very different repertoire than the technical and also the methodological criticisms against the journal impact factor and the age index that reform advocates and before them, also scientificians have used. So when people defend the persistence of the GIF and the age index, they implicitly reject the bad expert construction that the responsible assessment reform movements are using. So again, tentatively, it seems that among our respondents, the responsible metrics that Jenna wasn't very visible, even when presented with information on the movement, the scientists may still construct the division of moral labor around indicators in different ways to the movement. And that means that the problem monetization doesn't disrupt or displace already embedded bottom-up responsibilities. So on its own, raising awareness and providing further information on the technical limitations of indicators or better alternatives is not enough for mainstreaming this agenda. We hope that this relatively small-scale study has some relevance, some broader relevance for empirical research on reform, given that reform movements have an international or at least a large regional focus. We hope that it provides a blueprint potentially for posing questions of responsibility to science elsewhere. And I'll leave it at that. Fourth coming is this study preprint will be published in May. The scoping review I mentioned will be published this Thursday. And I'll also be talking in a podcast on all things good and bad about research assessment in a couple of weeks' time. So thank you. Thank you very much. We have time for just one question. Hi, I'm Chanetta Jones. Adjean, Sara, thank you so much for your presentation. We've heard a lot about the reward incentive system and presumably that's the same around the world. So the question is what makes Americans different? And I'm wondering if you and your team have speculated on this. And if there's any hint to some national research assessment structure that might be linked to what you're observing. Absolutely, that's one. So this is an answer that could take an hour or 30 seconds, but there's a lot going, yes. So definitely there's this link you pointed out with the capacity for science systems to move when they have a national research assessment scheme like the UK and the Netherlands that I mentioned. Also the funding structures for universities are very different in the US. Yeah, but still, I think it's striking that this type of policy incentive hasn't really caught on when you compare it to other policy pushes around open science, for instance. Three seconds. Ha, ha, ha. Ha, ha, ha.