 Hi everyone and welcome to the workshop on collaborating to reduce research waste. I'm really excited about the session and I'm delighted that we have our three panelists here to discuss this issue. We have Mel Bond, Thomas Luttefeld and Alex Bannock-Brown. This is a recorded session. However, while it's being played on YouTube, the panelists will be available live to answer questions in the Slack channel for this workshop number four collaborating to do reduce work research waste and they will also be available on YouTube live comments as well. So take it away. Okay, so did you two want to introduce yourself briefly? Yeah, sure. I'm Tom Luttefeld. I'm the founder of CISREV.com. CISREV is a platform for doing system overview. A lot of the motivation for doing that was actually to reduce research waste. CISREV is an open access platform. So projects are publicly accessible and yeah, I think that's the short introduction to me. And I'm Alex Bannock-Brown, a research fellow at the Berlin Institute of Health at the Quest Center. And yeah, I've just been working in the systematic review space for a while, working to build tools to help us automate systematic reviews at any step that we can get a tool to help us. And yeah, over the past couple of years, we've seen both new tools emerging as well as we've got loads and loads and loads of tools. So just trying to keep up is one thing. So yeah, I'm really excited to be part of this panel. And I'm Mel Bond. I am an EpiReviewer support officer based the Epi Center at University College London in the UK. And I'm also really excited to be part of the panel. And so we have a list of questions that we thought we would discuss and talk about. So I'm going to perhaps ask Alex the first one. What do you think constitutes research waste? And when does the tool contribute to research waste? Yeah, I think that's a really good place to start. You know, you might think that just because we've got two or three different tools that do exactly the same thing, I wouldn't necessarily constitute that as research waste, because one of the tools might have been developed for a specific context. A tool could have been developed using an underlying set of assumptions. There could be some statistical assumptions that it uses. And we need to know the conditions under which the tool was developed in order to be able to apply that tool for your new research. So I think we, while we have so many tools that are out there, and it's very hard to keep track of which is the best tool to use, and the research systematic review toolbox has, you can just go and find new tools all the time. I don't necessarily think that that's research waste, but I do think we need to be careful in when we are thinking about developing new tools, checking what's out there already. And Thomas, when do you think a tool is redundant, or does become redundant in terms of research waste? Yeah, so I think in terms of when does a tool become redundant, a little bit of redundancy is a good thing. You know, people developing the same ideas and, you know, taking different approaches to it is very useful. That's how we make progress, build robust systems. I think in systematic review there's kind of a special kind of research waste, which is, you know, people looking at documents and extracting information from those documents. If you go on systematic review registries, you'll find, you know, if you search for like a chemical like mercury, you'll probably find 100 plus reviews of mercury. And while those reviews might have different topics, when people are reviewing them, they're, you know, they may review the same documents, they may need to extract similar information, and that's real human time that's getting wasted. So that's a special kind of research waste. You know, a special case of that as well is when, you know, the data extraction that's being done by humans could be automated. And that's like a particularly depressing kind of research waste, where, you know, you have people extracting information about documents that's sort of readily available, but not in a machine readable format. You know, maybe not a great example, but you could think of species in a paper, you know, maybe the species that was studied in a paper is very obvious. But you don't have it available to you for, you know, a statistical analysis. So you have to have people extract that information. So, you know, in terms of redundancy on that, yeah, I mean, we do need to make better tools that allow people to access that data and reduce both human waste and redundant tool waste. Yeah. Mel, do you want to take that question as well? Yeah, well, I was just thinking about in terms of research redundancy in a way, it's kind of what you're saying with people extracting the same sorts of data from the same research articles that have been out there. I mean, I'm wondering what kind of repositories or what kind of infrastructure we need to have to enable more awareness of the fact that these already exist. I mean, you see so many reviews that cover the same sorts of topics, even within sort of my space, which is education, which is only just really starting to really rev up with the number of systematic reviews being conducted. There's already, you know, overlapping, lots of overlapping research that are just revisiting the same sorts of ideas and not really expanding or gaining further insight into interthing. But I'm thinking, you know, is that simply a lack of awareness of the fact that this research has already been done? And what could we do perhaps to then increase that awareness within the community to avoid some of that? I don't know if you had any ideas around that. I think that's a really good point, Mel. And I think that comes back to Neil's points on open synthesis. So what the underlying assumptions of the way that that data was collected. So we've got understanding of how the systematic review protocol was made, the settings under which that the data was collected, how it was done, was it to students, was there a PI, was it five PIs? And I think the same goes for tools as well. So if we don't, if there's a lack of transparency on how the tool was developed, you know, lack of documentation, then people are more or less reluctant, either they don't know about the tool or they're less reluctant to use the tool because they don't know how it's been built. And I guess the same is available as the same for data. If you were to make your meta analysis data that you extracted individually from all those graphs available, but without any information on where your protocol is or how that process happened, people would probably be reluctant to use it because they weren't sure of the quality. The community aspect is really similar to the open source community. So GitHub is very relevant there, but other Git platforms as well. When you're writing code, at least my process, and I know the process of many others is usually to look and see what open source packages exist. And a lot of the time just using existing open source packages when building new tools, when the license is amenable. So in programming, there's this, I mean, really rich ecosystem around tools being built for every single purpose. In the document review, systematic review space, I mean, we have a similar situation, having open source tools in systematic review. There are some of those now. And that's one way to approach this redundancy and waste issue. But the data is, the data is at least as valuable as the tools being built. And the tools we have for making that data interoperable, reusable are not great right now. And I don't know what the solution to that is. Making data open access is part of it. But it's not the only thing because if we make the data open access on our platform and somebody else does it on their platform, and then you just get 100 places where people can access the data and it's overwhelming. Maybe there's a technical solution. I hope we can figure out what that is together. Okay, well, we had a thought of a slightly different take on things in terms of what it comes to people doing projects and having the projects where they're needing to undertake systematic reviews and potentially looking at creating their own tools to do that. Often the projects have a definite lifetime, like a length of time that they're funded for. And that can be a real problem when it comes to tools and working with tools and also making data available in terms of who's going to be responsible for the availability of that going forward. Alex, perhaps, did you have any thoughts on how we might be able to improve that situation? Yeah, I think there's two things really there. In the academic space, projects are funded on a short-term basis. There's not necessarily a legacy plan or a plan what's going to happen to either that data or what's going to happen to that code or that tool once the project is finished and once the funding runs out. And yeah, about how we can create a community around some of the tools. And I think Github and our community are some of the best places to grow these kinds of communities so that we can say, hey, I've got a tool out there. We had funding, but we've only got funding until end of January 2021. Is anyone working in the space? And does anyone want to take over? And that I think could be a really valuable way that we can just ensure that the tools live on. If the tool is being used, if it's useful, if the code, either that it gets integrated into something else or that there is someone there to maintain it. Do you think it's something maybe that could be built into research funding application requirements or something like that where it's sort of at the policy level now where we need to be making policymakers aware of these kinds of issues? And maybe do you think that could have some kind of positive effect? Definitely. I mean, engaging in key stakeholders funders being one of them. Sadly, applying for maintenance funds isn't very sexy to put on a grant application. Most funders want to see, you know, development of some shiny, beautiful tool that's going to solve the entire world. But yeah, I think we need to realize the reality of tool development is we need some expertise. Often if there's an overlap in software built around research, you know, you've got very research heavy people and you've got the very software development people and sometimes or depending on the resources in that project, those two groups of people don't necessarily overlap. So you might have a researcher building a tool with very little coding experience or awareness of good coding practices or even awareness of what GitHub is. And also on the other hand, you might have groups of tool developers making amazing shiny tools. But if the end users aren't engaging, then yeah, then it could contribute to research waste in that space. Thomas, did you have anything to add on that? Yeah, sure. So I think a lot of it comes down to the value of the research. So I actually don't think it's necessarily a bad thing that, you know, if you have a graduate student who's just starting, you need some sort of project for them. And, you know, you have a grant for a department or something like that can support that graduate student, you want them to do some sort of original research. You want them to learn rapidly. And certainly having that starting out student work on some sort of existing review is a good way to do that, I think. But also you probably want them to create something of their own, something where they're independent and they've led the research. You know, a lot of a lot of a lot of educational processes about how to actually create your own things and become a leader in a certain space. So I don't think you want to get rid of that. But you do sort of want to recognize that some of these research projects are, you know, really valuable for society or, you know, creating some sort of data that's actually useful to some end-use applications. And once you start creating those kinds of reviews, you know, ideally, you identify, you know, actual marketability for those, for that data, you know, that could be data that's valuable to the government. One good example is reviews on chemistry. You know, most of what I do is related to chemical hazards. So, you know, regulating chemicals based on their hazards is important. Collecting that information and making it available in an open access way is of value to regulators and chemical companies and all of that. And so, you know, I think it's easy to, none of us are saying this, but it's easy to look at tools and say, you know, we just need to make the tools better and then, you know, easier and then people will start creating this more long maintained things. But really it does come back to the funding and how can we make, you know, make these projects valuable? Once you do that, the rest of it will come quickly. It is happening too. You know, just the fact that we're having this conference and people are making all of these tools, you know, certainly is recognition that there's valuable stuff to do in this space. And, yeah, I think we're going to get there. I wanted to jump in with a question as well. I hope that's okay. You know, I was just listening and, you know, excited about these kind of synergies. And I wondered if you all could talk about the potential for collaboration, not only to reduce research waste, but kind of the positive sides of what that means for developers and synthesis and scientists and, you know, what kind of opportunities that opens up. Yeah, Emily, thanks. That's a really, really good question. And I think, I mean, one of the main things that I can think of is learning as a community. We have people with all different kinds of experience. And I think having these conversations, bringing together groups of people really means that we can learn a lot from each other. Learning a lot from, you know, sharing, get best code practices, best sharing practices, open data and things like that. I think is something that's really valuable that we have at amazing conferences like this and that the evidence and synthesis hackathon has been great at facilitating the past couple of years as well. Just becoming aware of people doing the same kind of work you are doing is extremely valuable. And that comes back to the open access projects and registries for projects. If you can identify somebody doing, you know, a project that's similar to yours, even if it's not, you know, exactly your project, then there are some really interesting opportunities on the technology side. You know, two people are doing research on insects, you know, and extracting similar information. But with a different end goal, you know, maybe we can make their data interoperable. So, you know, people are reviewing the same documents, extracting some of the same information, not all of the same information. I think that's a really interesting future to look forward to. Definitely. And that would mean potentially, you know, reducing the strain on resources, pulling, you know, pulling that, that the knowledge and the skills and abilities between research teams and across communities, across countries that could have fantastic benefits, I think. Yeah, that's a great idea. I would love to see some more opportunities to be able to share knowledge amongst the different tool sets. I mean, I know that even myself, I only have a limited skill set in terms of tools and development tools. But I'd love to be able to see some more openness in terms of, you know, open practice and sharing knowledge within the community and with research students as well, so that you've got, you know, undergrads even starting to come through with these skills already before they're getting to the stage where they're trying to, you know, apply them in a more, I guess, a more consistent and frequent way. Yeah. You know, these conferences are probably the only way I can think of, you know, I guess we're in one, so it's hard to think of other things. But in terms of like how people recognize, you know, somebody is doing similar work to me, you know, how can I collaborate with them rather than just reproducing what they already did. You know, frankly, what I usually see in my space with building these tools is when we go and talk to a group that's building similar tools, they're rebuilding the same tools we built. You know, and you want to say, stop doing that, how can we collaborate, it's a waste of time, waste of resources. Mel, what do you think about, you know, other ways of actually finding people or who are doing your research that exists right now? What tools are you aware of? Well, I mean, I have to be honest, I use Twitter a lot and I use the Research Gate and I have a lot of, you know, my professional learning network is based around generally those two social networks where I've just jumped in on a hashtag and found other researchers that have have similar interests and just gotten in touch. In fact, that's how Neil invited me to this panel with he'd been seeing me tweet madly about some great new evidence synthesis gap maps that some of the people have created using EpiReviewer and Epimapper. And even Research Gate to an extent can allow you to, you know, connect with other like-minded researchers. LinkedIn to a certain extent I think works really well. But again, I'm in a slightly different space, I think, than you two are. So what, Alex, do you use other networking tools other than those? Yeah, I mean, that's like the social networks like that. But I think that comes like it comes right the way back to being open. And there is a sense in academia that we're competing, we're competing for funding and things like that. And I think I hope slowly that we're seeing a change towards the recognition that science is for everybody. Science should include everyone. And that actually, and something that I hope we've seen with a lot of the COVID research is we make much longer strides when we work together. So I think step one is actually just being able to share your research ideas on something like Twitter or wherever you are active. So yeah, step one. COVID is a really interesting example. I don't, I know I've seen a few people doing systematic review type research on it. Some people have done that on CISREV too. But just the sheer number of articles that have been published on COVID already, you know, and yeah, collecting that information is also like really critical to public health, as we had presenter just speaking about. So, you know, I guess we haven't brought up too much the idea of living reviews, but, you know, when we have these topics that are generating such a huge amount of publications, and then there's a lot of value to collecting that information and understanding it and discovering how it's changing over time. I think we really need better tools for doing that. Like I don't, I don't know. We've actually got one happening. So the epicenter has been doing a living review of the COVID-19 research that's been coming through. They have a massive evidence gap map available through the epicenter website. And I've been doing two other living reviews on the teaching and learning research that's been happening during the pandemic. So one review that's based on like K to 12 school literature and one that's based on higher education. And so we're using the, we're extracting our data through EpiReviewer and then using EpiMapper, which itself, the EpiMapper app is, you know, it's available for anyone to use. And then uploading them to our website and trying to, you know, tweet about it, but getting the knowledge out there that we've done this and sharing it with other people has been quite difficult, I think. You know, so I've started resorting to kind of emailing the authors involved in the studies that are in my review so that I can get that knowledge out there. But even then, you know, we're doing the work and we're making it publicly available. But how do you get people aware of this work that we're doing? That seems to be another issue. Yeah. And I think that COVID has also highlighted the need for interoperability. So like Tom was talking about, that you've got some data that you've got, you've extracted information about and guaranteed there's someone else that wants that information. How can we use the tools that we've got available? How can we use APIs? How can we use data sharing to reduce waste in that space? So yeah, we're hopefully trying to find new ways of doing things quickly in the COVID time. You know, we saw a presentation about electronic lab notebooks becoming publications. And, you know, as people who review papers a lot, it's pretty easy to look at everything through that lens. But if there's sort of like controlled data that people who are publishing in the same field, they all know that everyone else is interested in this data. In my field, it's chemical hazards. The reaction I always have is like, let's create reviews to extract that data and normalize it. But you could definitely think about doing that on the front end instead, right? And probably there are movements to make publishing data in a normalized fashion easier. I don't know what kinds of journals are doing that though. Are either of you aware of journals that have like a database of results, something like that that, you know, a developer could plug into? Well, I think the E-Life and F1000 are supporting a lot of these kind of living plots, living databases of results. But I'm not sure about if you can then port results in from the other end, wanting to look into that. Yeah, most of the journals I know, it just goes into supplementary data as an addition to the article. I don't think they have any repositories or anything for that. But that's a great idea. There really should be some sort of technical solution to this kind of thing. Because a big part of the reason people aren't, you know, people are reproducing the same work is it's not even about the data being open access or not. It's about it. You have to, at some point, you have to take the data and put it into a form where that you can access programmatically on your computer. You're going to build your own spreadsheet, you're going to build your own database or whatever. And so like, even if the data is already open access in a publication, you still end up needing to review it and extract it and renormalize it and do that whole process over again. Maybe one example, it's kind of like a simpler example. But with PubMed, there are mesh terms on articles. And I actually am not certain how that process is completed. I'm sure NCBI has some people who actually assign mesh terms to documents. I hope, you know, when I publish, people always ask me what terms I want to associate with my paper. And that's sort of a controlled thing, right? Like, if you can create a library of terms, an ontology of terms, you can start creating that sort of system. And a lot of people use PubMed. So that would be one good way to do it. But I don't know, there's a project for anyone listening, create a good way to, you know, share normalized data when publishing on the front end, rather than when reviewing articles. Yeah. And Open Science Framework is amazingly, has so many different plugins and things like that. And yeah, there's a package OSFR for integrating with Open Science Framework. But yeah, what standards do we need around that for evidence synthesis specifically, because it's an open space. Open Science Framework, it's wonderful. Yeah, that's a great suggestion. I haven't looked at OSFR. That's the interest to see what that's like. Yeah, I need to look into that as well. And unfortunately, I'm going to have to jump in and bring the discussion to a close. I'd like to thank you all for bringing this topic to the group. And I'm hoping that we can have a very vibrant discussion in Slack as this is posted live. So thank you all. Thank you so much.