 Welcome everybody to the second session of stream six. This session is on quality of research for development and scaling concepts and approaches During last week's session of stream six, which was chaired by Brian and I hope that many of you have attended We discussed how evaluating FTA research requires mixed methods and a complexity aware approach Much of FTA research addresses level three problems as was highlighted by Holger Myers keynote presentation in the opening session The research pursues multiple impact pathways aiming to produce knowledge and otherwise support processes of change The research itself needs to be able to learn and adapt accordingly Typical quantitative impact assessment tools as we discussed are insufficient on their own because each engaged Transdisciplinary research case is unique. It is not possible to replicate or identify a counterfactual FTA research evaluation therefore needs to be able to assess whether and how research contributes to outcomes in the sphere of influence and to impacts in the sphere of interest Throughout the conference, we've heard about the importance of contextual factors which can affect the quality of research And how identifying these factors is essential to systematically consider how research is being enabled or constrained We've also heard from Professor Dolores Armenteras how inequality pervades research with more difficulties for researchers from the south and women in general To be heard, published and recognized with significant consequences on the identification of research questions So in today's session we'll continue the discussion of the challenges of research that aims to help solve problems in complex conditions and the implications for research design and implementation Some of the key messages our presentations will try to convey are that context and scale are critical in designing research for development And that well-defined principles and criteria can help guide and evaluate research proposals, project implementation and evaluation Today's session will begin with a keynote presentation by John Gargani on scaling science. What can we learn about scaling our impact? John Gargani, who is nice enough to join us today, is founder and president of Garganian Company based in Berkeley, California And he's past president of the American Evaluation Association. Together with Robert McLean, he's co-author of the book Scaling Impact Following that, Richard Cole will present two methodological frameworks to explore the connections amongst methods, quality and impact of research aiming to influence policy change and pharma practice, respectively Brian Belcher will then present a research quality assessment framework designed to be applied to engage transdisciplinary research. And finally, Frank Place will provide an overview of the one-CGIR stage gating concept We'll take a few questions after each presentation and then after the four presentations are over, we'll open up the discussion With our panelists and we'll take questions from the audience. We would like to ask you, and if you've been attending the conference, you know, To please put your questions in the Q&A box and to also use the voting functions so that you can let us know which questions you find more interesting Joining us for the panel discussions and we're very grateful, Holger Meinck, strategic research professor for global food sustainability at the University of Tasmania, Australia And chair of the CGIR independent science for development council. So with this, I am just going to head over the floor for our keynote presentation, if it could be pulled up the presentation by John. Thank you. My name is John Gargani. Robert McClain and I recently published the book, Scaling Impact, Innovation for the Public Good. In my presentation, I'll introduce some of the ideas in the book. The presentation is intended as a conversation start or so, I'll move quickly and stick to the main points. This is a picture of banana landia, a banana plantation in Mozambique. On the lower left is the plantation. It's so large that it's single-handedly transformed Mozambique from a net importer of bananas to a net exporter. On the upper right are small farms called Mishambas, found throughout the country. And running between them is a road, a bright line dividing one large-scale agricultural approach from another. For Robin Me, this picture raises questions. We decided how big the plantation should be, how many small farms there should be, how were the decisions made, what's the optimal mix of these and other approaches to agriculture, and who's responsible for achieving it. The same questions can be asked in other settings and contexts. And we didn't believe they were answered well by current approaches to scaling. So we set out to find answers elsewhere, from innovators in the global south. In the end, we arrived at a different understanding of scaling that's based on the work of southern innovators. We put what we learned in the book. It's organized around four principles and five case studies. And we believe it challenges organizations to adopt a different mindset when scaling. The book is available for free, along with other scaling resources at the IDRC website. Just Google Scaling Impact IDRC, or use the link below. I'll start by clarifying what we mean by impact and scaling. Then I'll introduce the four principles of justification, optimal scale, coordination, and dynamic evaluation. Some of what I present goes beyond the book, reflecting our most current thinking. Rob and I joke that everything about scaling impact is simple to understand, except for the terms impact and scaling. Let's quickly define them. Impact has multiple definitions. Many of you learned about impacts in the context of logic models. In logic models, impacts are only some consequences of a program, policy, or project. Outcomes and outputs are also consequences, but they aren't impacts. Impacts are the most important consequences. They are the last to occur. And they are usually compared to an implied counterfactual. Counterfactual is another term that may be confusing. Without getting into detail, a logic model asserts that the consequences of a program, policy, or product are different and better from those of some alternative program, policy, or product. As a rule, logic models don't identify the alternative, which is called a counterfactual. We define impact differently because we believe it helps us better understand scaling. For us, impact is all consequences, not just those we intend or want. Of any importance, good or bad, to anyone. They may occur at any time, and they should have an explicit counterfactual. So if we say our program is better, we know what is better than. If this has your head swimming, be grateful. You have something to ask about later. Scaling is also a problematic term. Let's unpack it using the metaphor of an apple tree, where the apples are impacts. You may have heard of people talking about scaling up. That's like taking an apple tree and growing it larger so it produces more fruit. Banana landia is a scaled up plantation. Scaling out is like growing more trees to produce more fruit. Machambas are small farms that have been scaled out across the country. Scaling deep nurtures the tree in order to change the qualities of the apples, making them larger or more delicious. These different approaches to scaling can be combined, so we can scale up and out at the same time. There is also same scaling, which is intentionally maintaining the same scale, and descaling, which is scaling down, possibly to zero. And finally, not scaling, which has superficial similarities to scaling, but the opposite effect. We consider any and all to be valid ways to think about scaling. But notice that they describe different strategies about how to scale. Bigger trees, more trees, better trees. These are the means of scaling. If you're a manager, this is what you spend a lot of time thinking about. Somehow it has also come to dominate how most people think about scaling for the public good. We want to change that. We want to focus on ends, not means. This is why we talk about scaling impacts. We want organizations to scale the positive impacts they have on people, places, and things. And they should scale up, out, deep, or any other way that produces optimal impact. The first principle of successful scaling that emerged from our work with Southern innovators is justification. Scaling must be justified in a public way, because scaling is a choice. We may feel pressure to scale from funders and peers, what we call the scaling imperative. But organizations are free to choose. Sometimes, perhaps most of the time, it is better not to scale. Because scaling affects others, the choice is shared with them. And it should be based on evidence and values. It's not enough to know that our actions will create change. We need to understand how much and in what ways it matters to people. Much of the writing on scaling starts with the question, how do we scale? The principle of justification cautions us to take a step back and start with the question, should we scale? A big idea underlying the principle of justification is impact risk. It's the risk that organizations fail to produce the impact stakeholder's desire, or produce those that they find undesirable. There is impact risk when organizations are uncertain about the consequences of their actions. Consider a continuum of certainty, that one end is high, certainty of impact. Here, we can reliably predict the result of a program, policy, or product. Pharmaceuticals fall here. A public health organization cannot justify using a new drug, unless it knows the effect it will have on people. At the other end of the continuum is low, certainty of impact. Here, an organization cannot predict what will happen if it acts. Fine art falls here. A museum can justify hanging a new painting on its walls without knowing the effect it will have on people. Your innovative program, policy, or product probably falls somewhere in between, where determines how you use evidence and values to justify it. The same idea applies to scaling. Let's add another dimension, scale, that ranges from small to large. In general, the larger the scale, the more certainty we need to justify our actions. With this in mind, we divide the space into three levels of risk. Impact risk is too high when we have less certainty than is appropriate for the level of scale. This would be treating pharmaceuticals like fine art, using them without understanding their effects. Impact risk can also be too low. This is like treating fine art like pharmaceuticals, and waiting to complete large-scale randomized trials before exhibiting art. There is a Goldilocks middle ground of acceptable impact risk. It's difficult to identify, which is why the choice to scale should be shared with those affected. It changes as scale increases. And it depends on the urgency of the problem. We may be willing to assume more impact risk to address a more urgent problem. Currently, this is a topic of great debate with COVID-19. Should we test potential vaccines less, increasing the impact risk in order to inoculate the public more quickly, or should we test as we always have and inoculate more slowly? The second principle is optimal scale. More is not always better. So rather than thinking about achieving maximum scale, we should be striving for optimal scale that balances multiple considerations. This requires a holistic view that we believe should consider at least four dimensions. Magnitude, which is probably the most common concern regarding how much impact and how many affected, as well as variety, which is the range of different impacts that are created, equity, which has to do with the fairness of who is helped and harmed, and the sustainability of impacts and the efforts to create them. The third principle is coordination. It acknowledges that scaling takes place in complex systems. Given this, to bring an innovation from first idea to optimal scale, requires an evolving set of actors and a flexible scaling process. Think of a journey that starts at first idea. For example, is it possible to make a new type of smart fertilizer that adjusts itself to local growing conditions and ends at impact at optimal scale? The right mix of magnitude, variety, equity, and sustainability. The first part of the journey is called discovery science. It may be undertaken by a bench scientist and her collaborators. If the idea seems feasible, the next part of the journey is implementation science. There may be an implementation expert, investors, manufacturers, and distributors all working to bring the idea into practice. The whole journey may also be supported by scaling science. Scaling experts and stakeholders help all the actors to justify their efforts, define optimal scale, engage with collaborators and competitors, and support evaluation. I've laid this out as an orderly linear process, but successful scaling is almost always messy. Those on the journey move forward and backward, they may overshoot optimal scale and need to descale. Sometimes, to be successful, they make the difficult decision to stop their efforts altogether. The fourth principle is dynamic evaluation. It starts with a simple but powerful idea. Scaling is an intervention. We talk a lot about scaling an intervention, but scaling is an intervention. When we scale, we change our actions in order to change the magnitude, variety, equity, and sustainability of impacts. However, scaling creates dynamic change, which makes it vitally important to evaluate before, during, and after scaling. What do we mean by dynamic change? Well, evaluators who are interested in impact focus most of their attention on two relationships. First, the relationship between an organization's actions and its impacts. Second, the relationship between context and impact. That's what the arrow pointing to arrow means. In some contexts, impacts may be larger or better than in others smaller and different. The same actions in the same context are assumed to be stable. Scaling changes all of that. When we scale, we change actions in order to change impacts. This is why we say scaling is an intervention. If scaling is successful, the way it changes impacts has the potential to change the context, making it easier or harder to create impacts. In addition, scaling may have side effects that affect context. For example, when an organization attracts philanthropic investment to scale its work, it may become difficult or impossible for similar organizations to attract investments in the same location. As scaling continues, these feedback loops can ripple through complex systems, making it more difficult to predict how impacts will unfold next. Dynamic evaluation challenges us to widen our gaze. In the past, evaluators were focused on two relationships. When scaling, we may need to focus on five. Unfortunately, evaluators may not be well equipped for this. I've covered a lot of ground quickly. We've talked about impact and scaling, the four principles of justification, optimal scale, coordination, and dynamic evaluation. You can learn more about scaling impact at the IDRC website, and I look forward to answering your questions. Thank you. Thank you very much. I thought the presentation was really, really interesting, and this last part on the need to learn, which is something that evaluation always stresses, but the part of learning and evaluation during scale I find very interesting, as well as the trade-offs. Are there any questions in the Q&A box, or if not, would any of our panelists want to ask a question to John, maybe? Rachel? I actually have a question myself, seeing we've got a question from Lee. Just a quick question on the actors involved. Discovery science, implementation science, and scaling science. Are we sure there are only bench scientists, and their role is in discovery science, or could both stage two and three benefit from scientists, field, and bench scientists, not only stage one? John, do you have any comments on that? Yes, that diagram was meant to be illustrative, not define the exact players who would be there. But scientists may play a role throughout, and experts in implementation science may be scientists. So I think that it's more to demonstrate that there is a need for a variety of actors with different levels of expertise and capabilities, and that planning for their cooperative work and for one set of actors to hand off to another becomes critical in evaluation, rather in scaling, and not always considered from the beginning. We've got another question from Ileana, asking what similarities or complementarities do you see between the discussion around scaling and the use of theories of change? Well, Rob and I discussed for some time the idea that we need to have scaling theories of change. If we think of scaling as an intervention, then we should have a theory of change about how and in what ways scaling will change our impacts. So I would say that they're complementary in that sense. And I would encourage people to think through that. There are various kinds of scaling effects that make it easier or harder for you to achieve impacts or to achieve different kinds of impacts as you scale. Heard immunity, for example, is a famous example of a scaling effect. I would say, though, that some of what we think about with theories of change or logic models is a more restrictive view of impacts than we are bringing to the discussion. And we believe that organizations, especially organizations that are scaling, are responsible for all their impacts, not just the ones that they say they intend to create, not just the ones they anticipate. They're responsible for all their impacts. So we want to get that on the table as important, and that changes how you do evaluation and manage scaling. Great. I think we have time maybe for one more quick question from Deep Mara. Could you please elaborate on the type of partnership models, for example, public and private sector and civil society organizations, engagement mechanism and resources required for going to scale? Yes, in one minute. There are an infinite number of ways for organizations to work together. And the funding requirements are considerable, which is why sustainability is one of our four domains that we think need to be considered. You have to have a plan in place when scaling to acquire resources in some way through funding, through the market, through something that would allow you to sustain this effort and to change it as it needs to be changed. So, you know, it can be collaborative. It can be more competitive. It can involve short-term relationships, long-term relationships. It really, from our perspective, doesn't matter. You're trying to choose a partnership model that meets your context and your purposes with a vision of optimal scale, which is informed by stakeholders driving that whole process. And we'll take one more question from Vincent. Vincent, do you want to pose this question live? Well, yes, I certainly can. Thanks, John. Yeah, my question was about the impact of this change of understanding, not only on how evaluation of research for development can be done, but on the very definition of our own research for development objectives, which is sometimes very top-down, very detailed. And whereas, as you said, a wide range of impact needs to be considered, including those that would be desired by the community. So, but does it tell? So, you're asking how that relates to... Yeah, I think we're challenging that view. I think it's... And it's not that more top-down view. I'm not saying it's exclusively top-down. Isn't sometimes appropriate. But I do think that it's driven by governments and funders who see themselves as actors who have a high level of control and power, directing the work of others, versus the people who are doing the work, seeing their role as achieving something for other people. And those other people are involved in important and intimate ways in what they do. And the funding should also be viewed that way, I believe. I mean, countries have objectives, organizations have objectives of their own through their funding. But that critical stance of, we serve another group of people, we must understand them, include them, be advancing their interests, resolving differences in their interests as they arise, I think is a very important part of scaling and more with scaling than just typically impact as we think about it. But that's different. That's part of this different mindset, I think we're moving to ask people to consider. Thanks, Ben. Sure. Thank you very much for getting plenty of questions. And I'm sure we can perhaps keep some for later in the meantime, continue with the presentation just for the sake of time. But thank you so much, John. So, Rick, next presentation. Okay, so good afternoon or evening or morning. I'm Rick and I was just going to share a few thoughts that I've been having about research quality and impact for the sort of research we do, not particularly focusing on scaling, but thinking more about the way we manage ourselves within our institutes and organizations. And one of the advantages of speaking towards the end of the meeting such as this one is being able to adapt what I've planned based on what's happened so far. And my first thought around that was what it's all been said before, particularly in the first session on this theme. Then I thought maybe it is worth continuing anyway. However, I just wanted to refer to a couple of things that have come up during the week. I've been struck how many people have referred to Holger's three levels. And how Robert said there's only level three, actually. If you think you're on level one or two, then you're not looking carefully enough. Frequent references to transdisciplinary research and the need for this. And I did a bit of looking around and found the first earliest reference I could find to that in the literature was relevant to our sort of business was 1974. So after 46 years, it seems as if we're still wondering how to use this concept. Well, and there is a body of thinking that I found particularly relevant that I haven't heard referred to during this week. And that's the framing what we do as post normal signs. And I've just put one of example from a larger literature, which is now available. On the screen here from a paper on environmental services that came out just two years ago, which describes the nature of science and its relationship to the sort of problems we work on. And what's different, what they're labeled post normal compared to what we're maybe used to. And what much of scientific standards and practices are designed for. And it's interesting to me that what comes under transdisciplinary research, I think only really refers to a couple of these things. And there are other dimensions captured in post normal beyond that covered in transdisciplinary. Anyway, let me really look at what we do in the literature that's interdisciplinary. Anyway, let me revert now to what I was going to say anyway, and it's really, I guess, something we all recognize that our work is based on applied research. Research aims us to bring about some sort of positive change development and facilitating that change. And it's reasonable that the people who are paying these substantial bills want to know that that's happening. And that's really the basis for demanding evidence of impact. At the same time, we pride ourselves on providing high quality and aim to assess ourselves on that basis. So, pondering what quality and means is something that's been going on for the 30 years that I've been engaged. And the evolving roles of the centers mean that that's going to continue. The competing schools of thought and the large literature, which I don't go into. But just to sort of ground what I'm talking about here, two specific events recently that I was involved in prompted what I'm going to present here. One was a spear meeting I attended that basically, in summary, they presented their evidence that natural resource management research doesn't generate impact. And a discussion within FDA on high impact data is a demand for something called high impact data. And when we look at these things, we see, first of all, looking at impact. There's no doubt there's some sort of, you know, misalignments in our system. Influential groups such as spear still use very much a linear problem research change thinking. And they say that they compare the cost of the research with the attributable value that it's produced. And that's just not the way the research that we do works. We have multiple overlapping problems, typical by multiple overlapping teams. The boundary between research and development and scaling has disappeared. Changes happen irrespective of our work. Adoption is really any longer a core issue, but it still seems to be a large part of what people are looking for. Problems evolve rather than getting sold and so on. And when you look at it like this, then you think, well, what do we actually mean by impact? Well, John gave us some ideas in the last presentation. I've not had time to incorporate those yet, unfortunately. And much the same is true when we look at research quality. Concepts used in the CGI are, have moved beyond those of academia, which are largely based on credibility. But these broader, these broader dimensions of research quality are pretty hard to apply, except very generally. I think we can define research quality as fitness for purpose. It's probably quite hard to argue with that, but that just shifts the problem of being clear about the purpose. But it is that being clear about purpose that provides a connection with impact. And the proposal here is simply to look more carefully at the purpose at any component of what we do and use that to determine the relevant definitions and impact indicators of impact and quality. Now, to illustrate this, I'm going to take a couple of frameworks that describe R&D processes and then break those down into steps and see whether we can, you know, get something more grounded on which to base concepts of impact and quality. And I'm going to take two generic frameworks here. There are very many we could use, and we could use series of change that people have developed for their, for their programs or projects. But I'm going to take two, one of which is a classic been around for nearly 50 years and the other is more recent. And I think both have proved useful within FTA work. So the first example is the issue cycle, which was first described in 1972, describes the stages which attention, research and action go through when working on topics of public and policy interest. I'd say that sometimes go through. Of course, they don't always go through this. But when you have some framework like that, then it describes stages or steps, which I've broken down now in the rows of this table. And the stages can be described in various ways, but there's a clear objective of each. And we can describe what impact of each of each of those steps look like if that objective is met. For example, the impact of the objective of first step is raising a flag about a new issue of potential public importance. And the impact occurs when we manage to raise interest and concern among researchers or others. We go further down the table, then the nature of impact changes at each step. Step three concerns, concerns possible interventions or solutions and impact means maybe demonstrating that they work and somewhere near the bottom, we have the conventional view of impact as changing state of the problem. And that narrow view there is the current standard for many people demanding evidence of impact. But I think if we take the totality of what is here as a more realistic and broad view of the impact of the sort of research and development efforts that we are making, then I think it's far more realistic. And there's one more point on this slide. A project or even an individual that is so far only taken step one might have had impact without ever getting before, long before they've got to the bottom of the end of the process. Looking at research quality column here, then much the same thing applies. We can identify characteristics of the research that make it fit for purpose and those will be very different at different steps. For example, at that top, normally we must come in strongly, whereas legitimacy is crucial later on in the process, maybe down here. And if we were to pursue the discussion of what high impact data means, then to me it's quite clear that the sort of data you need and the data which is going to allow you to generate impact is also going to be very different, different steps of process such as this one. The second example I wanted to look at is the research in development framework that has been used by FTA. The process is no longer a linear sequence but includes cycles, but the same amount analysis applies. Two minutes, Rick. We can break it down into steps. So here are steps in that cycle and again we can identify the sort of conventional narrow view of what impact it's all about and the more realistic broad view that it is all of these things. If we look at some more steps in more detail, then for example, this schema does include conventional agricultural trials and quality standards for those who have been articulated since the time of Fisher, but it also includes these larger N or plan comparison trials which have been discussed in one of the other themes and they have very different requirements for effectiveness and the impact compared with the sort of trials which are conventionally being described. And here is a step that requires synthesis of qualitative and quantitative information from multiple sources to understand trade-offs and interactions with context. To generate impact if the results are more nuanced in realistic development messages and expectations and that again is maybe not something that's conventionally used. So the key messages from this, from I guess that I want to get over here are first that if we break a complex research and development process down into coherent steps, it allows more nuanced, relevant and applicable definitions of impact and quality and this applies whether one group or programme or organisation is responsible for all the steps or only for some of them. And second, these definitions will depend on the frameworks used to conceptualise the work and the linkages between the different components. And the implications of this, well I think there are implications, internal implications for our organisations. We can do this at any level of our organisation. We can use this to adapt our ways in which we manage, particularly manage ourselves and monitor what we're doing. It's 12 minutes, Rick, if you can wrap up. And I suspect there are also external implications that if we use these then we might be able to influence some of these demands for impact evidence that people feel sometimes disrupting the work. Thank you. Thank you very much, Rick. Rachel, do we have any questions? Yeah, we've got one here from Pliny OIC. So asking, funders expect impact, but society does as well. Do we use the same indicators to assess our impact or do they change according to the public? I would say they have to change because surely impacts are to do with the changes that matter to the audience. But I must say that compared to the, I was thinking as I was talking about the contrast between what I've been trying to think through here and what John presented in the previous one. And one is to do with the, whether the focus is on our internal management and the way we organize ourselves and the way we plan and implement projects and programs compared to taking a look as an outsider at something that has been happening and saying what's gone on, what has changed as a result of something. Thank you. Don't seem to be having any questions for the moment. There, Rachel has spoke too soon. Yeah, they're just coming in now. Okay, so Carl has a question on how to convince the powers that be for such an alternative framework. Claiming it is insane that we are treated like development institutions. Any comments on that? I mean, the only suggestion I have is to actually use this and communicate the results of using it and see whether anybody finds it as interesting as I think it should be. I agree entirely that the fact that we are, we are research led research and development organizations. We are not primarily development organizations and the fact that we therefore should be, the impacts we're looking at should be different and the way we're evaluated should be different. So I would say, you know, there are, I know there are demands, but somehow we, we should be in a position to modify the demands rather than just sort of give into the demands. One more question from the audience point of clarity on elaborating why adoption is not a key issue. That's from Tess. Okay, adoption. I wouldn't say it's not always a key issue, but so much of what we do is not just about technologies. I don't quite know how we use the term adoption when we're not talking about agricultural technologies, but we're talking about many different aspects of processes and systems. So system change to me is not characterized by adoption of practices, even sort of institutional practices. There's something bigger about system change than adoption. So, and yet still many of the impact assessment studies and reports we see. So start from the position, okay, how many people adopted something new and what happened when they did. Frank has a question. If you'd like to ask that quickly live and then just for sake of time, I'll probably move to the next. Oh, hi Rick. Okay, sure. Yeah. I really liked that. Well, both you and John had indicated that the variety of types of impacts and you at various kinds of stages of research as well, perhaps, which I think is, I agree entirely with, but what I think then it poses a challenge for us all to prioritize across different types of research areas, because the impacts are so diverse and different. So I was wondering if there's any frameworks that have been used to sort of prioritization out. I'm sure there are, but I can't think of any of that. You probably know better than I do. No one for you. It's challenging me too. Okay. Thank you. We should perhaps move on to the next presentation. How are we doing for time? Do we have time for more questions or should we just move on to the next presentation? You muted, Rachel. Sorry. Three minutes goes quite quick. So yeah, we should move to the next and we can come back to anything and so many questions. Yes. Thank you very much, Brian. Great. Thank you. So good day, everyone. We're covering a lot of time zones with Holger and Tasmania and John and myself and probably some others on the west coast of North America. I'm going to present a little more detail on some of the quality considerations that Rick's just discussed, presenting on behalf of my co-authors, Rachel Kloss and Rachel Davler, both with us here on the, on the panel. And Stephanie Jones. So we've had a lot of discussion about the complexity of the problems that we deal with and the research challenges. And to address these problems effectively, the way we do research has evolved substantially. And I think we've had a lot of discussion in the last 10 days about that. We've heard many different examples of transdisciplinary approaches, ideas of co-generation of knowledge and engagement with stakeholders. Much of FTA's research as well as research and other CG programs and some of these grand challenge programs and many different research for development programs deliberately cross disciplinary bounds. That is interdisciplinary and engage stakeholders and other societal actors in the research process, which is what we call transdisciplinary research. And there's a lot of different definitions, but we in my team, we're using that idea that when you engage stakeholders and societal actors in the process of research, that's transdisciplinary. The CRPs themselves were designed to encourage wider and deeper partnerships and a stronger focus on outcomes. But this evolution and research approaches has not been matched by an evolution in research and in evaluation criteria. Research quality criteria used to guide proposal developments to assess proposals, to review journal submissions. And another research assessment are still primarily defined along disciplinary lines. And this misfit means that good research might be undervalued, manifests sometimes in difficulty getting articles accepted in certain journals or problems with granting councils. And also the idea that poor quality research might slip through the cracks because there aren't good comprehensive quality criteria for this sort of research. So there's an urgent need for appropriate principles and criteria to assess quality of research for development. And this presentation will discuss our work to develop a transdisciplinary research quality assessment framework. We started with a systematic review of the literature looking for definitions and measures of quality in an interdisciplinary or transdisciplinary context. And we searched broadly, we covered some of the literature that Rick was talking about in post-normal science, sustainability science, engaged research, transdisciplinary research, translational science. And ultimately we came up with 38 articles, none of which really had a comprehensive overview, but talked about different aspects and considerations in both defining and measuring quality in research that crosses disciplinary bounds. We identified key themes and ideas from the literature and organized them within four principles guided by the idea that in order to be used, research-based knowledge needs to be perceived by users to be relevant, credible, and legitimate. And the literature also talks a lot about high emphasis on outcomes and impact. So we added a fourth principle of effectiveness. We organized the principles, pardon me, organized the criteria in a typical project chronology, in other words, first listing criteria that relate to project design, followed by data collection analysis and reporting. And we put considerable effort into developing clear and unambiguous definitions and corresponding rubric statements to be used in the assessment. We proposed a three-point scoring system in which a score of two is assigned if the criteria is fully satisfied, one if it is partially satisfied, and zero if it is not satisfied. And we've now tested it in a series of outcome evaluations of FTA projects and a few other projects, and we've then further revised the criteria and the scoring system. This slide lists some of the key themes that emerged from this literature review, ideas of engagement with problem context and collaboration and stakeholder inclusion, and the increased need for communication are all featured prominently. In interdisciplinary work, there's the challenge of how to integrate epistemologies and methods, and in transdisciplinary work, the added challenge of how to integrate stakeholder knowledge and values. There's also emphasis on multiple kinds of outputs and outcomes, that is not only peer reviewed articles and citations, and a strong emphasis on problem solving. A key point is the attention to fit for purpose. We need to evaluate research against its aims, and this echoes something that Rick raised. In this framework, relevance refers to the importance, significance, and usefulness of the research to the problem context and to society. In FTA, a relevant project should consider and address CG goals, flagship strategy, international, national, and local processes, and the state of science. It needs to engage with stakeholders and other societal actors to ensure that it is asking the right questions in the right way, and that it is drawing on the full range of available knowledge and expertise, and that it will be targeted appropriately to the intended audience. Credibility refers to the quality of science and the idea that research findings are robust and sources of knowledge are dependable. The credibility criteria are the most analogous to disciplinary quality criteria, but add criteria relating to contextual awareness, scientific integration, and reflexivity to more traditional criteria of scientific rigor. The principle of legitimacy captures the idea that the research process and the researchers themselves have to be trusted in order for users to value the research. This reflects the fact that research users may not have the necessary information or the capacity to evaluate the scientific quality of the research. Rather, they assess the research based on whether they believe it to be fair, ethical, and representing their knowledge, values, and needs. And finally, we have renamed the fourth principle as positioning for use. We're drawing here on the RQ+, which is another research quality assessment framework developed by IDRC. And this reflects the idea that the research itself is designed and managed to enhance sharing, uptake, and use of research outputs and stimulates actions that address the problem and contribute to solutions. In order for the research process to be effective, the research needs to be organized so that all necessary functions are performed at each stage of the research cycle. And as we've been developing theories of change with researchers in a lot of different projects, there's always the anxiety from researchers that so much is expected in order to achieve outcomes and contribute to impacts. There's so much work that needs to be done. And I think this idea reflects the fact that our teams, our research teams, or our research for development teams really have to engage or include more skills and a broader range of expertise so that we are actually covering all the bases and the scientists can focus on science and some of the other aspects of research for development are covered by other members of the team. I've got a couple of examples. I can't present all of the criteria, but here's a couple just to give you an idea of what they look like. They're presented in a table by principle. Each criterion has a name, a specific definition, and a set of guidance points that are intended to help an assessor focus on the pertinent aspects of project design and implementation. So the first criterion under the principle of relevance is clearly defined problem context. And this is defined as the context is well-defined, described and analyzed sufficiently to identify a research problem and corresponding entry points. And then the guidance draws attention to the need to truly appreciate the context in which the research is being done and to define the research problem accordingly. And then there are specific definitions for key concepts, in this case the concept of the problem context itself. A second example is from the principle of credibility and so the criterion is appropriate methods, which are defined as methods to fit our fit to purpose and well suited to achieving the objectives and answering the research question. This criterion also illustrates how traditional academic quality criteria have been augmented and adapted for transdisciplinary application. In this case, the criterion recognizes that methods need to be fit to purpose, but still well justified. As part of our work with FTA, we have conducted several theory-based outcome evaluations of FTA projects. One of those case studies on oil palm was presented in the last week's session by Rachel Davle. In those studies, we collected detailed information about project design, implementation, outputs and outcomes. And with that information in place, it's fairly straightforward to assess each project against these quality assessment framework criteria. This slide illustrates the QAF scores of five projects and these are the old criteria that we've subsequently improved them since we did this. It is then easy to identify areas of strength and weakness in an individual project, but also areas of more common strengths and weaknesses in sets of projects are illustrated in a diagram like this. So for example, note that a few of the cases in this set had well documented theories of change or made the effort to consider... Two minutes, Brian. Thank you. We're here and explicitly disclosed the researcher's perspective. Generally speaking, these projects are strong on problem definition, relevant research objectives, and design and have made contributions to knowledge. Several different assessors have used the tool. The scoring is reasonably consistent, which gives us confidence that the tool is reliable. As mentioned earlier, it's important to assess each project against its own purpose and therefore it's not possible to compare projects directly based on their scores, but the scores usefully identify areas of strength and weakness. We found that the three-point scale is not precise enough. We need a broader range and in testing the tools, we found some problems in the original criteria, definitions and rubric statements, which we have revised and updated. And then I'll move along. Considering FTA research from this perspective, we see that there's a high attention to relevance of the portfolio with obvious scope for improvement in the use of theory of change and planning and implementation. And there's some lack of overall coherence in the program. We've seen some surprising gaps in credibility criteria, for example. Several projects did not have clearly stated research questions and objectives in any of the project documentation. And we recommend that every project should fully document its research focus, even if funders do not require it. We have seen that transdisciplinary characteristics and projects are associated with multiple outcomes and impact pathways. The ISDC has adopted these principles and this quote from the technical mode puts some emphasis on the need to foster a culture of high quality research for development. And just to wrap up, the quality assessment framework provides clearly defined criteria that expand upon traditional disciplinary research quality criteria. The overall set aims to capture aspects of research design and implementation that theoretically should produce outputs that are perceived by intended users to be relevant, credible, legitimate, and that are well positioned to be useful and used. The framework can be used by researchers themselves to guide a project or program design and implementation, by academic supervisors, by research managers to guide and assess research for development work, and by funders, journal reviewers, and others to evaluate research that crosses disciplinary and academic boundaries. And there's a couple of references to our paper and also to the RQ Plus paper. Thanks. Thanks, Brian. I've got a question that came in quite early on from Fergus saying transdisciplinary science generally involves three elements, involving stakeholders as you mentioned, but also the focus on real world problems, solutions oriented, and iterative method of development so that methods evolve in relation to the reflection on progress and addressing these problems. He's asking why is it useful to privilege only one of these? Oh, it doesn't, Fergus. If you look at the full set of criteria, each of those ideas that you mentioned are incorporated in the criteria. So in my slide, I mentioned some of those themes that definitely came out very strongly in the literature, and we've built them into the assessment framework with criteria for each of those considerations. We've got another question from Anne. Over time, we've seen some really strategic theories of change and also some that look a lot like log frames. Do you see improvement in this and how can we do better? Also, if we understand that everything is connected, as in level three problems, how do we limit the scope to something manageable? Yeah. Good. Thanks, Anne. First question, yes. I think we started out with a lot of fairly simplistic theories of change when we got started doing this, but we're getting more sophisticated, more intuitive. And I think more and more projects are using them as part of their management as well as their planning. So they're looking to see, are we achieving, are we moving along the theory of change the way we anticipated? And if not, then asking why and making adaptations accordingly. I think to make them stronger, one of the things that came up in the discussions last week was that we really need stronger and better articulated theories, social theories about why we expect to move from one box to another. In other words, better articulation of our assumptions about the arrows between the boxes. Why is this change expected to happen? And to test those theories explicitly as well as part of our work. Oh, sorry. What was the second part of the question? Oh, yeah. How to fit it into the, yeah. So I think the beauty of a theory of change is that, and as we've seen with some of the projects we've worked with, we're building a theory of change with a purpose to which the project intends to contribute, but recognizes that it alone is certainly not going to solve that problem and end up with the final results. But the theory of change allows us to see which part of the, of the process are we fitting into and contributing to, and who are the other actors in that system that we have to engage with. And what are the other processes in that system that we have to, to at least be aware of in terms of context, but possibly contribute to as part of our, our intervention strategy or our scaling strategy. So I think it's about not just having a theory of change for our small project, but a theory of change for how our project fits into a process that's happening with or without us so that we can do our work that will be as effective as possible. Thank you very much. I think perhaps we should move to the next presentation, Rachel. I see that there is a question for, from Tony Simons in the chat box, and we were hoping that that could be kept for the end Q&A, because it seems to sort of cover the entire session. So if that's okay, we'll move on to Frank's presentation and then ask Tony to ask that question verbally later on. So thanks everyone and hi to many friends that I have there in FTA. So I'm invited to give this talk about stage gating, not because I'm a CRP director, but because I've been involved in a core group. And I'll get to that later on, who's been trying to look to see how we can develop a stage gating concept or mechanism for the one CGIR. So just some background. So the purpose of stage gating is that to help improve the efficiency of input use and effectiveness of research in terms of results for internal, but really it's been come, come on to the agenda onto our tables for, from external accountabilities from the funders. So it's been mandated by the funders to assure them that their pooled funds will be managed effectively to increase the delivery of impact oriented results. And, you know, quite frankly, I think some of the funders have concerns that the CG is not very good at stopping lines of research that don't seem to be very promising. So the concept of stage gating is not really new, but it's systematic formalization by the use of metrics and methods would be new to the CGIR. But we are always making decisions about continuing, adapting, halting research in many different ways. We realize that there's a lot of good examples of that in the CGIR already. So where do we stand in terms of developing this concept? So the starting, we had as a starting point, some design principles from Tag 2. Already stage gating was being mentioned back early on in February at the Ashbourne meeting. Then a group of us decided to come together to see how to develop this. There was quite a lengthy workshop that was held over three days in May. Perhaps some of you attended that one. More than 60 people did. We had also a report back to a wider community on May 20th. All the workshop resources are available at this link. You may remember we had some presentations about how stage gating is used with other organizations, also within the CGIR. We had looked at gender perspectives. We had teams looking at how you might do some ex ante assessments of projected benefits and costs, for example. And we took some deep dives into what this could look like in application. So all of that is there. So then we also picked up on this at the Science Leader meeting. We had a lot of discussion about how to do that. We had a lot of progress up to date. Received some good comments. And you may see if you've seen the draft 2030 research strategy, the stage gating is referred to mostly on page 40. There is a companion document that we haven't, it hasn't been widely shared about performance management where there's a little bit more written about stage gating. So I'll come back to the research strategy in just a second. Here we go. The decision points will be used to manage CGIR projects of components to implement this approach universally. CGIR projects will be divided into stages separated by assessment and decision points known as gates, et cetera, et cetera. So this is quite clear in the language for the new one, CGIR projects. There also could be stage gating to determine whether projects should continue after the three-year duration, whether they evolve into something else. So in our meeting in May, we tried to unpack this, we were given the remit to look at these one CGIR projects and how they could be broken down. So we said, well, in theory, you could break them down into many, many different stages even before the approval of a project into a designed idea stage, concept and proposal stage that we're typically used to doing for various donors. During the implementation of a project, you could do stage gates at various times, you know, every year, once in midterm, once at the end, and at the end of the project. And so we discussed some of these pros and cons at our meeting, but we didn't, we don't have an answer of a one size fits all type of thing. It should also be mentioned that the donors themselves have, I think, different views and concepts of what stage gating will work. So this slide was presented from the funders at the Ashburn meeting in February where they kind of conceived stage gating as helping us to guide us through the discovery phase, to a testing phase, to a scaling phase. This links a little bit to some of the presentation that John had earlier to kick off this session. But you can see that it's quite a kind of an innovation focus. So we have, we're looking at a lot of different innovations that are perhaps are trying to achieve a common objective. We test them out. Some will look to be more attractive, some less attractive, and then we scale up. So it's quite a, quite innovation focus on that and quite a linear focus as well. But that's what's, what the mindset is of some of the funders behind this. So we had quite a discussion about the principles around stage gating, which then would need to be integrated into a workable mechanism. So the first one is enable effective resource allocation and reduce cost overall. We try to be as smart and simple as we can, facilitate to making these tough decisions on allocation. That second is that this stage gating should be embedded in the theory of change. So it's based on likelihood of achieving development impact, conforming to quality of research development aims of the system and be clear on the moment of transition to partners for wider adoption at scaling. It should be transparent and evidence-driven. So this is also comes up quite clearly in the discussions and in the research strategy about having clear metrics and criteria for making the major decisions on continuing and advancing certain lines of research. Universal at the portfolio level, but to be flexible at the project level. So informed by specific needs of the project and its components. For example, the stage of research, the scale of scaling readiness of innovations that it's applicable to both technological and non-technological CGR innovations. And then we want to also make sure that we don't crush creativity that the stage gating does still help to encourage innovation. So we want to support learning, reflection, adaptive management, reorientation, reallocation of resources. So those are the principles that have been quite widely agreed upon by the funders, by the science leaders when these were presented also amongst the different scientists. We all agree that we have had consensus that this sounds good, it looks good. However, there's a lot of challenges and issues raised and I could have actually presented about five slides of concerns issues that were raised, but I'm just going to flag a few of them just so you know. There's basically around what, when, how, who, and how we actually could make this thing work. So the why about it, the principles are not so contentious but there's quite a few other challenging areas. So the effects themselves. So stage gating could provide undue disincentives for higher risk, higher pay research if we are strictly guided by certain types of metrics. So we have to be very careful and cautious on how we, how we define those so that it doesn't stifle creativity. The decisions themselves. So I guess this is one of our, the age old challenges. So if we do do, if we go through stage gating and find that, well, progress is kind of slow. Does it, does that mean our research should be halted or adapted or does that mean our theory of change was kind of wrong from the beginning. We need to modify that. So that's always a challenging area. How much tolerance is there really for adjust decisions versus halt decisions? We, we, I think the perception from the donors is that we always choose the adjust decision and not halt decision enough. And what, but what effect will stop decision have on our staffing and the CG and partnerships. And in fact, do partners have a voice in the stage gating process themselves because they will be quite, quite affected by these decisions. On due ability, there's a number, a long list of things that are challenging. But here's just some of them that stage gating may not easily apply to some types of CGI or research. Or when different research components are highly interdependent, can you stop one line without actually effectively halting the potential for impact from a number of them. But in terms of different types, we've already received some very interesting piece from a few people from IFPRI who wanted to show why policy research might not fit very well here. I've had a few separate conversations with Ravi about, you know, research on transformative, you know, systems research at a very high level that may not also necessarily fit very conveniently into this paradigm. And so, and also can we effectively link stage gating to theories of change and the quality of research for development indicators. It's one of our principles, but we have to make it work. And how to balance rigor and feasibility in terms of metrics. How much do you weigh past performance of a group versus the projected impacts, which are continually changing. And do we have the capacity to implement this across the many one CGR projects that are envisioned. So moving forward, we could drop on experience with stage gating in the CGR. So there's many examples in breathing programs, CCAF's annual performance review does this, the inspire challenges, challenge allocations and many processes of reflection and annual budgeting. We have this core group which is composed of Jules Colomer from the SMO myself, Mark Scoot from RTB and Helen Elchel from Ilri, who's on the Melcop. And we're, we've also engaged on a wider reflection group. We have quite a bit of expertise on theories of change, quality of research for development, project project, projected benefit calculation groups out there and performance management to draw upon. And funders have also volunteered assistance. Some, a few of them are doing their own stage getting like GI said, for example. So at the moment we're just awaiting discussions with the executive management team to determine how this will be further development, develop, but we have quite a few interested people at hand to be able to do that once we have clear guidance. So that's all I had to say. So thanks very much. Got a question from Fergus quite early in the, on here that's been voted up. Doesn't reflective iteration within research processes with stakeholders represent a bottom up stage gating and is that more appropriate than the top down model of one CGIR that seem, that they seem to be moving towards or can they be combined? Yeah, can you just start the first part of that? I missed it. Sorry, go ahead. Sorry. I'm just trying to find it because now it got voted up again. Okay, sorry. Doesn't reflective iteration within the research processes with stakeholders represent a bottom up stage gating that being more appropriate than the top down model of the one CGIR seems to be moving towards or he's asking if whether they can be combined? Yeah, I think they could be combined. I think that the one thing that struck me as not having been reconciled yet is when we apply stage gating because I think some of the donors feel that that should be used from what they call major decisions and we don't really know what that means yet. The way GIZ apparently uses is that they allow for a three year period of a project cycle because they know that you can't make quick decisions and hasty decisions very frequently. So and then they basically do some more formal process at that point to see whether there should be another phase to the project and to go forward. So and then they leave the other decisions that you might do in a normal project cycle up to the, you know, to have these bottom up processes and other kinds of things, I guess. So that's still kind of to be decided, but I would suspect that any kind of final solution will involve both of those. Hey, we've got another question from Carl asking whether there was any discussion in the meetings on the uncertainty of what innovations will ultimately be impactful. We need likely to invest in many things, only a few of which will actually be impactful. Seems to be very positivist. Do you have comments on that Frank? Yeah, we, there's actually there's been, so fundamentally I think a lot of the stage gating has, I think the mindsets of many people and the examples we had from outside the CG were used often for breeding of breeding lines or varieties and so forth. That could be subjected to some common metrics to make some choices along the way. And so we, the, there's another group within the CG who are looking at scaling, you know, we have scaling readiness tools, scaling scan tools and many of these who are also looking at from an innovation perspective who are broadening that out to non technology types of innovations to see whether their tools actually still work across those. They have tested them in a few cases and find that they do. But generally speaking, I think that we really have to think much more clearly about how the scaling could work across the wider types of research in the system because we don't always have the same say bucket of innovations as we do for in the crop variety types of research in other areas, which makes it quite more challenging I think to implement it as some of the funders see this in their minds. Okay, thank you very much. Perhaps we should move on to the interactive panel discussion now because as usual we're running short of time and we're getting so many interesting questions. So we might want to ask all of our presenters to put their camera on if that's okay. And in this discussion will also be joined by a whole group of the chair of the ISDC. Rachel, would you like to see if there's any questions to get the conversation started? Yeah, I think maybe we'll start with Tony Simon's question. Is he able to ask that live? Hello. Yep, we can hear you. Hi, I'm just trying to get the video for you as well. Okay, so I can turn the video back on. And I got to put my clothes back on and got out of my pajamas. So there we go. Anyway, the question was, you know, you know, John and Brian and Rick and others, Frank and others, it's great. Let's imagine we had this all in place and it was all working really well and you know, it was just brilliant. We had the best things. At the end of the day, any intervention or policy or technology or awareness raising event or change has winners and neutrals and losers. And how then do we use this anticipating that to say, okay, so what are our principles then in arriving at that? Is it to do no harm and have no losers or is it to say we would accept 85% winners, 10% neutrals and 5% losers because it's all for the greater good and we see the balance of that, but we wouldn't expect except 40% losing, 30% neutral and 30% winning because of the threat to people. And therefore, I think it's good if we can think, you know, what's the next step when you've got all of this place and researchers are working like this and it's all working wonderfully. The outcomes are going to be incredibly diverse and variable and therefore, how do you in this process of arriving at that help benefit that because I didn't see any mention of winners and losers in your conversations. Thank you very much for your question who anybody in particular would like to answer that. Holger, would you want to start and then perhaps other panelists can compliment? I can give it a go. In terms of winners and losers, I mean a lot of that comes back to how clearly articulated your theory of changes because if you have a strong theory of change then hopefully that would have clearly articulated what is actually trying to achieve with that type of approach, with that research, with that technology who the main beneficiaries are and if it's a really well thought out theory of change it also covers some of the contingencies and some of the negative aspects that might be associated with it. Which brings me to a related point that came up a couple of times already in the discussion and that is the importance of the theory of change in order to achieve the impact that is desired. Which actually means that as we progress through a research process new insights, new knowledge is generated which inevitably needs to impact on the theory of change which needs to be dynamic enough in order to accommodate that that you can change it as you go along when an assumption is not on the whole. I don't know, that's a starting point for that. I'd be interested to hear what others think about that point. I'm not sure how we do this. I think there are a couple of points to consider. In our book we talk a lot about trade-offs so this idea that there are winners and losers does matter to us, I didn't discuss it much here. But when you think about winners and losers we actually carry with us some values about who we'd like to win and who we'd like to lose. It's not like we want everyone to win. If we're talking about certain kinds of research endeavors we may want stakeholders to win in some sense by gaining more control and researchers and funders to lose to some extent in terms around that control. With stage gating part of this is to bring funder values into the process more. For that there's some sense that for funders winning around those values being included is a good thing and researchers may be losing some control over that is a bad thing. We carry into this a view of winners and losers and that we want this to shake out a certain way. The question is what do we believe should happen and how do we set up systems to make that more likely than less likely? I'll just make a comment. To me there's a difference between what we're planning what we're envisaging happens and I believe theories of change are most useful when they're set up as you're designing a project not retrospectively. And when you're starting something I believe if you can predict that there will be losers then there's that negotiation whether that's acceptable has to be part of the project design. That's quite different from the retrospective assessment of what has happened particularly those unexpected losers. And as John pointed out in his presentation the post-doc assessment of impact has to include all changes that have happened including the losers and not just being able to check off the good things that we expected to happen. I would just maybe one last thing if I can just quickly too. Whenever there's a price change as a result of what we do there's winners and losers when the price changes. We can't help it there's always going to be winners and losers I think on things. But I think it's also good to think about the longer term and the shorter term because we had some work for example in the Philippines which we knew was going to be a group of farmers who were going to lose in the short term but since the country was winning they could then put those saved resources into helping those farmers out. So I mean so then we had some further on research and engagement with them on how to do this over time. So yeah I mean I think it's good to be able to anticipate these things at the beginning and see how best you can work through those. We've got a question from the audience that's been voted up. It's about scaling so maybe John will start this answer off but it's not a key part of scaling changing the context so creating a more enabling environment to overcome innovation adoption barriers. John do you want to start with comments on that? Yeah so I think what we decide to label context versus the intervention say that we're engaging in is always a little bit arbitrary. I mean there can be the context of society in which people live and your intervention is to change some aspects of that context to make other things happen. So the first question I'd ask is are you really just giving a different name to the thing you do in which case it's exactly the same as what I think I had presented. If you're talking about that more complex model I was discussing around dynamic evaluation where your scaling efforts are going to have impacts on well through your impacts will potentially change context and also through side effects of scaling may also affect your context. You could be trying to engineer that in such a way that it enables your larger purposes while monitoring that to see if that's what it does. But I don't really see a contradiction here I think it's largely a linguistic one about what your labeling is the focus of your work and if I've misread that then let me know. Holger, Brian would you like to comment? Okay I've got a note. Yeah I'm just sorry I'm conscious of the time I'm just trying to find out whether we can be indulged with a couple of minutes extra. So we might try to run until we're supposed to close in three minutes we might try to run until 20 past six I hope that is okay for everybody. So perhaps we could take one final question and then towards the end I would ask Holger to wrap up the session for us. Maybe you want to pick one last questions of the one that we're voted. Yeah so we got a question from Valentina. Our research trajectory started from level one to solve level one problems and scale within the framework of level one issues. We've reached evidence on level one but we know that this does not necessarily fit to issues and problems that can be defined if we start by analyzing level three and take a theory of change approach. Sometimes Valentina feels that tackling level three issues redefines the needs for level one research. It is two ways. Does anybody want to take that on? I'm happy to have a quick go at that. I think that's correct that as soon as we start looking at level three issues and start to understand them better they help us to define some of the level one issues that could be embedded within and helps us to set the clear boundaries and in doing so become much clearer in the articulation of the research objective and that iteration can be extremely helpful. The other comment that I would like to make in relation to level three problems is that we have to be very careful to identify which component of it actually falls within the responsibility of science. A lot of those issues are very complex and often science only plays a minor role or in some instances even a very controversial role that we need to understand much better before we engage in a very detailed scientific process. We need to make sure that the questions that are being asked can actually be answered by science before we use science in trying to answer them. And that's where often the conflicts arise between particularly politics and science. I would just add quickly, I agree completely and I think it's important to not get overwhelmed by all of these ideas and expect or assume that each and every project has to do everything. Rather having a good clear theory of change at the program scale identifying the elements of the problem three questions or problems that can be then answered in a more mode one problem, one type of research is what we should be doing. Some things have to be defined in a more narrow way and some things have to be done to deal with more of the complexity. So I just agree very much with the statement. And I think that was exactly behind what I was trying to say. Remember the diagram I put up of what we were calling the research in development process, which together is a way of conceptualizing what might go on with some level three R&D work. And yet within that you can break it down into pieces and those pieces can be often addressed by level one type methods and research. And I think we ought to give ourselves credit for doing those things well and make a noise about doing those things well, even though the whole picture might be something that's rather hard to assemble and hard to track exactly what's going on. Yeah, hi. I was just wondering, we've had, I mean, aside from the Q&A we have a chat box where, because the panelists cannot write questions in the Q&A and I see that Basin, our director has been very active and I was just wondering whether Basin, do you have a question that you might want to ask? No, I have a question. Maybe, thanks Federico, maybe a direct implication of both of John's talk and also the panelists that we've seen the type one problems versus type three problems to address those type one research versus type three research there is also type one evaluation or assessment approaches versus type three packages. So what are the implications? Very often in fact we are caught in a mismatch. We're being applied type one evaluation where we're in fact doing type three research to do type three problems. Does that mean that we really need to shift, as you've said John probably in your book as well, some of the theories of evaluation and impact assessment? I would say yes, we do need to shift those theories or at least are thinking about it. Expand them I would say is really what I'm suggesting, not taking what we have and throwing it away. If we have just a minute I can maybe show you something that relates to how evaluation approaches and scaling approaches may be related to what we've been calling level one and level two, level three sorts of problems. I'm asking our moderators if we have a minute for me to try to show that. I think if it's a minute we can... Absolutely. All right, let's see if this works. All right, I'm going to show you right up here. So imagine we have a couple of axes. We've impact risk vertically from low to high. So there's a chance we're going to do something bad if we act versus there isn't much chance something bad will happen. And then urgency, we better act now, which is high or low. So when the impact risk is high and the urgency is low, that's where this phase research and stage gating I think works really well. We have the time to go through all of that and cycle through, right? So that's where maybe scaling what works, that motto works well. When the impact risk is low and the urgency is low, then this is what's happening a lot in sort of the business world. They call this lean scaling where the only urgency they have is around markets. But they can sort of take their time to develop a new product and they can just bring it into the market. It doesn't really matter. It's not going to hurt anybody. So you're going to let the market decide here. What's happening, I work a lot with people who have one foot in the market world. And a lot of what's been going on is these two things getting shifted where people in markets are using lean approaches where they probably should be more cautious. And phase research is being pushed upon them in contexts when it doesn't really need to be the traditional phase research stage gate approach. So let's put those in the right order. Over here is the urgency is high, but the impact risk is low. In which case we should just go, right? Let's just do this. There's not a lot of risk and we need to act now. The ship is sinking. Let's do something. This is the quadrant, which I think is really the issue, right? The urgency is high. The impact risk is high. And this is crisis. And this is what's going on with COVID right now. And this is why Rob and I said we needed a principled approach to innovation or scaling or evaluation that takes that into account. And I'd say that that circle is the one that is most of what we do. We are acting because there is some urgency and we have a high degree of uncertainty about what will happen. And that's what we need to try to understand. And trying to superimpose any approach on how we go about that is dangerous because we mismatch the context with that. Having said that, knowing which is appropriate when and where is really quite hard. So I feel like there's a lot of groundbreaking going on in these discussions with a much wider view towards what is possible than I've heard in the past. So I'm really excited. Was that a minute? Yeah. No, that was fantastic. Thank you. I think we're all in awe of your tech skills, especially. Thank you so much. We are running terribly late because we're already seven minutes past time. But I mean, I wanted to thank all of you for being here, all of our panelists. And I wanted to ask Holger if you wouldn't mind closing the session for us. I know it's very late for you there. And thank you so much for staying up so late with us. No, that's okay. It's 10 past two in the morning. And I hope I'm so reasonably coherent. So thanks for that. I thought it was a fascinating session. And I really enjoyed the different perspectives because I think they're critically important as we move into the new one CG AR world. I appreciated very much John's talk on the scaling and the fact that he reminded us on that we need to look at acceptable impact risks. And that scaling changes the context. I thought that was a very important message that at least I took away from that. And that it is the responsibility for us all that it'll get all of the impacts as they occur rather than just the intended impacts. I also thought that was really nicely followed up by Rick when he pointed out the misalignment that we have with some of our conceptual approaches and the real world. And that we need to break the complex research and development process into more logical steps. And that really helps us to actually think through the problem domains much better and match the approach to the problems that we need to solve. Brian did a fantastic job in actually applying the quality of research for development framework that we have developed. And I thought it was very interesting that he found that the three-point scoring system that they used wasn't fine-grained enough for the purpose. And I thought that was a really good learning experience. The other thing that I found fascinating in there was that simply by developing the rubric and actually applying it, it already identified some major gaps in the science quality that could be readily fixed right at the outset of a lot of the projects had they used a framework like that in the beginning to conceptualize the research. And that's a discussion that I really think that we need to continue. And finally, the conversation about the stage-gating I think is very important for us as we move forward into a very different way of doing the research with a different relationship with the funders as well. I was encouraged to hear that the whole stage-gating process is about encouraging innovation and with the aim not to crush the creativity. And I think that's something that we really need to keep in mind as we move forward. With that, I know that we have enough time. I appreciate the opportunity to be part of that and keep the discussion going. Thank you, everyone. Thank you. Can I just intervene a second here? Thank you very much, Helger. It was really interesting. I just want to remind everybody that we're closing now, but tomorrow we have another three exciting sessions, two parallel poster sessions and a hot and controversial session on systemic approaches in a silver bullet world. And it will be led in a quite innovative fashion, very different from the other hot and controversial sessions, which we had. So I invite you all to participate and join us. And thank you for having stayed a bit longer, but it was an incredible session. So I really congratulations to everybody. Wonderful. So I'll see you tomorrow, everyone. Thank you. Thank you. Thank you. Thanks to everybody. Thank you. Bye-bye. Good day. Good night. Bye-bye. Thanks, Helger. See you. Thanks, John. Thanks, Brian. Thank you. John, Federica, Rachel. Thank you.