 All right, so we've waited our customary a minute or two for the Zoom meetings. Okay, so let me introduce myself again. I am Ajay Pillai. I'm a program director at NGRI and part of the team for the Morphite program. My colleagues include Adam Felsenfeld, Colin Fletcher and Riley Wilson. And I would like to also take this opportunity to thank Gerald Somani for his help with the webinar. We are today gathered for this webinar specifically focused on RFA-HG22-019. This is for the data analysis and validation centers which are going to become part of the Morphite program. Next slide please Riley. Okay, so I am going to start off with a whole bunch of links. As I mentioned before, these slides are available on the Morphic webpage that NGRI has and they're available as PDF. So the relevant links that I have decided to include in this slide include the link to the current FOA. If you want to reach us as a group, you have the email contact available, the webinar link which you probably already have, the program webpage. There is an FAQ that was developed for the Morphic program. That's also available on the website. The NGRI 2020 Strategic Vision, one of the outcomes of that vision is the Morphic program. And the past FOAs which were for the Data Production Center and the Data Resource and Administrative Coordinating Center. Next slide please. So this slide covers what is going to happen today in this webinar. After I tell you a little bit more, Adam is going to take over and talk about the overall aims of the Morphic program. And then I'm going to come back on and talk about the FOA specifically and then we have some time allocated for your questions. Okay, so I want to remind you, as I mentioned before, that this call will be recorded and posted to the NGRI Morphic webpage. Your questions, maybe if they appear to be of broad interest, we'll convert them into FAQs and post them. You do not, of course, have to identify yourself when you ask a question. And also this is a Zoom webinar, you have a Q&A button instead of the chat button with some of you maybe more familiar with. And please ask a question using the Q&A button. As I mentioned, the Morphic program has two RFAs that have already been issued. The actual funding notices for the experimental component and for the data coordination center and the funding announcements are going to be made in all likelihood this fiscal year, which means next week or so. And once that information is public, we will update our website accordingly. And with that, I'm going to hand over, next slide, Riley, please. I'm going to hand over the con to Adam to provide a general overview of the Morphic program. Adam. Great, thank you, Ajay. And I'm Adam Felsenfeld. I'm another program director involved with the Morphic program. And I should also say that on the phone with us today are Colin Fletcher, another program director for the program, and Riley Wilson, a program analyst for Morphic. So Riley, if you could advance to slide six, next one. Yeah, so it's easy to talk about Morphic in terms of its long-term goals, so more than the initial five-year grant period. The point is to develop a catalog of molecular and cellular phenotypes for no alleles in human across the genome. And ideally, such a resource would be consistent, meaning that their standardization and assays are well characterized, et cetera. We would want strong loss of function, probably no alleles for everything rather than different kinds of alleles for everything. We would want informative molecular and cellular phenotypes in multicellular human systems. And we want it comprehensive, so all genes are as close as we can get. And of course, that's an ideal. And as an ideal, I hope it's easy to see the data would be highly informative for multiple tissues, relatable to other kinds of data, so other alleles, phenotype data, consistent standardized, et cetera, et cetera. But we think this is not feasible yet. We don't know the best way to design such an effort. There are a fair number of technical and scientific challenges or barriers, which I'll get to in a bit. We don't understand scaling issues like costs and throughput. We don't understand strategies to employ the tools to be developed, the trade-offs that need to be made, to best explore what's likely to be a very large experimental space. And just getting the data, of course, is not enough. We need to understand how to use it with all that entails from data management to QC to making data interoperable to demonstrating the scientific value of the data in multiple applications. So next slide, please. So why do this? Well, first, there's a paucity of data on human knockouts, though there are resources like Nomad that do yield information. There are mouse knockouts from the knockout mouse project, but mostly for them, no molecular or cellular data. And while mice are an outstanding model, for many reasons, they're still not people. Knockout phenotypes are useful for interpreting other alleles. Data like this would complement a number of large efforts underway to study gene regulatory variation, for example. It could be a resource for insight into biological pathways. And finally, if the cells on which the assays are done can be saved and propagated, they may ultimately represent a collection of disease models for further study. The purpose of phase one and next slide, please, is really to understand the main barriers to that long-term goal by getting started at a scale of about 1,000 loci over the first five years. So really feasibility and how to inform or information that's needed to design a potential phase two. So certain barriers that we call out, though there's certainly others, this is not an exhaustive list of how to select genes, how to optimize making alleles. We need to know what cellular systems are most informative and in what circumstances, what assays are informative, what the potential for scale is of the parts and the whole. And there are a number of interesting scientific issues like specificity and pleiotropy and cellular autonomy and compensation and genetic variability and other variability and getting maximum value for the data. And again, phase one intends to address all these things by starting at a modest scale and at the same time giving room for some diversity of approach. So next slide, please. There are ultimately going to be three components to the Morphic Program. As Ajay mentioned, there are data production awards, there are four awards that will be made by October. There is a data resource and administrative coordination center that will be won award by October. And what Ajay and I are gonna talk about today, mostly Ajay is gonna talk about today, which is the data analysis and validation centers and that's the CFOA. So next slide. So I'm switching gears a bit here just to give some general advice for applicants. These are going to be cooperative agreements, which means that there's substantial NHGRI program management. There are collaborative tasks that are going to be the apparent. Some we've defined and some we'll get defined once the consortium really gets going. Ajay is gonna talk about those, but please read the FOA to find out terms and conditions for how this will be managed. There will be within the program for flexibility to set and adjust milestones, which is always needed in a complex program. There will be a kickoff meeting after the grants are funded in order to establish the consortium. Letters of intent are not required, but we strongly encourage them. They're due October 1st. They really help us get a handle on especially review workload and it's difficult to find good reviewers and the longer our review branch has in advance to look at these, it's the better. The next slide, please. In general, always read the review criteria section of any FOA. This is what reviewers will use to evaluate your application. Please read the instructions to applicants for the research plan sections. These FOAs, all three of them have a separate resource sharing section, and which is not, it is not typical, but those resource sharing sections will be evaluated by reviewers and considered in the score. Please also read the section on review and selection, which lists criteria that NHGRI may apply in selecting among well-scoring applications. Please read the budget section, especially regarding the minimum time commitments and the need to budget for consortium meetings. And please choose any letters of support judiciously. It's great to have lots of collaborators and lots of enthusiasm, but if you have a lot of letters, you may be disqualifying review potential reviewers in what may be a small field. Next slide, please. So we especially encourage applications from investigators from demographic groups or institutions that are generally underrepresented in genomic science, from new investigators, from experienced investigators who are new to genomics, and investigators who have not previously participated in an NHGRI consortium or program. And the next slide, please. Overall, we will, as Ajay mentioned, we will probably post some FAQs if they're depending on the questions that are asked. And please look out for these in the next month on the website, which will be updated. And I think that's it. And I think we should hold questions for after the whole presentation. Do you think that's reasonable, Ajay? Yeah, that's fine, Adam. But I mean, if people are worried that they may forget the question or something, you can type it into the Q&A right now, but we will address them at the end all together. All right, thank you, Adam. So I am now going to get to the specifics of the RFA for the data analysis and validation centers. Next slide, please, Riley. You did two. Can you go back one, please? Yes, thank you. As I already indicated, if you want a PDF version of these slides that are available on our website, some of the fonts especially later on are too tiny. They're there for you to look at later or not we need to talk about today. Okay, so the general outline of what this FOA tries to achieve. So the primary goals for the data analysis and validation center FOA is that to make sure that the consortium's data variability is controllable. Data is useful to understand basic biological processes. It is interpretable for undertaking future hypothesis-driven science by the community. So essentially, a lot of the things that Adam addressed in his morphic overall goals, our main product is going to be the data that we put out and the data validation centers are going to play a key role in ensuring that the data is of high quality and useful biological insights can be gathered from there. So generally speaking, the projects with high potential to eliminate the strengths and weaknesses of the data being generated and the focus on community utilization and obtaining feedback from the community on how we can, you know, things like reformat, re-better experimental designs and so on will be obtained through community feedback. Next slide, please. So the FOA itself lists a whole bunch of non-responsive criteria and I wanted to highlight some of them. We do not expect that there will be WETLAB data generation in your applications to this RFA. Should make sure that you propose to use morphic data and how you propose to use morphic data and not just other similar data sets or complementary data sets. So make sure that you address collaborations within consumption. These are, again, non-responsive criteria and that applications that do not contain a data sharing plan. Next slide, please. So some of the key challenges, as Adam said, this is not an exclusive list and these are some of the key challenges that we have identified and I'm sure you will think of many more. So some of the key challenges include integrating data between the data production centers and as you will see, I'll describe some of the experiments that are being proposed by the data production centers and there are differences in how the experimental design is there and all the other related details and integrating data that morphic generates with other data sets that are available in the community through either other consortia or one work. So yeah, the types of experimental projects that one can think of, but again, this is not limited to this set to figure out how to identify and correct technical bias, batch corrections and other issues that come up with large scale data production and analysis, work on improved experimental design within morphic methods for separating the effect of the knockout from background changes, building things like gene regulatory networks or other pathway-based analysis, data integration across multi-omic measurements. Again, you'll see some multi-omics measurements being performed by the data production centers. So a fair amount of the work that you will be proposing will be paying attention to the availability and the quality control and of course, the data production centers will also be working on quality control and assessment and both of these things, metadata and quality control should reflect both the underlying biology and the data generation activities. And also to think carefully about how small non-morphic program labs around the country or the world for that matter can use the data and the models created by the morphic program for designing downstream experiments that validate a lot of the insights from large programs and data resources and catalogs that we put out. Next slide, please. In general, I wanted to remind you that there are in addition to the things that the scientific aims, there are consortium responsibilities. And again, how good and useful are the data, how good and useful are the metadata and the API and the data access. There are a whole range of open problems that we have not resolved yet with them because we will start resolving it during the first year and later on during the morphic program. There are questions around common pipelines and we expect that the data analysis and validation centers will take a leadership role in making assessments around common pipelines and related issues about sharing data. Okay, next slide, please. So again, as Adam mentioned, there are particular parts of the FOA that you should pay attention to. And of course, the review section and the questions listed there on how reviewers are going to review your applications is an important section to review as well as what the research strategy section of the FOA says. As a reminder, the budget is at 350K direct costs, maximum for five years. And of course, budget should reflect what the actual means of the project are. The applications are required to submit a separate data sharing section. And again, to repeat Adam, this would be reviewed and be considered in scoring. There are very specific review criteria in addition to the standard NIH review criteria that are again listed in the FOA. As a general reminder, the due date for the application is number first. Next slide, please. Okay, I am going to take you now through a whirlwind tour of the four data production centers, okay? And this is just to give you guys a general idea of the types of technologies that are being used to generate the knockouts in and of themselves, the types of cell and molecular systems that are being proposed to be analyzed and the types of assays that are being proposed by the data production centers. And again, as a reminder, once the awards are made, you will get a lot more information about who the data production center applicants, I mean, awardees are and other details about their project. Okay, as some general background, each one of the data production centers has proposed to analyze their own data, but each of the single modality data that they propose to generate. They also propose to do some limited amount of integrated data analysis to discover new biology from the experiments that they are proposing to do. And what I'm going to do in the next series of slides is to summarize graphically at a very high level what each one of the morphic data production centers have proposed in their applications to do. And again, you will not, it's a very high level summary. I've tried to make certain things in the graphics uniform so that there is a consistency between the different projects that you will see. At the end of the slide deck, and I will not, I'll just show you one slide today, but at the end of the slide, then again, as I said, the PDFs are available if you want to look at it right now, that there is a tabular summary which complements the high level graphics that you're going to see in the next few slides. Next slide, Riley, please. So this is project one. So what they propose to do is undertake five different types of assets on differentiated cells. Most of the assets that people are proposing are going to be found in these graphics in yellow boxes. And there is a generic description of the data analysis pipeline that they propose to use. Along with the specific technology, like for example, this one lists Linux genomics. And the green boxes here are when the data comes out of the data production center into the drag the data coordination center. So that's the general structure for these graphics. And on the left, usually almost all projects are starting with IPSEs. They are going to knock out genes. There are four projects, as I mentioned, and all of them propose slightly different ways to knock out genes, sorry. And they include CRISPR-Cas9, CRISPR-OF, as well as Degrand. A lot more of these details are there in the tabular version. And okay, next slide, please. So this one, this is project two. As you can see, this one is proposing to do different types of cellular systems. So they are going to produce three germ layer differentiation as well as cardiomyocytes. They are proposing to do bulk RNA seek and attack seek assays. There is a lot more phenotypic non-genomic type measurements, including calcium transients per cell, as well as microelectrode measurements and how well the cardiomyocytes are feeding and how well they perform electrically. As well as a lot more assays around genotyping. It also gives you some more details about the types of genes they plan to knock out. Some more comments we are going to during our year one, figure out details about coordinating among the different data production centers around genes to knock out, as well as issues like sending samples over to someone else to run their assays and things like that. None of these things have been decided yet. And as the year progresses, we will make this sort of information publicly available. Next slide Riley, please. This is project number three. One of the emphasis that this project has is to make a lot of their knockout IPSE cells available to the community. On a scale which is much larger than some of the other projects. They also propose to differentiate into three different models, get 2D gastroids and creative eyelid organoids and a triculture system with neurons, astrocytes and microclean. And here they have a variation on the type of transcriptomics they're going to do. They specify that this is one of two centers that's going to perform but of seed type measurements. Next slide, please Riley. So this is the final and the fourth project data production center. They are going to focus mostly embryoid formation and as well as a much larger set of assays. Not, so I want to listen to add that at least for this project and maybe for some of the other projects, not all assays are going to be run on all perturbations, right? There's going to be some sort of a triaging that's done. And this is part of what we are going to assess in the Morphic program, as Adam mentioned, right? What is scalable? What is practical? What is practical five years from now and all those sorts of questions. So they're going to perform perturbs seek and both the transcriptomics and epigenomic manner. They also propose to do a nuclear protein abundance in a multi-omics setting. And finally, spatial transcriptomics using the Visium platform. Okay, that's the summary for all four projects. As I mentioned, there is a tabular representation of this information. Next slide, please. So another important element of the Morphic program, of course, is the data coordination center. Now, of course they are, this center is going to do a lot more than data coordination. They're going to also be the administrative coordination part of the Morphic program. I will summarize what they propose to do in one graphic with a focus again on data because a lot of what all of you are going to be interested in is what is going to be happening to the data and how are we going to play a role in all of this? So next slide, please, Bradley. So this is a summary of what they propose to do. I'm sort of highlighting the data analysis and validation centers from the application that the DRAC has proposed. So as you can see, a lot of their focus is on making sure that data ingestion works properly, that there are identifiers and metadata and all of those things and putting the data and the analysis all in an open manner on a well-designed web portal as well as design of the APIs to access the data and be able to query the data in some reasonable manner. As I mentioned, and you should think about what sort of external data sets are important in your sort of analysis that you propose and if that data is accessible, you should think about making it available within the morphic ecosystems so that one can analyze data together. And that basically brings me next slide to almost the end of the presentation. I will get to questions in a minute. I just wanted to just show you one slide. So Riley, if you could jump to two slides down, please. So thank you. So each one of the tabular representation, there's a project one that is labeled here up on top left, which is the same as the graphical project one, right? So you get a lot more details about what people are proposing to do, what sort of biological questions that they might be interested in and how knockouts are being made, things like that, okay? And with that, Riley, if you can come back to the question that says slides and that brings me to the end of my presentation. Adam and I and Colin, Riley and are available to answer any questions that you have. You can ask your questions verbally or you can type them in, send us email later. Any of these things will work. That's an interesting one. The PDF is missing images from slides 21 to 26. I saw that, okay, we will reconform and we will fix whatever is an error. Apologies if that is, I thought I saw them, but anyway. So the next question is, would small scale experimental validation be allowed in this proposal? I would think that the answer generally is no. Maybe my colleagues have other thoughts on this in general because I think most of the experiments should really be done within the context of the more big data production centers. So how many awards are expected is the next question. I think basically we are hoping to make about four awards, three to four or five, three to five awards. It depends on the details of what the budgets are. This is specified in the FOA. So the next question is, will the presentation be available? Yes, the presentation will be available. The presentation already is available, but apparently this is not right. So we'll fix that and update. For RNA seek, any of the data will have full transcript coverage on only three prime end. So the bulk RNA probably will, I'm unsure about the details if this matters to your proposal, please send that as a question to the Morphic Program at NIH.gov email address. And I'm happy to figure out the details and let you know. Collaboration with all DPCs is expected. Yes, collaboration with all DPCs is expected, but that does not mean that you have to work equally with all of them, the same degree and the same extent. The next question is, so I'm answering all of these live. In a second, let me just update my screen. The questions that have already answered. What types of entities are eligible to apply? So there is a long official list of entities that are eligible to apply. Most of those include the usual universities, there are some federal agencies and applications from foreign institutions are allowed, applications with foreign subcontracts are allowed. If you have very specific questions, feel free again to email us and we'll be happy to answer those sorts of eligibility questions in more detail. Okay, can we propose new analytic methods algorithms in the proposal? Yes, you can and you should propose new analytical methods and algorithms in the proposal. Or it will be more engineering driven. I mean, some of the methods would be engineering driven. I mean, if there are standard methods that are available in the field in the area, you should of course, compare your new methods with the existing methods and to make sure that one is doing the best work when that is possible. There will be some amount of engineering that is required, you know, I mean, at least at GitHub and Jupiter notebook level, you will have to provide your source code should be made available in those sorts of details. The idea for the data coordination center is that they are more likely than not going to wrap them up as I'm missing a noun. They're going to wrap them up and use them in a cloud driven framework. So, at least at that level, they are engineering driven. Are the knockouts at our home? Adam, do you want to try to answer that? I don't. Yeah, I think the assumption is mostly homozygous. As a default, but there are certainly going to be circumstances which heterozygous will be important to analyze. For example, when there's a huge amount of diatropy or just a really early and awful lethal effect, something like that of a homozygous. It's one of the things we have to understand. I think we have to understand the answer to. I think there's already some data that provides insight into that, but I think we need to answer that in the context. Of the program. Okay, thanks. So next question. It appears one key aspect of. Whoops. I lost my question. It appears one key aspect of the program is to create bridges to other synergistic programs and data collections. The answer is yes. And also for. Yes. So there are multiple ways one can think of doing this. And you are welcome to propose your own methods. So I think that's one key aspect of the program. The next question is where can we find planned data details? That is the number of samples, number of models, types of data monorailities. So you will likely not. Find that level of details, the total number of applications and the total number of samples. So I think this is critical for you for some reason. And please send us an email and we will try to get this on this question answered. I will say the following generally speaking. So the data production centers actually propose to do a lot of QAQC around how well the knockout has been knocked out. How well the phenotype gets carried through once they start differentiating into specific cell types and a lot of those sorts of details. And presumably all of these things will work so that you will actually have data available to the community. But if you need for your application details about the total number of samples and things like that, please send us an email. Yeah, especially if they go beyond what Ajay put together for the tabular summaries, please have a look at those first. There are so many data modalities in from project one to project six. Can we cover part of the data modalities or is it required to cover most of the data that's generated by the DCs? The other is no, you do not have to cover all of them. Having said that, if you just say that you are only going to deal with RNA-seq data that is probably not going to be, that's not going to make the reviewers happy, but you know, I'm not a reviewer icon. But generally speaking, you should think about what Adam said, right? I mean, the overall goals of the morphic program are what we emphasize when we do our reviewer meetings with the SRO, right? So the SROs are the pre-meeting with the reviewer group. We are going to emphasize the roles and goals of the morphic program to the reviewers so that when they read the applications, they understand how a particular proposal fits in with the overall goals of the morphic program. I hope that answers your question. I can give another kind of answer, which is, boy, we would sure like to get out of the phase one, what it takes to really make some currently difficult to use or difficult to harmonize data to make it really useful. And maybe it's just not the right data. Maybe that's the answer, or maybe it's interesting data, but it just isn't obtained the right way. It's too variable. Something like that. But we need to know as a consortium at the end of the day, we do need to get a much better feel for that than we have now. I know that doesn't help with an individual application, but I think Ajay gave the individual application level answer for that. Okay. Next question. Okay. No experimental work within DAVs, but how much interaction might be feasible? That is validation raises specific issues, but how much feedback is felt or what's needed? And how much feedback is fed back to production centers to be resolved by more lab work there within the time frame from the project. And the answer is, yes, to all of these things. We do expect to be a very active consortia. And we do expect that insights from your analysis of the data sets will be taken into consideration when, your tool of the work in the data production centers in year three and year four, five. So yes, we expect Adam, do you want to add anything? No, I think there's obviously, there's going to be some resource limitations to how much can be experimentally validated, but if a production center has data that needs to be clearly would be useful and needs to be validated and they're not, they're sort of within the consortium, there's not the right kind of validation being done, then that needs to be part of the discussion of the consortium and certainly it's a kind of thing we could ask people to do, we could ask production centers to do more of. Okay, next question, ask for the engineering, aspects for part of the work, would it be okay to include software web developers to go beyond just GitHub repositories and interactive web servers? So I, this is a tricky question and I want to be careful answering this. We do not want multiple places to start, multiple places with multiple grantees within the Morphic program to just spawn websites of their own because it just creates a lot of confusion within the community. Now, having said that, if you think that you are, so I'm guessing that people want to propose specific tools that will do specific things and we expect that these tools, if for example, they require a web interface, for example, to enable access to biologists who don't want to be writing programs, then we expect that there will be links from the main Morphic page which will spawn off other pages that you may want to develop. So yes, software web developers would make sense under those circumstances, but I just wanted to clarify that, it's not going to be a completely independent website that you can put up. It's not very helpful for the large program to do that. I hope that answers the question. Adam, do you want to add anything? Colin, do you want to add anything to this? I think Colin should probably add it. All right, Colin's silence I'm going to assume means he has nothing further to add. Okay, so the next question, how long do you expect it to take for the data from the Morphic centers to be generated and shared with the data analysis and validation centers? Well, I will give you a generic answer. I don't know. I would guess that there will be hardly any data in the first year. Specifically, there will be very little data that comes out of any of the differentiated and more complex, organized systems. There might be some data, some preliminary data, especially at the IPSE and especially some of the simpler assets to run. Adam, do you... Yeah, I just, the only thing I really want to add is that the production centers and the DRAC will have almost a year head start. But I still think if we're lucky and things work very well, then I think data should start about the time that this FOA would be funded. There should be some reasonable amount of data to get started, but it wouldn't surprise me if it took another six months to a year beyond that to have enough data that was QC'd to the extent that people would really want to start devoting a huge amount to it. So I do think it's going to be in that sort of second year of data production. It's going to be a make or break year for the timing of what the analysis centers can do. So the second part of that question is, in other words, for a five year plan, how long do you expect to develop methods and explore them with non-market data? So based on Adam's answer here, right? By the time you guys joining, there should be a reasonable amount of data. However, I think it is very important to compare data sets between, I mean, compare across high quality data sets. So if you do have relevant data sets to bring on to the table, please do. Okay. How important is novel visualization development for this RFA? So I will say this. This may be insufficient of an answer. I think visualization remains a big challenge for a lot of biomedical data sets for a lot of reasons. So yes, that will remain true within Morphic as well. So if you have new ideas and new methods by which data can be visualized, especially in thinking in terms of integrating across data types, that would be useful thing to have within the consortium. And as a tool that is deployable on the website. I think I have answered all the questions. Are there any questions I have not answered? As I said, I will go back and fix the slide availability question. Really annoying me at this point. Sorry, I'm rambling. So as I said before, please send us an email if you have more specific questions, the questions that some of the details that we were not able to address, we are happy to address via email. One of the things we try to do is, as I said, put out a pack so that we are fair to the community and open to the community about answers to questions that are generally useful. Okay, it looks like there are no additional questions. Is that true? Anybody has any lingering? Okay, I just confirmed that for some reason the images are not there on the PDF. Okay, I will fix this problem shortly. My apologies for this. Okay, it doesn't seem like there are any additional questions. And Adam, should we just bring this to a close? What do you want to stick around? Adjai, I think we should stick around to the top of the hour. But I agree, I don't see any additional questions. And we can see the participants, if everybody leaves, we can wind up, but I do think we should stay. All right, Adjai, I think it is essentially the end. Thanks, everyone. Thank you all. Thanks, Gerald.