 Okay, Ajay, you want to come up? So in September of last year, 2016, NHGRI sponsored a workshop. I think it was in this very room, in some of you, yeah, the room we can't get away from, right? Yes, it is never changing. The purpose of the workshop, I think, was to try to identify opportunities in computational biology and data science. And Ajay is going to present the findings of the, or the report from that workshop, correct? Okay. Thanks, Rudy. So I'm here on behalf of the Computational Genomics and Data Science Group and the co-chairs of the workshop to present the report. Essentially, as Rudy mentioned, this workshop was held here in Bethesda, or Rockville, I guess, at the end of September last year. The goals were to essentially prioritize genomics research topics for relevance to NHGRI extramural with a focus on computational genomics and data science program. And basically find out, look at our portfolio, look at, see what things need to be continued and what things need to be enhanced and supported and what new challenges we need to address over the next three to five years. It essentially worked in the same way as a lot of our workshops do. We identified co-chairs from the external scientific community, and three of the council members here were part of the co-chairs, are Carol, Trey, and Aviv. Mike Benke and Lincoln Stein made up the other two members of the organizing committee. And along with the NHGRI computational genomics staff, we held various meetings to identify who we should invite to, what the general topic areas would be. And during the meeting, we had 39 extramural researchers come here, along with staff from NHGRI and NCI and IGMS, and the ex-ADS office, which was Phil Bones, Big Data, Data Science, and IHWIDE office. As far as the organizing the meeting goes, as I mentioned, the sessions were, session topics were designed by the organizing committee, the speakers and the details of what was to be presented and who was going to present was organized by the attendees, along with the co-chairs and NHGRI staff. There were five sessions that were held, so these were sort of like breakout sessions where a lot of details were discussed. So these essentially boiled down to challenges facing basic science, challenges in the clinical realm, essentially talking about what data and compute resources need to be there for the genomics, computations to be done at scale was the third topic. And the final topic was how we collaborate with other institutes and other resources that exist within the genomics arena. So as usual, the workshops, we started out with a general background presentation where, you know, our portfolio was presented by staff. We discussed the aims of the workshop, then we broke out into separate sessions into each one of the sessions that I talked about earlier. Each session recommended a bunch of areas of focus and topics, and then we got together and we used some technology to try and understand what these different suggestions and priorities were to be able to communicate with the entire group what each one of the individual groups would discuss, and we used dot storming, which is essentially a technique by which you vote on different ideas, and during the voting process all the questions about what exactly was meant by a particular recommendation suggestions were defined in multiple rounds. And eventually we boiled down to 13 recommendations. The import of the recommendations, each one of them are essentially highlighted. I'll just read the highlighted sessions. These are not in any recommended order, so the first one doesn't mean that got the most words. So interactive analysis and visualization of large datasets was identified as an important topic. Another long-standing one was understanding how genotype translates to phenotypes, ensuring genomic data sharing was a third one, and causal variants, identification and computational tools. There's a lot of focus on developing ontologies that are phenotypic focused, developing efficient and scalable algorithms, supporting what was called vertically integrated resources along with horizontally organized knowledge bases. Vertically integrated resources are specific resources that focus on single data types and horizontally organized are like model organism databases that go across data types. The recommendation was scalable, intelligent, and cost-effective development of metadata. I called R reproducible. It should be reusable, that's a mistaken slide. Developing the cloud environment for NHGRI investigators was the ninth one, rigorous benchmarking and gold standards for both analytical methods, doing the right set of experiments and collecting the data resources as well as phenotypic annotations. Integrating genomic data into clinical decision supports and improving the process by which that can be enabled. The next one, 12, is integrating patients more fully into genomic medicine research and support for informatics and computational needs for single-cell work. That was essentially the set of recommendations. Now I get to what we as staff plan to do about it. We are going to undertake a range of portfolio analysis and I'll show you an example of that and take these recommendations either to continue a certain emphasis that exists in current programs and current PAs or work on the policy and or continue working on the policy and and or create new initiatives. So we actually at NIH now have some reasonably good tools for doing portfolio analysis. So I'm going to take you through an example of such a portfolio analysis. So genomic visualization, just the first recommendation. Basically what I did was I created a bunch of queries, the simplest one being genomic visualization. The tilde r is essentially a way by which the natural language processing algorithm looks for genome and visualization within ten words of each other. So essentially the idea is to create a range of queries that would avoid not finding relevant grants. So we can do this both within NHGRI, within NIH, as well as funded organism grants across many other agencies and also the European Union and so on. Anyway, so you can take these queries, look at the results, you can place any type of constraints that you want on it, you can put constraints in the number of years, you can look at non-awarded ones, NHGRI ones, and non-NHGRI ones, and you can then sort of manually curate this and it's a fairly reasonably efficient way to go about it. So just to finish the story for the genome visualization, what we found is that in the last decade the query initially reproduced about 42 grants that were funded, which after curation boiled down to 11. At the non-NHGRI funded level, this was the breakdown before curation and after curation. And the key about the genome visualization recommendation was that it was supposed to be interactive, so if you add the word interactive to it, you get a much smaller number of awards and you can actually see that all the awards that we actually do fund, they are reproduced in the portfolio analysis. So it's a pretty efficient system and it's a usable system and we can actually use it to figure out what we need to emphasize in existing PARs, what need continued support and what new initiatives you can propose. So what we propose to do as next steps is publish the report on the website and advertise it. We expect to finish the portfolio analysis by the summer. These recommendations will also be used as input to the entire Institute of Strategic Planning and Exercise. I would like to end with acknowledging all the co-chairs, Mike Benke, Karl Bolt, Trayay DeKarabeevra, Kevin Lee, who helped keep us through the entire process of starting with planning for the workshop through today. My colleagues within the computational genomics subgroup within NHGRI and Eric Green, Carolyn Hutter, and Jeff Schloss, who started this. Thank you. I can take questions and I'm sure Karl and Aviv, I don't know if Aviv is still on the phone. If they want to say something. Any questions or comments from the council? When you do the search, are you searching just the NIH portfolio? Is there some way to also like look at what the Wellcome Trust or other international groups, global alliance, whatever might be doing in these areas? Yeah. So I think one can look at the European ones, certainly in the UK ones, but we only find out the funded portfolios, not what they didn't fund, whereas within NIH we can ask questions of which grants we didn't fund. First, thank you for the summary and thank you, Carol and Aviv. Actually I've given this speech before and I won't give it again. In my opinion, this is one of the most important things that NHGRI can be doing, but the challenge is how much did you talk about, A, how do you prioritize in this field? And then the second is how do you prioritize, and at the same time ask the critical question, what can NHGRI in particular be doing to promote the field? The workshop addressed those two points. I mean, 13 probably to you sounds like a wonderfully narrow scope because you probably started with 130, but indeed 13 still is a very large number and some of those are broad in their careers. They're not the results of a study, for example. Yeah, I think people did, to various degrees, try to think about how you improve the efficiency of the resources and the tools that are developed. I don't think anybody really had any magic bullets. I mean, I think we are all sort of stuck in some ways about how to proceed. I think some of the ideas that came out during the discussion with other institutes on how we interact better. So for example, during the discussion for what is called the Sandbox, used to be called the Sandbox, and the genomic data commons that the NCI is piloting, there are lots of lessons that we can learn that would improve efficiencies. There are broader conversations that within NIH that Dr. Prattie Brun alluded to about contracts at the NIH level where PD2K is trying to look for ways by which you can reduce costs by leveraging cloud resources, not only in terms of financial ability to negotiate, but also with respect to recruiting engineers from the various cloud providers to be able to help us do implementations that are efficient, that can cross various boundaries of scientific discipline. So we had hoped to try to get to that very specific question. I think the challenge was we brought people into this workshop from very many areas that aren't necessarily, these individuals weren't necessarily familiar with the NHRI portfolio balance. So to ask them to come up with, you know, here's what NHRI can do uniquely out of all of these things was really beyond the scope of time and actually data that we had to ponder during the course of the workshop. So, you know, clearly a lot of the things that came out as recommendations overlap a lot with what we saw in Patty's slide deck today. You know, these are, there's nothing that really jumped out as being particularly surprising. But I will say that NHRI has really been at the forefront in a very impactful way. For example, in the whole ontology development area, right? So NHRI has led that from the very beginning and that's been transformational to the analysis of genome data. So NHRI has already been at the leader, has been a leader in sort of promoting data science principles and concepts. And whether or not after the portfolio analysis that Ajay is working on now, I don't know if that will identify anything where NHRI is uniquely positioned to only be the only one to lead in filling those gaps. But at least we'll have a better understanding about what the gaps are. And then I think the idea would be to step back and ask that question again. So, I mean, I think as usual this is an ongoing conversation and some of the new efforts that Eric is leading with the council subgroups would be informed. In some ways, this can form an early charge for that council working group to think about how you would take this nice list and prioritize it and give it a little granularity. I also think it's, you know, the prioritization for what, I mean, we divided it up and there's infrastructure priorities. Then there's, you know, genome science slash biology priorities, which might be different than genomic medicine priorities. So it really, the priorities really are context driven. It really depends on what area you're talking about as to where the priorities will fall out. But I think that also emphasizes the importance of NHRI leaving on this because I think everyone can get stuck in this priority conversation for a very long period of time without anything actually happening as a result. Whereas the, just the rate of data generation in NHRI afforded the sensitivity to such that you simply can't afford that privilege of sitting there and waiting. And that's a good thing. And that in some ways, it's a repeat of my earlier conversation that seems like the worst thing we can possibly do is get ourselves stuck as a field. It's just there are many things that can be done. The field's moving very rapidly. So in some ways that creates a tendency maybe to act stuck, but we just can't do it. We have to push ourselves and push the field. So I also think that this is a really, that these areas that are going to come out of this portfolio analysis are going to be great subject matter for the unsolicited pool. And I think that's a good thing to think about for individual investigators where we can get innovation and new ideas coming from individual investigators that will end up supporting the overall mission of the institute and some of the big programs. But I do think this will help us bring more people into the computational field relevant to genomic biology and genomic medicine. I also want to emphasize, just to further echo and strengthen the things that were said, is that because of the need to move with some agility and speed, and because for some of the other institutes that are very big initiatives, and for other institutes that are much bigger than NIGRI, I think the instinctive response is, well, we need to ponder this with great seriousness because this is such a major investment and what should be done, and so on, and there's kind of an instinctive response to not do anything. Whereas trying out a bunch of things could be a lot more beneficial. And software today starts with, it's true that it's expensive and you have to invest a lot into it, and in fact it is cheaper to get a, at least compared to what you get in return, it is cheaper to get started than it used to be. It used to be that you had to invest a lot more in infrastructure and in coding time in the old days than you do today in order to get the same or a better even outcome today, in terms of the specification of code and the code bases that exist and the fact that the cloud exists, all these things really change the nature of how quickly you can go and make an impact. And, you know, a stellar four-person team can go a very, very long way in writing a piece of open source software that really transforms what people can do if it's the right team. That's not to say that you don't need major investments in infrastructure, you do, but that it is just important to be timely and to move. The two are not mutually exclusive with each other. I think that was a sentiment that was heard very substantially in the workshop, that we need to get going. We can say the past. We're getting so behind the times, you can bear it to me. So I just want to second, third, or fourth or whatever those comments. I think this is a critically important area. I really like the idea we need to, we do need to move forward, we need to move as quickly as we can. The idea of putting out something to get some investigator-initiated things, get something going, get some ideas going, get some people thinking about these things, I think is really important. Note that I was actually, only this morning when I read again the slides and recommendations and so on, I realized that the proposal touch, the recommendations kind of touch on two things, and they tend to be separated from one another. There's a set of recommendations that talks about the data, and there's a set of recommendations that talks about the analysis. And there are very few recommendations that actually talk about how data and analysis come together. So for example, we have two technologies and policies to ensure genomic data sharing, but other things about sharing the data and the analysis capabilities on this data together. And so I think that would be an interesting thing to think through in terms of some specific phrases. So I wonder- I thought we should have made that comment a long time ago, but it only occurred to me now. Well, this is Carol, so I wonder, and Ajay might remember, so these 13 recommendations were kind of the result of that dot storming voting, and I think in the fuller documentation that the point you just made is fleshed out as an important area. You fleshed out better in the full document, but if these statements are the ones that kind of resonate in everyone's mind, it's important to know that the spirit of the conversation was a little more nuanced than what we can say in this very short bullet form. Maybe that's the right way of phrasing it. Yeah, I mean, I think the document actually does contain some of these. Okay. Thank you, Ajay, and thank you, counsel, for your input.