 Good morning. My name is James Coulomb, and I have the honor of serving as the NIH Working Group Coordinator for the Gabriella Miller Kids First Pediatric Research Program. I'm here to give you a brief overview of the program and a progress report, and also talk a bit about a very recent collaboration with COMP. Excuse me if I read this. I want to stick to the time, and I want to not skip over a lot of salient points. So the Gabriella Miller Kids First Pediatric Research Program was initiated in response to a congressional act, the 2014 Gabriella Miller Kids First Research Act. Both the program and the act are named in honor of a 10 year old Virginia girl who in her short lifetime became a very forceful advocate for pediatric research. The act did a number of things. It ended taxpayer contributions to presidential nominating conventions, and it transferred $126 million that had been accumulated into a pediatric research initiative fund and authorized an appropriation of $12.6 million per year for 10 years to the NIH Common Fund for Pediatric Research. The first appropriation was for fiscal year 2015. This is separate normally from NIH's other budget, so it's an additional chunk of money for NIH to advance research with. The vision of the program is to alleviate suffering from childhood cancer and structural birth defects by fostering collaborative research to uncover the etiology of these diseases and supporting data sharing within the pediatric research community. And notice the emphasis is really on collaborative research and data sharing. We're operationalizing this vision by building a pediatric data resource to provide access to high quality genomic and clinical and phenotypic data and in order to facilitate collaborative research. The intent is to accelerate gene discovery and ultimately to improve diagnostics and therapy. Kids First is a trans-NIH effort supported by the NIH Common Fund, and the institutes that chair the Kids First working group are, well, it's basically program institute leadership from NICHD, NHLBI, NCI, and NHGRI. In the working group, there are also representatives from a variety of other institutes and the CDC, basically everyone who's interested in pediatric research, we try to rope into our program. The list of members can be found on the Common Fund's NIH.gov Kids First website. The working group, which puts together and organizes and thinks up and puts together the nuts and bolts of the program, is led by a smaller leadership team, which is named here mostly their program officers, but we also have grants management specialists, a team from the Common Fund, and since all of us have other jobs, sometimes a little too many jobs, we have Valerie Cotton as a program manager to ride herd and keep us in line and make sure the program doesn't go off track. We also have a governance structure, as it were. This is an overview of the basically four elements of the program, the working group, the data resource center, sequencing centers, and again, X01 investigators who all have representation on a steering committee that's chaired by Virginia Papiano. We also have a group, we've recruited a group of five outside experts to independently or individually give us advice about the program. The program initially was conceived of as three major initiatives. One, we wanted to identify cohorts of children with cancer and or structural birth defects and their families and provide whole genome DNA sequencing and in order to build a database with the phenotypic, the genomic, and clinical data that the community could then use. The second initiative was to build that data resource and make it available to the community. And a third initiative, which is still in the planning stages for the future, is basically demonstration programs to show Congress that this is a good approach and we've managed to do something with the money. In the first year of the program, we had sequencing done by Baylor and WashU's sequencing centers. Subsequently, and for the last three years, we've been working with Hudson Alpha and the Broad Institute for sequencing our cohorts. As Charlene said, we use the X01 mechanism, which is not really a grant. It's a grant, I don't want to say the word grant. It's to provide access to NIH funded centers, in this case sequencing centers. And in the first three years of the program, we've put into our pipeline the following, the named here cohorts for sequencing. This represents about 6,000 cases. And since most of the studies are in TRIO's design with the affected child and their parents, this represents over 18,000 genomes. Five of these studies are available now to the public. The rest are somewhere in the pipeline and more should be available soon. In this last year, 2018, the fourth year of the program, we were able to add 11 cohorts representing over 8,000 genomes. And this is an expansion of what we expected to be able to do with the money. And we were able to expand our sequencing scope by collaborating with other programs at NIH. These include the Include Program for Studying Down Syndrome and NIAAA, who've provided additional funds for sequencing, and the data will go into the Kids First program. These are great examples of how NIH programs can work together to amplify our effects. So some of this is going to seem like down in the weeds administrative trivia, but in the four years of cohort sequencing, we've faced a number of surprising challenges and come up with solutions. And basically, this program has sort of been a wild ride with one problem after another, kind of surprising because other programs have gone here before, but we seem to be facing new problems all the time, but we're managing with a dedicated staff and dedicated folks at the sequencing and data resource centers. So I'm going to tell you about three of those problems. The first was not really a surprise. XO1s don't come with any funds for analysis support. So we repeatedly heard from our XO1 investigators that, wow, it's great you've given us hundreds of thousands of dollars for sequencing, but now what can we do with the data? We don't have money, and it's hard to get funding for analysis support through NIH regular grant mechanisms and the review process. So in response, we have been issuing funding opportunity announcements for RO3s that are dedicated to support analyses of Kids First data sets and the development of appropriate methods for analyzing those data sets. These are not using Kids First monies, but instead we've gotten a number of institutes to agree that if they get funding, if they get review scores that meet their pay lines, that they'll pony up the money. The combined direct budget for these two-year projects is twice the normal RO3 budget. So it does give you a little bit of leeway for at least hiring bioinformatics personnel and getting started with analysis. If you're interested in this, you can contact myself, my e-mails there, or the IC representatives that are named as scientific contacts in the program announcements. We're soon going to be reissuing this announcement, so this number is going to change, but there will be an announcement for continuing the RO3 support. Oh, and by the way, these are program announcements with referral or PARs, and that means that they should get a more targeted than normal review, and the reviewers should be well aware of what the intent of the program is and review accordingly. Okay, another challenge that was a bit of a surprise was that the cohorts often had data use limitations that interfered with the intent of the program. Our mandate is to make data generated by kids first accessible to the research community and to facilitate collaborative research by enabling researchers to easily combine and compare data sets for cross-data set analyses. We found that many of the data use limitations were restricting the ability to use these data sets fully, and we also were under the impression that I've got to be careful here because Valerie's going to kill me if I say this wrong. We were under the impression that possibly some of these data use limitations were misinterpretations of the original consent language that the participants had agreed to. So we undertook about a four-month exercise to educate our cohort PIs in what the data use limitations really meant, and in this last selection of cohorts, we were able to specify our expectations of broad data sharing. Didn't say anything wrong? Thanks. Another challenge was that when you go through a DB gap to apply for access to these controlled access data sets, your application for access is reviewed by a data access committee typically, and up till now, requests for access to kids' first data sets have been sprinkled about the different institute-specific data access committees, and we foresaw that as being a problem both in terms of consistency and possibly in terms of timing. So we've formed our own data access committee just for requests for kids' first data, and that committee is being chaired out of NCI's Office of Data Sharing. We think this is going to streamline the process and make it consistent and hopefully speed up the processing of these data access requests, making our data sets more available to the community, more easily available. I talked about cohort data sequencing, and we're building a set of data, but that's not very useful unless people can access it. So we funded, in the third year of the program, a data resource center. That data resource center is really a consortium of five different sites and investigators at those sites with CHOP, the Children's Hospital of Philadelphia, as the lead institute. The overall PI is Adam Reznik, and he and his team were funded just a little over a year ago as of last July. The work of constructing this data resource center was divided into three cores, a data coordination center, administrative and outreach center, and a data research and a team working on a data research portal. Their stated goals are in this slide, but I'm just going to touch on some of the accomplishments that they've done in this brief last year. OK, the data coordinating center is charged with taking in the genomic, phenotypic, and clinical data, and that's surprisingly difficult. These are huge data sets, as you probably know. And so just piping over data to a cloud resource and having control of it is actually quite a challenge to arrange. They also were charged with and have constructed a phenotypic clinical data harmonization framework using controlled ontologies. And this is a, like in all this field, this is a source of continued improvement and a lot of interest and hopefully powerful tools for analysis. The genomic data coming into the kids first data resource is being harmonized against the latest reference build of the human genome. And they're using an optimized and scalable pipeline. So all the data sets are harmonized the same way and that makes it easier to bio analytically compare and search data sets. Again, that's actually a pretty huge project. Well, the data resource center as a whole recognized that the patient and clinical communities can and should be able to access and contribute to the research. So they recognize that we have partners, not only the researchers, but also health care professionals, the patients and family members, and the interested community. So they've done outreach both to help design the data portal and the website. They went through a process of having community meetings, at least three with the interested community, getting feedback on what features were desired. One of those features or features of the data resource center was a call for more outreach. People in the, especially the advocacy community, really wanted to know about the researchers. They wanted to know about the projects. They wanted to know about the data resource center itself. And they wanted to know about scientific progress. So the administration and outreach core has developed flyers and handouts and posters for meetings explaining the program, as well as reaching out through social media with investigator spotlights and spotlights for the community interested advocacy community groups so that some of their efforts can be better coordinated. OK, finally, again, the data is great to assemble and index and put together phenotypic frameworks. But it's not very useful if you can't access it. And it's not easily accessible. So an important part is the data resource portal. This is the PI of this part is Vincent Ferati. And they've developed their charge with, and I believe have been successful in a very short time, in developing a compelling user interface that will let users of all skill levels browse, visualize, and perform in-place analytics across the kids' first and related data sets. Of course, these are controlled access data sets. So it was necessary to integrate into this a framework from the Gen 3 platform that allows controlled access of the data sets. And they also integrated in Kovatica, which is sort of a, now I'm not a bioinformatics person, but I sort of view this as a Santa's workshop where you can import power tools, as it were, into the workshop and set the L's on working on your data sets. It really looks intriguing to me. And you'll see a little bit of glance of it in a minute. I can't really convey how this data portal looks and feels and how useful it might be. So what we're going to try and do is show a, and I can't see, oh, I've got to use that, sorry. So what I want to do is show a quick demonstration of the portal. You can do this yourself. You go to the kidsfirstdrc.org website, click on support, go to support, oops, and getting started. I'm not sure why that scrolled. Oh, yeah, yeah, forgot about that. Got to scroll down. And down here is a quick demo of this first, now they only had a year to work on this, so this is their first iteration, and it will certainly be improved upon. But it's pretty impressive, I think. And I'm supposed to go, OK, sorry. Hello, and welcome to the kidsfirst.org center demo of the website and portal system. The goals of the kidsfirstdrc website are to increase research, collaborations, partnerships, and engagement in the research, health, patient, and foundation communities, as well as provide a centralized location for the dissemination of information and resources to support the overall research goals of the kidsfirstdrc and kidsfirst program. The website will provide user support, news, and information, as well as encourage people to use the portal. It will provide real-time summary data for updates, as well as news to keep the community informed about study updates and research. Once users are navigated to the portal, they'll see this landing page where they can log in with popular identity providers like Facebook and Google. Let the pro you do it. Can you get rid of the feedback? Once users are logged into the portal, they'll be navigated to a wizard to orient them. They can select a self-identified role and then are navigated to the terms and conditions for data access. After reading through these, they are required to check off that they have read and agreed to it and then can enter the portal system. I'm just going to navigate to a user that has a little more information filled in. From here, you can see that the user dashboard has information about things that have been saved and interaction points. Users can also edit their profile with some information about themselves and interests. The next major setting is the partner integration area. Right now, there's two integrations. From the University of Chicago, the Gen 3 data comments, which allows people to gain access to data through the NIH and DBGAP program. And then the second partner, Seven Bridges, Kovatica workflow platform, which will allow you to quickly analyze data from the kid's first file set in an online cloud environment. The last page that users can interact with is the file repository. You can see on the left data features highlighted that you can browse through, as well as a table of files with some summary stats and actions that you can take on those, including downloading a manifest or a report. Once you've identified a study of interest, you can download that information. You can see that the download has started there. Another thing that you can do is identify multiple studies of information and then push those files onto the Kovatica platform to analyze data in the cloud. Before we do that, we'll always make sure that users have the permission to access these files. You can see I don't have access to one study, so I won't be able to copy those. But I do have access to CBTCC, so I will be able to copy those. And select my Kovatica project and copy the files right over. Once I'm in Kovatica, I can check and see that all of the files are listed there for me, that I just copied over. In addition to any other files I might need for workflow usage, like reference genomes or input data sets that would supplement that. In Kovatica app area, I can add apps to my workflow area. I can search through public apps, add them to my projects, or I can create my own app. So for Kids First, a custom joint genotyping workflow app has been created. I've already set up a demo task that shows what this would look like. So from the task tab in Kovatica, you can see the Kids First joint genotype workflow that's already been set up. Going into that workflow, you can see all of the steps are here with their files attached. I'm going to select the files that we chose from the portal. So I'm going to be looking for GVCF files that are harmonized. Once I've found the correct files, I can select them and save them to the workflow, which you can now see that's resolved. And then I need to name the output. And I can kick off the workflow with Kids First data. You can see that it has queued up. In my task tab, I can see a list of all the running tasks. In my collaborative environment, I can see there are other tasks that are running as well. And even one that has failed that I might need to go in and fix something about one of the preset inputs or the files I selected. I was supposed to stop that. And if I can get back to my on the bottom. OK, so I don't know about you, but I'm pretty impressed in what they've built in one year. It's time. And looking forward to the future, this is a very dedicated team. And I think they're doing fantastic work. OK, back. This is a timeline sort of overview of the Kids First program. In the first couple of years, we only did sequencing. And then we've continued that and expect to continue it through physical year 2012, building the data sets, the data resource. We funded a data resource center. And that funding will at least continue through 2012. We've issued RO3s for data analysis and we'll continue to do that at least through 2012. Pardon? Oh, 2021, sorry, a little dyslexia time here. The last three years of the program, however, are sort of terra incognita. We have no idea what we're gonna do. That's not true. We know a lot of possibilities that we could do, but we're just now embarking on planning for that, for those years. One thing we know is that it will be important to demonstrate the value of the data resource we have built and some possible projects to consider are funds to support phenotypic curation, funds to recruit and or re-consent cohorts of affected children, data mining or demonstration projects. And of course, developing animal models will be important or is likely to be important. And your suggestions are solicited and very much wanted. Finally, I want, since this is a comp collaboration day, I wanna talk about a very recent collaboration. I think we found out about this less than three weeks ago. So it's pretty fresh. We're trying to move it for NIH standards, the speed of light here to get this up and running. The project, the collaboration is to develop mouse strains to study phenotype and validate coding and non-coding genetic variants identified from kids' first whole genome data sets. We want the community to invite and to nominate variants identified through analysis of kids' first data sets for mouse model production and phenotyping to take advantage of the powerful production pipelines and phenotyping pipelines that comps developed. We've put together a team of NIH program officers to evaluate these. We call that an administrative review to evaluate the nominations. And if you have a variant to nominate, we'd like you to give us the information on this slide. If you have a series of variants to nominate, you don't need to fill out three pages for each variant, but what we came up with brainstorming was this set of information. It's likely to involve as we see what comes in. And if we need more information, we'll come back to you with additional questions. As I said, in order to get this moving quickly, well, the nominations will be reviewed by a subcommittee of NIH staff from the kids' first working group. The variants will be prioritized basically based on the strength and breadth of the supporting evidence. In other words, whatever you tell us is important. Make your best case. And decisions will be finalized in consultation with the Comp2 staff. So again, we're moving very rapidly. Our first deadline will be October 26th. And if we need more information, it'll be requested after our preliminary review. If you have technical questions about the variant production, you can go to this email and Comp staff will answer those questions. Okay, we at kids' first want to use this as a pilot for how to evaluate nominations and pilot a process here for feedback to see how this program evolves and how to go about best accumulating nominations with the idea of maybe in the future having support for model producing model organisms using kids' first variants. Mouse aren't pictured here, but certainly in the mix already as it were. So finally, I want to put in a plug. We're going to be at the AS American Society of Human Genetics annual meeting. We'll have an evening poster session and social as it were so that you can meet kids' first cohort PIs. We can talk about collaborations. We can get your feedback on what you think we should be doing with the program. And data resource center staff will be there as well as people from the kids' first program. And we're hoping that this will be a truly interactive meeting and another way we're hoping to have meeting structures that are maximally productive. I have to say, I haven't encountered an NIH group that is as motivated as this group and we're constantly trying to innovate and tailor our program to use the money Congress has given us to the best effect and we'd love your feedback. So you can catch me here or you can contact me. You can find out more about the program on the Common Fund website for kids' first. And I guess we're not going to take questions but you can get to your break now.