 Okay, so the last presentation for the day we saved for last partially because this is a bit of a I think just an interesting Fun different kind of a topic and we thought we could squeeze it in this council meeting And it's something I've been meaning to want to have a presentation about probably for about a part of a year We want to pick the right council meeting where this seemed like an appropriate time to do it We thought that was the present but to really really understand I want to set a context because there's a bit of a storyline associated with this I I guess the first thing I'd say is maybe I take some of what you're going to hear about Personally a little bit of ownership around this Because of my strong devotion to the field of genomics with the field getting named and coming to be literally the year I graduated medical school and graduate school. I grew up, you know as a postdoc in the human genome project I just think there was this it was just a remarkable field that I'm incredibly proud of as I know all of you are so I just think it's really important for us to think about how young the discipline is and how there's a lot of a lot of really interesting things that unfolded at this Institute that that shaped this this important field of genomics and meanwhile when I came to the Institute even 24 years ago or 23 years ago and and Even as an intramural investigator I was reasonably impressed with how the Institute seemed committed to trying to capture a lot of the things that were going on I you know we had a Staff photographer we did a lot of videotaping of stuff even back then and there was just a lot of things and a lot of workshops and a lot of documents coming out of workshops and it just seemed like there was lots of remarkably valuable materials that were being accumulated at the Institute So I was impressed with that but but the this real story begins in September of 2008 Literally when I got a phone call and the phone true story the phone call I got was from Francis. I was the intramural director Francis was about to leave the Institute like in a week And it was his last week as NHGRI director and he called me and he said you know, I'm cleaning out my office and I'm uncovering a whole I have to leave all documents behind that sort of rule in the government You can't take everything he goes and I'm just coming up with all sorts of really cool Notes of mine and summaries and things from the Bermuda meetings and things that are really really valuable and I'm I'm actually Don't want to just leave him behind So I'm thinking maybe I should copy him or do something and he and he sort of said and I looked at right You know we have a copier over here But it's not a particularly great one and and he knew that I had in the intramural Scientific director office. We had just purchased a reasonably high throughput scanner for getting ready for site visits and various things like that and Sort of said do you think maybe we should be scanning some of this? I said that's a good idea I said look but make a pile of the stuff You think is the most valuable and I'll send somebody over and we'll at least get this stuff scanned because he couldn't take any Of it true story that once he scant we started scanning a bunch of stuff and then we put together a bunch of CDs This is actually one of my need about six or seven copies gave him one I kept one I forgot what we did with a bunch. We said at least this is Francis is most precious stuff as he's literally walking out the door So so that's the beginning of the story But one thing I remember thinking is that you know genomics deserves better than this, you know This is ridiculous and you know just to have some CD that a last second was thought about you know And Francis was sitting on a treasured drove of his immediate notes that he kept in his immediate office Turns out that was just the tip of the iceberg right because what happens if you fast forward from September 2008 until December 1 of 2009 at that point He's bungee cord back to the NIH as the NIH director and meanwhile I applied for and got appointed to be the NHGRI director and December 1st I took over his old office and I walked in his office and found drawers filled with The stuff that he never got to right because he only could get to what he like in the last couple of hours on the job And they were just incredibly rich files of it really cool stuff But that that was just the start of it I mean by the way there are boxes in the corner that you open every box you open up as another famous workshop another interesting thing Videotapes audio to all sorts of things and that was just the tip of the iceberg because then we decide well We actually need to renovate the suite So you start packing up old offices that various people is and there's just abandoned Files and and and things that are of great historic value and they're all getting boxed up And oh by the way We also then have and we also had issues with people starting to retire I think and and all of a sudden their offices are getting boxed up Oh, and then by the way we realized that we had a really nice computer server where lots of stuff was put into shared folders but we really needed to reorganize and there was just stuff everywhere and This really worried started to worry me And then what I should tell you and you have to appreciate is as boxes start getting moved around the government has a Lot of rules associated with official government documents every one of these pieces of paper are official government documents You can't just do anything you want with them These things have to eventually be formally archived and eventually find their ways into find their way into into Sort of big archives that look like the last scene of the Indiana Jones a temple of Indiana Jones a temple of do movie Right and then forever. They're lost and I just get again I kept thinking you know I'm so worried and people start coming to me going we got to do something about these boxes They got to be moved they got to be archived or overdue and I was worried we'd lose track of some of this stuff so faced with all of that and it just kept coming the barrages I I Turned to Chris Wetterstrad actually said as somebody who especially was now working in a role of liaison to the extramural program I having a lot of history at the Institute. I said What are we gonna do? We got to do something and so Chris and I started figuring out things that she could do to try to gather stuff and organize stuff and it became Very apparent very quickly that she was in over her head. I was it over my head This was just a herculean task and we were up against actually some deadlines with respect with respect to literally rooms filled with boxes Accumulated from staff that were important as far as I was concerned And yet these were gonna have to go off to government archives if we didn't get our house more in order And so I said, you know, we're you know, we really need some help you look around NIH The truth is NIH doesn't do this very well. There are pockets of NIH National Library of Medicine does some stuff it but if you look at most institutes most institutes I mean they just don't have infrastructure for this particularly strong and the little bit of infrastructure There is every time there's a budget cut sort of it gets it gets made smaller And so it's and it's just not campus-wide. You had to solve we had to solve this problem ourself and so I Said we should hire somebody who actually knows what they're doing Professional who actually knows how to archive and knows how to sort of take advantage of modern technologies to to get all of this These I think valuable assets in a form that that would be safe And also in a form that could be utilized for people who like to study these things and want to sort of have a good Accounting of the history of the field and of the Institute both of which I thought deserved something better than than government archive and so through a really just Somebody who knew somebody and a little bit of luck. We identified I'm not even sure it was a doctor then Chris Donahue was just finishing his doctorate in the history of science at the University of Maryland and A he fit the bill somebody who was had expertise and was energetic and thought what we had in our collection was incredibly cool And so I hired Chris and no pun intended, but the rest is history And just said you own this you you tell us how to do this And I think this would be a very exciting opportunity for you not only to create a strong infrastructure First to not lose this stuff but also to do scholarly work around this and And stimulate scholarly work around this now I would say that and so he's going to tell you about what he's accomplished since coming to the Institute You know it'll be candidates. This is someone unprecedented. Most other institutes haven't done something like this But what the hell, you know, that's what we do all the time in NHGRI we do things that are unprecedented and And and it has served us well And so I think it's with that spirit that we decided to create an internal history program history of genomics program Of course Donahue leads that's what he's going to summarize for you and and really I think what I'm doing And bringing this to council There's like two things when I wanted you to be aware of this and you may even be interested You may know people who are interested in in Accessing some of these materials we've put together because that's why we have them organized But also we'd be very receptive to your feedback because you know again being so unprecedented There's lots of ideas of things we might do or might not want to do. It's still a very young Program and it's and it's sort of out there without much precedent to sort of guide its its future So with that as a context, I'll turn this over to Chris. Well, thank you Eric All right, so to begin this is really what we started with this is not an actual picture But it's a good representation and as as Eric said if we did nothing They would the files would have gone into sort of the gray box of the of the National Archives So we embarked on a flurry of digitization from 2012 to 2013 This would enable us to keep really a copy of our history a copy of the files And at the same time we really began thinking about the various ways to capture the history of the Institute So even beyond digitization beyond capturing we were Really thinking about starting a normal history effort other ideas were proposed and in 2014 the history of genomics program was actually born So the program has three goals around which this talk is designed The first is Getting the NHGRI organized And here I'll describe how we succeeded in capturing the rich history of the HGP genomics And the NHGRI's role through document digitization file capture and video archiving Okay, goal two Next I will outline the process So I'm going of making these materials available to outside scholars This includes first a significant database initiative, which I'll describe in some detail And second gathering input from outside scholars And lastly the history genomics program now has a regular seminar series Has hosted an anniversary symposium and now conducts oral histories to augment and to capitalize upon the rich history of genomics But I'll start with getting the NHGRI organized Starting in 2012. We created a process for digitizing and scanning Inventoring And cataloging our history for eventual storage and retrieval This also included a culture change of getting stuff when people leave Working with programs as staff especially to find historical program files Which were then in as eric mentioned in office filing cabinets Asking where historical program files Really sort of describing the historical work that programs had been doing and the human genome project what had been done Where those could be found in digital form on our various shared drives and working with Analysts and with program staff to organize program files so that they could be accessed later So here are two really cool examples from our digitization efforts The first is an example document from the pre-history of the HGP and of the NHGRI So this is the first meeting of the program advisory committee on the human genome in january 1989 I also have the agenda for the first meeting of this group The national advisory council for human genome research from january 1991 um after 80 meetings betty gram is still here and The agenda is pretty much formatted in the same way Open session was shorter back then So that is a that is a change Next I would like to give everyone a very quick overview of the scale and scope of the archive So we have about 1 million pages of digitized paper documents 700,000 of those pages are from directors watson and collins and acting directors such as michael goddessman We also have 300,000 pages from retired staff retired staff Include elka jordan mark geyer jeff schloss and jane peterson We also have born digital documents which total about 25 million pages This is the vast vast majority of our archive Born digital means microsoft word powerpoint excel documents from our various shared drives at the institute Covering extramural scientific business mostly but not exclusively after 2005 We also have an 800,000 page growth rate of born digital files per year. So this is an expanding archive This archive is about the same size as an archive for a medium-sized research university or a medium-sized state school So it is a significant archive okay, um We also are actively working on video and uh audio archiving We've digitized about 300 videos so far so These include program meetings symposium talk TV TV presentations from folks like francis collins and jim watson We've also digitized some documentaries and educational materials All right, so now Two brief examples checking the sound uh One serious and one silly work. Oops. It wasn't supposed to happen Using the powerful tools of the human genome project researchers at the nih have identified for the first time a specific gene Which when misspelled in a subtle and particular way Confers a very high risk to that person of developing parkinson's disease So this is the parkinson's gene identification announcement uh, and the second is francis in gene spicer I'll stop that Very importantly all of these videos are those that either no one has or no one has in a convenient place So much of these materials are from the 1990s, but we also have recordings From the 2000s up to the present day. So it's a pretty significant video archive okay so go to Uh making these materials available So as the archive grew larger and larger it became obvious that it was extremely important To enable access to these files And to that end as I noted the history genomics program began thinking very seriously about the technological logistical and regulatory challenges of making these materials available to outside scholars The process began in 2014 in consultation with microsoft Uh through which it became apparent that for our needs building a a share point database Was our best option and it ended with a database launch in april of 2016 in a pilot phase So next I'd like to give you uh the overview of our database resource that we've developed for scholarly access to our files But first I want to give you the distinction between the database and the archive The archive is all of the files that we've saved Including the materials that we've digitized The database is all the files that we are hoping to eventually make available to scholars And in consultation with program staff we select and metadata all files in the arc in the database And organize them so that they're keyword searchable So as of right now there are about 300 uh 35 000 files organized for the database This is going into the database organized for database use at some time We have 2,500 files that are in the database right now With plans to put in the database with metadata another thousand this year Adversizing that this is a very very secure database uh we worked with microsoft and also NIH CIT to make this a secure cloud instance The metadata is extremely extensive and the whole resource is browsable and searchable Researchers have to apply to access this database and we do not include files that are in any way confidential All right And here's the the top level of the database as it now stands So examples of file categories include general hgp history Bioinformatics mods comparative and organismal sequencing technology development map being especially early mapping. I love it early mapping Human variation programs large-scale sequencing And encode and almost all these files are from before 2010 The database also includes files curated and metadata in response to requests from outside researchers There is a specific interest from a researcher We will curate and make those files available with metadata for that researcher in about four to six weeks time Okay, next I'd like to give you a more detailed Picture of how the database is organized The first picture illustrates the folder substructure for materials around the hat map project Users can browse after finding the folder that they want the second The second illustration details the database at the file level With metadata which can be searched by keyword, which is as you know another way to find files metadata imputed by staff especially for the files that are highly requested is extensive And the reason for the metadata is so that enrolled individuals can find A single file instantly and with no problems and often These these researchers know a lot about genomics the hgp But they don't know precisely about how we do business on a day-to-day level. So some of our nomenclature may be a little Unclear so some of the metadata is a bit of an educational process for them. But after Some back and forth between myself and the researchers they are able to find Things very very quickly because of this very extensive very custom metadata So this is a map of all the Enrolled researchers in the database this shows the all the institutions where the enrolled researchers are located In a few cases. We have multiple individuals from one institution enrolled It's a truly international enrollment covering western europe The us the uk southeast asia and australia At the moment we have 10 to 15 active researchers and the goal of the second phase of of the database development moving forward Is to increase enrollment to 100 researchers And we certainly see that as possible in the next year So there is high demand for this database So to move to the other side of this engagement with scholars not only sort of Getting giving them access to the files that they need for their scholarly work, but also getting feedback from them We and consistent with just general outreach In in terms of what the program does the program held a two-day workshop entitled Capturing the history of genomics here at the NIH in 2015 this brought a number Of prominent scholars in genomics and molecular biology to spend time working in the archives to meet with program staff And to advise on database development efforts and the future directions for the program Out of this meeting we received a clear mandate to push forward with our database development efforts and to assist scholars in publishing work on the history of genomics And also out of this meeting came both special issues listed on on the next slide All right So both the conference and our database development have led to two issues in genetics genomics and genetics history special issues focusing on the history of genetics and genomics in the journal of the history of biology Which is on the right And studies in the history and philosophy of biology and the biomedical sciences on the left Which are both top journals in the field Will be published in late 2017 and early 2018 Also, the fact that they're sort of a little bit rivals in the field and they might be published around the same time is is nice These these special issues. I think will exemplify how historians using our files are moving towards what I would call a more Sort of evidence-based history of genomics and of the hgp Okay And goal three As I noticed as I noted earlier The final goal of the program is to create our own history by actively hosting events Such as ongoing lecture series as well as an anniversary symposium And by chronicling the multiple ways the experiences of retired staff and key figures in the genomics community Through basically developing oral histories And our first effort in this regard Is the history of molecular biology and genomics lecture series So we've already hosted three on topics as diverse as theoretical biology and genomics And the changing understandings of genetics and disability And there will be three in the summer and fall months of this year So the scholars we invite are typically those that we want to encourage to move more definitively into writing On the history and social implications of genomics research Rather than just the history of biology the history of evo-devo and to these ends scholars visit for a few days Meet with staff and they give a talk in the lecture series And this provides ample time for informal discussion about the program the history genomics program and its efforts and speakers to a person have Emerge extremely enthusiastic about the program's efforts and emphasizing its uniqueness and all the variety of things that it's done So it's a nice coaching mechanism and it's also a nice engagement mechanism So as importantly In order to commemorate the 25th anniversary of the launch of the human genome project The nhdri history genomics program hosted a seminar series entitled a quarter century after the human genomes project launch Lessons beyond the base pairs Featuring hgp participants Uh the first six seminars featured a panel discussion involving elka jordan and mark geyer that that uh eric moderated Which is shown on the right Other lectures included those of maynard olson, uan bernie, bob cugdeegan, marco mara, and david bentley So all well known names And last but certainly not least we have uh since 2014 actively conducted oral histories where I ask program staff Interim real investigators and extramural grantees and leader in the genomics field to discuss their life and work Most are about 90 minutes long when edited for posting on genome tv Many like like maynard olsons are quite a bit longer We've completed 35 so far and and many more are planned The first set of five Featuring uan bernie david bentley, howard mccloud and living legend maynard olson are now on on genome tv on youtube And also included our recordings of two panel discussions With current and former n hgri directors So upcoming oral histories to be posted on genome tv Include a recent one by george church, which is really interesting than I did Not so long ago. We've also partnered with two biotech And basically biology history Organizations the life sciences foundation and the chemical heritage foundation and they've done five oral histories with us all right so I'd like to play a clip of elka jordan who is the the former deputy director of the n hgri On the role of bernie david heli in the early years of the hgp So this is one of my my favorite quotes She did she did i must ground that to her she did recognize that this was a Very valuable program and the community was behind it. So she didn't try to kill it, which she could have tried to do So she Yeah, so that's that's sort of exemplifies a lot of the Interesting things that were going on and out around 1992 and So to and I'd like to outline some of the key move goals moving forward So first and foremost is to expand database content And to ensure that the database has sufficient coverage of all areas of genomics history To expand the database user base although we're not going to need a whole lot of encouragement from this side We're sort of getting a lot of encouragement from the from the community And As much as possible really to improve the the user experience inside the database And especially the efficiency of the searches by keyword Because if it has not already it will certainly become in a year or so a lot to browse and search through so we're we're really thinking about basically how to how to maximize and make most efficient searches so To these ends In july this year. We are hosting our first Archive database users meeting just like the users meeting of other consortium here We will ask some of the heaviest users of our of our database questions like What areas do you want covered? What areas do you think we need more coverage? What do you like and dislike about file search about the the interface viewing? And the the meeting I think will give us a really clear picture of how to how to move forward with the second phase of our database design And you know, I think it will also give us an opportunity at least on on this side here This you know to try to move the historical community into topics that we think Really well represented in our in our archive holdings here and are not represented in the historiography outside So it's sort of a push-pull So we are also going to actively court leaders in the genomics field to sit down for oral histories So last in september this year We will also be hosting a handwriting lab during which we will begin the process of transcribing and making machine readable and searchable all of france's collins's handwritten notes There are thousands of pages in our archives that have been digitized But they can't actually be keyword search in the moment because they're handwriting And during this lab we will we will develop and implement a pipeline for the transcription process That way I I don't have to you know, basically interpret these notes for for researchers and that it's basically going to be part of It's basically part of the database So an example of this is from the prehistory of the hat map project in 2001 and it reads this is a medical project useful in all people Not the hgdp Not the human genome diversity project, which was a An older different effort than the hat map focusing on the interaction between genes and cultural evolution um, I see this Handwriting lab. That's going to come up in december uh, september as a opportunity for graduate students And they will not only aid us in capturing the history But they will also be able to use these notes and these materials in their time here To write their papers and their dissertations Okay, I I need to note that The only full-time staff on this project are myself and an archivist So it wouldn't be possible to do this without the help of many many many people Across the institute So chris and I would first like to thank the folks who have been closely involved in the program from the beginning um And adam and adam felsenfeld and lisa brooks from extra mural as well as larry brody from Genomics and society have been really Important in terms of guiding this program and guiding me as i'm here Um, we also would like to thank folks from the communications branch Who really helped with our our oral histories and getting them on on genome tv Folks in the division of management so that we are in full compliance with federal records laws But especially the the core team Including edson who is our database project manager And eric, okay Thank you questions So while you're thinking of questions or comments, I might make a couple comments just listening to this first thing is Just at a very practical level. I think and it was it's highly illustrative to see how far sort of our Internal culture has come for capturing stuff Watching the recent retirement of jeff schloss sitting on this incredible historical set of documents related to a thousand dollar genome efforts and his Major role in it and just you know watching his office get Um deconstructed but making sure nothing got lost. I mean it was a very seamless thing And you know compared to where we were i don't know three four five years ago when people like mark geyer and jane peterson retired When we really had we faced a similar circumstance, but it wasn't quite as easy or straightforward So I you know that was a great example of how we've advanced just in internal efficiencies The other thing I'd say is I have found it to be fascinating To interface with the you know history of science field I mean, it's a different cultural group Some of you might have interacted with scholars over the years, but it's a it's a very different chris Chris donahue has taught me a lot about this. It's it's they've just sort of a different academic Environment, but nonetheless they're extreme as chris mentioned They're extremely enthusiastic about what we're creating and what we end up doing is leveraging a lot of their energies Graduate students and even some of these scholars if we could put some minimal infrastructure together They're the ones coming in and putting in the time to sift through it to help annotate it to do lots of things And so I it really is about leveraging Um, which I think has proven to be very effective through some of these outreach efforts that chris talked about So two points I wanted to make How much of this is currently machine readable? So in terms of uh, like how much of it can you search? Yeah, so you can search about 400,000 files at the moment It feels like searching the electronic record looking for particular documents. I mean, I you know this is We have a capability at our place of Anybody can just ask the question. How many records do we have that have x and y and z? And if the number is less than five, then we don't tell them how many there are But if the number is greater than five, we say, yeah, we have 17 records that do this or say Feels like the technology ought to be Very similar for for you could steal sort of adapt our ideas. Sure Yeah, we're always looking to adapt I mean the other thing, of course, is just the way they have the the files organized I mean, it turns out if you're coming in very generically, you're coming in very specific You're coming in very specifically I think did you mention the top one of the special issues around It's about basically genetics and anthropology But what about what's the special when you're doing on hat map? Yeah, and there's there's a huge effort underway on basically hat map So you just take hat map as a discrete project You do something similar with encode you can do something similar with a number of the We take hat map, you know, even without doing searching You have these organized folders that you can sort of build your way down Start at the beginning or, you know, some of the earliest planning meetings Some of the initial set of grants some of the earliest Yeah, well, then there's that right and I mean the the one the one drawback a little bit with this number of files Is IQ seed all this stuff and I organized all this stuff That we basically made machine readable So, um, I know where everything is in terms of that that huge bank of about 400,000 files And I can instantly tell you at least the Maybe not the floor of the house, but I can certainly tell you the address of what you're looking for And it's basically just because I've spent all this time organizing this archive And because I have all this sort of tacit knowledge, but we are getting to a point where the the Archive is is big enough where even that is starting to strain a little bit. But yeah, so I saw Eric and Val and Gail Eric First thank you for your effort and the presentation The sense I get this is more of a history of genomics and less of a history of nhtri, but I it's both Okay, it's both because I remember a conversation a fascinating conversation in this room with mark geyer About the early discussions of should there be a genome institute, right and and I hope that's captured my second point is I hope the international flair of the genome project is also captured because I'm afraid in some very large initiatives today We could use some of that and and we've seemed to have lost it We're not in my opinion in some of our very large efforts We we need to be cooperating more particularly with the uk than we are today And I I wasn't around during that history of how Those early days of cooperation and some competition on the genome projects were born, but it'd be interesting to see how that was Initiated and how it came to fruition because again, I think we could use it today So there there is a archiving project and history project being undertaken by the welcome trust Also, Beijing genome institute does have an effort that's that's sort of beginning and Henry Yang has In in some ways started sort of sounding the alarm bells about the international history of the human genome project But I do have a concern Just for other country specific efforts whether those those those national flavors are being lost although we do have Some records from even the the entries into the human genome project that never quite got to be entries Being the the famous example being the the the the russian or soviet Entry into the human genome project We do have materials from from from that side and we have country representation But I've been sounding some bells. They're not exactly alarm bells trying to figure out, you know Is is france is germany? What's going on in those countries? And I sort of share your Share your your query about things being lost. We're just waiting for the russian stuff to come out of wiki leaks We're not doing anything about it. We may not need to wait really. Yeah, so so val and And gal and carol and yeah, so a couple questions. I had one was Is there a possibility for a small amount of funding for like a Fellows program think a postdoc level people Who might be interested in coming and working with you and funded by NHGRI? So that's one question the second question regards the application not so much the process of the application But the principles in applying for getting access for example Do you have to propose something that's Genomics history or if for example I or someone else wanted to write a biography on say eric green. Oh, could we apply? Absolutely Go ahead. You can answer it. So so right is so we Claire Driscoll and I who's in our technology transfer office developed Basically in a a database researcher user agreement and basically what you do Well previous previous to really this year We only invited researchers into the database who were working on these two special issues And we started very very small And what we're thinking about and this is still up to some debate is is expanding that out To a bit of a broader researcher community where you're not specifically working on a genomic special issue But at this point just because of the nature of some of those files you have to be enrolled You know, basically you have to be working as a history of science or a history professor at an institution where the institution signs for you and basically says These files we we know that these files that you are giving us of human genome project history are your property That we're just using in part of a researcher collaborator agreement And we will not use them, you know, basically poorly And only for the stated purpose that we said like a biography of eric or a biography of francis colons or something like that So the it's it's not onerous, but it is a very narrow series of things that people Agree to when they apply and it's not somebody you can't just sort of email me and say hey, I'm I'm just some guy Who's interested in the history of genomics? You have to you know at the very least have a very specific project in mind And you have to be with an institution where I know that you can undertake that project and so on and so forth But the the intention I think part of Al's question the intention You know, we had to start this small make sure we're comfortable to all that the intention is to grow the universe I mean you said you have 25 over doing it to grow the community as we get better It's just sort of keeping a watchful eye on this But yes, we want to and we don't want to to just have people work on stuff that we want them to work out We want them to bring ideas of things that we have never thought of Right. So one thing not not to interrupt But one thing that we are seeing a lot of is that there are centers like in Edinburgh that do history of genomics And I've gotten basically emails from people that I know very well saying I have four postdocs That are really interested in the history of large-scale sequencing or the history of you know One example being you know murion history murion sequencing. Can you enroll all of these people? Into your database who are all part of this group who will collectively work on a history of murion sequencing I'm saying yes, we can do that and we're going to see more and more of basically these labs basically You know a PI plus a bunch of postdocs working on a very very big funded project And and all those people under that umbrella will be enrolled into the database So we're you want to quickly answer Val's question about possible trainees, which I'm sure you'd love the I would I would love it Yeah, that would be a yes. We've talked about it various ways But so far a lot of times folks sort of already are funded already are part of academic programs It's just a matter of just coming they come here and work Oh, oh, yeah, but you gave him ammunition to really get after me because he's floated this idea before So I think I have gal and then carol and then jay and then Sharon Sorry, so um, you all know Jim Watson Said the gene, you know lc program odd, you know, there should be money to study Unknown consequences, right? And so I was reading Laura Andrews who some of you may know who she is wrote in her 1999 book Quoted him as saying I wanted a group that would talk and talk and never get anything done Well, so the the history of the is supposed to be a laughing point, Eric Well, I bet I bet he said that But he also said other things which were much more inspiring about wanting to think about social impacts and we have lots of those quotes So the idea though was just to ask you about your your documents and history of the lc program and whether And how whether people are working on that kind of thing as well because it's a really unique program in science And policy and so I just thought you guys would enjoy a little Something funny at at 4 30 in the afternoon. Yes, it's but Reviewing some of those notes. We find those sorts of bits. You find those things all the time. Okay fine. Anyway, so in fact I suspect you've got lots on the lc program Yes, um, so we've got lots on the lc program um, I could give you kind of a percentage of what the lc program is in in Our digitized paper files and it's about 24 percent So we also have the most of of joy boyer's files that we digitized a while back for safekeeping um, I also know that multiple people Who are some are at hopkins some are working with susan lindy at upen um They are very very interested in in doing the history of lc or doing an lc history um And also, you know, occasionally we get uh individuals from u.n.c social medicine and eric youngst group. Yeah, right. That's right. Yeah, so So that that his I actually have eric youngst Table of contents for his history of of lc So and we also interviewed uh, elizabeth thompson before she passed away. So So lc history is extremely well represented in our archives. There are scholars that are extremely interested in in doing That type of work We haven't seen quite the cold basically the the coming together of a of a very very focused research group in my field similar to What's occurring over hat map also what's occurring over technology development and also End code and also a little bit. There's the mod squad. They're down in fit. Uh, sydney. They're they do model organisms the databases um I think we will see a lot of discussion of lc history in about two or three years Carol So kind of following what val was saying. So I wonder uh, if you've contacted the Smithsonian the american history museum for potential Interns or or volunteers to help us. I mean, they're they're like our nation's archivist. So They might have a a lot to offer in in in a project like this. So that was that was number one And the second is there's You know the history of n h g r i in the context of genomics history So there's a lot of history about genomics that occurred outside n h g r i including the coining of the term genomics, right? Which was coined by tom rodrick of the jackson lab It test his restaurant in bar harbor main. So I mean, so there's probably tom's Tom has unfortunately passed away, but um, his his wife probably has some of his notes He kept he kept amazing notebooks Right which might be relevant to this but aren't documents that come from the boxes People leaving and retiring from n h g r i so is there a mechanism to get some of those documents Into this kind of history. So not at this time. Um, and but i'm i'm I'm certainly not alone and we've been thinking about that for a really long time So ari patrinos has all of his notebooks and I go really and I and he says, oh, yeah But they're in greek. I said, well, that's no problem. I said so So we have been thinking about you know There is a huge wide world of genomics outside of this institute and outside the things that we've funded The issue is Two full-time staff members and also just the the amount of hours in the day And I think there might be a point and there might be a mechanism of Where you you know, basically looking for a repository of last resort We may be what it is, uh, the the you know, basically the repository that you're looking for No, I mean, I think we'd be receptive. I mean, we don't have an active program to grab stuff There is practical limitations, but I think we'd be reserved especially for an example like that. Let's keep going j I'm just gonna well just one just say I think this is Fantastic that you all the presence of mind to do this right now and in the long view of history. I think this will pay off. Um the Maybe you have this in your presentation somewhere and I missed it, but it seems like a lot of the most interesting, um Dialogues from the you know from a historical perspective would be not necessarily captured in printed documents, but rather, uh verbal or email And I'm just curious about what the considerations are. I mean imagine just all of francis is Back and forth transactions in itself and or gyms would it's you know So by the a lot of email from francis and from some other people as well. Um, we have a pretty Established uh method for if you are sort of at the division level Say for example, jeff schloss Jeff before he left basically went through his email and said these are the historically important ones that capture the conversations that you're thinking about Here they are here. They go into the archive So we have email capture Um, and we also have from various directors and also various staff who kept copies of the email Things like that in a way that that sort of capture the substratum that you're discussing We also have a lot of we have arlie too all the videotape from that I mean, so there's you're sort of a lot of things that together I think paint a a little more of a composite picture than what I what I presented sort of very quickly Is that it answer your okay to the best of our ability for the best of our ability, right? Yeah, so Sharon Yeah, I think I had a somewhat similar question to carol is that it would be nice at least for the initial genome sequencing centers To see if they have documents on their side To add to the archive because you know, most of our universities are not archiving our work And and I would think they may have documents that are interested may not have showed up in Your collection probably a lot of documents complaining about us That'd be good to hear but they might be you know, I don't know Yeah, um So mit does have a very extensive society Science and society program and I have talked with historians at mit And I sort of am and I have been in the room with mit archivist and mit librarians And that was the big alarm bell that I that I sounded last year about basically sequencing centers Losing their history, but it's just It's just not there quite yet No, I mean, I meant them contributing to you So, I mean Baylor does not have such We don't have a lot of archiving Um, so I'm just saying I think it's something you might consider if some of the Census of cd. Well, I just meant that There might be documents that they're willing to note donate to the archive No, I think so I think the I think there's a point well taken maybe there because that would not necessarily require A huge amount of effort could be the kind of thing to sensitize people these the kind of things if you have You know and you could quickly scan and send to us We could make we could intake and make part of archive and we could think about a good suggestion Yeah, and but we also don't want to run a fallow their own sort of archival practices if they exist So if they well, they somewhere it exists. Yeah, Eric Eric is I think this is more of a question for you and it's a bit difficult is It this is an institute that has promoted public release of data We've we've engaged people And asked them to release their genome to the public Then why aren't we making these records publicly available and making people jump hoops and document Their particular question Just why why aren't all of these records publicly available? Or is that the long term? Well, I think the long term is to make as many of them available as possible Second of all, I make I think this is analogous to a controlled access situation Because you know, we're we're we're making available to qualified researchers We have a fairly low bar what a qualified but and and then the other aspect of it is that not We have to be a little bit careful. These are government records and includes some information that we haven't thoroughly Convinced ourselves should be publicly released and they may never be publicly released So to have them at least, you know, sort of a have them in an archive is valuable But but just a wholesale open everything up would potentially run into other problems And so that's that's why I'm sort of trying to find a middle ground You know, I've been thinking about the public face of this In regards to sort of our data release norms for five years So it's it's a it's a it's a struggle. Um, and I think eventually if um, the other thing is to make these things Totally public requires a differing basically public face at differing level of metadata I mean, it's you know, if you want to make it searchable You can't have the same sort of metadata that you have for just the the internal database So there's there's some technological and logistical challenges to this as well But the the goal is is to make at least some of these files publicly available So that there isn't that that existing paradox But I think this is an area that it's good to keep in mind and just keep pushing ourselves because again We're we look people in the eye and ask them to make their genome public And make an argument why that's good for society And I think then say when it comes then to our own selves, we should be follow those same ideals Challenging with historical records in this context is that at the time that people make certain statements and road sales and documents They didn't know you were asking them that so you don't have the equivalent of being formal consent Yeah, there's all accessible under freedom of information and indeed Well, I'm not I'm far from an expert, but on certain things that are solicited under SOA Other SOIA the subject gets a chance to comment Things might be redacted you might do all of that ahead of time and there might be nothing sensitive there has no idea I was just pointing out that that's the difference between asking people ahead of time and having things done So I just wondered Based on your second slide where you show all the archives or Raiders of the Lost Ark If you if you digitalize these can you get rid of the paper copies? So the paper copies are all in the national archives. So they're totally they all go there We're totally compliant with all federal records laws So if you really want to look at the paper copy you can go in there I mean or you can come here. It's your choice. Yeah. Oh god. Don't make me watch So I think we probably all right. Thank you very much Now I have something far less appealing to offer