 Alright, thanks Adam, thanks for the opportunity to be here. I decided to talk about something completely different, not in my area of expertise in the lab, but something that I think is a much more focused project than the grand, let's understand all of life projects that we've been hearing about, but still I think right. So many of you are familiar with the way the adaptive immune system works. The vertebrates are happy to have a unique type of immune system and their T-cells and their B-cells would be thinking about antibodies and so forth. The T-cells are what's called the cell-mediated response and there's a couple of different types of things, but this cartoon shows the kind of interaction that we'll be talking about. There's a specific type of T-cell, CD8-positive, it's going to react to this class 1 MHC and I'm not going to go through all the details. So basically the background is that almost every cell in your body presents antigens through its HLA molecules as you see in this picture. And this peptide, the green bar, is just 8 to 10 amino acids and it's chopped that out of one of the proteins that's being made inside of the cell and so you can think about these things as the cells being forced to display a snapshot of what's going on inside them at any point in terms of all of these little peptides. And when a T-cell has a T-cell receptor that recognizes that peptide then this other interactions ensue that I won't talk about but some of those can lead to destruction of that cell if indeed something bad is going on and that's the idea, we're constantly having a surveillance for that and you could say well what if you don't have anything presenting at all and the natural killer cells are actually suspicious if you're not displaying anything at all then it's a different system that gets suspicious. So there's kind of an interrogation system that's going on all the time and that's a very powerful thing. So I think the immune system has enormous potential and I think we're really on the brink of understanding how to control that potential which makes a large scale resource type project very appealing at this point because the accuracy of our impending control of the immune system will depend on us understanding what kind of antigens are displayed in this way and what the response is precisely. The commonatorics of this kind of antigen display by these class 1 and the CD8 positive cell recognition are simplified from the fact that this is a short peptide 8 to 10 amino acids and while we can make an enormous number of T cell recognition molecules we're finding that we're able to also tap into the diversity of those and actually start to think about documenting that. So I think it's a large combinatorial space but not so large that we couldn't attack it with a coordinated approach and it would certainly be better to start with T cells than the B cell antibody thing that's much more combinatorially large space. So control what would it mean that we can control the cell mediated response it would have profound applications in infectious disease obviously it's the first thing you think about in terms of body's response to that any agent that actually is able to infect the cell and create abnormal peptides. Cancer is another case and if you were at the AACR meeting the immunotherapy that approaches the cancer with the darling of the meeting what can I say it's a very new exciting area after being shunned for many years as being hopeless now this is really had a spectacular resurgence and of course autoimmunity, aging, stem cell transplant there are many many applications across medicine if we could control the cell mediated immune system we could do amazing things. So I think it's a worthy scientific challenge to think about this it's clearly a genomics proteomics hybrid project but the genomics part is very important. So for example you can think of the concrete question given its genome and its RNA expression predict the peptide antigens that will be displayed by a cell and many of you may not know they're already software that's doing this quite well I particularly like the program net MHC out of the Danish group there is I'm hearing from colleagues that are actually in the field using these things that it does a pretty good job so you got you can you can decide if there are various processes in the cell that decide which of these atemers are going to actually be chopped out and which will successfully be displayed that is a tractable problem that we could get very very good at if we tried so that's definitely not a hopeless thing you want to know also that the you know which which HLA we can figure out that from the from the genome and whether all the pieces are in place to display these peptides second part is predict whether a T cell receptor will recognize a given displayed peptide antigen given the sequences of the receptor the peptide in the HLA molecule that displays it this is a much more ambitious but I think not out of reach question there aren't that many combinations and I think similar approaches large-scale big data approaches could approach that if we could do that then we could possibly create a master reference for human T cell recurrent for human recurrent T cell recognition events accompanied by perhaps live cell models where we could actually then experiment with these recognition events and learn how they are controlled and ultimately create a predictive model for this process so what would be the data elements that NHG are NHGRI would actually supply first of all we need accurate sequencing of the MHC region and understanding of the key molecules the HLA molecules and the other immune related molecules that are contributing to this process we still aren't as good at sequencing the MHC region as we should be we would need mutant cell panels so we can do the experiment with this and then there are many methods to do that it's exciting that we can do single cell RNA sequencing now because you would really want to know that at that level because as we learned looking at a V for brilliant paper recently there are so many different ways in which cancer cells for example vary from one to another and it's great to be able to do that in on a single cell level if we can then determine computationally from whole genome sequencing and all this other information what peptide antigens are displayed then we would be in the next at the next level we would like to do this have information about local immune environments and time courses and back vaccination infection during cancer treatment and so forth who would participate obviously NIAID is a great collaborator in this case and NCI increasingly because cancer immunotherapy is so hot at this point you might even consider going outside of the NIH and collaborating with some of the patient advocacy organizations that are concerned with diseases that have a profound immune component or that may have an immuno therapeutic solution I tried to get all the projects that I could find that are already going along these lines and I apologize deeply if I left you off the list Rick Myers is here he's got a great project at Hudson Alpha so there are already several groups that are doing T cell receptor sequencing in other words you can take a blood sample and I can tell you all of the different clonal expansions of T cell receptors that are in in that blood sample and that gives you the you know essentially a snapshot of what the patient's T cells are are configured to do at that given point and this is a great project that's doing that in in many different areas there are two commercial companies that do that as a service as well Mark Davis at Stanford has through NIAID funding a genome core that is collecting these data and other data that would be relevant to this NIAID has an immune epitope database where you have examples of experimentally validated cases where certain peptides were displayed in certain disease or normal conditions and I mentioned the net H net MHC program which actually creates very good prediction which is then validated by these experiments they have a large database called import and I found that to be a little bit of a grab bag a thing I mean I think we do much better informatics we could we could do this right if we really pulled all these data together the TCJ of course has all these full tumor genomes and we could predict antigen being displayed by all of the tumors from this process and we're trying to pull together something that would allow you to pull in literature and other other ideas based on that so that at UCSC so as I said the combinatorics are not all that outrageous typical MHC class 1 displayed peptides about 9 amino acids so 512 billion possibilities but only a small fraction of these will actually have the affinity to nest in any particular HLA molecule and we're getting to the point where we can actually make that prediction quite accuracy accurately we sequence the the TCR beta chain and the highly variable and relevant part of that is the CDR3 receptor part that's coupled to the alpha if you look at the CDR beta alone you have one person with 100 million different versions of that circulating in their blood but only a few of them are actually clonely expanded indicating that they responded to something and and been expanded maybe for a purpose and we can measure that quite quite accurately so I think the the the combinatorics of this while large it's not so frightening for me putting on my big data hat if you think about the problem of speech recognition or any any of these other problems that have a big data component we we can make the machine learning methods work along with the experimental data so we literally have to fill in a big table of all possibilities here but we will be able to extrapolate from what we learn and if this is all successful on the T cell cell mediated response we could go back to the much more complicated antibody and B cell recognition problem where the antigens are much more diverse and the combinatorics of creating antibodies also is much more diverse so you have much larger repertoires for orders of magnitude larger but still possibility okay that's it how would I do for time okay so can you help fill in for me the connection in terms of presumably you need a lot of data of the form this TCR sequence recognizes this peptide right so if you had a ton of that data you might bootstrap your way up tell me how you're gonna get any of that data in any significant amount yes so that's the that's the key technology challenge to get to get looped into this thing right so currently that in parallel I see how to do individual experiments right but you probably want millions of you know millions to hundreds of millions of data points and as individual experiments that seems step one step one is absolutely to figure out a way to do that figure out a way to do that so that's the step one technological challenge and absolutely yes so without some kind of experimentally grounded large data set one would be shooting in the wind so a starting point might be to to start with known drug HLA interactions and work from there so is there a is there a plan to use structural biology approaches to sort of understand how drugs might interact with varying HLA alleles you could you get drugs as modifiers of this process would be so drug HLA interactions are pretty important clinical yes absolutely yes that would be that would be a great a great way to get at an incredibly important clinically important aspect of it yes yes that was great David I think this is a terrific area and it does illustrate a kind of policy dilemma for the NHGRI but it's one that I I would encourage you know developing some approach to as you pointed out you know there are all these other parties NCI is very excited for good reason right now about these interactions and an IDK kind of grab the type one diabetes story and NIA ID has course much interest in HLA and so there I follow this field closely for 20 years and there's always been this somebody else is doing it kind of attitude right and I I think that the opportunity here for genome is just to sort of recognize as politely as possible that that much of that effort has simply failed to grab hold of the core genomic issue that's on the table and that is of interest to NCI and IDK and so forth I mean you know you look just backing up from the focus that you put on this you know in the peptide recognition the association you know between class 2 genotypes and type 1 diabetes you know as I discovered in the 1960s it's still one of the strongest allylic associations that that we've ever found after all the money that we've spent looking for tiny effect sizes this is a huge effect size and we still don't know whether the primary lesion that leads to what to type 1 diabetes happens in the pancreas or the thymus so that's not a very good record yes after 50 years of studying this problem and if you read the literature you'll see quite easily why the problem has not been solved it hasn't been attacked as a general problem in genetics genotype phenotype and genomics it's always been attacked from a narrower angle yeah so there anyway there's an opportunity here and I think it's a it's a good model and there's way too little discussion of HLA in genomic circles there's this tendency that we inherited from our forefathers that you know it's very special it's weird it's different right it's just another part of the genome it happens to be highly variable we can solve this thank you so much Maynard yeah my reaction and I I'm not being an immunologist which is obvious that I it seems like a complete cottage industry at this point that is is waiting to mature the way genetics matured when we got a hold of the human genome project and made it happen it's very very frustrating ankylosing spondylitis is another disease that I've started to study and you know that I mean 90% of the cases have a particular variety of HLA be the HLA be 27 allele and it's by far just a whopping signal and you know that is displaying some peptide and there's some reaction to that in these people but how could we have gone 30 years very frustrating yes David I like this too from the perspective of tackling difficult regions of the genome yeah I mean we say we have the genome but we really don't and some of the most medically important regions of the genome are quite challenging actually even today so I think pushing areas like this that are so medically relevant and getting the right resources there is very important thanks Debbie yeah so if anyone has a solution to the issue that the little issue that Eric brought up how to scale how to scale the key experimental part of this that would that would break this whole thing open thanks David I think we have to move on