 So thank you very much Andy for coming, he's from MD Anderson Cancer Center and he's going to talk about the genomic medicine capabilities there. Well thanks very much for giving me the opportunity to fly up and give you a brief overview of what we're building and I think building is the key word but MD Anderson since certainly since my arrival back last April and with the arrival of Linda Chen as well. I was telling somebody it took as long to get from the shuttle over here to the hide as it did to fly from Houston to Dallas so I spent about as much time on the ground as I did in the air. This slide is actually supposed to be slight depict the problems we are trying to sort out. All the houses look alike, high altitude to low altitude views, building big pipes and at the end we'll talk a bit about branching and how things are related to each other in a heterogeneity evolutionary sense. You've all seen this slide or variations on this theme multiple times but I'll put it up one more time. You can call it personalized stratified precision ad nauseam terms but essentially it all comes down to the same sort of continuum we'd like to work on in particular in terms of speaking mostly about cancer the right target, identifying through various forms of patient omics basically genomics at this point, large-scale efforts in terms of cancer genome atlas project and the international cancer genome consortium but taking that sort of retrospective cataloging data, validating that in current patient populations and identifying things that look like targets, identifying what be mechanisms of action and then moving those into various appropriate drug development assay development and biological assays to at least come up with agents that can hit the target we think we've identified by virtue of mining omics data and then of course the hard things begin to happen where we have to develop the ability to find the right patients. Lots of what we're finding in genomics tends to be clusters of relatively rare mutations across different tumor types for any given gene. The B rafts and the p53s and the wrasses are quite uncommon it turns out and that makes life more difficult to even start on this process let alone get there to the end. So we began to think about what needs to be done at least in terms of a single institution to try to advance or or accelerate this process and one of the ideas that's been put together at MD Anderson is the so-called moonshot program and this is moving from a paradigm of what is essentially clinical expertise which is very tumor centric one could argue siloed to a certain extent and trying to leverage that expertise across tumor types across commonalities but doing that in such a way that you now leverage the expertise across multiple departments and also building an infrastructure that allows the generation of the deep data sets that one might begin to need to put yourself on that upward turning arrow. The focus is on patient impact deliverable driven sort of as opposed to a more academic research environment where everything generates new questions and you can chase things endlessly down rabbit holes comprehensive in that it's attempt anyway to span the whole cancer tear continuum from looking things like early diagnosis on prevention all the way out to issues of the sequelae after a cancer diagnosis and treatment in terms of survivorship. I think the key component which I'll try to highlight here is the collaborative internal and external and the organizational constructs and technology that need to be put into place. Again fabulous clinical research but not much in the way of cross cutting translational research on a very large scale. One of the key things that we've tried to do in terms of thinking about how you build genomic medicine capability was to basically try to not tackle every single thing at one time. So not these selected cancer types which are triple negative high-grade serious leukemia of several flavors lung melanoma and prostate instead what we did was specifically to go after a subset at least to try to build the system to prove and work out the structures and the kinks in the systems that had to be put together to try to leverage large-scale data generation in both the clinical and genomic arena put those two things together and try to move things forward. As I said one of the key components of the moonshot is not just the idea we want to reduce cancer mortality we all want to do that. One of the key components is a serious thought and investment given to what sorts of infrastructure need to be developed to empower that activity both within a tumor type and across tumor types and what's been put together many of which are actually ongoing and being built are the so-called platforms. Now platforms are somewhat akin to cores but not quite the same thing and I could be at pains up here for the rest of the afternoon trying to tease out the difference but essentially what we're talking about are research enterprises which are run almost on a semi-industrial scale with a professionalized staff which are not subject to the academic sort of evaluation structures that are in place for most cores which are run as a part-time enterprise by an academic faculty member and so they're on a deliverable based industrial scale. They're subject to you know contract renewal on a yearly basis depending on performance. So the ones being put together won't surprise you too much. The Center for Co-Clinical Trials is essentially trying to leverage large-scale model organism development for the cancer types primarily the focus of the moonshots and to build that into a system where patient samples are being explored in both xenograft animal models, genetically engineered mouse models of the most pertinent cancer types are being developed and essentially being run in parallel in terms of looking for new drug development, looking in terms of genomic markers at the same time as the human cancers are being analyzed from the patients. A freestanding Institute for Personalized Cancer Therapy will also be leveraged. This is run by Gordon Mills and John Mendelson which is already doing sort of targeted genetic testing on the back of clinical trials to try to identify new indications for therapy in patients who have failed the primary treatment looking for things like you know the usual suspects EGFR mutations, RAS mutations and the like. There's platforms being built around cancer control early diagnosis primarily focusing a lot on proteomics with the recruitment of Sam Hanash. Clinical genomics needs a major sort of retooling and scaling to be able to handle the workflow we see coming that's underway. Jim Allison has arrived and is working on building a platform if you will that will take into account and begin to have a systematic deep dive into the immunological parameters of each patient and each patient's tumor. The Institute for Applied Cancer Sciences is essentially an academic drug development unit which is now encompassed within MD Anderson so the ability to go after targets it may not be at the first order of intention for large-scale pharma but may be driven by smaller efforts in more boutique or esoteric sort of target space. The translational research continuum is a fancy word for basically saying you want to try to merge this clinical genomics and all in and just close the ties and interaction with the phase one and two units at MD Anderson as possible and last but not least the last three things here are what I'm going to concentrate the rest of my my talk on or building research genomics and informatics thinking about leveraging big data and I'll tell you what I mean by that and molding this into a framework that we've come to call adaptive learning so the little man's head's been cut off which is a bit unfortunate but anyway he's really sick so you have I mean the overall gestalt and flow is given here it's it's reasonably complicated so I'll take a few minutes to go through it we're putting in place the normal things you might think we want to do one of the first ones that's that's almost at the cusp of being put forward or approved is a universal consent which allows the collection and generation of of omics data at this point primarily focus on genomics data it doesn't specify the uses of those data those data would have to come as different IRB driven protocols which say this data has been generated on this set of patients I will now want to access that data for a specific study so you slightly shift the paradigm and that you don't have to write a study each time you want to do genomics and study it you have to write a use case and a use driven IRB protocol to study the the universally collected genomic data at least on the subset of the Moomchot tumors at this point and of course one needs the consent in place but also more regularized and and ordered an SOP driven biospecimen collection banking and indeed by molecule processing preparation of analytes all of these things have been put into place and are modeled somewhat and driven by the experiences and we've had both in terms of the ICGC experience and the TCGA experiences so we're beginning to know how to do some of these things a bit better what you have down this side is the research data and this comes from all of those platforms that I just showed you so all those platforms other than probably IPCT are operating in research space in terms of generating proteomic marker data genomic data immunological monitoring data all of these data will be come out of this box and on the opposite side you have the clinical information and test this is essentially all the information that's collected on a patient from entry through treatment through follow-up that rides in any clinical database within the institution and currently I think that number is about 26 or 27 separate databases so ideally one doesn't want 26 or 27 different databases what actually one wants is to merge all of this and all of this into an integrated patient data warehouse the actual that that simple notion of having your research data and your clinical data living in the same database space such that it becomes much more track tractable to query it to write analytics over it and also to regulate access in terms of making everything again a use layer such that everybody's running analyses inside the framework rather than downloading data onto laptops onto usb sticks walking around with patient data this is the model anyway now I can say this and it's easy to say in a sentence but we all know that the road to big data warehouses and merging clinical and genomic data sets is sort of that road is littered with the burned out huss of thousands of cars we think we have a way forward you can invite me back at some time in the future and I can tell you where we're burned out car or not but I'll give you the model of what we're putting in place at the moment this was actually already ongoing before we arrived at md anerson in terms of trying to merge the clinical databases in a stage process over a multi-year process into a centralized data warehouse so we sort of leveraged on that capability in the backbone that was already there basically an Oracle type backbone and began to have serious conversations with providers about how we actually move genomic data and omics data into this same warehouse and what the sort of the relational structure of that database should be and that's sort of literally the conversations that are happening almost on a daily if not on other every other day basis what we'd also like obviously is to pull in external data we'd like the TCGIC GC data in there pub med all the other data that one might think would be useful in terms of understanding the collective experience from a from a published standpoint or a research enterprise standpoint for any given patient that you see walking in the door who may share characteristics but of another patient that you have collected in the database and on top of this obviously one of the things that attracts certain facets of the institutional hierarchy is the ability to improve efficiency of operations it's a big hospital every every efficiency saves money saving money is good one can envision how one might having your radiology data and your pathology data integrated may save may have cost savings in turn of recapitulating these databases individual informatics personnel supporting all these silo databases now are integrated into supporting a much larger data structure I think one key facet and something we're working on quite hard is how do you leverage all of this in terms of learning about each individual patient and moving from what has essentially been almost a frequentist type model where you collect a thousand patients then you go back and ask questions about okay where were they similar where were they different where were their facets that may have pinched upon the differences in those patients to actually moving to what is arguably almost almost a Bayesian sort of approach where each patient becomes almost a research participant and moving forward in terms of integrating both their clinical data and research genomics data and essentially moving into an era where each patient becomes a model for at least overlapping phenotype and learning about that phenotype for the next patient who enters the study and again this is an iterative cycle not all this whole thing swirls in a world relatively rapidly hopefully and that we're learning from each patient and eventually informing better treatment decisions integrating this data so each time around the circle one hopefully does an incrementally better job at least identifying what the key issues are and hopefully actioning and impacting those key issues so big data it is Texas after all so it couldn't be anything else again this was already being built the longitudinal patient data warehouse so the key facet here is this isn't snapshot data anymore this is data over the living history of a patient as they're seen through the process at MD Anderson in terms of all the way through the process every follow-up data we're very much interested in the longitudinal aspect where a lot of what's been collected in terms of at least genomic data has been very much snapshot driven so the challenge is there and how do you how do you obtain samples longitudinally it's no surprise that we started with leukemia because that's the easy easiest place to get at the samples longitudinally solid tumors present a huge challenge which we're getting to think about but inside the massive data analytics box for lack of a better term all sorts of things can be envisioned clinical decision support operational efficiency gains learning where the systems learning how to compact particularly the it support for a lot of the independent databases and individual sort of clinic databases and mechanisms that are in place and again research and development one of the key facets and one of the things we're trying to put forward first is this notion of end end user interfaces with the understandable and hopefully or at least understandable data we'll move on to actionable later one of the key things we've engaged with first is the ability to make sure that one of the first things that gets stood up from all this sort of genomic data and integrated patient data is an interface that a physician a clinician who's actually seen that set of patients that they helped enroll can actually go back down query that database and begin to extract information without having to have a buy from a petition on the payroll without necessarily having to know how to code or having a clinical clinical fellow spend the next three weeks digging through individual patient records I think this is critically important because it builds support within the clinical community at MD Anderson I think it's incredibly important again allows them begin to use the data to see what the history of their patients is and begin to see things that maybe they haven't appreciated before and contribute and become partners in the research process rather than sort of passive participants and again also a layer for researchers and again the hardcore buyer for metitions would of course be able to go in and work on the raw data themselves but this is the order of importance in terms of of implementing any of this is actually to get something out for the clinicians in the first instance so the leukemia project I'll give you a couple slides on exactly what it what it is and what the plan is so plan is to take the next 1000 leukemia patients who essentially quote walk in the door with a pretty decent focus it has to be said on reaching for MDS AML and CLL because these are the moonshot tumors focused on but not limited to newly diagnosed patients we started enrolling at the end of September and I think as of about a week ago there were 250 patients enrolled on again fairly heavy in the MDS AML CLL just coming online but also with a decent number of acute lymphoblastic leukemias as well samples are taken at diagnosis or presentation in terms of previously diagnosed or referral patients and thereafter at each patient visit along the normal clinical path of referrals and patients coming back in for evaluation we're taking saliva and buckle for the normal depending on the leukemia type bone marrow and or peripheral blood for the tumor sample and again the key here is bone marrow and blood are assessed in the context of the normal clinical work but in care that is the saliva and the bone marrow being collected in the bone marrow clinic everything remains in a clear compliant chain of custody the research samples are split from that and taken out but the loop is essentially closed and that we don't have to resample the patient per se if we actually want to move back into from the research environment back into the CLIA sample which has been held the entire time so that's a key facet of the design what we're doing we're generating genomic data at this point we're generating whole exome sequencing and starting to think about generating low path whole genome shotgun the latter gets you rearrangements which in this case are quite interesting for at least a subset of the AML patients which don't have apparent translocations this is being generated on each pair data generated from the normal tumor presentation and from the relapse samples when the patient actually relapses so 20 to 50 percent of these patients will actually go into relapse over a period of months that's the next sample so essentially each patient gets sequenced for three samples up front or at least in the course of the initial sequencing wave the normal the tumor and the relapse we keep the individual samples from the the the the inter sort of relapse space to be able to dive back in and start asking questions about minimum residual disease early detection and peripheral blood from circulating DNA and the other things one might want to do um all clinical data are currently collected in the departmental database again leukemia has its own database where they collect every single facet of information that they think is important for leukemia patients that's being a combination of of automated extraction into the large scale database and also exploring a lot at the moment with natural language processing and crawling through patient records for instance cytogenetic data is semi-structured you can't automatically download it into the database you actually have to process that data through a natural language processing and actually seems I mean I've been remarkably impressed actually I was I was highly skeptical of how well it was actually going to work but for some of the fields where we need the data it actually seems to be doing a really credible credible job one likes to ask think about the questions one could begin to dive into with the first 250 patients of the first 500 I just list a couple of them here that are they're interesting what facets are important in the progression of MDS to AML can we identify patients who are at risk for death during induction chemotherapy and an issue I'll come back to in a few minutes can we understand more about subclonality of any given patient's disease and the propensity to have relapse and progression on standard of care therapy so that's sort of the leukemia project again we're enrolling patients the sequencing has just begun so no data to show you in fact this is the most data free talk I've ever given in my life but there we are the other opportunities I think we're going to try to really hammer on over the next the course of 2013 are an issue that I'm quite interested in it's this issue of genetic and genomic heterogeneity moving into an era and thinking a lot more than it then certainly has been thought about in this notional idea of comprehensive cancer patient genomics that is really thinking about the germ line in the somatic genome of the patient as an integrated interacting whole rather than having the more traditional model where you know germline genetics is the parlance and the purview of this group over here somatic genetics is done by these sort of geeky sequencing people and they never actually get together and talk very often they sort of have embassies in each other's country but they don't actually visit very often and I think that's something we need to sort out and I'm sure everyone in the room is on the same page with that one as well because there's clearly some interesting things going on not only between the germline and somatic but also thinking about the other genomes in play in each patient in terms of all the little bugs and whatnot that are crawling around over and inside of us. One area as well is to move into the impact of genomic on outcomes particularly the notions of survivorship thinking about long-term issues which I'll go into now. Just briefly on the H word heterogeneity it's no doubt that genetic heterogeneity is a key determinant of variation in all the outcomes we care about. What we just like to know is what cancer genes are operative in any given set of tumors. We and others have shown remarkable variety at least at the somatic mutation level of all the cancer genes that may be operative in an otherwise homogeneous disease like ER positive breast cancer for instance. If we could learn that we'd like to know what is the level of intratumor heterogeneity within each patient. There's a tremendous amount of work going on now both from multi sampling of primary tumors deep dives and whole genome sequencing data leveraging single cell sequencing technologies that are uncovering a pretty bewildering array and depth of the intertumor heterogeneity problem none of which we actually have nailed down too much how much it actually matters. I think that's the important question here. Heterogeneity is nice or important or interesting but what impact does it actually have? Does it actually matter in the context of whatever treatment you're giving? Does it matter where the mutations are in that tree structure that we all like to draw? And I think the other important facet is what are the germ line and somatic sequence variants that are impacting the other things that clinically matter a lot? Drug metabolism, immune response, what role does cancer susceptibility play moving away from the highly penetrant Mendelian disorders and perhaps in a middle ground between the population based risk and the intermediate susceptibility driven by somatic mosaicism and all the other things which seem to appear almost a weekly basis in the journals and as well thinking about toxicity. And finally this notion of cancer patient genomics in a comprehensive scale each patient of course is a composite or interaction of two germline genome and the somatic genome and also as I said all the other genomes that we're beginning to understand as well. One of the things that's been reasonably well covered is risk in response to exposure from tobacco evuration, diet stress, but I think these are all areas that are being explored heavily now trying to think about the interaction and the access between these two things. Two of the ones I think we're going to focus heavily here down at MD Anderson are treatment response, acute toxicity and resistance. This goes hand in hand with doing lots of work in terms of genomics and target driven trials but also thinking about survivorship issues, long term toxicity, particularly interested in late recurrence and relapse and second primary cancers in the context of several diseases, starting working a lot in the CLL world now where the patients are extraordinary risk of developing second malignancies and thinking about late relapse and recurrence in the context of breast cancer and other tumor types as well. So this gives you a sort of an overview of what we hope to get upstanding up least to a certain extent over the course of 2013. And the key people involved in particularly the leukemia project and again sort of thematic, symbolic sort of gesture here with the bridge between the omics, labby type people and the and the leukemia physicians over here, particularly Hague Iconcharging, chair of leukemia, Guillermo Garcia Monero, Michael Keating and Bill Weirder, the people in the molecular diagnostics lab and all the people on the other side of the bridge, particularly John Frenzel and Keith Perry on the informatics side and of course in collaboration with Linda and the team here. So that's that's sort of what I wanted to tell you and just give you an overview. I'm happy to answer any questions if we have time. Thanks very much. Great. All right. Thank you, Andy. So we do have time for questions. I see Jean. Could you get into a little more detail on this business of consent? What exactly are you getting up front and what do you do after that? So what what we're planning on and again it's an ongoing discussion with the RAB is to obtain consent on a universal basis in terms of obtaining samples for analysis and sequencing or omics and in this case it's going to be sequencing in the first instance such that there's a step where and it remains to be seen whether we're actually going to be able to move it forward in that exact fashion but let's take it as written that that'll move forward and essentially the data generation, the sample collection data generation sets becomes the norm for each patient who gets consented under that protocol. The view then is that creates the data in the warehouse and then if you want to do a study then you have to start applying for use protocols. So the notional idea is it should empower the whole system in that if you decide you want you're interested in a patecellular carcinoma you don't now have to write a protocol to go out and collect the sequence the patecellular carcinomas those patients will have already been consented and collected. Again remains to be seen whether we're going to be able to stand it up or not but it's a notional idea to move forward and think of moving into an area where sample collection data generation becomes the norm in terms of that research environment thinking about each patient as a research partner and what you now control are the access to the data sort of use case driven IRB protocols and use case driven access to the data. Great. Oh. These are very vulnerable patients and I suspect you're going to have some discussions. Yeah I mean it's in fairness it is an ongoing very active discussion. We have Pearl and Jonas. Thanks. Very interesting. Following up on Jean's question regarding the consent just for the clinical data if I come in as a patient am I asked to consent for my clinical data let alone exome sequencing to go into the big center database. I'm not at present I mean that that lives in the departmental databases at the moment anyway that that's all collected or trying to support that into the big data warehouse but it's an interesting question does that change the nature of that data now when you integrate it with genomic information and again it's a conversation that's happening I'm not sure how it's going to play out and I think one can make a case for sort of both sides of the coin but you have fundamentally now altered the structure or the value or the interactivity of that data by moving it into this big data warehouse context and it may well be that we have to move to that model on an individual study basis but it's still I think what you would still layer on I would like to see in my view even if you did that you are still bounded by the by the use case scenario only on the on the study based research side so you have to come back in with a use protocol even if they've consented to have their clinical data their genomic data that necessarily doesn't mean it's quote fair game for anyone to come into the the challenge I think for all this going forward is will it be wonderful to have this uber consent if you say no you have to let those people know your stuff's going to be used anyway so I mean it's a it's a challenge so I think mark had a comment on the data warehouse did you want to comment on data warehouse to Jonas yeah oh okay yeah so I think the the point to make is that you know we're not in a new ground here that a lot of places that have an integrated data warehouse have different classes of data in particular payer data for systems that have provider on health plans that operate under very different data use agreements than the clinical data you can still keep those all in in the same data warehouse with the same structure but then use the permissions yeah exactly to allow the access so I think that the the problem while somewhat different in terms of research versus clinical has been approached before and it's a soluble problem yeah absolutely I mean one of the one of the people we've been who's been brought on as consultant has a history in the finance industry where they sort of do this layered access all the time I mean it's the way they run their business so I mean I think there's certainly models and within the health care system as well so you know it is it is been done before the challenge at Anderson is actually moving all getting it done which yes we'll see so my question is about the adaptive learning platform attached to this big data and the the integrated patient data this is almost like a long-term ongoing adaptive clinical trial so how close do you get to call it a clinical trial and the the other related question is how much would you expose these classifiers to external inspection for validation I mean the first question is I mean I'm not trying to give any short answers but it's an ongoing discussion about when when you move from that sort of almost a research question can we develop what look like good tools for identifying patients at higher risk of x or y or z outcome and then how do you validate those and you know the models of almost running on a clinical trial basis have certainly been put forward particularly we're collaborating to a certain extent with with IBM and their Watson platform as well so that's the first place and certainly there there's at least been in inter institutional sort of validation of you know the things that are being put out in terms of what Watson thinks it knows about leukemia for instance the ability to vet externally and have external advisory boards and for that sort of process I think is completely valid I think we're just not ready to stand it up and have people come kick it kick the tires yet but I think it's it's it's a valuable way to go and the whole adaptive learning platform is again very much like this sort of large-scale clinical trial where you sort of are now you don't have to write the protocol the patients are already there the data is already there you you design from the data you already have or at least impinge upon that sort of collection of patients which you already know about so I think it's a challenge to sort of alter the mindset to a certain extent about how you do these things but also how you put the checks and balances and traps and snares in place to make sure it's actually valid in the first instance I think it's a real real important point great thank you did somebody holler yeah okay please I'm just curious as if you move this into the clinical system is there certain education or special education that you've had to provide to staff so they can answer the patient questions I mean again something that's being actively thought about one of the first challenges was and it's quite crucial as you can imagine just picking leukemia in and of itself in terms of that moonshot sort of on project drive it was a long it took a long long time a period of months even with the the sort of the clinicians to bring them up to speed or sort of what these concepts were what genomics actually was formalizing that in terms of answering questions in terms of what will come up in the patient I mean is an active area which is sort of consumed within the platforms to a certain extent which I'm not particularly but I've been having talking with Ellen grits about various sort of approaches and and I think there's a fairly strong program at Anderson that hopefully will get on board with thinking about how you develop those materials but I think you know even getting the clinicians up to speed was not trivial and I think again that's so where we had to start to even get the thing off the ground great Mike so I'm curious on the as a cancer center how do you envision the incidental findings component of the genome being dealt with I mean you've got largely a somatic environment where you can you know you're going to throw out those things that from the normal side or retain them as the incidental findings but they're not going to be just cancer they could be a wide range of things I'm curious as to how you think that's going to I mean I know before I left the UK it was a subject of a nearly endless agonizing discussion and highly heated opinions at both extremes where you know you have to report everything back or you report nothing back depending on you talked about and even before even as I was leaving they were still in the throes of sort of that discussion in fairness I haven't been involved in the discussions of it here yet I think it's an important point but it really hasn't reared his head too much yet I keep sort of poking it with a stick in meetings but I haven't got anybody to really bite or squeal yet because just coming from from even that signer where they're doing large-scale population 10,000 genomes you know type sequencing you know they're finding germline BRC1 mutations and all the other leels that you would expect in a population sweep and that even what to do with those when they're even pseudo-anonymized or anonymized samples I think is a huge issue there so I keep poking it with a stick and bringing it back up but I'm I think it's an issue that has to be dealt with I mean again what is what is there is there a coherent national policy at this point here is there a coherent local policies no I expect to see a policy in the near future of those kinds of targets the CSER network is has written actually a summary of the incidental finding policies at all of the CSER sites that's being submitted any day to genetics in medicine okay that would be I'd be brilliant to see that and actually the Garnet network has paper that's now accepted right okay that's brilliant yeah we have it sorted I mean I was mostly curious about just that the fact that it's so somatically based at a cancer well I think we're moving into an era I mean what I tried to highlight there is I mean we're going to actively engage the germline now whereas from you know in the last 10 years we've been treating it as a damn nuisance other than something to filter out you know to find your somatic mutations and putting it in a bucket I think we need to move away from that because there's there's rich detail there about sort of phenotype we need to learn so we're going to be forced into it whether or why and far beyond cancer yeah absolutely other comments great well Andrew thank you very much for coming we really appreciate it good all right so our our next speaker is Josh Denney who's going to be describing the Vanderbilt Predict program and Emerge PGX