 The clock is three o'clock, so why don't we get started? Good afternoon. I'm Griffin Rogers, the Director of the National Institute of Diabetes, Digestive, and Kidney Diseases, and I'd like to welcome you to the ninth lecture of the Genomics and Health Disparities lecture series. This series is a part of an ongoing dialogue about how innovations in genomic research and technology can impact health disparities. And in addition to the NIDDK, this series is co-sponsored by the National Human Genome Research Institute, the National Heart, Lung, and Blood Institute, the National Institute of Minority Health and Health Disparities, and the Office of Minority Health at the Food and Drug Administration. And the speakers for this series have been chosen from these five organizations to present their research on the ability of genomics to improve health of all populations. The speakers in the series, it's really my great pleasure, therefore, to introduce today's speaker, Dr. Jose Flores. Dr. Flores obtained his MD and PhD degree from Northwestern University, then he completed fellowship training in internal medicine, neurology, and then an endocrinology at the Massachusetts General Hospital and the Harvard University Medical School. Dr. Flores is not only an accomplished investigator and clinician, but he's also advancing diversity in research and in the scientific workforce. He and his group have contributed to the performance and analysis of high-throughput genomic studies in type 2 diabetes-related traits and complications in international ongoing studies, including magic, genie, diagram, and sigma. For all of you in the government, you know that all of these studies are acronyms, and so I won't go through the details of what that stands for. He leads a genetic research initiative in our diabetes prevention program and is the PI on the study to understand the genetics of the acute response to metformin and glipazide in humans, something called appropriately sugar, MGH, and other pharmacogenetic studies at the Massachusetts General Hospital. Dr. Flores serves as the associate director of the Boston Area Diabetes and Endocrinology Research Center and the director of the Broad Institute's Diabetic Research Group. He has over 170 peer-reviewed publications, and outside the lab, he's very active as a chief of the Massachusetts General Unit and the associate physician at the Mass General and associate professor of medicine at the Harvard Medical School. He also serves on numerous editorial boards and is the editor-in-chief of current diabetes reports. His exceptional research productivity has earned Dr. Flores numerous awards and accolades, including the 2011 Presidential Early Career Award for Scientists and Engineers. This is the highest honor bestowed on by the U.S. government on scientists and engineer professionals in their early stage of independent research careers. In 2014, he was also inducted into the American Society of Clinical Investigation. We're really grateful for his service as a member of the advisory committee to the director, where he co-chairs a very important committee together with the deputy director of NIH on the Next Generation Research Initiative working group. So with that, please help me in welcoming Dr. Flores to the D.S. Thank you, Greg. Thank you. Thank you. Can you hear me? So I think I need to find the name of the person who wrote that, and maybe I'll hire them or something. It's really beautiful. So thank you very much for the invitation to speak here at NIH. This is a particularly appealing topic to speak on, not a topic that I've really given any lectures before. This is a completely new lecture for this purpose. I was very comforted to hear that this initiative is organized by many institutes within NIH collaborating. It's great to see that kind of spirit of collaboration within ICs here, and the day has been fantastic. I've really enjoyed all my interactions. So I will briefly go through my one disclosure. This is within the last 12 months. And we'll start with three years ago. It seems like a long time ago. I don't know if you remember him, but he used to be around. And in the State of the Union, three years ago, he introduced the Precision Medicine Initiative, which was then unveiled in greater detail at this ceremony in the White House where many of the people here were present. And so that was an attempt of utilizing the resources of the U.S. government to collect the largest ever collection of individuals who are willing to participate in research and by using available technologies in no-mix, genomics, et cetera, as well as clinical phenotypes, electronic medical records, and whatever else samples they want to volunteer have a better sense of giving the right drug to the right patient at the right time and making the correct diagnoses. So let's take that as a starting point in this talk on health disparities. And let's think about Barack Obama, who's now a private citizen, and say, well, if he was a patient, what about his genes? You can see them here, right? He put them right here in his DNA molecule. So does he have African genes? Does he have European genes? And so the very first point to make, just so that we're on the same spot, is that he has human genes. There's no such thing as African genes or European genes. He has human genes, there's carrot genes, there's dog genes, there's sheep genes, but there are human genes. And now what he does have is he carries genetic variants in his genome that many, many years ago, hundreds of years ago, arose in an ancestor who lived in a different continent. And so we can say that he has an ancestry that says that those variants derived from a particular set of ancestries. Now we know his mother was a white American from Kansas, his father an African from East Africa from Kenya. And so you can say that a little bit over 50% of those alleles, based on the larger sides of the X chromosome and the mitochondrial genome, so a little bit over 50% is alleles that have a European ancestry. And then a little less than 50% are alleles that have an African ancestry. Now if you ask him, he self-describes as African American. And he gave very compelling speeches on issues of race prior to during his electoral campaign as president and very moving events in Charleston and other places where he discussed his issues and always self-identified as an African American. Even though if you look at where the variants in his genome came from, they have that sort of distribution. And so if Barack Obama came to my hospital and was a patient, and he developed heart failure, and then I wanted to treat him. And I wanted to say, well, what are you, African American? Or even worse, I looked at him and said, you look like you're African American based on my own judgment of what it means to be African American. Then I might say, well, the drug for you is going to be this sort of drug that you all remember was approved with a race-based indication. We will treat heart failure with a combination of a nitrate and hydrolysis because there's some subgroup analysis and then a subsequent clinical trial that seemed to say that it worked better on people who call themselves African American. And then there was actually a marketing that took place that looked at regions of the country where people with heart failure were more likely to be of those self-described racial groups. And then that was the basis for trying to market this drug, which was not particularly successful. And this has been really well collected into this book that really delves into these very thorny issues. Now, what is the point? The point is not really if you think that response to a drug has to do with a genetic variance or the molecular makeup you have that makes you more or less likely to respond to a drug. What really matters is not so much what you call yourself or worse, what you're perceived to be by some outsider. But what is the genetic variant that determines response to that drug? And that genetic variant may be something that arose in one content or another, and your chances of carrying the genetic variant has some likelihood based on your population and your ancestry. But it's typically not 100%. And so one example that was brought home was in this earlier paper about eight years ago now, where a group in Duke looked at the response to anti hepatitis C therapy, so interferon with ribovirin, and found that there was a variant in IL28B that had a very strong association with sustained viral response. So here's a Manhattan plot. For those of you who don't do genetics all the time, most of you know this. But for those of you who don't do genetics at the time, here's the entire genome array in alternating colors by chromosome. Each dot is a SNP in the GWAS array. On the y-axis is the negative log p-value for association. So the higher the signal, the more significant. And you see that there's a signal that emerges in this very good candidate gene. Now what's interesting about this one candidate gene is that it has market allele frequency differences between people of African descent, European descent, or East Asian descent. So the frequency is very, very high in East Asians. And so it's fairly high in people of European descent and much lower in African Americans. So then when you say, who are the people who tend to respond to this medication, it is more likely for a white person, self-described people of European descent, to respond better to the drug than people of African descent. But it is not 100%. And if you delve down into the allele frequency differences and you group them by European Americans versus African Americans, yes. It is true that overall, as an average, people of European descent are going to respond better. But that doesn't mean that people of African descent don't respond at all. Particularly they carry the same genotype because that frequency is not zero in people of African descent. So there will be some people, self-described African-American who happen to have the genotype that predicts response as was then mentioned by the authors of the paper pointing out that African Americans with a CC genotype, the response genotype, are actually respond better than people of European descent with the alternate genotype. And so when we talk about the use of genetics to be able to make conclusions about therapy, about treatment, about health disparities, I am not talking about race as a very large construct that deals with many, many different issues. I am talking about the alleles that people carry, where they arose thousands of years ago in one continent and where they can tell us about biology. So what I'd like to cover for the talk today, really, is in the focusing on the specific subject of type 2 diabetes, are there race-ethnic differences in genetic risk that we can determine? Can we leverage, those differences exist, can we leverage them to understand, to understand, to lead to higher discovery based on ancestry differences? Can we use genetics to measure ancestry and whether does that tell us about disparities? And finally, maybe show a few examples where it's possible that such genetic differences may matter in the clinic. So are there race-ethnic differences in genetic risk for type 2 diabetes? So we know that the observation is that there is differences in prevalence of the disease and that also goes along with racial ethnic groups or distribution and we see that, for example, people of Latin America and descent have a much higher incidence of diabetes than say people of European descent and this is the projection and the very rapid pace of what is happening worldwide. When you look at the Mexican experience, for example, you see prevalences of type 2 diabetes and upwards of 20% in some age groups that is about twice as much as what you see in people of European descent and whether you live in Mexico or you live in the United States, if you are of Latin ancestry, this is the kind of prevalences that you see. Now, that has happened because of the very rapid progression of dysmetabolism characterized by obesity and type 2 diabetes and you're all familiar with these pictures across the United States. So in the span of 20 to 30 years, you have had this rapid progression where if you really want to be saved, you wanna move to Colorado and maybe that's the one place where you can more or less be protected in some sort of citadel fortress from the onslaught of obesity. But other than that, this is happening but there are regional variations. Now, of course, these renewal variations are not driven by genetics, right? Genes don't change this fast. This is not evolution manifesting itself. This is rapid changes in our environment that are taking place on the background of a genetic predisposition and they may be affecting people differently depending on the genetic makeup. And so what is the reason for these variations in allele frequencies across populations? So we know that we descended from a common ancestor with chimps in Africa and there was a migration out of Africa where only a restricted subset of people made it across into the Eurasian land mass. And so the very first thing to point out is that all of the original genetic variation is in Africa and only a subset of that genetic variation made it out. So there's greater diversity, of course, that exists in people of African descent based on being a better representative of the original human population. And then as people migrated across the various continents that they had access to, there were so-called bottlenecks. Similar to what happened in the migration out of Africa, there's also a large bottleneck into the Americas across the Bering Straits where again only a restricted group of people make it out. And so what happens to allele frequencies when that happens is there's a stochastic event that says the number of families that carry a variant of medical across are going to over-represent that variant in comparison to the population they left behind because they happen to have over-representation purely by random events across the people who make it through. There may be other issues like drift and drift may be different when once the genetic population is in this continent versus that continent, they may drift at different speeds. There may be selection pressure, either positive or negative that make a variant depending on the geographic environment, drive frequencies up or down. There's of course the time when finally people begin to cross oceans and meet and Europeans who make it to the Americas beginning in 1492 with my ancestors and then the forced migration of West African peoples across the Atlantic during the slave trade and there's a mixture that takes place that mixes some of this variation and then the persistence of isolates for example in Iceland or other places where it's more difficult to get to. All of these issues are what drives allele frequency differences and so even though we're 99.8% identical across the genome, these variants have different prevalences across the geographic state and so how do they affect type two diabetes? So we can now study that because of the amazing, amazing accomplishment of the human genome and so it's really worth, and not because I'm a guest of an AGI, not because Francis Collins is here, I just want to say that it has been amazing to be alive and have the use of reason and be involved in science and medicine at the time when this happened. So you all remember where you were maybe when the shuttle exploded or when 9-11 happened, you all remember when tragic things of the event to being alive and doing science when the human genome was sequenced to me is a before and an after and I think when it comes to human history, it tends to be cyclical. You know, when human history in general we repeat the same mistakes, we go in cycles, we elect the same types of people, we get into the same kinds of wars, we kind of make the same things over and over again. It seems like we're never getting out of the spin but when it comes to science, progress is incremental. There is a before and there's an after and the world is not the same after something happened. And so coming from Mass General where ether was used for the first time in surgery, there is a time when people had to have amputations raw where they are basically biting on a piece of cloth and drinking a lot of whiskey and then there's ether that completely makes them forget and not feel any pain. And there's a time when a child diagnosed with type 1 diabetes had a uniform death sentence and by age 10, 10, 15, 20, they died of an ignition in a starvation facility and then after 1920, these people now transform what was a uniformly fatal disease into a chronic condition where people lead essentially normal lives. There was a time when you had a tooth abscess and could die from the infection and then all of a sudden with penicillin life is different and of course there's highly active anti-retroviral therapy that again took HIV into a fatal disease. I remember in the 90s being a medical student, the kinds of AIDS words that I had to see and then what we see today. So I think the human genome is in that league. There's a before and an after and what we are engaged in is in trying to find out whether the wealth of information that has been acquired through having the complete catalog of bases that dictate our biology and that explain differences between people can be leveraged to transform medicine. So that has been now harnessed in understanding differences through projects like the HapMap project. And one of the insights that came from the very early efforts that understanding the human variation was that we share common variation that most of the haplotypes, meaning the linear arrangements of variants in a chromosome, they tend to be shared because they arose a long, long time ago in Africa. They made out of Africa into the various populations and are common between populations. There's a little bit of extra variation, as I mentioned, in people of African descent, meaning the variants that never left Africa, at that point, but for the most part, common variation is shared. So when you look at all the mutations or polymorphisms that exist in the human genome in the entire population, it is true that most sites are rare because there's many individuals in which mutations can arise and those mutations will arise, and they're really, really rare. But in terms of what varies in a single person, most of the variants are common and shared. So in talk to diabetes, we have been able to use this information through genome-wide association studies to begin to catalog upwards of 100 or so variants across the genome catalog here by year of discovery. In blue is the mode of discovery, so blue is GWAS in terms of the color, and then the odds ratios. And there's about 100 or so variants that we can now conclusively, reproducibly, say affect risk of type 2 diabetes, the majority of which have been discovered in people of European descent. So the first question on the map is, do they differ across ethnic groups? And initial attempts when they took the variants that again are common because of the mode of discovery, and they look at, to see whether the odds ratios that increase risk of diabetes vary from European Americans to African Americans, Latinos, Japanese Americans, or native Hawaiians in this study from the MEC cohort, essentially show that what you discover in Europeans has a similar effect in people of other ethnic groups. And when you construct a genetic risk score, meaning an aggregate of risk that is conferred by those genetic variants, and you see that genetic risk score differ in distribution, depending on the population you're studying. Europeans here compare because of the discovery sample compared to the other four ethnic groups. You see that essentially the distribution of the risk score is largely overlapping, meaning common variants that are shared and have an effect on diabetes in one group also have an effect on the other. Now, in this study from the Enhance population, we did a similar thing, not only for type two diabetes variants, but for variants that are associated with quantitative glycemic traits. And we also construct genetic risk scores, and we have non-Hispanic whites, non-Hispanic blacks, and Mexican Americans. And when you look at the fasting glucose level driven by the genetic risk score, here's the European samples with increasing numbers of alleles giving you a higher level of fasting glucose. And here's an essentially overlapping distribution for non-Hispanic blacks and overlapping distribution from Mexican Americans. Now, when we do that in a larger array, the IBC care array that is a gene-centric array, and you do a similar exercise for genetic risk scores across three ethnic groups, in this case African Americans, Hispanics, and Asians. Again, variants discovered in European populations have an essentially overlapping distribution for the genetic risk scores. So the message again here is that we are more alike than different. That variants that are shared are shared and they are common. That is not to say that you can't detect differences if you are very fine in your approach, and there may be things that are unique to one population versus another. But when it comes to the genetic architecture of type two diabetes, it is fair to say that most of it seems to be shared. So can we leverage ancestry differences for genetic discovery? Can we take advantage of the fact that some populations have been separated across oceans for a long, long time, and there's genetic drift and positive selection and that mixture to try to maybe discover more than we could discover before. So I showed you the very happy and proud success of the many variants discovered in European populations, and I remember what happened early on when we had this idea that we are not really only scratch the surface, we only estimated about 10% of the heritability for type two diabetes, that there is maybe potential sources of frequency difference across populations, which I already described, and that we may be expanded such that in Europeans because on the heels of our successful GWAS papers for type two diabetes in European populations came these two papers from Japanese groups to separate Japanese groups, saying that we had actually missed KCNQ-1. So KCNQ-1 is a gene that has variants that are associated with genome-wide significance. These two papers, companion papers in nature genetics, made our community of European-focused geneticists somewhat unsettled because what is it that they can discover that we didn't find? We had a much larger population size. We have very smart people working. What do they do in Japan? That is, are they smarter? You know, they work harder. They have more post-docs. You know, what is it? And what it is, is that of course, the variant that they happen to run into was a variant that was a much higher frequency in the Japanese population than in the European population. So it was there in our sample. It just had not had enough observations to meet the statistical threshold that would have called attention to it. Whereas in Japan, being much more frequent with fewer number of cases, they were able to see more observations and reach the genome-wide significance that would lead to publication. So that prompted the effort to really understand, of course, and accept that we need to expand the search beyond European populations, which is a very sensible thing to do. And the one that we've been more directly involved in is in a Latin American initiative called the Sigma Type 2 Diabetes Consortium. Sigma, as Griff mentioned, is an acronym. It stands for the SLIM Initiative for Genomic Medicine in the Americas or in Spanish, Inifiativas, LIM, and Medicina Genomica, all being funded philanthropically by Carlos LIM. And this is a number of investigators in Boston, Mexico, and Los Angeles that take part in this initiative. So we conducted, oh, by the way, let me just acknowledge Carlos LIM. So he is now a philanthropist and kind of egged on by the likes of Bill Gates and Warren Buffett and others who have been very generous deciding that he also needed to make a mark and then identified diseases of importance to the Latin American population as cancer, diabetes, and kidney disease and decided to invest here as a meeting when he visited the Broad Institute and I was introduced to him. I didn't realize that the basis for my introduction was the fact that one of the people in his entourage has diabetes and forgot to bring his medicines with him. And so they needed a local doctor to be able to prescribe a couple of medications for him and so I'm having this conversation and I'm saying, yes, I will prescribe. Of course, you know, this is a big philanthropist and so, you know, you can imagine the scene where there's a black limo that drives through Boston with a person who's gonna collect these medicines from the CVS in downtown Boston, you know, and so the big black limo like parks there and this person gets out and says, I have this prescription from Dr. Flores, you know, so. Anyway, so a lot of people on the ground that help collect samples in Mexico, so they were able to then do GWAS in 9,000 people that had not really been done before in people of Native American descent with a large degree of European admixture as the Mexican population is. And so in our Manhattan plot, there was a lot of the usual players that had risen to the top in previous European type efforts like TCF 702. There's a KCNQ1 variant that had been very high in Asian populations that I just mentioned, but there's one that was completely novel, SLC 16A11, and I will spend some part of my talk talking about how this one variant discovered Mexican populations went all the way from genetic association to function and may illustrate a therapeutic avenue that should be helpful to all ethnic groups. So SLC 16A11 had an us ratio of 1.28, which is on the level of things like TCF 702, among the top four common variants, but because of the very large allele frequency differences between Europeans and Mexican Americans, this variant alone can explain up to 20% of the difference in prevalence in type two diabetes between people of European descent and people of Latin descent. So it's a two-fold, I mentioned earlier, two-fold higher risk of diabetes in people of Latin descent, so of that two-fold risk, a fifth can be ascribed to the existence of this variant. It turns out that the haplotype where the variant contains, where the haplotype that increases type two diabetes risk contains four miscense variants where amino acid changes and the gene encodes that are previously obscure and not particularly well-described monocarboxylate transporter. So what about the population frequencies that led to our discovery? So what's interesting about the variant, if you look at the derived allele in orange in this pie chart, it's completely absent in people of African descent. It is present in about 2% of Northern Europeans about 11% of people in East Asia, and then it's very, very common in the Americas that that's why it's keep detection in previous GWAS of European peoples and even Asian peoples and then showed prominence in our GWAS in Mexican populations. Now, what is the population history? So we did some fancy genetic analysis and we're able to show that this variant was introgressed into modern humans via Neanderthals. Okay, so let me explain what that means. So when I went to high school, I had a very linear explanation of human evolution. So I learned about Australopithecus, I think was the first hominid that was known at the time I went to high school and then pedicanthropus came after that and then Neanderthal, chromium, homo sapiens, modern humans. And you remember, right? You have the monkey and then you have all the people and now you have like the computer or the very fat person at the end of the progress. But it turns out that evolution, human evolution was not that linear. What happened is that Neanderthals and modern humans descended from a common ancestor with chimps and diverged at one point and here's the migration of Africa. So here's African state and Africa, here's people moving to Europe and then the Neanderthals happened in Europe for a while and then they coexisted. And when they coexisted, they fought for resources and they competed, but there was also something called gene flow between the two populations, also known as hanky-panky, just to make it clear between the two populations. And so we all carry some degree of Neanderthal industry, two to 4% of Neanderthal industry and it turns out that this haplotype that confers risk of diabetes came into us into modern humans via Neanderthal admixture and then migrated eastward, rising in prevalence as a matter of its way. Not clear if this is because of drift or because of positive selection, but then when you cross the Bering straits, this phenomenon of a distortion and allele frequencies takes place because an overrepresentation of carriers perhaps came across the straits carrying this haplotype and remained highly prominent or highly prevalent in the Americas. Now, so onto the functional work. All we know is that there is a genetic variant that seems to, well, a number of genetic variants, mid-sense variants on this gene that seems to be the gene that matters. It's a monocravoxylate transporter, so let's see if we can develop an essay where we can measure what it transports and here is an essay for pyruvate using threat sensing in the common canonical way in which this monocravoxylate transporters transport this molecule across, it's proton-based. So whether you measure pyruvate or protons, you see the same curve and you see that cells transfected with a construct with SS616A11 have an increase in proton transport both inwards and upwards, confirming that it is a member of that family. Now we want to know whether the haplotype that gives you type 2 diabetes risk has, again, a function or loss on function on this molecular mechanism of transporting pyruvate and when you transfect a plasmid that contains the reference haplotype versus the type 2 diabetes haplotype, you see that the effect of having the type 2 diabetes haplotype is a decrease in pyruvate transport and again the same results observed for proton transport. Now what, how are the variants doing this? So we have a number of coding variants but as you know, when you do a genetic association study you have a part of the genome that is overrepresented in disease versus health but you really don't know whether the gene was the gene that's affected and how the molecular variant does that. So one way to do that, you need to do some fine mapping and we took a number of genetic resources at our disposal to try to generate what is called the credible set in the genetic association region, what are the SNPs that fall within the 80% probability of being causal and there's a number of variants that fall out, all four coding variants, there's two that fall out and two that remain and then there is three variants in the near the transcription, start site two in the promoter and two in the five prime UTR that also are equally likely to be associated with type 2 diabetes as the rest of the haplotype. So we've done some fine mapping that we can then concentrate some of molecular work. So based on the non-coding variants we try to see whether maybe the non-coding variants in the risk haplotype affects levels of expression. This carboxylate transporter is poorly, lowly expressed in all tissues but it is detectable in liver so we went to liver being a type 2 diabetes relevant tissue and when you look at the various genes in the region including one studied by the NIDDK PMA group and all the other genes in the region the one that has what is called an expression quantitative trade locus meaning that the existence of the variant affects levels of expression was found for SLC16A11 showing two things one that SLC16A11 is likely to be the effector transcript and second that some of the variants in the haplotype seem to be changed levels of expression. Let us not see an in visceral adipose tissue so confirming the liver specificity of this action. Now you can have this sort of pattern where the genetic variant is associated with different levels of the transcript based on cis effects meaning the variant is directly affecting the transcription of the gene downstream or maybe in trans effect where through some sort of feedback loop that cis variant is affecting some other phenotype that then comes back and regulate expression so you have not established that the existence of the variant on that chromosome on that DNA linear arrangement is actually personally responsible for the changes in expression. So one way you can do that is by looking at a little bit in balance and you concentrate on the heterozygots and when you take people who have two chromosomes one that has the risk allele and one that has the non-risk allele and you see when you have one chromosome with the risk allele, one with the other allele by measuring RNA you can actually sequence it and figure out which of the two sides is coming from and then in single individuals who happen to have one chromosome of each flavor you have a little bit in balance meaning that heterozygots individuals the lower levels of expression are due to the presence of the type two diabetes risk associated allele. And so we established that this is really having a direct effect on expression of the variant and you can do that in the liver collected from people or in primary hepatocytes where we happen to know the genotype. So that's what that was the missense variant sorry the non-coding variants that are having an effect on expression. So what about the missense variants? I told you about two of which remain in the credible set. So it turns out that the coding variants also affect the way in which the SLC-16 in the liver protein, this transporter interacts with vasogen which is a carrying protein. So this is out of juice, let's see if this works. And so vasogen when you do co-immunoprecipitation and there's two different experiments and you compare the reference haplotide versus the type two diabetes haplotide is the one protein that is very, very different in its ability to be co-immunoprecipitated with SLC-16 in 11 is vasogen which is a known chaperon for these monocarboxylate transporters. It is responsible for taking SLC-16 in 11 to the plasma membrane. And when you see whether the presence of the coding alleles leads to changes of levels of expression, you see that the reference allele has much higher levels of expression in the membrane than people with the type two diabetes allele confirming that. So now we have two mechanisms by which this haplotype leads to lower levels of transport of a still unknown molecule possibly peruvate possibly others. So there's one way which is diminished levels of expression by the non-coding variants and there's another way by diminishing interaction with vasogen in its presence on the membrane. And so in the normal condition you will have the gene that is expressed at normal levels and interacts finally with vasogen, gets to the membrane and it does its work. In the type two diabetes condition you have some variants that lead to lower levels of expression but also reduced protein interaction so that you have levels of the membrane that are in the order of about 10% of what you see in the healthy condition. So now what that tells you then in terms of a therapeutic approach you've been able to establish direction of effect and so what one would need to do is to raise levels of this protein at the plasma membrane. And so this insight which is the beginning, the first step toward a potential therapeutic implementation is an insight that has been derived from one population but is applicable to all populations and can be done to raise the drugs that may be helpful for all groups. So there's one example in which leveraging genetic information across different ethnic groups can lead to higher discovery. Another one and I alluded to briefly is by having a transethnic approach as has been shown by the large consortium of types of diabetes and this effort led by Anubhav Mahajan as the first author where you take very large GWAS and metanalyze across many different ancestries, European, East Asian, South Asian, and Hispanic by having all this variation and metanalyze you can reduce your catchment area and then you can come up with better intervals for the credible sets of variants. So this is before you do your ancestry metanalysis you have an association signal at this locus you have this very large credible set and when you bring in the transethnic group you can really narrow down your area to the size and the same thing for SLC38 and another gene you can really bring down the credible set to only a couple of variants that then can be associated. So these are ways in which you can leverage genetic differences for discovery. So what about using genetics to measure ancestry and maybe exploring health disparities and so this is something that has been known for a while so this is one very early example from Bill Noller who's an intramural NIDDK investigator in the Phoenix area and works with Epimas and then early on and I have this paper because it's that beautiful abstract that kind of captures the entire story. It turns out that if you look at this GM haplotype it's not necessary to know what it does but this GM haplotype there's a negative association in type 2 diabetes in the Epimas and you would think that maybe this is a risk factor for the disease but it is not a risk factor for the disease because what happens is that this haplotype is present in people of white European descent and not present or a much lower frequency in people of Native American descent and so it is not a marker of disease is it a marker of ethnicity and so there's a confounding that takes place which is known as population stratification and that sort of differences can be leveraged to characterize the global genetic ancestry that people have. So this is for a single marker of the genome. So when you take all of the variants that exist the majority are going to have similar real frequencies say between Europeans and Native Americans but there's a number that will not there's a number that will be much more frequent in people of Native American descent and very rare in Europeans and there are some that are really frequent in Europeans and more rare in Native Americans those are called ancestry informative markers because similar as that Pima haplotype it can at least help you distinguish the amount of global ancestry that a particular person has. So when you take a GWAS array and you perform principal components analysis and here's two different axes of principal components using GWAS in people of European descent principal component number one, principal component number two and you have the data play out and then you color it by the country of origin it's beautiful how you see that in this two dimensional space driven by the genomic data you can almost reproduce the map of Europe based on the people of Iberian Peninsula Southwestern Europe being coalescing in this part of the space then you have people of West Europe mostly Germany here's people from Italy here's people from the Balkans and Greece and people from Scandinavia and then the Slavic people all of them basically falling out in this two dimensional space based on the genomic ancestry so we are fairly sophisticated being able to capture ancestry through these tools. So we did this in an effort to understand whether the differences in type two diabetes that you see in Latinos are driven by the ancestry you have versus socioeconomic factors so we had two populations in Mexico and Colombia we were able to genotype a few dozen ancestry formative markers that allowed us to estimate global ancestry proportions for each person and then shown in dark and black are people with type two diabetes shown in gray are people without type two diabetes arrayed across the gradient of the percent of European ancestry and what you can see is that the more European ancestry you have the less diabetes you have and the more native American ancestry you have the more diabetes you have both in Mexico and Colombia with the effect being much more marked in Mexico. Now we had some rudimentary socioeconomic status information and we wanted to tease apart how much of this was driven by socioeconomics versus ancestry and essentially we were not able to really disentangle the two because it's remarkable that in this day and age but it still happens socioeconomic status very highly correlates with ancestry in these countries. So we've tried to this has tried to be exploring greater detail by a collaborator John McKinley a study NIDDK funded study in Boston where they surveyed various neighborhoods in Boston trying to represent African American and Latinos to have basically equal samples of each of the ethnic groups and then they type we type these markers for them to try to determine ancestry both in terms of African versus European and Native American versus European. They constructed models that included socioeconomic factors psychosocial so for example major life events or sense of control environmental factors crime in the neighborhood food environment ability to exercise lifestyle behavior also died physical activity smoking et cetera and then biophysiological things like lipids, BMI et cetera they constructed a theoretical model and they tried to use structural equation modeling to see what determines what what is causal for what and how much is captured by each one of them. It's a complex study and there's many things that one could raise about it but to really drill it down it turned out that when you look at the actual data from these people that were collected and phenotyped that about 20 to 25 percent of the excess type two diabetes that you see in African Americans and Latinos is explained by race or ethnicity and the of that effect of race or ethnicity that about the quarter of the excess risk that is explained by race or ethnicity that to a 38 to 46 percent of the effective race mediated by other domains with socioeconomic factors being the prime driver. So there is about 20 percent in this data set that can be explained by race or ethnicity and the major driver of the race or ethnicity difference is socioeconomic factors which really tells you that there are things about health disparities that are fixable because we may not be able to do much about our genome but there's definitely things that we can do about access to care about neighborhoods about architecture about healthy lifestyle about interventions et cetera, particularly in people who need it the most. And so I just want to close and the last few minutes with some examples where maybe genetic differences may matter I've been trying to make the point that we're more alike than different that we share our genome that common variants are the same in all groups that we can leverage genomics to make discovery but are there situations where maybe having a specific ancestry may lead to different treatment decisions based on a single genetic variance? And so that can happen when you have unique variants or strong effects. So we took the same sigma group of 9,000 people we selected the 4,000 that were in the extremes of Native American ancestry and in those people we performed whole exome sequencing trying to discover coding variants that would be maybe more rare but have stronger effects. And that effort led to the discovery of a variant in HNF1-alpha which is a moody gene a gene that causes monogenic diabetes. I don't know what I just did. Okay. Recoverable. And here's the meta analysis for the cohorts the bottom line is that there's a five-fold OZ ratio so remember the OZ ratio is before we're 1.2, 1.3 this is five-fold OZ ratio conferred by this variant on risk of diabetes in this population previously undescribed variant in a moody gene. And the reason it's previously undescribed is because it is nonexistent in other groups and it's present in about 2% of Mexicans. So here's the previously described moody variants here's ours and it's a transactivation domain of this transcription factor and when you do reporter essays and you see what is the effect of molecular function here's the wild type which 100% transcription activity here's the known moody variants that basically induce complete loss of function with a very radical phenotype of diabetes at an early age. On our variant where glutamate becomes a lysine at position 508 gives you about 40% transcriptional activity so a middle of the road and of course only carried in the heterozygous form so these people have residual function and maybe that's why their diabetes is not particularly distinguishable from type two diabetes. So if you take the population that we examine the 4,000 people and you plot their age of onset by their body mass index with the hope that people with moody like diabetes monogenic forms of diabetes would be in this quadrant of being young in onset and lean. Everybody who carries the variant is in the typical quadrant that would be for type two diabetes so they're clinically undistinguishable. But we know that people with moody diabetes of this type respond better to sulfonylureas than to metformin. So it may be useful to examine whether these people have a preferential response to sulfonylureas because that would trade change our treatment algorithm and it is not a trivial question because if you made the calculation of the millions of patients with diabetes in Mexico and the 2% prevalence of this variant that means that there's 140,000 people in Mexico who carry this mutation who might be treated differently based on knowing the genotype. So there's one example where maybe knowing genetics may help is a variant of strong effect carried in a specific segment. A similar story has emerged in Greenland. So if you're a physician who takes care of people of in vit descent whether it's in Alaska or Greenland, Northern Canada you may be interested to know that there is a stop codon in this gene TBC1D4 that has a minority of frequency of 17% so highly represented in this population that has a big effect on two-hour glucose and a huge effect, 10-fold higher risk on type 2 diabetes. It turns out that this gene works by mediating AKT induced glucose uptake in muscle by mobilizing glute 4 and so we have loss of function by the introduction of this type codon. You have less glute 4 that makes it to the membrane of the myocyte, less glucose utilization, higher glucose in blood. And so a physician who's taking care of this population may want to know that for about one-fifth of your patients this is the mechanism by which they get diabetes. And so the drug that you want to use is probably an insulin sensitizer rather than maybe an insulin secretogog. And the last example I will show is something that we just published recently. So this is work from the magic consortium that was mentioned. Magic spells out the meta-analysis of glucose and insulin really interest consortium. Very, very, it's amazing. We spent three months of conference calls trying to come up with the acronym, people vote, you know. And there's a lot of British people in this consortium so Harry Potter was famous at the time and so they came, you know, magic was sort of the thing that they, and then the next three months is for the logo. And so, you know, it turns out that Inga, you know, knows an illustrator and so this is a top hat of the magician and then here's the magic wand, here's the DNA, here's the Manhattan plot. It's kind of unfortunate. It looks like this is like smoking the DNA so it's sort of, you know, not great for a diabetes study but anyway, that's what turned out. So in this study led by Ellie Wheeler and Aaron Leong who are trainees and Inesh Baroso and my colleague and friend James Mix at NGH, the senior authors, the trans-ethnic G was for hemoglobin A1C. So of course hemoglobin A1C is not only a marker for glycemic response in diabetes but it is also now used for diagnostic purposes. So as of the last three or four years, if you meet a threshold for hemoglobin A1C you're going to call somebody as having diabetes or not and that is different from the previous practice of simply using A1C in the individual person as the evolution of glycemia over time based on treatment. So in this trans-ethnic exam there were 60 SNPs that were associated with hemoglobin A1C and it turns out if you hemoglobin A1C being glycated hemoglobin, there's many things that affect those levels. The levels of hemoglobin itself which may vary depending on the red blood cell, its life course and levels of hemoglobin being synthesized and then levels of glycemia. And of course what you care about for diabetes is the levels of glycemia and what correlates with fasting glucose and toeroclucose and not necessarily with the red blood cell lifespan. And of those variants, the 60 variants, 22 could be very clearly ascribed to erythrocytic ways in which the modulated A1C because of their impact on other red cell parameters and then 19 could be ascribed to glycemia based on their impact on fasting glucose and toeroclucose. So why is that important? Because when you construct a genetic score based on either the total number of variants or the erythrocytic variants and you divide the population in those at the bottom 5% of the genetic risk score or the higher 5% of the genetic risk score and those in between, you look in Europeans, you have this monotonic relationship, linear relationship between the genetic risk score being highly correlated with hemoglobin A1C. Same sort of thing happens in Asians but with a narrower span but a huge difference in people of African descent because of the presence of the single variant in G6PD whereas a coding variant positioned in 202 that gives you G6PD deficiency on the carrier state. It is an X-link variant that has been pretty much ignored in GWAS because of the difficulties in analyzing the X chromosome but when it was done here and when you look at women in the hemizygastate or men, there is a much larger display on the levels of A1C depending on whether you carry the variant or not. So what does that mean? This is eminently translatable. So first of all, we know that glycemic genetic risk ratio associated with type 2 diabetes here through city genetic risk score is not associated with type 2 diabetes in any population but we know that this variant is highly prevalent if people of African descent. So if you're gonna use this threshold for diagnosing diabetes based on a variant that affects levels of A1C on non-glycemic parameters simply based on red cell lifespan, you're going to underestimate the number of people who have diabetes if they happen to be of African descent because there is a reduction in about 0.8% which is clinically significant on the A1C level based on the presence of this allele and a little less in women. So the estimate would be that if you were to use A1C for type 2 diabetes diagnosis in the United States, 2% of African American adults would go undiagnosed based on the 6.5% threshold which is 650,000 people. So these are three examples in which maybe genetics in specific populations may be getting into clinical practice. So take on messages for the talk and I'm very close to concluding so common variants are shared across populations. I made that point. After you account for allele frequency differences and patterns in different populations, it turns out that genetic variants tend to have very similar effects on glycemic phenotypes. Transethnic analysis definitely enhance discovery whether it's for variants that are unique to population or whether it's for fine mapping across populations. These large allele frequency differences can really increase power for you to be able to find things that you didn't discover before in a single population. We can use ancestry markers to catalog the, to estimate the global ancestry proportion that a person carries and maybe help in determining what health disparities come from and what is fixable and addressable through public policy. When we discover things in a single population, the insights that are gained from that discovery can actually maybe lead to the development of therapeutic agents that apply to everybody with diabetes regardless of where they come from. And finally, in some specific cases, maybe unique genetic variants may lead to differences in treatment. A lot of this information that we are collecting in our field is being put together in a portal that is funded via the Accelerative Medicines Partnership, a joint academic government and pharma consortium that is now concentrating on three diseases, one of them being type two diabetes. The major mandate for the type two diabetes communities to produce this portal that collects all of the genomic information and metagenomic information, expression data, epigenomics from relevant tissues, put it in a place that is safe and secure and develop an analytical engine that allows any user in the community, not just geneticists, not just bioinformaticians, not as statisticians, not only like smart people who code all the time, but anybody who has an intelligent question to come in and ask the question in English and then have the analytical engine find the result in real time and spit it out to the community. And so this is already up and alive based on funding by this initiative funded by NIDDK and five major drug companies that are involved and you can already use for your preferred genome interest, your knockout mouse or your drug target that you're working on or your high school paper that you're trying to write. So let's enter in a little bit of a philosophical note. You know, I think there was a large national investment that was made into the Human Genome Project. There is still a lot of investments that many institutes make on genomic studies. There is sometimes the pushback on whether all this investment is really leading to things that transform medicine. It seems like it's maybe some people think it's a lot of hoopla for nothing, that we should really continue to invest. So I'll bring you back to another former US president, this one from Massachusetts, who in May of 1961 had this crazy idea that somehow we would make it to the moon. And I don't think he had any clue as to how getting to the moon would improve life on this planet. You know, why would putting Americans or any human being on the moon for that matter would lead to better life on Earth? But this is something that the United States could do. This is something that we could try to muster as a national effort. And if we had a technology and the ability and the know-how to answer the question, we would go and answer the question and then we'll find out whether it was useful or not. So I think the Human Genome is very much in the same category where we say we have the Human Genome and we need to answer the question, will it transform medicine? Yes or no? And either answer is useful. Either answer will help to see, maybe in this case it will, in this case it will not. But we have the technology, we have the know-how, we have the interest and the passion. And so all of us who are working in this area are very excited about being able to to say that there was a before and an after and being able to close that loop that said the medicine was different because the Human Genome was sequenced. So here's all the people that I worked with. We had last week sort of a mini laboratory where we reviewed the human resources driven questionnaires that people fill out to see what it's like to work in the group and we had a conversation as to what our research group is like and what people enjoy or not enjoy being in it. And it was really interesting how almost to a person, people were saying that what they enjoy the most about our research group is the diversity that it has. Diversity along many different axes but diversity in disciplines, you see that we have some people who do genetic analyses, people who are clinically trained, people who do basic work in the lab on functional genomics, software engineers, many collaborators. And so what they appreciated is that because it's so diverse. No one feels like they know everything. And so it's a friendly environment. It's so diverse, many different points of view, many different opinions and everybody feels free to voice an opinion because it enriches the conversation. So I'm very grateful to all of them for what they bring to the table and of course all the people who support us financially. And then now they're not so diverse, a home environment where we have four daughters, Cuban American wife and the empiric lack of evidence that I can transmit a white chromosome. And so with that I will close and be open for questions. Thank you very much. I guess no one understood anything. Okay, I can do it again. I can do it again. So thank you for your talk. I wanna go back to the concept of using race and ethnicity. And I know in the title of your talk you used race slash ethnicity. How do you think as researchers we should talk about populations in our research? So yeah, that's a complex question. By no means an expert. What I can tell you is that if people are trying to think about biological, the biological axis, that is a subset, what we showed in the, what was shown in the New England research studies that ancestry informative markers of ethnicity were basically subsumed by self-described ethnicity. But we're not a biology. What I talk about is about the alleles that you inherited based on where your ancestors lived thousands of years ago. They lived on different continents. And so I'm not talking about your skin color. I'm not talking about your language. I'm not talking about what you look to me. I'm not putting you in any category. All I'm saying is you carry a set of variants that in some ways are similar to mine and in some ways are different. And to the extent that those variants affect biology and thereby affect your risk for disease or your response to medications, I care about those variants. But to the extent that they're not 100% in one population and 0% in the other, I will not make assumptions about you based on the variants. I will genotype your variants and then figure out because you could carry variants that were carried by a European ancestor in your past. And that's the one you inherited even if you are 90% global ancestry of African descent. So when it comes to the biology, I will not talk about race and ethnicity. I will talk about the variants you carry and where they lived thousands of years ago in what continent. And I think that's a conversation that is scientific and in a sense, it doesn't have any value judgment attached to it. It's simply the biology of it. But of course, many other things influence risk of disease besides the biological variants you carry. And as we showed, there is a large correlation between those biological variants and your skin color and how you're perceived and how people made it to this country and what's happened in intervening 300 years to make them less able to access care, et cetera, et cetera, et cetera. So there's all these other layers that go on with the biological variants that mark you. And so because they affect disease risk, I will also think of those things and they may affect your approach to medicine. They may affect mostly Latino patients. So they may affect how you perceive of who's involved in your healthcare in terms of the family environment who's responsible for the cooking in your house. What access because you're holding two jobs to maintain your newly immigrant family here and you can bring in the rest may affect your ability to exercise. So there's a lot of other constructs that go into how you treat a person based on what they self describe. So when it comes to biology, I talk about ancestry, when it comes to everything else connected with race or ethnicity, then I will take into account the cultural and socioeconomic context. And I feel like I was telling trainees at lunch that as a Spanish speaking diveatologist, I do connect with my Latino patients in a different way that I connect with all the other patients. And I think you would ask them and they say the same thing. We laugh at the same kind of jokes and we talk about soccer. And there's a way in which you can connect with a disease like diabetes, which is a lot about encouragement or motivation and about a therapeutic alliance having that sort of personal relationship means a lot. So it's a long answer. So is our Neanderthal contribution overrepresented in disease risk variants? You gave the one example. I'm just wondering, it seems like I've heard that before. So I'm not aware of any systematic look actually. Maybe that exists. Maybe Eric or others can comment. I'm not aware of whether the Neanderthal contribution to our genome is over-represented in terms of for the magnitude of DNA that is Neanderthal DNA, or do they have more disease variants than not? I don't know. It's a great question. I'm gonna take the bait on your last slide about transforming medicine as a way to just pick your brain. So we're in April, we'll be the 15th anniversary of the completion of the genome. Probably yes, we're 15 years into this odyssey, if you will, which really began when we had the genome sequenced. From your perspective, as a physician scientist, what have you been most impressed with in route to a transformation? And then maybe in contrast, what have you been most disappointed? Where do you think we're really lagging in terms of seeing transformation? And then there's a third question. Where does the work as a diabetes researcher and physician fit? Are you more transformative or more on the lagging side? So, okay, so several steps to that question. So number one, I think the public has not been necessarily fair, and maybe that's partly driven by the hype of geneticists themselves, but the public is not particularly fair in giving time for these things to play out. If you look, for example, at when the microscope was first designed, and the time it took from the microscope as a tool to the description of the cellular architecture that a microscope could yield, it was decades. If you look at aspirin being discovered as a drug that is helpful for pain, and then maybe even for coronary disease prevention in the landmark trials in 1982 to then describing the mechanism of action by which aspirin achieves those effects, there was also a lag. So I think 10, 15 years is still within the very early phase of what it comes to discovery for discovery to translate into clinical medicine. And there's many examples in the development of therapeutics where such a lag is, this is not unexpected. So that's number one. Number two, I think we've realized very clearly, and something that we kind of knew but maybe ignored, that it is very difficult to go from the relatively straightforward genetic association of a genetic variant in the genome with a phenotype to the fine mapping of that variant, identifying the gene involved, the effect of transcending the molecular mechanism by which that happens, let alone the physiology and the systems level. So that cannot be done in high, so far has not been able to be done in high throughput. It has to be done tissue-specific, disease-specific, mechanism-specific, and as I showed, three, four years of dedicated postdocs and graduate students doing a lot of basic science. So it's from gene association to function is a laborious road. We kind of knew that, but it takes a while. It's clear, at least for type 2 diabetes and other complex phenotypes, that genetics for prediction is not really going to be that useful. And that's, we maybe didn't know, the effect sizes are small at the population level. We are really good at predicting outcomes on basis of very simple tests that you can get in your PCP's office. And so being able to, the hype of precision medicine that I'm going to be able to predict everything's going to happen to your complications and so on, based on genetics, I think was a hype and it's not being realized. That is not the same for pharmacogenetics. I think one thing is disease where there is selection pressure against it. The other thing is exposure to drugs where maybe effect sizes could be a little bit larger than they are for disease because the pressure from drug exposure is only 100 years old. And so there hasn't been any time for selection to really drive things. But that's an open question. I think what I'm most excited about, in addition to discovering new mechanism, is the use of genetics to be able to understand disease heterogeneity. And so I didn't present any of this, but in type two diabetes, you know, it's a very heterogeneous phenotype. And people are given the diagnosis based on hyperglycemia, which is the end result of many different processes that is not autoimmune, so not type one, and that is not monogenic, so not modi. If you had a modi, your antibodies are negative, high glucose, you're a type two diabetes. And a type two diabetes in a North African lean person is different from the one in the South Asian or in an African American, et cetera. Hyperglycemia is like tissue growth in cancer. And you would never say, you just have one disease because you have tissue growth. You have many different things that can lead to tissue growth. And so in work that Maria Mudler and their group has been doing, she's been using GWAS for 50 different phenotypes to then take the type two diabetes associated variants, a hundred of them. And then across the 50 genotypes have that data drive clusters that are genetically defined by phenotypically driven. So I can use the variants to define the clusters, but I'm letting the phenotypes tell me how they cluster. And so from that, there's an emergence of a clear beta cell subgroup that has a pro insulin sub subgroup. And then there's an insulin resistant subgroup that has an obesity flavor and a liver flavor and a lipids flavor that then maybe allow us to understand the heterogeneity of disease and then being able to say that maybe for strata of the population or maybe for the top people in each one of these distributions, I will have different therapeutic approaches or different surveillance approaches. So I think in terms of understanding the heterogeneity of disease, we should be able to move the needle. Thank you.