 So, good afternoon. I'm Vince Bonham, the Acting Deputy Director at the National Human Genome Research Institute. I'd like to welcome you to the 27th lecture of the NIH Genomics and Health Disparities Lecture Series. The first lecture was held in May of 2015. The series aims to highlight the opportunities of genomics research to address health disparities. The series is co-sponsored by the National Institute of Diabetes and Digestive and Kidney Disease, the National Institute of Minority Health and Health Disparities, the National Institute of Heart, Lung, and Blood Institute, and the National Human Genome Research Institute, and finally the Office of Minority Health and Health Equity at the U.S. Food and Drug Administration. Speakers have been chosen by the lecture series co-sponsors to present their research on the abilities of genomics to improve the health of all populations. Well, we are pleased to have Dr. Andres Marino Estrada joining us this afternoon, and my colleague, Dr. Elisao Perez-Stable, who is the Director of the National Institute of Minority Health and Health Disparities, will introduce our speaker. Elisao. So good afternoon, everyone. It's really a pleasure to be here with you today. I'm in person with our speaker and a couple other people. I'm the Director of the NIMHD as you heard. Dr. Andres Marino Estrada is a Mexican population geneticist who is interested in human genetic diversity, and its implications in population history and medical genomics. He trained as a physician at the University of Guadalajara, and then subsequently went to Spain to complete a PhD in Biomedicine, where he focused on evolutionary genetics from the Pompeo Fabra University in Barcelona. He worked on population genetics as well as analysis of genetic variation in candidate genes under positive selection on the human lineage. Dr. Marino Estrada was a postdoctoral fellow at Cornell and at Stanford School of Medicine, later became a research associate in the genetics department at Stanford University. For his work in Latin America, he was awarded the George Rosencrantz Prize for Healthcare Research in developing countries. His work integrates genomics, evolution and precision medicine in projects involving large collections of understudied populations from the Americas and the Pacific. Dr. Marino Estrada is currently the principal investigator of the Human Evolutionary Genomics Lab at the National Laboratory of Genomics for Biodiversity in Mexico. He is the co-founder of the Latin American Alliance for Genomic Diversity and a member of the Executive Committee of the International Common Disease Alliance, or ICDA. Dr. Marino Estrada actually is also here in Washington because he's part of a National Academy commission to study the proper use of population markers in thinking about genetics, a study that was initiated by the Genome Institute and NIMHD and other ICs here at NIH. Through Dr. Moreno Estrada's presentation, my colleague, Dr. Leonardo Marillo Ramirez, and Earl Steppman, tenure track investigator in NIMHD, the vision of intramural research will facilitate discussions. Please submit your questions at any time during the presentation via the Q&A box, and we will get to them afterwards. Now, let me welcome Dr. Moreno Estrada to present his presentation on medical genomics and underrepresented populations from Latin America and the Pacific. Andres? Thank you very much, Alicia, for the kind introduction. Thank you, Leonardo, for organizing this exciting meeting. I am super pleased to be here. It's really an honor to be part of this amazing lecture series. Well, as you heard, Andres Moreno, originally from Mexico. And today, we'll actually go directly into the topic of disparities, but by mentioning something that is not in my CV actually. So, after all, I heard from Liceo, this is not possible to know because this is something that I want to mention in the context of my family history of community engagement and anthropology research. And it turns out that actually my father worked for many years as an anthropologist throughout many rural areas in Latin America, including Mexico, Brazil, you have him here, some pictures of him in community engagement work. Myself and my sister, we grew up in this community, it's actually through our childhood. So we were really exposed to the reality of many of the disparities that these communities face in a daily basis. And if you talk about, you know, disparaging in a general context, there's no, you know, where else you can find this more direct content than living in these areas and communities that really got me interested in how to translate this into somehow a benefit through research. So with this motivation actually I want to make school, not to do clinical practice directly but to understand better what makes humans as individuals and populations are why we're different between communities and how can we address those differences by, you know, biology concepts then to, you know, health care. So, as I mentioned then I moved to Barcelona and here the main challenge was not to do, you know, top level research but to get concentrated and do a PhD thesis while having these amazing views from the building that is right there. You can see this research part that is very attractive to be. So, you know, to make things hard to get focused on research, I moved to California which is also, you know, full of really nice views and weather so I was very fortunate to be there for five, six years in my positive training and finally the goal was to go back to my home country and establish my own laboratory which is right here in the middle of Mexico, you can see it's a very modern campus with infrastructure to the genomics in all sorts of species including humans. We have a grant program called Integrative Biology that we accept nation one and international candidates. So, just very briefly to mention, you know how my lab looks like and of course these pre-pandemic times where we were able to have lab retreats in amazing locations of Mexico. This is basically the first lab of its kind in the campus that we are focused on human evolutionary population genomics and I think it's been very interesting to integrate the view of evolutionary biology and population genomics back in Mexico. So, I think the main motivation of my lab and many others and I don't need to convince this audience of course that the biggest challenge that we're facing, you know, those that are interested in diversity and population genomics is that there's a very big bias towards existing diversity in databases. And this has been said already for decades now I think that we have been realizing this over the recent years that most of the participants of the genomic studies particularly in association to health traits have been primarily focused on people of European descent. And that really it's in real sharp contrast with the actual proportion of people from different entities across the world and you can see here just the proportion of participants all over different years how they have been disproportionately increasing even though there's more participants from other countries as well, the proportion of Europeans is still the largest by far. You can also look at the same thing in real time somehow by these diversity monitor web page that you can check and browse yourself which time window you want to see how this proportion has been reflected in the number of participants of different countries and you can still here see the same trend right here. Even more concerning is the fact that you see in previous years for example in the upper part you see a snapshot, for example, in 2008, where populations were participants from Latin America were, you know, mostly absent except for some in Mexico and many others of Latino descent in the US for example. And despite that you can see that the proportions on the left that actually account for nearly 1% right here of Hispanic or Latin American, which is quite small of course and that's part of the problem. But look at even more recent years when you see this was just, you know, I just take last night actually, and these has even, you know, decreased even more. Now we're down to 0.25%. And despite the picture on the right looks better somehow because we have more representation from other countries that doesn't reflect really in the proportion of genomes being deposited in these databases. So that means that despite the representation this might be getting better proportionally we still quite behind or even more than before, in terms of the number of you know from other ancestry that are represented in these databases. So, despite that of course there's a lot of efforts that have been, you know, trying to address diversity as much as possible globally and of course, one of the prime examples is the 1000 genomes project where they were able to see because more than 1000 genomes from all over the world. And the key point here is to see, you know, how many novel virus are being deposited in these databases by the sequencing of new populations for other ancestry. So in the bottom you have, you know, how many of these SNPs have been new after sequencing these 1000 genomes project individuals. So what you can see here is that most of the genomes here of different assatures like Europeans and blue Asians, for example, in green and so on, they contribute an average a similar amount of novel SNPs, except for this other group as expected of African populations you can see here on these yellow shades, all of them contributing to the more SNPs on the on average five million SNPs per individual compared to the rest of the. So here in this transition area is the interesting with all the individuals from Latin America, you can see here PEL from Peru, Seattle and from Columbia, Puerto Ricans, Mexicans, they have this gradient that they, some individuals actually behave as contributing as as much as an African and others see the less than European individuals. So that solo salvation might be mostly explained by the fact that in Latin America you have a homogeneous pattern of a population you have actually individuals that are heavily at mixed that means recently from different source population that we know that from history because they have been several source population contributing to the genetic makeup of Latin America very recently, of course primarily Native American sources, but recently from Europe 100 years ago and years and after people also from Africa. So this gets translated of course into genetic signatures, which we can actually detect, applying different algorithms to assign chromosomal segment to different countries, and that's what we also leverage to do a lot of the inferences we make in population genetics. So if we want to see how would that look like in a global space, you know, people from Latin America or of Latin ancestry in a global space is just summarizing access to variation of a global over which is the conversion page. And he would have more than 150 individuals from all over the world you can see here represented in PCI space, and they cover basically, you know, every continental reference you can imagine like Europeans, Africans, Asians, and so on, even also here, Native Americans from different places of the Americas, Oceania and the Middle East and so on. So the great individuals here are the participants from the Page Consortium, and if we wanted them to ask in which area of this PCI space Latino individuals would be clustering with, then we would get this picture which is all these, you know, red symbols with an H here are really the space that Latino individuals are occupying this PCI space. So basically, every possible corner of the PCI space is also represented by Latin America and so this is, you know, an argument in favor that we cannot just categorize all Latin America's in a single homogeneous cluster, or a single label is basically a homogeneous group that deserves a lot of fine scale investigation. So this is not only important for the fact of analyzing these assets to patterns that are fine scale, it is also important for medical genomics directly, because thanks to the inclusion of multi adding samples in GWAS, for example, and this is again the main results of the Page Consortium, novel associations can be detected as a one as a one that you can see here in these yellow locations are for different chromosomes and different traits. Some of them actually were only discovered thanks to the fact that new ancestors were included in the Page Consortium. So these novel associations, we can actually map them by which axis of variation represented by PC components were detected or driven by individuals in that particular axis of variation, such as this one here that we see on PC4 that is mostly accounted by this native Hawaiian component, or these other, the other ones also in blue that are in this PC6 here accounted for these other novel variants in APC specifically driven by Native American individuals. So again, only thanks to the inclusion of those individuals in this axis three GWAS, indeed in this multi adding GWAS was possible to detect these new associations. Now the question is, we know that, you know, the Americas and Latin America in particular has a lot of different diverse roots, linguistically, culturally and of course we will see now those genetically. So the question is, how is this impacting their genetic history and therefore their health profile. So, for that we have been, you know, over the past decade I will say addressing these questions, like, you know, in our gene regionalized way, like trying to make fine scale diversity projects in different parts of Latin America I would mention just examples, like the Mexico diversity projects similar efforts in the Caribbean and South America, more recently as well. So, let me start just by, you know, recapitulating a little bit of our early work on the Mexico diversity project which aimed at characterizing for the first time more than 15 indigenous communities across the country. And this was done in collaboration with UCSF, Stanford, the National Institute of General Medicine in Mexico, and several other institutions of, you know, geneticists and anthropologists throughout Mexico. So, these individuals that are more than 400 individuals were genotype using micro race, and these ended up with a panel of more than half a million needs in combination of all the populations. And you can see that it covers, you know, all the major areas in Mexico. The first thing we'll learn is that there's a large proportion of these individuals that are mostly genetically isolated. So this is the picture you would get of how related are these individuals among different communities. And as you can see here, each node here represents one individual. And the fact that they there's a line connecting every pair, or any given pair means that they share more than a certain sense organs between those two individuals you can see that most of the connections are only within communities, the very few that are actually among communities are closely geographically, or along, you know, the Pacific Corridor or this exception here, like between the UK company and Central part of Mexico. But apart from that, most of the communities only share originally related is within their communities, and also the shape of the cluster is proportional to how much they share as well. So, for example, here, the study population in the Sonora Desert is a very isolated community. This is also known from anthropology, but you can see how it's reflected in their genome there are a very tight cluster with no connections outside, but also the degree of relatedness there is is much higher than any in any other population. Another example is here the lack and done in the in the lack of the jungle actually in Chiapas is another example of this isolation pattern. So, also, because of these isolation patterns, we can also ask how much many differentiation are in between these populations so that can be measured easily by the section index the FST. And we can see here in this table a pair wise estimation of these FSTs, and on average you can see that most populations have, you know, on the expected average of you know 5% or so but many other examples have way more than these 5% that would be you know expected for a within continental comparison and look at these other values like for example between the lack and I understand that I was mentioned before they have more than 13% difference between them. To make put this into context that's even more than the difference that we have also found between Europeans and stations which is 11%. So, this is like between different continents and that will be kind of expected FST value. But that means that within Mexico, just two isolated communities can have even more differentiation than two continental populations, separated by, you know, thousands of kilometers away, which is, you know, much more than the distance between northern Mexico and south Mexico. You can see here also in this PC plot that they are clearly separated by geography, following a north south exit so that also is consistent with the fact of a north side organization in the early stages of the people in the continent. So this is clear reflecting the graphic history, but also we wanted to know what's the consequence of this pattern in the general population now let's let's look at the cosmopolitan population across different states in Mexico. And this is the collection of other samples that were, you know, contemporarily collected by the image institute in Mexico. So we'll combine both days and the result is something like this. We have here the reference found of the Native American population from our study here, the more than, you know, 20 individual populations now. And then on the right side you see the cosmopolitan samples from different states or from different states by biographical graded as well. So, first thing we see is of course there's a higher proportion of Native American as to hearing blue in the southern states, as you know, as have been documented before, but when we look at higher in this model of nine sources, we see that there's actually different indigenous components that are regionalized very clearly like this light blue, mostly represented by the northern individuals. These darker blue mostly represented by the Oaxaca indigenous communities and extended throughout Mexico, and these mine component orange. You can see how these are actually reflected in the corresponding cosmopolitan sample from their regional states, meaning that there's a lot of genetic continuity between the ethnic, recently ethnic population and the local sources. But we wanted to quantify this even more. And for that we developed a method that combines local ancestors to estimation, which is basically assigning an estimation of which the street is located in a particular chromosomal segment. And then after that, which we can do very confidently with many methods now there are like a lot of methods doing this like, you know, the ethnic and more recently genomics and there are several methods to that very accurately, which creates like a coordinate, a map of coordinates where we can easily detect where are the segments of a particular answers to have interest so we can then take only the European segments for example and combining with additional reference operation from Europe and then, and then analyze, you know where within Europe these segments are coming from and so on for the different ancestors. So what we did for the Native American fraction of Mexicans of course important Mexicans was to analyze in BC space, how they were clustering in the presence of these other, you know, reference panel of more than 15 to 20 indigenous groups across Mexico that we have as a reference the location of the cosmopolitan samples. And here on the PCA map without a reference only the Native American fraction of the genome of these admission individuals, you can see that just by these two dimension plot. It reflects a similar picture to what it will look at geographic map of Mexico. So you wouldn't, you wouldn't get this kind of correlation just by analyzing the entire genome of a Mexican because it will be most dominated by these indigenous groups, but by applying this method of local ancestry, and then they second only the Native American portion of a juror capitulate actually this picture, which is quite interesting again not only for anthropological purposes but also we found out that there's a correlation between these taxes of differentiation and long function values for example which is important for, you know, generic research in asthma and other respiratory disease. In 2014, in science, we didn't get a copy of the magazine but I always like to present the suggestion we submitted because this is how we see our work represented graphically. So really what motivated is that we realized that more something was needed. So we needed to move its scale, not only for, you know, few hundred individuals and some stage represent that we wanted to really look at these differences were able to be to be recapitulated nationwide. And also, on top of that we needed to have a broader representation of trace to really explore what is the consequence of this structure in the medical context. So for that, we partnered with the National Institute of Public Health INSP in Mexico, and we submitted for funding our project to the Newton Fund and also the National Agency Conceding Mexico. So we got funded for our first phase of this project that we call the Mexican Vivant Project. And it's actually a collection that has more than 40,000 DNAs collected in the year 2000 nationwide. As you can see here, it's a national health survey that has not only biological samples but also a lot of trace associated with it, I will mention in a minute. And from these collections, we selected 10,000 samples to be genotype on the MEGA array, which has 1.8 million SNPs for the genome. And like I said, we were lucky enough to get funded for our first phase of the project. So, I would like to highlight the fact that this is the largest nationwide biobank to date. This is one of the largest inside that are other collections that are larger. For example, the Mexico City perspective study has more than 150,000 individuals that are only from the Mexico City area, which interestingly, and this is a really nice effort they have been recently also generating genomic data, not only epidemiological data that has been being published already the past years already since they were collected. And in contrast of that larger effort, which has a lot of power to detect associations, our project is aimed to give a broader view of the ancestry in every location of the country. And as you can see here, I think no other collection, although you have a really dense sampling can reproduce a geographic map just by plotting the coordinates of this sample of individuals, which is what you see here. So these are just individuals collected across the biobank, either in rural or urban areas. So, like I said, it also has a lot of demographic and biomedical traits associated with this collection. You can see here, some of the examples. And interestingly, we also added a panel of pathogens that from the blood sample we are measuring the titers of antibody levels against 20 different paths that are commonly circulating globally. So this is a panel actually developed for the UK Biobank initially this is also in partnership with the UK Biobank researchers. So a subset of the UK Biobank has been typed on the same panel, and we did the same on our Mexico Biobank. So hopefully to find some associations with immune response. The question that we can also explore is whether the role of ancestry has something to do with the accumulation of different mutations across the genome, which is mutation burden. And what we found very quickly is that it is actually correlated at only in the rare variant spectrum of the of the variation. So these are these three plots are representing only variants that are less than 5% in frequency, either in the genetic, so the synonymous or the interest variance. So you can see here, a trend that is very clear variants that are lying on indigenous American ancestry so this is the proportion of American ancestry correlates with less number of individuals in the serious variants or the genetic or synonymous associated with with, you know, more indigenous American ancestry, the opposite trend is correlated with European European ancestry, the more European ancestry we find in the Biobank, the more we find from every category. So this is reflecting we think, mostly the bottleneck associated with the access to the indigenous people that has in general less mutation burden and less counts of variants across every category. So that means that demographic history also affects these patterns of different uterus and non uterus variants across the genome. The fact that we also see this in the European component in a reverse way also tell us that these populations have not been heavily affected or not as much as the as the indigenous components as a source population. So the other thing we can explore is models that take into account what are factors that can affect a complex trait variation and this is a mixed model that is simultaneously considering all these factors you can see here like sex, you know, attainment, genetic ancestry, latitude, ROH interestingly, which is once a homocyclic city is important, because we found actually some examples that are correlated with ROH. The first thing I would like to highlight is that the most notorious example is that we found a clear correlation between height, and, and also expect the fact that I'm not genetic, which is for example, sex, in the case of males are higher in stature, educational attainment, which is also a crux of for for income. But interestingly, look at these two significant results that height is connectedly correlated with acid from the Americas, specifically from the axis that separates the Mayan component towards arrest for shorter statute and that was very significant as well. We found other examples that are interesting, also correlated with ROH, et cetera, glycerides here, and also BMI, or for example glucose levels that are also possibly correlated with acid from the Americas as well. As you know, so there have been several investigations finding markers that are specifically segregated in populations that are correlated or associated with diabetes trait. So, of course, the hope is also to find novel associations and here I'm just mentioning, or showing some examples of the GWAS that we have been carrying out by chemical traits for example, like lipids and so on. These, these are actually known loci some of them so this is a good sign that we have been replicating those. But also the hope is to find other ones which is hardly to do because still we're underpowered with only 6000 dvw have been analyzing so far, really we're in need of scaling this up to be more power to detect other other big signatures. So, the aim of this whole project is also to create a research that can be used by other researchers nationwide or any other product interested in this kind of research. In a similar way, like the Wellcome Trust Case Control Consortium did several years ago, where a single generation of data can really inform many future studies by having a set of shared controls. So that's something we can do by by providing a common set of already genotype individuals that can represent nationwide diversity, and then new efforts will only focus on, for example, sequencing or genotype encases particular diseases. The other highlight I think it's important. This is one of the major goals of the project is to create local capacity the entire project has been conducted locally from the sampling. And like I said, it's already a national health survey conducted periodically in Mexico, but also the bio banking has been maintained and conserved in the facilities of the National Institute of Public Health in Kornabaca. Genotyping was conducted in our facility in Guanajuato, which is also equipped with it with it, you know, high technology like through high throughput genotyping computing and everything has been also done locally of course in collaboration with participants of the version. So this is really one example about the Mexico, you know, chapter of our efforts, but moving to the interest of Latin America's broader scale. We were also motivated by the fact that it's very different profile that we can detect in Mexico, as you can see here again, PCA space, different countries in Latin America, and these are the sources that represent the different contributors to the impact of people for example from Colombia, Peru, Ecuador, Chile and Argentina. You can see that the area that these individuals, the Native American fraction of these individuals is not overlapping with this space that is represented by for example, people from Central Native America, which is Mexico main. Even within Latin America you can see that, you know, people from Colombia are mostly clustering closer with the sources like these people from the northern part of South America, which is Amazonian components combined with some other Chivcha area also from the linguistic group here, and the Peruvians are mostly clustering their genetic roots to this and then component in the solar extreme. So there's a really fine scale even if you look within South America. So motivated by this, like I said, we have been conducting several other studies in South America. One quick example is one that just got published a few weeks ago in American Journal of Human Genetics, which is a study that wanted to understand the genetic basis of preeclampsia in a minority population that have been understudied, I mean in many aspects. So this is the Highlanders of the Altiplano in Peru. And this was set up with the motivation that preeclampsia has been observed to have a higher frequency in Highlanders compared to lowland individuals, most likely due to the hypoxic environment up there. But we also think that might be some adaptive process compensated for the fact that you can have actually, you know, full term birth and health individuals even at high altitude, even though maybe the cost of it will be higher levels of preeclamps. This collection of individuals, you know, took several years to compile almost like a thousand cases collected from family trios, both mom, dad, and willy cullcourt samples in the regional hospital of Puno, which is a city almost like 4,000 meters above sea level. And we found that actually there is one locus on chromosome 13 that is a region that has several clotting factor genes that are associated with preeclampsia. Actually, these are novel, novel variants that have not been previously reported around genes that have been before already related to pregnancy diseases. So we think that there's really giving a signal of which kind of functions are vital for this disease. And we think the clotting factors have a major role in there. And this is a zoom version of the locus that we found, as you can see here the clotting factors in the region are F10 pro Z, like I said, has already been documented to be involved in preeclampsia as well, as long as other other genes in the same region here. Again, this was just to give you an example of another study that was enabling new associations started by going out there and focusing on sub-collections of really under-reversed populations in community engagement settings. So, also in connection with those South American collaborations, we had to continue to also be part of the Chile-Henonco project, which we are advising somehow. And from that project, which is a national initiative in Chile to characterize their population, we proposed a separate specific project for one population that is part of Chile. But it really has a very different origin, culturally, and you will see also genetically, which is the population of Easter Island. This is the Rapa Nui people. The indigenous people of the Rapa Nui are very different. As you will see, this is really one of the most isolated places on earth that has been studied by many angles in terms of anthropology and archaeology is very fascinating, very famous for these big statues that have been built in the island. A very big medical to the way they were built. But what is clear is that there is a different origin in terms of the culture in the earth. So this is clearly people of Polynesian origin and have a different history of coming from the substitution expansion all the way from Taiwan to all the way from the Pacific until they reached the end of all the corners of the Polynesian triangle. And to propose this project to the community, we actually visited several kinds of community to do informal talks, invited the community to participate. Here we are, my wife and our daughter, visiting the community to give informal talks. She's also an anthropologist, so also part of the team that did all these ethnographic fieldwork. Here we are doing some of the sampling in the island. And this was really a great experience because we also returned the results to the participants in a way that we were able to have testimonies and feedback of how they interpret also their own results in terms of ancestry and what this means for their own family history. So how we really approached this project was because we want to understand together with the community, what is the genetic makeup of this population in understanding what are the genetic origin of the people that first arrived to the island and what is the genetic makeup of the people today in Grava Nui. So this really is what took us all the way from Latin America into the Pacific because the roots, like I was saying, are actually coming from the Pacific. So I was mentioning the Australian expansion came about 4,000, 5,000 years ago from Taiwan, you know, crossing what is today the Philippines, Melanesia and all the way to Polynesia here. This is not debated, but the fact that they reached or not the Americas and that there could be something dedicated to that or not. That was a very big question in anthropology for decades. And also, the fact that could be either way, you can see this schematic arrow here mentioned it could be Polynesians getting all the way to the Americas or the reverse way. Very quickly, what we found after analyzing individuals not only from Easter Island, but also from other collections that we will see soon is that we apply similar methods that we develop for other ethnic individuals because in Easter Island, you have a very similar situation in terms of technical mixtures, because in a similar way as Latin America will see here the confluence of different sources like Polynesians in the first place. And then there's of course European colonization in the area in the 19th century, and then because of the island being part of the political territory of Chile, there is negative American access to also in the island. So we said let's apply similar techniques and try to do some of the conclusions to understand what's going on. So, like I said, we needed a bigger database to compare these patterns across the region, not only Easter Island. So this was thanks to a collaboration with the University of Oxford, and the University of Chile, that we were able to put together this panel of Pacific Islanders and Latin American populations. What we found is that some individuals have actually not only these answers that I mentioned, but the positions in which they were observed these individuals is a very, very interesting pattern, which is Native American segments. If they were part of this recent migration from Chileans going into the island, we would expect them to be together with the European segments here in red. Instead, we saw some individuals that have Native American segments only embedded within Polynesian crumbs. So that means that maybe there was some contact only between Polynesians and Native American, even before Europeans. So for that, like I said, we apply a similar pipeline that we developed for the Mexico Diversity Project, local access to estimation, and then local ancestry as a specific PCA here, and then we applied other methods to do some of my time. Now, in the interest of time, I will just summarize this figure, which is basically what what what tell us what we found in terms of the Native American genes in Polynesians that were found in not only Easter Islanders, but also in other islands, like the United States, or the Tomoto Islands, both Mangareva and Palsar here, that had a little bit of Native American that was not coming from this recent migration from Chile into Israel, because we were able to differentiate those with a lot of European ancestry. They are Native American genes, they actually cluster in the same space as this central Chilean, like Mapuche, the Wingich. But look at these other positions here, which is the relative positions of the Native American genes in Pacific Islanders, and the references across Latin America turns out that actually the closest source is in Colombia. So these Zeno individuals that are just they were presented it for this or our props for the source population are the ones that are representing better the signal coming from these Native American genes in the Pacific. So that's number one already interesting because we were expecting if they made it into the Americas in the most, you know, natural kind of place to land would be maybe like direct line into you know Chile or Peru. These open different questions, it could be, you know, you know, somehow the currents that took them more toward the north part of South America or the other way around, maybe Native Americans actually made it into the Pacific. That's an open question that actually got picked, you know, by the news, this is just other papers also, you know, telling different hypotheses, also with the same questions, some of them arriving to the conclusion that there was contact, some other that there is no contact. But like I said, our project at least demonstrated that there is evidence for this mixture, but we cannot tell is whether they politicians went into the Americas and came back with this new signal, or, or Native Americans made it all the way to somewhere in the Pacific and then carried out the signal to the rest of the island. So this direction is not something that we can tell from the results of the project but at least we have evidence that actually there has been contact, which already answers a very, you know, interesting question of anthropology was again, picked up not only by cartoon makers here we are in Israel and you know, this is a fascinating story. This is published in Nature last year. Again, we didn't get to cover but I think it's also nice to represent our artwork, how we say it. What this project really opened to us is the possibility to do larger studies in the region by creating again biobanks that can be very useful for medical purposes and not only this entanglement is historical events. We have a group called Oceania and genome variation project, which is an even larger study involving more than 7,000 samples or which we have already genotype around 1000. You can see here the representation is much denser. Now in the Southeast Asian part, Milanesha, including again Polynesia but now focus more toward this other areas that we haven't explored before. The process is also to quickly, for example, in the pandemic, ask what is the frequency of our end that haven't reported for COVID risk, for example, and see quickly, what is their frequency for the region. This is important because of our end that has been already detected to be at risk. We can see the sharp differences in different places of different islander populations you can see my clinician here where one of the deals is, you know, mostly almost fixed, whereas, you know, the same a little can be at intermediate frequency some other areas being again the same risk a little documented by other GWAS. So, this is another useful usefulness, a very useful approach of having biobanks for from underrepresented risk because quickly we can see these kind of predictors. In other cases, you have to really go very specifically after a variant that underlies a particular disease. And this is just a quick example of a collaboration with John Oran Casanova from the Rockford University and in Paris, that they were observing the case of seven different pedigrees of families affected by respiratory diseases, including SARS-CoV-2 like the COVID. So, when they see when the genome of this of these children, they all share a common variant that has not been reported before. This was a midsense variant in this gene called if not one gene that encodes for one of the chains of the receptor of the type one interference, which as you know is is is key for the immune response that gets triggered after a pathogen exposure. So, of course, if a gene is midsense and is not producing this chain, of course, the receptor is just not not connecting with the different and then the immune response gets disrupted. And the expression of that, the clinical expression of that is, like I said, either very adverse effects of LAD virus vaccines or hypervirus or COVID. So, they discovered this this variant, they did a beautiful work experimentally to show that it is actually not expressing membranes, but really they reached out to us because we have this in the Pacific and they said, we need to understand where this variant is present across the region. We know that what we found in these seven families, but they have in common is that they are all of Polynesian origin, but they wanted to know where in Polynesia is this distributed. So, we actually, you know, signed a sequence the variant, and found that it's specifically present in Western Polynesian, you can see here in orange, the frequency of this allele that is actually significant in Samoa and the Cook Islands in Fiji very low or absent completely everywhere in the world. I mean, like here we have this reference comparison to Europe, East Asia, but also everywhere else in the Pacific is also very low or absent. So again, this is epidemiologically very relevant because by finding a variant that is specific to Polynesian ancestry, now you can, you know, issue recommendations like for example, inherited under one efficiency should be considered individuals of Polynesian ancestry with severe viral illnesses. So finally, just to just to close, I would like to mention some of the most recent effort we have been spurheading Latin America and this is part of the International Common Disease Alliance that was mentioned in the introduction by LISEL, but it's really aiming to issue recommendations for the human genetics community worldwide on how should we address the biggest challenges in the next five years. One of the recommendations actually is to increase the diversity of biobanks, focus on genetic analysis, and also to promote global equity which is also directly related to this part. So to answer that recommendation, and this is just a global organization of ICDA how it looks like, it is mainly organized in three main access, which has this motto of promoting from maps to mechanisms to medicine so ultimately to improve healthcare, and through enhancing the research about mapping, also about the mechanisms, and then ultimately the mechanisms that, you know, yield to medical benefits. So you can see here the global equity also is a transactional component here, and I am also co-leading this global equity working group together with Nicodici from South Africa. So what we have been doing in the Latin American branch of this is to get organized at least within Latin America to propose projects at a global scale that can have regional impact and then promote local building capacity. So this is something I'm co-leading together with Ricardo Berduo here from Chile, and then you can see here a whole host of different investigators from six different countries across Latin America. We call this the Latin American Alliance for Genomic Diversity or Latin Genomes for short and looked it up. And like I said, we are also translating our roles into these three different levels. So the first one about maps is basically to leverage this existing violence. I already told you about this Mexico Biobank, but it has 40,000 DNAs. We only are able right now to analyze genotype, we genotype only 6,000 of them, which is really a small proportion of the entire potential of the Biobank. The Chilean economic biobank also has, you know, only a few hundred individuals genotype, but it can, you know, it's been collected more than 3,000 individuals, and Argentina is doing a similar effort called Pobla that can also be scaled up. And these will allow to have better reference panel for imputation, for genetic trade association and so on. The other one about the mechanisms that we also want to jump into the different level of characterization of diversity, which is at the cellular level. So we just got a seed grant actually from CCI to be part of the Human Cell Atlas project and this is the Latin American or one of the Latin American efforts out on that line, which is the human cell map of Latin American diversity, including also five different countries in Latin America and the United States, but particularly in collaboration with the Maya Clinic to also have participants of Caribbean descent. Here the idea is to collect samples across Latin America, primarily a lot, and then secondly, we also have the interest of analyzing Goldbladder because in Chile particularly, there's a higher incidence of Goldbladder cancer that correlates with Mapuche answers. That's something very unique of the Chilean population. So the idea, again, is to do a single seller and a profiling in immune cell types across different indigenous and ethnic populations across Latin America, and then create a new by network that it will be specifically about Goldbladder but also create capacity by having a community portal about singles and information data sharing analysis and so on. So the overall workflow would be like having hops that would be centralizing all the processing within Latin America and then sequencing hops also within Latin America again to try to concentrate all the project within the region. But it has a lot of potential to be sent up to other tissues because the team already has access to, for example, other tissues like skin, liver, gut, brain across the same sites. Basically, the only project awarded by CCI that is going to Latin America directly so the other projects have been based either Europe or the United States. This is, I think, I can proudly say the only work host institution is in a low to middle income country we're very happy to have that project and the other things we're doing now moving into the medicine efforts of course to get together as a network to do joint analysis of COVID course, many of these countries also have been collecting cases of COVID in recent years. And just a short example of the Chilean cohort has been able to replicate some of the hits found by the global initiative called HCI. In Mexico we're just very, you know, in early stages of analyzing the data you can see here we're really underpowered we need more samples to have more more power to detect signals. So the next step will be to combine the cohorts in Latin America to do a mental analysis. A quick thing that I can just mention to just go back to how these batteries are important in the, in the COVID reaches as well. In this other example of genome sequence from COVID patients at Stanford, and then correlated with the ancestry of these individuals. It's in the early stages of the pandemic and you can see here how in the beginning was affected more or less, you know, everyone independent regardless of ancestry here you can see a lot of individuals of European and West Asian affected in the early months. And then it turned very quickly in, you know, May and June to these different pattern are, you know, a lot of people of indigenous ancestry being mostly affected. And I think this reflects the fact that those were the minorities that were not able to shelter when the lockdown came into place and then of course they have, you know, they have to keep working and then somebody, you know, weren't, you know, just just just put the shelter as the rest of the, of the communities, and they were more exposed to the COVID infection. And then the most ambitious project we have or the goal of the network is to combine these existing biobanks and create new ones to reach the goal of a, you know, one million biobank of Latin American individuals. And this needs to really the support and the joint effort by different, not only researchers and institutions but funders as well to get new improvements and enhance existing biobanks. And I think this picture has to change. This is basically where we want to participate at a bigger scale and, you know, be represented in a similar way as these other biobanks approach the globe. I mentioned already these two that are happening in Mexico but what I think we have to do better in Latin America with Costa Rica with Argentina or other countries that are also keen to be represented as well here. And Latin America is ready for these next level. I think infrastructure is not a, is not only mutation talent is not only mutation there are programs and talent students institutions have been devoted specifically about genomics in the last few years. And so that's also something that is in place committed engagement is also something that we have demonstrated we can partner with local communities to spark interest and train which is important also to return the capacity of analyze our own data. And I think this is really the right moment to partner with with the bigger players in genetics and that's why I think it's really keen to be here, speaking about this, and we need to join forces to do this really at a global level. So, as a conclusion, I would say that these demonstrate that we're ready in terms of authentic community, we have been meeting numerous genetics projects. But there's, there's a lack of a coordinated effort at a regional level that I think deserves the whole region. And we have also, you know, very well established that local diversity remains from the represented despite these efforts, and also there's insufficient funding to complement. These, these spreads to scale them up. So, really definitely genomic research and minorities should be essential to close the gap in health disparities and to be a local capacity. So, hopefully this is also motivating to, to try to, you know, speak between all the decision makers and research that have this common interest. In a similar way, like it was done in H3 Africa, for example, a whole program that funded several projects with the NIH and also well contrast funding this effort for for several years. And with that, I will just close thinking all the people that has been part of the next vibrant project and many of the other efforts. And thank you for listening happy to take any questions. Thank you very much Andres for your talk. This is very exciting. Let me open with a question for you. In your first slide, you talk about the persistent genomics research gap. What do you think that we can do here at NIH to help bridge that gap because clearly is persistent. You know, we've been talking about this for at least a decade, if not more. I see your efforts, but I would like to know how can we help. Yeah, that's a great point. Thank you. We need the assistive commitment, I think, from from the whole community. And that's what we're trying to do as part of the ICBA global effort work with more locally to try to make a difference as a local network of collaborators that is already established to try to really bring the attention of the major players, such as NIH, definitely NIH is one of the major players in shaping human genome research for decades has been the leader in many of these efforts for many years. So without the support of these major players, what will be maybe of the same scale, you know, left with the own resources that have been very limited within each country, which we know that are not, you know, not always very supported about local research, even though it's about the local community, ironically, local coverage and not always a lot more supported about our research. So I think if we scale it up, worldwide level kind of thing, which is what I see the same thing to do, but with keeping the leadership locally, really I think that that effort will return back the benefits into the local community, like students like being trained like infrastructure. So I think that we need to be heard by NIH and other players, hopefully as a joint coalition. And I think, like, Asia Africa did like, you know, welcome trust was interested in a joint goal, NIH as well, for the benefit of the entire region like has been doing in the past, you know, 20 years in Africa more than 50 projects have been funded things to that. This is not like a, like a single project proposal of something specific in Latin America, I think it has like I said three axis maps mechanisms and medicine with a lot of existing capacity, we just need a higher level of support to crystallize that and scale it up because like I said, in the very beginning, we do have more representation of Latin American populations, but in proportion to the other effort that are also growing bigger, we're just shrinking even more. So, and we just need those numbers to scale it up because that matters statistically to get association being detected. It's not only about just numbers because the others are also getting more because we want to really understand the fine scale of genetic diseases we need to have that statistical power and you don't get unless you go higher in numbers, different phenotyping, and, and in a way that it returns back to the community to understand what is going on in the research, not just sending samples somewhere else and then they will take care of all the research. I think the community is eager to be part of that. But I think we need to build a bridge between agencies like NIH and local communities and I think that's why I'm here, I think we can make a bridge. Excellent. Thank you so much. Let's go to questions from the audience. This is the population structure based on geography observed among Latin Americans could be environment driven and may not be genetics driven. What do you have comment about that? Yeah, absolutely right. No, no, that's why I think I mentioned some of the models that I showed in Mexico Biobank, we actually demonstrated that some of the correlations are actually just due to environmental factors right like, like just age or latitude. Not even a genetic industry. We do find some of them that have some genetic component, such as height, that's why I think height is, is, is one of the strongest trade that were associated with something that is clearly genetic. And even within the ancestry correlation, it is particularly the axis of variation between the north south axis component, which is just, it's not just the global ancestry of the Americas, we found the correlation between the two variables that maybe was too quick that I showed it. The global ancestry from the Americas is correlated with height but also the MDS one, which is basically the axis separating the Mayan component. And I think it's very clear when you go around and you see actually that you get up initially shorter, but I think it's also very interesting to demonstrate that it has some genetic component to it. Not to mention that it's the only component of course it's a combination of both environmental genetics. Thank you so much. We also have a number of questions that are related to community engagement. So, could you please tell us some about challenges or success story about how you are able to reach this communities and do you think that that your Latino are, it's a factor in terms of facilitating those interactions. Do you want to comment on that? Well, it's not as simple as that. Definitely it's not just an affinity on the culture. I mean, it helps, but I think it's thanks to the hard work of anthropologists and geneticists working in the field for years with those communities that really have really worked together. They have their trust because they have been informed about what has been done in previous studies, then okay now let's move to the next step because now we want to understand further this particular result that you were part of already, for example. And that's what we have been doing. Some of the examples that I mentioned about returning results. I just mentioned the example of East Ireland, but we also are returning to Peru now that we found this association with the cloning factor genes in preeclampsia. We went back to also the Mexican communities after the first Mexico diversity paper. So those relationships are key to keep the community informed about what's going on with their samples. It's just not about being a one time thing because they're interested in donating and then they don't hear anything about it. So without those kind of relationships are key. And of course, a centralized support also by institutions that have a lot to say also in this kind of research, which is another strategy. It's not like a community based, but more like more by recruiting hospital settings, for example, that has a whole different setting of community engagement, but still you can get back to researchers and participants or we contact them about the results of the research. So I think this relationship is really key to make that happen. Great. We have another question about how can genomics informed treatment of cardiovascular disease for Latinos or any other disease or how can we use genetics to our advantage to try to address health disparities, for example. Right. Well, that's the ultimate goal for sure. But but first we need to understand the makeup of those populations to then move into the actual actionable target. Right. So I mentioned this other example, which is not necessarily Latino populations, but the one that I gave about the variant that is specific in Pacific Islanders of formation ice is very specific to this community. And if you don't know that from genetics, then you don't even you don't even have any radar to look for that potential genetic disease that is just segregating these few cases of vital disease. So that that is a clear measure of how that diagnostic procedures procedures can be informed by the fact that you know that that is a disease that is only affecting that ancestry in that region, and you should be suspicious about it. So that can accelerate treatment. So it's not that necessarily is translating specific treatment for that disease, but I think you can also look closer into those committees that are more vulnerable. So I think our very first step is more that diagnostic measures, and then of course that whole idea of math mechanisms to medicine to find them for their drug targets and things like that, which I think in the next level. Yes, Andres, thank you for presentation and I was fascinated by your sort of population migration assessments are almost feel like the indigenous people of the South America where came had a different entrance to the Americas than the indigenous people of Mexico and Central America, I don't know you kind of implied that at one point. But my question really focus more on what what do you see is sort of the opportunity here of of of what is Latin America Latin America is this blend of the all these populations, you know, three major flows of population. As opposed to studying sort of isolated, more indigenous population has its own value. But what about the main, the mainstream of Latin America, you know that the typical people in Mexico City, which are really just a mix, a full mix of indigenous and European, mostly Spanish. No, we should really think about Latin America as an entity of richness and definitely its diversity is one of the biggest value, and not think about that make sure as something that is compounded or that profile that you know is no longer present in this, like you said, general population or the atmosphere to be those are quite important sample has you want, you know, whatever you want to call it. I think one of the things that these systems have demonstrated that we have learned from them is that there's a lot of continuity of those genetic roots of the ancestral sources of the contributors, namely indigenous populations in the present day at mixed individual. So, in a way that after applying these computational methods of dissecting this answer to see in your genome, we can tell exactly where your Native American ancestors come from and therefore what is the profile in terms of history in terms of medical profile or genetic risk plays into that component than in these different contexts, so I think we shouldn't really make this big difference of what is indigenous population with we use it as a, you know, convenient way to describe you know methods and how we, you know, use the data. But I think we have learned also that there's definitely a continuum between individual that are like fully indigenous ancestry or heavily at mixed individuals. So Latin America is really these these very rich canvas where these methods can be really dissect with no problem which which roots we want to analyze. So that gives you the advantage of having, like I said, more statistical power because you can have larger sample sizes, and then methods is no longer a problem in terms of understanding these complex facts, but definitely there's, there's additional layers you have to take into account, compared to, you know, a study in homogenous like a year, for example. But I think that's also part of the opportunity we have been developing those methods so again it's not only mutation we have everything in place to do it, we just need to scale things up to have similar projects like they do in the US or Europe for example right I think I think the methods are ready and the talent locally also. Thank you. Okay. One more question. How do you see the role of epigenetics complementing this research that you're describing here. Yeah, it's a great question. Yeah, we should definitely include it and I will be keen to be able to look at that additional dimension which is something we have not been discussing so far in a similar way as we are trying include also these single sequencing dimension which is just telling us about the gene expression profile of the same individual that we already have genome for example so having the epigenetic profile also will be another orthogonal line that we should definitely include. Again, another, another point that we're having biobanks where you can recurrently go back and have more characteristics using different approaches. Once you have those collections, and we have some of them like you have seen. So definitely will be keen to add all also those kind of characteristics. Sorry, one more question quickly. How can researchers from either the US that are interested in studying Latin American populations can be engaged with you or initiatives that you have described. Sure. Actually, our, our initiative that I just mentioned at the end that Latin genomes is open to everyone interested in the topic and we have naturally have an even address but I think you can, you know, look it up through at Latin genomes we're on Twitter, and you can go to us because we are open to growing community we have been also discussing with people as long as they aligned with the principles and division which is not only studying Latin Americans, and that's it it's about these commitment of doing research that that that reflects in local capacity building which is what what we want to have as an ethos of the of the network that we believe we're open to invite everyone who's interested. Great. Thank you so much. And with this, I would like to thank you address and invite Benz Bonin again to close the event. So again, thank you for this great talk today on behalf of the genomics and health disparities lecture series co sponsors. I want to thank all the attendees are participating today and look for the next lecture that will be in the fall of 2022. So, thank you, and have a good evening.