 So, hi everybody. I'd like to welcome you all to this SIB Virtual Computational Biology Seminar Series. On behalf of the Committee of the Seminar Series, I would like to wish you all a very prosperous year. Today we have the pleasure to host Muriel Beauchus from the Institute of Social and Preventive Medicine in Lausanne University Hospital. It's a great pleasure for me to have you because we've been collaborating when I was doing my PhD, so I'm very pleased to have you back here. Muriel Beauchus from the University of Geneva and MD from the University of Lausanne, as well as a PhD in Genetic Epidemiology from the CASE University in the States. She currently works as a full professor of Epidemiology and Public Health, and she's the head of the Chronic Disease Division at the Institute of Social and Preventive Medicine in Lausanne. So, and she has also been appointed the director of this Social and Preventive Medicine Institute and she will be assuming her position in August this year. So, her research focuses on Epidemiology of Chronic Diseases and in particular the genetics of blood pressure, renal function and obesity, she's also developing nutritional Epidemiology with specific interest in dietary determinants of common chronic diseases. Her work is based on a multidisciplinary approach with epidemiologists, public health specialists, clinicians, geneticists, statisticians, bioinformaticians and molecular biologists. So, she's interested in developing public health genomics which attempts to translate molecular findings into clinical and public health applications and vice versa. So, today, Mihail will tell us more about the Genome-Wide Association Studies performed on urinary phenotypes. So, Mihail, thank you again for accepting our invitation and the floor is yours. Okay, let me do this result. This is trying everything. Okay. Okay. Thank you very much. Thanks a lot for this invitation. Can you hear me? It's okay. Great. It's a great pleasure for me to be here today. I have a, as you could hear, a long introduction and very specific background, a little bit schizophrenic background because I'm specialized in public health and in genetic epidemiology where we have totally different aims. You know, genetics, you want to personalize things and public health, you want to find solutions for the general population. So, on my everyday life, sometimes I put my public health hats and sometimes my genetic epidemiology hat and sometimes we have conflicting decisions to make. And that's very nice, very nice to do and also very stimulating. My message today will be to tell you that what the field of genome-wide association study brought, at least in my perspective, is that we are stronger together. So, this series is pretty long. It's about, you know, 15 years that we are now able, thanks to, you know, novel technological developments, to assess, you know, the association between genetic variation throughout the entire human genome and disease and physiology phenotypes, which may be continuous. And one of the first, you know, results was that, you know, most of the effect that are detected are quite small. You know, the proportional variance, phenotypic variance explained is very small. So that anything you study, at least, you know, studies from 10 years ago, was not big enough, large enough to have enough power to robustly identify the effect on its own so that, you know, people, researchers, investigators had to team up with other teams throughout the world so that together they could reach, you know, a large enough sample size to be able to robustly detect and replicate findings and have to be sure that what they had found was not a false positive result, but a true positive result. And I think a positive side effect of all this has been that, you know, a lot of collaborations could be initiated. A lot of large-scale international consortia have been built. And I am pretty convinced that by and the results of those studies, these have allowed for extremely fruitful interdisciplinary and international collaborations and have built us further in terms of understanding, you know, the link between genomic information, genetic information, and health and disease. When, you know, in public health, we are mostly interested in a population-based approach and solutions, preferably cheap ones, aiming at preventing disease. Whereas this contrasts with the personalized, the classical clinical approach where we find a personalized solution, such as treating a given patient with a solution that is optimal for this person. So in my work and with my team, we have been working mainly using population-based approaches and trying to focus on early determinants of disease so that we can, we hope, try to find, you know, interventions to prevent diseases, to prevent chronic diseases. And this approach is multiple. So we start with collecting population-based data using cohorts. Cohort meaning, you know, we try to fill up those people with time to have not only a baseline examination, but first follow-up, five-year follow-up, 10-year follow-up, so that we gather a longitude in beta on those participants. Then we collect samples, you know, blood, serum, plasma, collect DNA, extract DNA, urine, and most mostly large-scale. Part of those samples are being analyzed immediately and part of them are being stored in large-scale biobanks for future analysis. And then we generate, you know, omics data for the topic of today, genome mind association study, DNA-cheap arrays, using, for instance, Illumina or Affymetrix technologies. So we generate a lot of data, but we might also, you know, sequence a genome or using, you know, epigenomics analysis or metabolomics. Then comes a step of data analysis, creating new knowledge and trying them from our perspective to either change health care or organization or develop novel public health policies aiming at primary prevention. Another step of data analysis, as you know better than me in bioinformatics, we also try to enrich, to annotate the genetic information or all these additional information with publicly available databases. What we also do is, for instance, I'm also responsible for cancer registry. We try to develop linkages with our population-based cohorts with those kind of population-based systematic approaches, monitoring approaches, and also try to link with publicly available environmental data, such as air pollution, for instance. We systematically link our population-based cohort data with air pollution data to enrich the analysis. And then the principle is, you know, we have several actors in this show. Genetic information is one of them, but then the synotypic information is another one. And the exposome concept has been put forward a few years ago now. We are interested as epidemiologists in collecting as much information as possible to better understand disease mechanism and to better prevent them. And this is not just at one point in time, but to try to have a longitudinal assessment of all those experiments. And then genetic information is one of them, which is a very useful one and tactical one, because most of the time it doesn't change. So it's present since conception. That's a very useful property because we are sure that it precedes the effect. Epigenetic information is another one, which is a bit more complicated because it's tissue-specific, it's time-specific, so a bit more headaches associated to it. And then many, many other things can be analyzed and can be put together to better understand whether the determinants of diseases. And in public health and social medicine, we also are interested in social economic determinants of disease. We know that, depending on your social class, there is a, even within the same country, there is a 10-year difference in life expectancy, even within Switzerland, for instance. So that's a very strong determinant of health and longevity, social economic status, of course, via various mechanisms. And one of my very strong interests during the past years has been to develop nutrition epidemiology, why? Because we know that diet is a major modifiable determinant of many chronic diseases, cancer, cardiovascular disease, neurodegenerative diseases. And we only need to eat, so we are only exposed to it on an everyday basis. So it's a very strong determinant, but it's also a domain where our knowledge is extremely, sorry about this, is extremely, that was just to show how strong my opinion is on that, that this domain is extremely poorly understood. We currently make general recommendations, you know, you should eat fruits and vegetables, should try not to eat too much fat, not to eat too much calorie, but those are not personalized, those are not targeted, and there is a lot we don't know in terms of how our dietary intake on a daily basis influence our health status. And the important thing on this, in this domain, is to have a live course perspective and to understand that there are windows, you know, there are sensitivity windows during which a given dietary exposure might have a very different effect if it occurs, or if it occurs at the age of 50 years old. So this is very important, and we are trying to put in place a project where we would like to collect, you know, longitudinal and nutritional exposure data on a large number of people starting from conception until death. So I think there is a lot to be developed here, both in terms of gathering the data, but also developing nutrition biomarkers that allow to better understand the link between diet and health status and disease. And while trying to focus on nutrition, an important sample to collect is urine. So urine may be a thing of not much interest to many people, but to me that's a very, very precious bio sample to collect, extremely precious. Because it can provide us information on many different types of exposure. Of major interest to me is of course diet. If you collect 24 hour urine, we have information on sodium intake, we have information on potassium intake, magnesium intake, many different iodine intake, many different things, but we also gather information on drug monitoring potentially, you know, if we measure metabolites of drugs, we can have, you know, disease diagnostic, you know, proteinuria, albuminuria are part of the diagnosis of many diseases. We can have information on exposure to toxins, cadmium, urinary cadmium expression is a marker of exposure to cadmium and so a marker of toxic exposure, metabolic state can be inferred as well, some cancers can be diagnosed on the basis of urine. And newly it was also found that there are about 150 metabolites that come from the intestinal microbiome that are absorbed, the intestine circulates in the blood and are excreted by the kidney in the urine. So it's also, it is also an indirect way of getting some information on the microbiome, knowing that for large-scale population-based cohorts collecting feces is not very easy to do logistically. It's much easier to collect urine samples. So that's a very, very useful phenotype to collect if the interest is to understand the determinants of chronic diseases. So genome mind association study consists in, you know, screwing the entire human genome using mostly, you know, single nucleotide polymorphism, the most frequent genetic variation in the human genome and associating this genetic information with a phenotype of interest. In a large number of people you do it for one study and then you team up with other teams which have also similar data, run a similar type of analysis and then in the second stage you meta-analyze those results, aggregate results and then you replicate those results and then depending on what you find, you will have further steps which will be mainly in your type of daily work. I imagine, you know, different types of bioinformatics analysis to understand what drives those signals. So now, as you know, able to screen the human genome in a pretty easy way, more or less expensive depending on what we do. We all have a unique sequence, you know, we all carry about, you know, a few millions, two to three millions single nucleotide polymorphism in our genome. So that's a very useful property and our genome has a given structure and, you know, recent methods have allowed now to understand this pretty well and of course the correlation structure of the genome is a very important knowledge to interpret the finding of genome mind association study, what we call the linkage disequilibrium. Knowing that, you know, an individual of European descent and a person of African descent are not going to have the same correlation structure and the same linkage disequilibrium and haplotype blocks in their genomes so that a given signal might have a different meaning depending on the, you know, ethnic origin of the participants. Although several studies in the past years have shown that whenever a signal is identified in individuals of European descent, most of the time it can be, you know, extrapolated to people of African descent. The other thing is a bit less true but in that direction it's usually pretty, pretty good and it can also be extrapolated to people of Asian descent for most of the results. So the portability of those GWAS results is pretty good and most of the time these are the result of common variants. So there is not that much evidence showing that the signals we get in genome mind association studies would only reflect the combined effect of rare variants at these given locations. So far it's rather pointing towards capturing common variation. So as you know, you know, the genetic information is very important and we are now able to capture it and among the methods currently used in many studies are Illumina DNA chip. So we've been using in our skip or cohort now lately the Illumina Infinium Omni 2.5, which means we generate 2.5 million steps for each of our participants. This also allows exploring copy number variation, insertion deletions. But so far we are now just at the data cleaning step where we, Tony here present, is doing this to get in collaboration with other members of the team. And the first stage is to generate this information. The second stage is to generate additional variants by using publicly available databases and to generate additional what we call inferred SNPs. So now we have on the basis of the genotype SNPs, we can use variants using publicly available data. And the underlying principle of these type of analysis is that for every single nucleotide polymorphous we generate an association with a phenotype of interest. If it's a disease, let's say chronic kidney disease, then we have, you know, people who do have the disease in the study and people who are without the disease controls. And then for all the SNPs, we run an association. If it's a biocatomized trait, we do a logistic regression usually. And if it's a continuous trait, we will do a linear regression. But then of course, depending on the phenotype, we might need to transform it so that, you know, the residuals as best as possible approximate normality. And we repeat this millions of time. And then the regression coefficient, standard error, precision estimate, and the p-value. And then we put all these together and meta-analyze the result across studies and generate what we call a Manhattan plot. And then forest plot when we show what is the estimate and its precision, you know, 30, sorry, 25 percent confidence interval across all the studies. And then we have a meta-analysis result. So this is a situation where we have a larger patient sample, control sample for each study and then meta-analyze across studies and see what appears to be genome-wide significance. One example is a Manhattan plot of one of the studies published within the CKDGEN consortium. The phenotype of interest in this consortium is ring function. This is the result for estimated glomerular filtration rate, which is a way to estimate ring of function based on theorem creatinine. So we use a specific equation to infer this function. And then this is a phenotype which is available in most large-scale population based cohort. And it's pretty standardized across cohort. There is a specific laboratory standardization method so that we assure that what we compare is really truly comparable, which is quite an important factor if we don't want to get any noise out of these projects. And what you see here are the chromosomes. And each dot represents a result of an association test between the SNP and the phenotype of interest. And what you see on the wire axis is a minus octane of the p-value. So it's the strength of the statistical association. And when you have big peaks, it means this specific region of the genome is associated with the phenotype of interest. Here it's a continuous phenotype, so we use linear regression to get the result. And what we usually do is that we put the gene symbol on each of those bearing. And just to make it more easy to remember, because if we would put the SNP number, if you put RS166679, when you've seen 200 of them, you don't really know where you are. So easier to remember, although sometimes a bit tricky, because if you've picked the wrong one to label the peak, you know, it might kind of... You think the signal is due to this gene, but maybe it's just due to another gene close by. So be careful when you look at this, it might not be the true underlying causing gene. But just it helps you to tell, you know, where are we in the genome? For this specific example, one peak was much, much stronger than the other. And the gene symbol here is U-mode, which encodes for U-modulin. And I'm going to focus a little bit on this protein that is produced by the kidney only, the only organ, and that is released in urine, and that we have assessed in multiple cohorts, Olivier de Valls from the University of Zurich. So, well, we have this first step. We have results covering all human chromosomes. And then we need to decide, you know, what is statistically significant, what is not. And so far, most studies have used specials of the order of 10 to the minus 8 to declare, you know, gene 1 significance, which corresponds to approximately 1 million independent tests. We are testing more than 1 million variants, but a lot of them are correlated. So we do not have as many independent tests as this. And then, so this first test is to avoid having, of course, too many false positive results. And then to further decrease this risk, we replicate the top findings in independent cohorts. So that's the next stage. And then what comes next? So we have a lot of regions. If we have, let's say, 200,000 participants or 300,000 participants, usually, we now have 50 genome-wide significant heaps in those type of studies. It means that under each peak, you may have 1 gene, you may have 0 gene, you may have 200 genes. So it's still a lot of work to understand what is the gene of interest in that specific region. Yet if we would start, you know, following up every gene in the region and functionally characterizing each of those genes, it would be a huge amount of work. It would take us a lot of time and cost a lot of money. So that's where various techniques and post-GIS type of exploration analysis can help us in prioritizing the signal of interest and in better understanding the overall meaning of... Here you see some examples, you know, expression quantitative threat analysis to, you know, are there e-sleep in the results, you know, sleep that are associated with a specific gene expression in a given tissue, DNA is one hypersensitivity site mapping, chromatin mapping, you know, you can try to collaborate with other teams with animal experiments. So in my team, we do not do those type of experiments, but we do collaborate with teams who have these expertise. So zebrafish experiments are very, very frequently used, mice experiments, rats and so on, and other ways of further, you know, understanding globally those results are pathway analysis, which may be of very different types. What has been used in one of the latest projects in which we've been collaborating is CKDG and consortium was a deep depict software, which is used, you know, the entire results of the genome analysis is being used and a pathway analysis is being run using, you know, available pathway information. Some pathways have been manually annotated, others are just, you know, commonly available information. Expression data, publicly available expression data are being considered and all the GRAS results, including the null GRAS results, that is, SNPs that do not show any association are being used in that analysis to try to extract, you know, which pathways are over represented in those global results, which genes appear to be over expressed in selected tissue in those signals. And if I take the example of the CKDG and consortium, the latest publication, which included data on several thousands of participants, what came out of those type of analysis was that among the tissues in which those genes appear to be overexpressed was the kidney, which was not a surprise, but quite reassuring, let's say, the urinary tract, logical finding, of course, urinary function, but also the liver, and also the adrenal gland, so where, you know, steroids are being synthesized. So that's quite a logical, very interesting. And then selected pathways have been identified as well. And also in terms of, you know, regions that were percent with eventually expressed, so the kidney, adult kidney was really over represented. So this helps us getting more inside into the usefulness of those results. And if we look at a further way to present those data, which is the gene set overlap analysis, you can see here on those boxes, the pathways that happen to be overly presented globally in those G-RAS meta-analysis or even function from the CKDG and consortium. Among those, we can see renal system development, renal system development, decreased kidney weight, abnormal kidney cortex morphology, so very logical things. Then the lines that you see between those boxes and pathways represent overlap. You know, when it's sick, it means there is a lot of overlap between the genes that belong to those specific pathways, and when it is thin, it means that the overlap is less strong. So a lot of these makes quite a lot of sense, but it's quite interesting for us to know which of those pathways appear to be over represented, meaning that genetic information that is involved in those pathways appeared to play a significant role in the control of renal function in the general population. So the vast majority of the studies which were part of this consortium were population-based cohorts. So we are talking about renal population, not renal function in conditions of rare diseases. So the limitations of those type of analysis, as you know better than me, is that, you know, the information we use is only as good as it is, and there is a stronger enrichment of genes that have been studied a lot, and there are many genes, many regions of the genome for which we don't have a lot of high-quality information, and also the function of many, many genes in the human genome is very poorly known. So we cannot produce better information than what we use, and clearly can imagine that repeating those possible information in a few years from now might generate better quality information. So if I come back to this top hit of, you know, pushing the renal gene, we do know that these genes are involved in rare forms of kidney diseases. So this means that genetic variants in these genes are associated with renal function in the general population, but we also know that the variation mutations in these genes are associated with rare form of kidney diseases. So this is a very nice example of a continuum between rare diseases and common complex traits in the general population. So this protein is produced only by the kidney, a very specific portion of the tubular segment of the nephron, and this protein has been known since about 50 or 60 years. It was discovered by thumb and porthole, and it was first called the thumb porthole protein. 35 years later, it was another study published to find some uromodulin. It was called uromodulin because it was found in the urine of pregnant women to be associated with their immunity, their immune status. So it was called uromodulin. A few years later it was discovered that thumb, porthole, protein, and uromodulin are just the same molecule, and it happens to be the most abundant protein in human urine. So in terms of the amps of protein produced, or milligrams of protein produced by the kidney, this is the most abundant one. So that's quite an important and interesting protein. It has many functions, but we are still not very clear about its function. It is known to play a role in kidney disease because it's associated with immune function and rare forms of kidney diseases. It is known to be associated also with immune response during acute kidney injury. It is known to be associated with innate immunity and also with defenses against urinary tract infections. Depending on the variation in a genetic variation in this gene, people are more or less able to resist to urinary tract infections, bacterial urinary tract infections. And it also happens to protect against kidney stones. For the most part of the only genetic region so far that has been identified both for being associated with re-function and with blood pressure, with chronic kidney disease and with arterial hypertension. So being myself interested in both immunotypes, that was a gene of major interest to us. In collaboration with Olivier de Vest, we also found that in mice, this gene is associated with blood pressure sensitivity to salt, meaning if you eat a lot of salt, depending on the variants you carry at this gene, your blood pressure will more or less increase or not. And this is a very important phenotype because we know that with aging, the blood pressure of humans become more and more sensitive to salt and how common modifiable cardiovascular trait affecting about one in four adults in the general population worldwide. And if we know that we are now in a current area of population aging, hypertension prevalent is going to increase. And being able to better control blood pressure in the general population will have a major impact in terms of public health, you know, reducing the complications, cardiovascular complications of hypertension. So this is a very interesting molecule because if we can get to better knowledge, we can maybe find a solution to prevent both chronic kidney disease and arterial hypertension in the general population. So it has been also found, so it means that starting from a genome line, I did identifying selected slips and further running experimental studies in zebrafish or mice or rats, better knowledge could be done. And what was found also is that your modeling, which is solely expressed in this specific segment of the tubular nephron, is responsible for sodium or at least influences sodium reabsorption by the kidney. This part of the kidney is responsible for about one third of the sodium reabsorption. So that's a very key phenotype and we are now running further studies to better understand this. And what we find in the scope of cohort, which population based data from the contour of Geneva, Avogh and Bern, is that there is effect modification for modeling and salt expression for the effect on blood pressure, meaning that the association between the sodium and blood pressure is not the same depending on the level of your modeling produced. So we're going to analyze this further and see what we can get out of this, but that's very interesting to us at least. And then in terms of molecules, your modeling is produced and then secreted in the urine where it is cleaved and then it forms networks and urinary casts. And this is why we think very might influence urinary tract infections that bacteria in the urine are trapped by those networks and it helps protecting the kidney from being injured. And what happens if mutations occur in this gene, at least for the rare disease forms, is that your modeling is produced but cannot be secreted, so that it accumulates within the cells and it destroys the cells, so that the people who are affected will carry those specific rare mutations develop a chronic kidney disease maybe at the age of 30, 35, so it destroys the kidney at a very early age and then they end up having dialysis or kidney transplant maybe at the age of 40. So it's a very severe consequences for those people. So the next step for us was to try to run a genome night association study on urinary cells, so the production, the gene product. So we got a peak on the urinary gene of course, which was expected, pretty strong, 10 to the minus 72, this is the result of putting together six population-based cohorts, but our hope was that we would have other peaks to try to better understand, but no, we did not. And we have now increased the size and still we do not have other peaks, but we will try a little bit further also. But if we try to zoom, to zoom on this specific location, so the urinary locus, including the urinary gene, so we get a very strong signal, okay, you know those SNPs are correlated with each other, but we get a second independent signal, that is when we condition on the top SNP, we get a genome-wide signal which and this signal comes from a nearby gene. So we have two genes located closely, close to each other, providing independent signals. So current work is ongoing to try to understand what this means. So there is a meaning and this gene is likely involved in the almost phases of your modeling production. PDILT, this is just impossible to remember, PDILT, yeah. Okay, so this, yeah, exactly, yeah, exactly. So further work now is being conducted to try to better understand what is the link between the two. And then, so examples, so Tony here present has one also, those analysis, we all have a lot of human phenotypes, pure medialine was one of them. We've measured in collaboration with Olivier and other partners, elements in urine, and we are also looking at the ratios. So and we have seen that using the ratio sometimes get a different result from analyzing each of those elements separately. And this analysis led to a few peaks. And one of them was located in a region where there is a clodin, so CLDM14, clodin-14 gene. And this gene has been further explored using experiments. And so this shows which are logical for this ratio of urinary magnesium to calcium. And we can see that this understanding the physiology, the rheumatology in terms of ion transport in the specific segment of the nephron, what is very interesting with the kidney is that depending on the location in the nephron, the transport system are very different. So depending on the phenotype we use, we are targeting a different segment of the nephron. So it's very powerful to be extremely selective in where we look at. There have been many, many GUIs run so far, thousands of them. The results are being stored in a database, a publicly available database. A few years ago it was possible to show all the dots, but now it's so dense. It has been so vastly annotated that it's not readable anymore. But the database is available for those who are interested. You can buy gene or by phenotype and then you will know among the 30,000 association results that are being stored there which one is eventually depending on what you are looking for. It's very interesting to once you have a result, if you have a phenotype and you have a top SNP to know whether this specific SNP has been associated with other phenotypes in other sites. So it's quite interesting to use. Despite all this genetic information that is being used, so far we only explain a tiny fraction of the valence. Only the tip of the iceberg, of course, is not the only tool to be used. We likely are unable to capture signals from rare valence. Unable to capture signals from interactions, gene-gen interactions or gene-environment interactions, so that of course it's necessary to complement those approaches by other type of approaches such as genome-wide sequencing or whole-exam sequences. What are the advantages of those methods? It's relatively cheap compared to genome-wide sequencing, for instance. Hypothesis is free. We scan the entire genome. We can identify novel biological candidates linked to the phenotype of interest. As I said at the beginning, these needs a large collaborative projects. We get information of ancestry data from participants, because when you have two million markers on a given individual, you know where this person comes from, obviously. And we also allow exploring copy number variation. The limitations of GWAS are that the effect sizes usually are very small, so that we need replication and collaboration with others. We do not get information on the causal, you know, the functional association. It's mainly suitable for common variants. And because they do not explain a lot for a portion of the valence, they are not useful usually for risk prediction at the individual level. So it's not very useful to genotype people to know whether they will develop chronic kidney disease 10 years from now, whether they will develop diabetes 10 years from now. So far, it's not that useful, but it's mostly useful for the understanding of the disease mechanism. So I will end up just briefly showing what we are doing now. It is exploring genome-wide association study for the SkiPo cohort. Each dot is a participant. We have three equipment centers, Bern, Lozan and Geneva. And so far, we've been collecting and generating 2.5 chips, but also epigenomic data and a subset steroidomic data, so steroid hormone metabolite in urine, about 50 of them in everybody, transcriptomic, white blood cell transcriptomic, and blood and urine metallomic, meaning, you know, elements. 24 elements have been measured in blood and urine of everybody, a baseline. And for three-year follow-up data are being measured now. So we have magnesium, calcium, cadmium, mercury, arithmic, so a lot of things. Very nice genome-wide association signal so far. And also urine metabolite is being analyzed. And we are trying to better understand the GWAS signal using those additional phenotypes. So in conclusion, those methods, genome-wide association studies, are very useful to provide disease mechanisms, disease physiology and biology. They are not useful for risk prediction at the individual level. And what we have learned in the past year is that it takes a lot of time and effort to further understand, you know, to understand what drives a signal in any given region of the genome. So I'd like to thank all the people who have participated to some of those projects. Livia de Verst has been instrumental in this urine-modeling theme, the COLAAS team, Zoltan and Tonghi. Tonghi has been running a lot of these analyses. The team of SVEN has been collaborating a lot as well and all the SCIPOC team in the three recruitment centers. Thank you for your attention.