 Všeč sem tudi pričačnja težna srpom. Zato je tudi o stativnoj metodom, da je zelo vsega vsega za komplekšnje vsega. Zelo vsega, da v ljudevstvenih ljudi genom vsega vsega vsega vsega vsega zelo vsega vsega vsega vsega vsega. In genom vsega vsega vsega vsega vsega vsega zelo vsega vsega vsega vsega, vsega vsega vsega. Tako, včasno, nekaj dobro vzouto g-walls je začeljenje milijon vzouto genom, in za vse vzouto vzouto vzouto linearne modele, kaj smo vzouto vzouto vzouto fennotype. Vzouto vzouto g-walls vzouto vzouto vzouto g-walls vzouto vzouto genitik vzouto vzouto vzouto fennotype in vzouto vzouto epi.vo vomjevanje v Wrestling v začeljenje milijine v vowå navovo sprala v zoli ram靈šenči, ki bili bato tez kanal, izv pa navovo growe su napoš aqui neki zbiz, In for each SNP, we plot in the x-axis the position in the y-axis the minus logarithm of the p-value. So as you can see in this region, we have many associated SNPs. So using only the GWAS results is very difficult to identify the causal variant or the causal gene. And this limitation cannot even be resolved by using much larger sample sites. So the problem is that with the GWAS, we are only searching for a statistical association between the DNA, typically represented by SNPs, and the phenotype. But we don't know anything about the biological mechanism behind the statistical association. So one way to interpret this statistical association is to interrogate one intermediate phenotype, such as the gene expression. And to do that, we can apply a QTS analysis. So the QTS analysis is a powerful way to understand how the genetic variants affect the gene expression. And we can see the QTS analysis, like GWAS, where the phenotype, in this case, is the gene expression quantification. So typically, and we say that NQTL is a SNP, so NQTL is a SNP, which affects the gene expression. So typically, when we perform GWAS and we identify a SNP affecting the phenotype, the first thing that we do is to check if the same SNP is also NQTL for a certain gene. Because in this case, we can link the SNP to a gene, and this can help us to interpret biologically our statistical association. But if we find that a SNP is associated with a phenotype, and it's also NQTL for a certain gene, can we say that we identified the causal variant, so can we say that we solved this diagram, so we found the SNP, which affects the gene expression, which affects the phenotype. Unfortunately, it's not that easy, because even if recent studies show that we have an enrichment of the QTS among the trait associated SNPs, given the dag number of the QTS in the genome, many of these overlapping associations are just coincidental and not driven by the same functional variant. So, when we have these overlapping associations, we can have three different explanations. We can have a scenario of causal effect where the DNA changes affects the gene expression, which affects the phenotype, or we can have a scenario of plejotopi where the genetic variant independently affects the gene expression and affects the phenotype, or we can have a scenario of link is equilibrium where the genetic variant is in我觉得 is equilibrium with two different variants, one affecting the gene expression in druga vseči vziv na fenotype. Zato vziv smo v primeri scenariu in pozivimo, da je tudi zelo vziv na genu, na fenotype, vziv smo na vziv za vziv na zelo vziv. V izgledanjem vziv vziv sem, da vziv na svoj začutku, vziv se vziv na genu in zelo vziv na genu, vziv smo na ktl. analiziji in tudi, da se zelo vziv na svoj zelo vziv, vziv smo na vziv za vziv. The mid-anomization is a statistical method that we apply when we are interested in the estimation of the causal effect of an exposure, in our case the gene expression, on an outcome, in our case of anotype. So the idea of the mid-anomization is that there exists a SNP, which satisfies several assumptions, but still briefly we can say that the idea of the mid-anomization is that there exists a SNP, which is strongly associated with the exposure, and which affects the outcome only through the exposure. So in our case it means that the SNP affects the phenotype only through the gene expression. It means that we cannot have any direct effect of the SNP on the phenotype. And if these assumption holds, it means that the effect of the SNP on the phenotype, beta g was, is equal to the causal effect of the gene expression on the phenotype, alpha, multiplied by the effect of the SNP on the gene expression, beta qtl. Since we know beta g was and beta qtl, because we can extract this info from g was and the qtl study, the only thing that we don't know is alpha. But if this formula holds, it means that alpha can be estimated as the ratio of the two betas. But here I just presented you the easy example where we have only one SNP, but we know that a gene can have multiple independent qtls. So in this case we have to include these SNPs in our randomization model. And we call this approach a single gene approach. But we also know that a SNP can be an qtl for multiple genes at the same time. So for these reasons we propose and transcript on one randomization approach, which uses multiple SNPs and multiple genes simultaneously. So now the easy formula that I showed you before to estimate the alpha, so the causal effect of the gene expression on the phenotype, now becomes this more complicated formula, where c is the LD matrix which contains the link is equilibrium between all the SNPs included in our model. To demonstrate the advantages of our multi gene approach, we performed simulations analysis where we simulated 1000 regions containing 15 independent SNPs and three genes. And for each SNP we also simulated the number of genes affected by that SNP. For example, SNP1 is an qtl for gene A and gene B, SNP2 is an qtl only for gene B and so on. And our results showed that in case of pleotropy the multi gene approach gives a better estimate of the causal effect. As you can see the root mean square error is more than tenfold lower in the multi gene approach. And also more importantly we found that without losing power the multi gene approach controls better the type one error. As opposite to the single gene approach which can easily reach the 20%. So to run our RTWMR we need three different kinds of data. We need summary statistics from a GWAS for the phenotype that we want to study. And also we need an extender reference panel to estimate the link is equilibrium between the SNPs included in our model. And for the qtl data for the gene expression data we are using the summary statistics from a large qtl metanalysis performed on 32,000 samples. So we merge together this data and at the end if we run a TWMR we found a list of candidate causal genes. We applied a TWMR to 43 different complex traits that you can see listed here. And in total we found more than 2,000 putative causal genes associated with at least one phenotype. And these associations result in almost 4,000 trait genes associations. And it's interesting to see that more than 30% of these trait genes associations were missed by the previous GWAS. So it means that when we look at the 500kb from the gene that we found associated with TWMR in this region we didn't find any SNP reaching the genome wise significance level. And we think that these regions were missed by the GWAS due to power issues. And this is our assumption holds when we look at the results from UK Biobank. So from UK Biobank we extracted several datasets of increasing sample sites of British unrelated individuals and for each dataset we've performed the GWAS and then TWMR for BMI. As you can see here for this example for this gene associated with BMI we need only 60,000 individuals to detect a significant association with TWMR but we need at least 180,000 individuals to detect a significant association using the GWAS alone. So more in general when we use the complete dataset of British unrelated individuals from UK Biobank we found 318 genes associated with BMI. And 44 of them were found only by TWMR. So it means that these genes were completely missed by the GWAS. And to assess if those genes were meaningful we look at the results obtained by using a smaller dataset. So when we use 100k individuals from UK Biobank we found 56 genes associated with BMI. And 40 of them were already confirmed by the GWAS performed in the same dataset. But it's interesting to see that 13 additional genes were confirmed by the GWAS performed in the complete dataset of UK Biobank. So our results suggest that applying TWMR after the GWAS we can find in advance new loci that the GWAS alone can find only with much larger sample sites. So another example of regions missed by GWAS is given by these locus that was missed by the GWAS on education attainment. As you can see the top-snip doesn't reach the genome-wise significance level but our TWMR analysis showed a significant association for BSL-CL2. And this gene is already known to be associated with a Mendrian form of encephalopathy. But in this case it's also interesting to see that when we look at the results from a more recent GWAS performed with a much larger sample sites, we found that this region becomes significant also for the GWAS. So again, our results suggest that TWMR can find in advance new loci that the GWAS will find in the future when the sample sites will be larger and larger. Then typically when we perform GWAS we indicate the closest gene to the top-snip as the causal one. But we know that in many cases it's not true. For example, in this case, the closest gene to the top-snip is SOX5 in this region associated with height. But according to our TWMR results, this gene doesn't show any significant effect on height. But we found a significant effect for crypt. And crypt is already known to be associated with a Mendrian form of short stature. So our results suggest that the same gene has also an effect on height in the general population. In general, when we look at all the genes found associated by TWMR for all the 43 complex traits that we analyzed, we found that in the 71% of cases, the closest gene don't show any significant association. Then finally, to check if our TWMR genes were functioning relevant, we overlap the list of associated genes with the list of genes extracted by OMIM that are associated with abnormal skeletal growth syndrome, hypercholesterolemia, and cognitive impairment. So we overlap this list with the list of genes that we found associated using TWMR with height, total cholesterol, and education attainment. And while we found only a trend of enrichment for height and total cholesterol, we found a significant enrichment for education attainment. And these results provide an additional supporting evidence for our TWMR genes. So in conclusion, I showed you that TWMR is a powerful method to identify putative causal genes starting from already existing data. And not only we can identify new loci missed by the GWAS, but we can also prioritize functional relevant genes in already known associated regions. So today I didn't have the time to present many results, but if you are interested in this method, you can find much more results in our paper published two months ago on non-nature communication. And thank you very much for your attention, and thank you very much to all the people involved in this project.