 So it's my pleasure to introduce the last speaker of the day, Ewan Burney, the director of EMBL EBI, and the deputy director general of EMBL. He's one of the world leaders in bioinformatics and genomics. He has won numerous awards. I just mentioned some of the highest honors. He's a fellow of the Royal Society. He's a fellow of the Academy of Medical Sciences, the commander of the Order of the British Empire. He's the non-executive director of Genomics England, and I think also very importantly for our network, he's the chair of the executive committee of the Global Alliance for Genomics and Health. So there's hardly any better person than him to tell us about the grand vision for genomics and medicine and which the role in machine learning could play in that. So thank you for agreeing to speak here and we are looking forward to your talk. Okay, well, thank you for inviting me. After some thought, I'm actually gonna give you an example. When I put this together, I thought it would be better to give you an example, partly because I think it's good to be concrete and partly because I think it's important to realize what we're trying to do in machine learning in biology, but I'm gonna kind of open up and frame it in a variety of ways. So my first piece of framing is just reminding you about this amazing organism that we studied. So these are humans. We are pretty cool. We're actually pretty standard mammal. If you want to understand mice well, you should study humans as well as vice versa. And just to say, we have changed the way we can measure humans. So to remind you about humans, if you're a geneticist, we're outbred, that means that we mate randomly. We mate seemingly randomly. And we have pretty standard and pretty good genetics. There are a large number of humans around the world and we get measured by this amazing thing called healthcare. And that means that we get phenotyped and we get genotyped by the process of healthcare. Actually, another interesting thing about humans compared to other organisms is we're quite big. This is annoying if you're in imaging because you need quite a big thing to image us, but it's really useful if you're in molecular biology. You can actually get quite a lot of a human, in particular blood, but if you ask nicely, you'll be amazed what you can get. So you can study humans in ways that are quite accessible. And so there's a huge amount of opportunities around studying humans. And a lot of this ends up being about, now choose your phrase, machine learning. I've decided to call it machine learning because that's the name of your course. If you went back 20 years, it'd be called multivariate statistics. And if you want to be super trendy, you can call it AI or deep learning or something like that now. In some sense, this is all just very high dimensionality, extremely high, lots and lots of measurements at the same time, often highly correlated with each other. And these are very large-scale data sets. And I would like to draw your attention that we're trying to do two different things often when we do this either multivariate statistics or machine learning or deep learning. So one is effectively a describing process. So that is, in some sense, recognizing the correlation patterns inside of the variables that we have and coming up very often with a smaller dimensional space that represents that higher dimensional space. So that smaller dimensional space is a kind of description, is a mathematical description of the higher dimensional space. And you can think about this in lots of different ways. So maps are a smaller dimensional space of what is happening on the real world geology geography. Again, there's a whole bunch of sort of smaller dimensional spaces, for example, in video compression. Video compression is a smaller dimensional space of the higher dimensions, which is the final video shown to you. But description is not the same as understanding. And that's a really important thing. Understanding is when we can understand why this high dimensional space operates in the way we want to do it. And just creating a lower dimensional space is not enough. And I really encourage you to read this book called The Book of Why by Judea Pill. I don't actually agree with everything he says, but I think his fundamental facts are absolutely right. That is that in these complex and multivariable systems, correlation is commonplace, but what is key is understanding what is causal. And Judea in this book lays it out extremely well. Like I said, I don't agree with all of it, but I really recommend it. So I actually want to go into a particular case here. And I'm just going to introduce the set of humans that I look at. So these are half a million people. They are middle-aged, middle-class British people. They were recruited between the ages of 40 and 60, and they are now between the ages of 60 and 80. And in fact, they include both of my parents. So amazingly, I was studying at some level my parents here. I do not know, of course, who, which two of them out of the half million were my parents. But that will tell you immediately that this is slightly ascertained for healthy, healthy British, healthy on recruitment groups. They underdid a intensive half-day phenotyping on recruitment. So they spent half a day wandering around, getting measured in all sorts of different ways. Lada and Yorin and other things were taken then. They have all been genotyped. They are all being exome sequenced, and they will soon all be genome sequenced. So we have their full genome. And lots of other phenotypes have been measured on them, including 100,000 MRI scans, so imaging. And there's 18,000 of these available now. In fact, that's gone up since I wrote these slides. And I'm a little bit biased, because my parents are in this, but I think if you go around, you'll find that it's not so much bias, but probably the best human cohort in the world. The only other one I would highlight here, which I think is amazing as well, is the Danish healthcare records, which is sort of rather remarkably all of Denmark. But it's a really wonderful cohort. And what we decided to study here was an observation made actually by one of Europe's first scientists, Leonardo da Vinci, these beautiful pictures of the human heart, and actually cow's hearts as well. And in some of these pictures, he noticed that the human heart was rough on the inside. There was a rough surface on the inside of the human heart, in particular on the left-hand side of the human heart. This is a picture of that cutter. This is a human heart peeled away on the left-hand side here. And you're seeing this really quite rough surface of the heart, and they're called trapeculae. And we developed from the MRI scans in the UK Biobank a way of measuring this. And to do this, it's worth noting that the MRI scans are actually sort of 2D sections that are then reconstructed into a 3D section. So we take these 2D sections, and we're particularly interested in the left-hand side, which has a much rougher surface to the left ventricle. And because there are 18,000 of these things, one cannot do the next steps by hand. And so a very talented cardiologist developed a deep learning mechanism for recognizing the boundary of the left-hand side of the heart, and then tracing out the blood-muscle boundary, which is shown in this blue line here. And then we took a statistic developed actually in geography called fractal dimension, which aims to describe how rough a surface is. So the fjords of Norway are far rougher than the coastline of the Netherlands, and exactly the same thing can be applied to this inner surface of the heart. This is called fractal dimension, and it gives us a single number for each of these slices. And now I'm gonna skip over two years of work about getting this image analysis robust. And so it's two years, a clinical cardiologist, an image expert in deep learning, and then this fractal analysis. And this is some of the diagnostic plots that we use to ensure that the measurement works correctly. So I'm sure in your course, you're doing some of this, and please do, it's the kind of engine room of machine learning is doing this transformation of datasets into a set of metrics that you can use further on. Be reassured that we could transform these images into a view of how rough the insight was. And then we were able to do a pretty standard piece of GWAS, so let me just go back here. Just to say, across the heart, we kind of standardized on nine different sections across the heart, basal, mid, and apical, each with three sections. So we have nine measurements per heart. And you can see now the GWAS of all nine measurements here. So if you haven't come across a GWAS, this is actually incredibly straightforward statistics, which is a linear model of your measurement against a genotype association. The complicated thing, but it's now quite well established, is where you draw your statistical cutoff, and plenty of effort has gone into five times 10 to the minus eight, that's being an appropriate genome-wide statistic. What's being plotted here is what's called the Manhattan plots. On the X axis is every SNP that was tested in the human genome. It's about a million such SNPs in this scenario, so there's a million points on the X axis. And the Y axis is minus log 10 of the P value of the model. And the little line there is the threshold at five times 10 to the minus eight. And so what you can see here is that we have different places in the genome, which go above the line. And this characteristic business of having a number of points at the same place is due to the structure of the human genome variation, which is called linkage disequilibrium. It's actually a sign that your GWAS has worked out well. So you can see here that we've got some nice signal in slice two and slice three and slice four, et cetera. You could also see that, for example, that signal on slice two is echoed on a signal in slice three, but is not present by the time we get to slices seven and eight. There's a signal in seven and eight on chromosome three. It's just shifted to the end of chromosome three. So we have different genetics up and down the heart, but the dimensionality here is not perfect. We explored a number of ways of how to combine this. And actually we did a matter analysis. So this is where you consider each readout as being one way of estimating some underlying distribution called mTag. And this came up with a single discovery set shown here and the plot below indicates where the significant hits happen. And when there's an orange plot, it was significant only in the matter analysis and was never significant in one particular analysis. So those are the orange lines that go across, for example, on chromosome one and chromosome, further into, yeah, the two on chromosome one. So we have a number of different pieces of genetics that underlie this roughness in the heart, in different places up and down the heart. And I will not now bore you, but we went through replication to prove to ourselves and the pesky reviewers, reviewer three, that this was all kosher. And indeed we found that this replicated in separate cohort. And if you're a cardiologist, this are the places on the genome and the other phenotypes that this is associated with. And each one of these is a little story and a little drama in of itself. And for some people, you stop here. This is quite interesting. We have taken now a high dimensional dataset, these images of the heart, we've reduced the dimensionality into a lower space. We've analyzed that lower space with genetics and we've found some places in the genome that are associated with it. One could end the story here. We actually have an answer to the question, what is it actually doing? Why is our hearts rough? Why are our hearts rough? And one of the observations made by us and others is that people who have rougher hearts, so this fractal dimension FD, have better hearts in terms of the amount of blood that are pumped out. And that is a correlation between these two things. Rougher hearts on the left have better, are better pumpers, but we don't know which way ran. Maybe when you pump more blood, your heart becomes rougher, maybe it's a response, or rougher hearts are actually better at pumping blood. And there's actually a physical model of this that can be created and colleagues in the University of Padua and Milan created a physical model of the inside of the heart. And they suggested that there's good hemodynamic dynamic principles that suggest that rougher hearts should pump better. And my naive view of this is it's a bit like golf balls with their dimples. And their dimples seem to catch air and create a liquid-liquid interface, or sorry, an air interface for golf balls. Here the roughness of the heart might capture some liquid and create a liquid-liquid interface in that key pushing moment of the heart. So we have a suggested hypothesis that trabeculae levels impact cardiac levels. But remember the correlation and all of this dimensionality reduction cannot tell us the opposite, that it's not the opposite way around, that cardiac output inputs stroke volume. And more irritatingly, it can't even take another thing that there's a third factor that we've not measured, exercise or pollution as a trial, which both makes your heart rougher and means that your cardiac output is higher or rather for pollution, maybe it's the opposite way around. Pollution reduces the amount of cardiac trabeculation and means you don't have such great hearts. And we often see this in human data sets that those unmeasured factors that influence the things that we are measuring. This is where bringing genetics in really helps because we know that genetics can be associated with both or either. And then importantly, we know we can convince ourselves that genetics is not associated with many of these other factors in society. And the best analogy here is with randomized controlled trials. So in randomized controlled trials, you take 100 people, you randomize and very often you make it a balanced randomization. So 50 people get a drug, 50 people get a placebo and we measure the outcome. So I'm actually on a randomized controlled trial. I'm on the AstraZeneca Oxford vaccine trial at the moment for SARS-KV2. And I do not know whether I got the actual vaccine or a dummy vaccine which will protect me against meningitis but not SARS-KV2. So I am in one of these trials. We can think of genotypes as a very similar sort of aspect. So you take now many more people and the reason why is sadly the assignment of genotypes is not random, perfectly random. It's not 50-50. So very often on a genotype, you'll get 50 with one variant and many more with another variant. I am now skipping over the fact that we actually have two alleles because we have two parents. This all generalizes when this becomes a genotype with three options. So, but just to keep it simple, I've just imagined that we have only two alleles or haploid and then measure outcomes. The great thing about genetics, there's two great things about genetics is we don't have to actually administer this drug. So the genetics ends up being like a tiny drug with effects. But the other thing about it is that we can often find many different variants. So variants that associate with each of these, with the same aspect, but it's a different set of 50 people in each case in the sort of smaller arm or they only overlap by a very small amount. And this process of using genotypes as like a drug trial is called Mendelian randomization. And the way it's shown is by these plots and I should go through these plots. Maybe the bottom one is the top one, the PowerPoint screen grab did not work quite so well. So the bottom one shows each dot here is a SNP and I'm plotting, or my student in fact is plotting Hannah, is plotting its effect on one measurement. This is an eye measurement called interocular pressure and its effect on another measurement. In this case, it's an eye disease glaucoma. So up here, this is a SNP. I don't know, can you see my cursor? I don't know about that. We can, yes, we can. Up here is a SNP, which has this amount of effect on interocular pressure and it has this amount of risk on glaucoma. Now it's interesting that one SNP associates both with this endophenotype, this phenotype we can measure on the eye and on this risk disease, but what's even more impressive is that many SNPs that influence IOP, in general, they influence glaucoma, they influence it in the same direction. That's really important. And you can see that you can fit actually a reasonable linear model of all of these SNPs. The SNPs lie on a straight line. This is another example here where the endophenotype that we're using is lipid in chlorasterols, chlorasterol lipids, and the outcome here is coronary heart disease. Now in both cases, we happen to know that disassociation between LDL-chlorasterol, so again, each one of these, think of them as like a little drug trial where you're making, there's some people here who have slightly more pressure in their eyes, that's interocular pressure, and then they're slightly more likely to get glaucoma. On these SNPs up here, there are people who are slightly more likely to have high LDL, and therefore they are, well, not therefore, they end up being slightly more likely to have coronary heart disease. Well, the only is that true for one SNP, but it's true for all of these SNPs on aggregate. And the way we show that aggregate behavior is this meta-analysis, what this is showing now is being plotted basically is the slope of the line of the line for each SNP. So each SNP implies a particular slope, and the red points here are the best fit line, and there's a variety of ways of doing that best fit. You can just do the best fit, imagine it's a regression, that ends up being not so sophisticated because some SNPs have bigger effects than others, and you also get biased by certain, there's certain systematic biases that can come into these measurements. And these different ways of doing the meta-analysis are different ways of coping with systematic bias. And importantly, for these red lines, the confidence, so what each plot here is the point estimate is the slope of the line defined by that SNP, and then the black bars are the confidence interval. Of course, if the confidence intervals crosses zero, we're not confident that there is no association. You can see across the SNPs, this is kind of biased towards the top right, but there are some SNPs that actually show the opposite direction. When we take all the SNPs together, the meta-analysis is pretty confident that it is not, sorry, that it's not zero, and therefore there is association between LDL and heart attacks. Now, both of these are sort of positive controls because in both cases we have drug trials, in the case of LDL and heart attacks, they're absolutely nailed down that if you lower LDL, you reduce heart attacks. And in this case, there's actually a surgical procedure on interocular pressure that if you do this surgical procedure to relieve interocular pressure, you reduce, in fact, the progression of glaucoma. So this is also quite nailed down. You can see the meta-analysis here is good. I want to point out that this analysis works in the opposite way. This is meta-analysis, I'm showing now the meta-analysis for many SNPs in many different cohorts. For HDL, this is sometimes called good cholesterol. Now, good cholesterol observationally has been found to be counter-correlated to LDL, low cholesterol. And so people observationally, people who have high HDL have had less heart attacks, but we've never really known whether high HDL is directly protective for heart attacks or whether it's just a correlation with other aspects of lipid metabolism. Now, a number of drug companies made drugs to the HDL pathway and they performed very expensive trials, billions of dollars of trials. A number of people also did this genetic analysis, which was aiming to use SNPs in different places in the HDL pathway to estimate whether there was any effect. And what you can see here in the meta-analysis, which is this diamond right at the bottom. So the one means there's no effect. It's an odds ratio of one. The SNPs are neither protective or non-protective to heart attacks. And you can see rather depressingly that it is bang on one. Just to cut to the chase, the three billion pound dollar drug trials also discovered the same thing. That is drugs that change HDL do not affect heart attacks. So now going to my original question. So back to this question here. Is it this way or is it this way? Do you want to know? Or is it this? We were able to do the same analysis here between fracture dimensions and stroke volume and later on in a different phenotype called QRS. Now it's not as good as interocular pressure or LDL. So that these straight lines are convincingly not zero. And that is what these red lines show you here. It's not quite as beautiful as LDL or glaucoma, but it's convincing that this association is really the case that the trabeculi difference leads to changes in stroke volume. And there's a different phenotype that we measured or other colleagues of ours measured about electrical conductance. And there is reasonable good evidence about this case, though it's actually down to pretty much one snip drive set, oops, yeah. So I've just talked to you about hearts and there's no reason not to do this on everything. So I jokingly, I will do this on any part of the human physiology and my current student is working her way through the eye. And there's all the different ways of measuring eye morphology also in UK Biobank. And this is a picture of some of the different parts of that eye and here are the GWASs. And in this case, actually, she is showing that these endophenotypes are not involved in some of the classical diseases that they were thought to be involved with. So the opposite way round where we sort of disprove our hypothesis rather than improve hypothesis. So what was this piece of machine learning? It was biology. We developed a robust way to measure cardiac trabeculation and we found many low size associated with these anatomical features. And from machine learning or multivariate statistics process this involved an awful lot of machine learning and multivariate statistics. But we are able to go beyond the descriptive view of this towards a causal view by using both physical modeling and medelian randomization. We have strong evidence that these anatomical structures have a role in cardiac output and electrical conductance. So just to acknowledge the people who contributed to this. This was work done by extremely tarnated student in my lab. She now runs her own lab at Coltsman Harbour, Hannah Meyer. But she worked with a great set of colleagues led by Declan O'Regan and Stuart Cook in Imperial College London with many patients being measured through the NHS here as well. And I'd highlight Antonio DiMarbo and Timothy Dawes who spearheaded this analysis. And that goes to another aspect of this kind of machine learning. It is definitely a team sport. Individuals here very rarely have all the knowledge and expertise. So some people may be good at the statistics but less so the data production and data management. Some people are good at the data management but less the logistics of the people coming in and out or the measurements. And some people are good at the measurements and not so good at the statistics. And you really need a team to deliver some of this. And this has been a great and wonderful team. So thank you very much. And I'm very happy to take any questions. Good. Thank you. We have two ways of asking questions. One from inside the network is by raising your hand here in the Zoom channel and the other is through Slido. This is for the audience on YouTube in case they want to ask a question. So is there a question from within the network at the moment? So maybe I'll start with the question then. So you emphasized this interpretability and understanding very much. So that's a bit at odds with the current trend or you could say hype in machine learning, this deep learning. So do you see or which role do you see for deep learning? So I, like many people, so I have made my piece with deep learning. So I think if you'd asked me that question a year ago or two years ago, I would have said it's so, we're really misleading ourselves by having these sort of black box, uninterpretable in some sense that had their high dimensional non-linear models that transform one space to another. But so there's two things. Irritatingly, they really work really well. So that's one really annoying thing. And so image analysis and other things. And then colleagues in, you know, Oliver Stegel and Anshil Kunjai and Ian Holmes and who's that friend of course from Harvard? Oh, I've got his name. No, I'm so well. They've all done some very clever things where you can do two different things. The first thing is you can often embed a deep learning model inside a more interpretable model. So then the deep learning model becomes a kind of way of modeling, you know, something which is really quite hard to capture otherwise. So splice sites or these other things. And then you've forced the deep learning model to sort of output a probability distribution. And then the other thing happens where you can put little phylogenetic trees or little mechanistic models inside the deep learning model. And then there's like a whole class of cases where you have little parts of your deep learning model have tiny little bits of mechanism in them because you understand how that little bit of the world works in much more detail. So I don't see there being now a big gap, a big difference between deep learning versus non-deep learning. It's really a question of modeling and appropriate modeling in appropriate times and places. I'd like to emphasize that both of those don't get a causality. So, you know, interpretability is in some sense is the lower dimensional space, can I talk about the lower dimensional space? That's one way of interpretability. But actually causality is a different thing from interpretability, which is, do I understand what I need to change to change another part of the output of the model? And again, both mechanistic models and deep learning models don't magically give you interpretability. I mean, it's easy to think about causality in mechanistic models. But it's not a given. It's not by definition because you've used a mechanistic model. Will you have made causality components? Chloe has a question. Please, Chloe first, and then Lukas. Okay. Hi, Ewan. Thank you for the talk. So I have a question because you showed several hypothesis and then I think you only dissented gold two possibilities. So you had, is this heart... That way or that way, yeah. Is this influencing heart disease or conversely? But you also had the possibilities that there was a hidden factors that was influencing growths. And then you're able to say anything about that? Yeah, actually, in Mendelian randomization, what it actually does is it takes away the hidden factors in many ways. Okay. So in some sense, the harder thing is actually going to have the direction of the arrows. But in a lot of these things, there's an easier... Once you've decided these two things are causally linked, that truly one of these things is driving the other side, then usually there's a different thing which is a temporal progression of these things. Yeah, okay. Which is that very often disease happens after you can measure the physiology. So it really can't be the case that the disease trivially influences that. There are some times when disease is the causal factor. But then you see the temporal direction in the other way. You see the... Okay, yeah. Then it happens the other way. So I skipped over that. There is a technique which is basically which way does the correlation stronger called MR-STIGER to work out which way the directionality goes. But actually the kind of logic model that is... Well, it's as important to think about the logic model of causality here for this. What MR really does well is it excludes other factors which is very, very complicated in observational humans. Okay, thanks. It's my pleasure. Thank you. Then there is another question by Lucas Miranda. Lucas, please. Hello, everyone. Thank you very, very much for your talk. It was very interesting and clear. My question is this one. As you mentioned, the UK Biobank is one of the best. It's not the best port in the entire world right now. I was wondering if you're aware of how much effort you're putting into the generalizability of the models and the analysis today. Yeah. So I think that's a good question. This is a great thing for the UK that the world's biological machine learners come to play around with healthy human Brits or not so healthy now in some cases, sadly. I think there's two sides of that generalizability question. So one is a kind of human biology generalizability and actually humans are pretty much the same all across the world. So that one is sort of reassuring. We're a very, very tight species. I have rarely seen something which when it's grounded in mechanistic cellular and molecular biology does not generalize. In fact, I've never seen it. But I do think one does need to caution some of the things. And those are the things that are associated with societal factors. You know, human society is much more varied. And then the peculiarities about how a particular healthcare system works. So the NHS works in a particular way. Some was completely different debate about whether it's appropriate. But the important thing to realize is that it won't automatically go to somewhere else. So just to say that for some, particularly some diseases that the sort of diagnostic pathway can be quite different, actually, in different countries. So you do need to worry about it. But actually those are, for me, a second order of facts. It's totally fine to make a lot of your discoveries in the UK Bar Bank and then to kind of bring them back and try and work out whether they are still consistent in your local setting. So there you can leverage the measurement and discovery power in the UK and bring it back locally. And I actually think we don't do enough of this. There's a kind of myopia sometimes that we have about these things. Thank you very, very much. Thank you. So now we have a question on Slido that I'm reading out by Rahul Tiaghi. You talked about either confirming or rejecting known hypotheses for common conditions. Are there cases of novel hypotheses that can be confirmed by others? That's the Slido question. I mean, I think the universe of hypotheses in this world is very, very big. So in some sense, it's always... In some sense, you can, this data leads you to creating new hypotheses. One does need to be slightly careful here. So often to create a new hypothesis, you get inspired by a piece of data analysis, exploratory data analysis. It's really rather important that you do not use that same data set to then go off and confirm your hypothesis. So you do need to sort of be kind of clean about how one goes about doing this. I think that's less of a problem in cohorts the size of UK Biobank, because you can just split the cohort in half and do some things like that to convince yourself that you're not misleading yourself. But there are many other cases where you need to be careful about where you do your hypothesis generation versus where you do your hypothesis testing, even if it's sort of logically a very similar sort of test. And then, you know, at some level, there's some attractive thought about trying to write down every single possible hypothesis in the world. And then we kind of do some Bayesian estimation over all of them. But actually it ends up being a bit unfeasible and just not the way science... I mean, it's just... You just can't do that hypothesis enumeration very easily. So this goes back to thinking about this is where do you do your hypothesis generation and where do you do your hypothesis testing and understanding how you do those two different things? Following up on this thought, so you presented these genome-wide association studies between an image phenotype and then all SNPs in the genome. Have you seen a convincing study where this is done the other way round where you restrict the genetic space to SNPs that are, for instance, involved in i-morphology? And then you look for image phenotypes that are associated with these SNPs. So have you seen this in the other way round in any... So that way of kind of turning the axes where you look at a particular SNP and you look across phenotypes is often called FIWAS in the field, phenotype-wide associations. And in general, we tend to... We do a lot of this when we do it. Actually, practicing geneticists are, and I'm definitely part of this, a little bit allergic to restricting our genome hypotheses too much. And the reason why is because, you know, one of the beauties of the genome for this is that it's kind of all-encompassing. I mean, it's not perfect because we don't have variation in every single part of the genome at the frequencies that we want to see. Okay, so it's not a perfect thing, but it is kind of one of the most unbiased ways to do these hypothesis generations. So the way of kind of taking a SNP and asking what are all the other phenotypes it's associated with is commonplace, but it's built up from, of course, many from the full matrix, all SNPs, all phenotypes. And that full matrix is sort of at the heart of some of these big browsers and called... And cutting it out... But even if you consider all patches in an image, so that would be a very big matrix. Have you seen that? Yeah, I mean, I haven't seen that. I mean, that goes to something that we have tried a little bit with the eyes, which is trying to use the genetic space to help you understand the dimensionality reduction problem. So I think that's an interesting, I think there's some really interesting research to think about this, where you say, I am convinced that these seven SNPs work together. Now, which patches of the genome, not genome, the image, maximally associated with these seven SNPs can happen in some specific way. And so you can think of this as this dimensionality reduction problem. So what is the smaller space that represents these SNPs rather than this other way around? I think there's a lot to be done there. And with Hannah Meyer, confusingly, both of my students are called Hannah. So with Hannah Meyer, she explored this a little bit. One of the slightly irritating things is that once you're powered to do it without clever dimensionality reduction, it's actually easier to do it with little, without sophisticated dimensionality reduction. So that's what I found with my current student, Hannah, current, we explored this. And at the end of the day, we just said, I'll sort it, we'll just use what the clinicians use, which, and it makes the conversation with the clinician so much easier because you're not trying to explain to them what you're measuring and stuff like that because they know what you're measuring in these things. So I think there's a rich thing to get into there. And I've touched it, but I encourage all of you to get stuck in because I think there's a good space, there's a good set of questions to be asked there. I agree. And I quickly checked there are further questions from Insight Network as seen no raised hand and on Slido Nider, we have had a longer discussion. We thank you very much, Jun, for presenting your work. That was very exciting. Thanks a lot. And we send you a round of virtual applause for that. And thanks again for joining our summer school as a speaker. I'm, thank you very much for inviting me.