 Good afternoon and a very warm welcome to this edition of the Thijsig Talks. We are broadcasting again live from the pre-building in the vibrant and lively Spoorzone. Our team of today consists of my co-host, Pieter Spronk, and our technical support team Maarten, Priscila and Florence. And my name is Mirjam Siesling. We are very proud and honored to stage two speakers today, Marijn and Boris. And Pieter will introduce them properly later. Now without further ado, over to Pieter to introduce our first speaker. Pieter, the floor is yours. Thank you, Mirjam. First speaker is Marijn van Wijnaldum, he's an assistant professor at CSEI in the field of computational psychiatry. He previously was the head of the social rodent lab at Heinrich Heine University of Düsseldorf, where he investigated social decision making in animals, probably very small animals. Already during that time, he specialized on recording electrophysiological signals. That was one of those tough words, you know. And also the intriguing ultrasonic vocalizations that are transmitted between rodents during social interaction. He is also famous because he's the founder and president of the Dutch brain Olympiad, the Stichting Herzen Olympiade Nederland, from 2018 onwards. In his current research, he focused on the analysis of neural activity of bigger animals, namely humans, to predict their social behavior. Today he will talk about data science in healthcare. And you have the floor. Thank you very much. Thank you, Pieter and Mirjam, for the introduction. And yes, there is a connotation between rodents, especially rats and healthcare. Of course, it's usually a bad one, but I won't be talking about rats in this talk. Rather, the talk is about data science applications to clinical data, so data collected by GPs or by hospitals in this case. So just to briefly illustrate the thinking behind this field of work, is that the promise of using AI in the healthcare field is that we can arrive at precision medicine, namely identifying which segment of the population works. So illustrated here, you see the typical sort of treatment as usual. A particular list of treatments is worked down in a particular order, based on what generally works for most people and possibly also by their side effect profile. And if the most preferred option doesn't work for a particular person, you just go down the list until you get to one that hopefully does work. Now, ideally, we would rather predict which treatments, either medicine or maybe psychological treatment, for example, psychiatric treatment, which treatment would work for which person so that they don't have to wait for the appropriate healthcare that long. And in order to make those predictions, we would arrive at, look to AI models, machine learning to make those predictions and just arrive at the personalized healthcare. So there's opportunities for AI in healthcare. There's usually large amounts of data that are stored in hospitals and other care facilities. Some of it is well structured, for example, imaging data like CT or MRI scans. There's also the international classification of diseases that provides a taxonomy, sort of this very clear, which patient suffers from which condition. And there's a large opportunity for what I would call smart triage. So trying to figure out which patient is most in need of particular care. And this can be implemented in decision support systems so that you can maximize the benefit of care with limited resources, like staff, but maybe also technical resources. Now, there's also a lot of challenges, of course. A lot of the expert knowledge by clinicians is actually hidden, so not immediately present in tabular data. It might be hidden in unstructured clinical reports or in a translation that has been made by the clinician to go from what they see at the bedside into a classification scheme that is then used in the machine or in the reporting. And that might not accurately capture data. There's possible data ambiguity when data has been entered later in the day, maybe with some hindsight bias. And usually it's pretty difficult to translate findings from one hospital to another, for example, unless the scanning apparatus, for example, is very well and the protocols are very well matched. But this is quite a challenge. So out of sample validity, it's difficult to achieve. I wanted to talk about the weak air collaboration started by EDZ Hospital and Tilburg University under the impact program, which basically brings together clinicians and AI researchers and others to identify areas for collaboration where data science and artificial intelligence might benefit clinical care. And this is one of the sub-fields that is highlighted under the weak air collaboration. What I will be talking about is two projects. One is using artificial intelligence to predict the outcome after trauma for weak air. And if there is still time and data-driven clustering of psychiatric symptoms in a population cohort that was collected in data that was collected by the Rotterdam study and Erasmus MC. And then we also have a shared project to discuss at the end of this talk, this session. So the first project is about predicting self-reported outcomes after trauma. So we know that the mortality after trauma incidents has dropped sharply with the advent of better care. But still there is a large difference in recovery profiles between individual patients. So how well they recover after trauma. And this is captured in subjective quality of life ratings that address both subjective physical revalidation and psychological recovery, basically. And the goal of this project would be to see if we can predict how well a certain patient recovers in order to provide them with information at the beginning of the trajectory, both to set expectations and maybe to inform some shared decision-making. So the data set that we use for this is the Brabant Injury Outcomes Survey and Study, a large data set that's used long-term data collection in collaboration with Trauma Registry and AZP. And there are five follow-up samples after the initial hospitalization in one of the medical centers in Brabant. The measures are self-reported scales, the Euro QOL five-dimension score and health utility index. And there is a depression and anxiety scale as well. On the slide below you can see the co-authors on this project, which is... So this is the description of the study. This is the protocol, so you can see here the five measuring moments, basically, when all of these questionnaires were taken from the participants. And all in all, about somewhere in the neighborhood of one-and-a-half to 2,000 records are retained across the five sampling points. Now, previous approaches looked at predicting the precise values of these self-reported scores with varying success. So R2, R-squared scores range between 36 and 48% overall. And this was on the physical indicators only. And in this approach, what we wanted to do is to see if we can actually detect patterns in all of these variables that were measured at the same time on different intervals after Trauma jointly. So by clustering on the longitudinal data that consists of four plus two outcome measures. So what we want to achieve is to have a limited number of clusters that we can then use as targets for prediction in a supervised machine learning approach to see if we can predict those from baseline measurements and demographic data. So for longitudinal clustering, we used a couple of methods. One of those is KML3D, an extension of K-means clustering. A Bayesian method called HD-classive, a Gaussian mixture model and also a longitudinal out-of-encoder. And here's an example of how to make a decision on what is the optimal number of clusters. So one of the ways to make sure you can do that is looking at the GAP statistic. But also the Bayesian information criterion, BIC, is sometimes used to pick and to this gap. So select the optimal number of clusters for your clustering solution. So here in the bottom, you can see the co-authors that contributed to this paper that has now been submitted. So here are some of the recovery profiles. We see an example here with six clusters and the scores range from, so they take into account a pre-injury baseline. And then you can see recovery over the six measurements, measurement moments basically, with quite some disparity between outcomes between cluster one with the best outcomes. So near complete recovery versus cluster six that has rather poor outcomes. On the right-hand side, we see a profile for HUD's anxiety scores. In this case, lower values are better. So on the left-hand side, we see a group that didn't have much anxiety but actually also reduced their anxiety after injury. And the top right-hand side is a group that had quite severe anxiety and actually didn't improve at all. Over the course of this measurement period. So there's quite some disparity there. So we use supervised classification for cluster membership. And so here are some some example to this ambiguous between different or make a selection between different kinds of models. We looked at models that have relatively high accuracy, also have a high clinical sensibleness and a lone majority baseline. So the groups, the clusters that we find need to be sort of evenly spaced across the sample so that the different clusters are actually meaningful. And especially this meaningfulness was also discussed with the help of clinical experts. So here we see visualization of our approach. We first selected plausible models based on criteria such as the gap statistic and BIC and then applied machine learning metrics for supervised classification. And then we asked the clinical experts whether they thought that the different clusters that came out were actually clinically sensible. What we mean by that is that today conform to known risk factors, for example, for known for subjective recovery profiles. For example, having a hip fracture is considered a really bad or a predictive for bad outcomes in comparison to those people that don't have hip fractures. Now we see here a visualization using these stochastic neighborhoods embeddings. This is kind of a bad example. We see that the clusters are not very well separated at all. And there are also solutions where this was much better. So clinical sensibleness is here explained a little bit more. We can see on the right hand side this arrow showing that between the different clusters, so the different rows in this table, we actually see the percentage of hip fracture change quite a lot. It's going from sort of the best outcomes in cluster one to the worst outcomes in cluster six. And it's also recapitulates an axis on age. For example, we can see that as well. So the mean age is increasing in cluster clusters and the number of comorbidities as well. So this clustering solution actually recapitulates a lot of the known risk factors and therefore was judged to be clinically sensible. So the conclusion here is that we can actually do longitudinal clustering of patients reported outcomes. And we can predict that with reasonable accuracy from initial hospitalization data that includes both demographic data, but also pre-injury health scores. The clinical assessment is essential for sensible clusters. So we really depend on our clinical partners here to pick the best models, basically. And now the challenge is to go towards implementation. So the goal would be to make a dashboard maybe for this visualization that can be used to set expectations for patient recovery, both by the patients and by the clinician, maybe discuss potential interventions if there are risk factors that are predicting a bad outcome. Maybe you want to start with psychological treatments or care or physical therapy early on. And we think that, for example, reporting the probability of ordered cluster membership would be a good way to do that. So give like a probabilistic forecast of where what the expected recovery profile would look like and then basically engage in a discussion with the patient what they think about that. The second project, if I am good for time, I will look at the moderator. Yes, we have another project which also looks at clustering. But in this case, low maternal clustering of psychiatric symptoms in population cohort. So this is based on the Rotterdam study, a project that is called ERGO as well, where a large neighborhood in Rotterdam was evaluated in many ways and there's a lot of information on that. So the problem with classification of psychiatric illness according to DSM is that usually it proceeds by scoring a number of symptoms that are being endorsed by the patient or by the clinician. But to arrive at a similar diagnosis, you can actually have quite disparate groups of symptoms that bring you to the same diagnosis, for example, of major depression. So there's a problem there that people who have the same sort of outcome, clinical outcome, can actually display rather different symptomatology and also symptoms can be shared by diagnosis. So we want to see what would happen if we just look at the population and see how those symptoms that are being assessed co-occur naturally and see if we can find clusters that are maybe a little bit different. So what we did is we took symptoms from three validated scales that have been assessed by the people looking at the Rotterdam study and performed hierarchical cluster analysis. And the summary of this is that actually those items from the different sub-skills do not cluster nicely in their own scale, but rather interleave and mix. So we can see that the similarity between cluster endorsement and between symptom endorsement actually transcends these questionnaire boundaries. So some of the anxiety, sub-skill items cluster better with depression items, etc. Now, when we did that, we arrived at these profiles for different participants. And actually, let me just go to the conclusion here is that we found that when we looked at patients who have a known diagnosis, that those diagnoses actually transcend the boundaries that we can find here generated by data-driven clusters. So the data-driven clusters basically looking at really the co-occurrence of these symptoms in the population really do not match nicely with the boundaries of traditional classification of diagnoses, basically. We can see that here, the red patients are those with anxiety and they are basically all over the place in this cluster space. So the conclusion here is that there might be a need for epidemiologically defined clinical sub-groups. So maybe there are sub-groups to be found in the anxiety and depression cluster that can be defined by their epidemiological profile. And they might actually respond to different treatments, for example. We don't know this, but this comes back to my first slide. Maybe there is a personalized treatment for those sub-groups as well. There are some open issues that we don't know that much about. One of those is, for example, whether these clusters are stable over time. And that relates to our second speaker also, maybe the analysis of these profiles as time series. So stay tuned for that. And I want to thank my collaborators at the hospital and at the Rotterdam Erasmus MC for collaborating here. Thank you. Thank you. No, let me ask a quick question, because I see there's nothing yet in the chat. But if anybody wants to ask a question, please raise your hand or type it in the chat and I will communicate it to Mareen. But I was looking at your first set of your first research. And what I see in there is a lot is about treatments. So you try to find the best treatment for people who start at a certain stage in an accident or in something where they need to be treated. But I also saw that, well, for many of them, so you have to see six clusters and these clusters, they start at a certain point and many of them start at different points. So that means that you can already early make a prediction on where this will go. But for some of them, that wasn't the case, they started at the same point, but only after the third measurement or so. They started to diverge and they turned into different clusters. So does that mean that they started already out with the wrong treatment or what was happening there? So how early can you make such a prediction where it goes and would it still be possible to make adaptations to it? Yeah. So what I didn't show here, but it's kind of interesting, the more data you take in, so that you could start in with the pre-injury data, but then also the first week and maybe the first month, the more data you take in, the better the prediction becomes, obviously. But then it's less useful because you're already down the trajectory somewhere. So the most interesting cases are those clusters where they start off at a similar, reduced recovery profile and then start to diverge, like you mentioned. And so in order to capture that divergence, we actually need to observe the entire time series here. So for a new patient, we don't have that data, but we do now have this supervised model that basically assigns a probability of this person ending up in a relatively well-recovery profile versus maybe a poorer recovery profile using only the data that's already available after the initial hospitalization. So it's in time basically to still make changes.