 So welcome back, also on YouTube and on Moodle for the people who can't watch it directly. I will put the answers online and I will also add some more description to it so that you guys can look through it. Just relax and make sure that you understand what's going on. But there's two ways of doing this whole analysis. The first way is just to make one big linear model and put everything in. The other way is to first adjust one of the effects out of your data and then test the remaining factor to see if it still has an effect. Alright, so let's switch to the PowerPoint. So we did the assignments. Next up is a slide in Dutch. The lecture today is based on a tutorial by Bodo Winter. He has two tutorials. He has a tutorial on linear mixed models and a tutorial on standard linear models. So that's his first tutorial. And both tutorials are worth reading through. So that was it and I will do it in English as well. The lecture is an adaptation of the introduction by Bodo Winter. He has a very, very excellent tutorial. So it's worth reading the whole tutorial. The tutorial actually is not like something that you can just skip because the PDF is 22 pages. So I want you guys, and this is the homework for today, to read all of the 22 pages and understand what's going on. But for you guys, I compressed everything into 29 slides. So just going through the highlights, but read the PDF. There will be at least one question about the PDF on the exam. Ask questions. As always, if you have any questions, then just put them in chat and say, stop, I don't understand this. And then we will explain. Since I'm having holidays, there will be no Tuesday meeting next week. So sorry, Daniel. I know how much you like our one hour chatting when no one else shows up, but we can't do that next week. So we'll have to figure out how to do that. But I don't think that there will be many questions this week because the PDF is really, really good and it's really in depth. So linear mixed effect analysis, right? So for today, we will start about, we will talk about linear mixed effect analysis. We will do, and I will explain to you what random effects are. I will show you how to do the main R and how to get the significance out of these models. And when we are talking about random effects, we're talking about two different types of models. So we are talking about random intercept models or we're talking about random slope models. And I will try to explain to you guys what the difference is and when to use which one of these models. So basically, linear models are modeling a relationship, right? So you have your response and we have a certain predictor or a set of predictors. So the tutorial of Bodo Winter focuses on pitch, also called frequency. So pitch is kind of what you do with your voice. So you can have a very low voice or you can have a very high voice and that's the pitch, right? So when you do a recording, then based on the recording, you can see if someone has a very low voice or if they have a very high voice. So the data set is here and you can directly load it in R. It's one of the nice things about R. You can directly load data from an URL. So you don't have to download the data set. It's not a very big data set, so you could just download it. So again, the high voice, please. No, I'm not going to do that. But that's the difference, right? So there's a pitch that you normally speak on and you can speak on a higher pitch. I would like to redeem 5000 points if you do the next slide in a high pitch. No, I already did it a couple of times, so we're not going to do this anymore. But it's funny, right? But the whole paper, the whole PDF, 22 pages focuses on the relationship between pitch and different covariates or different predictors that they have, right? I'll add another 5K for low voice. You're already using my low voice, because it sounds more manly when you use your low voice. But the data set is structured like this, right? So I downloaded the data set, I opened up in Notepad++ and you see that there's something called Subject, which is just a person, because they're just interviewing different people. Of course, a person has gender, which is of course a misnomer, because it should be sex. Sex is your biological thing, right? You can only be male or female, but talking about gender, there are many different levels. So there's a scenario and in the data set that they are talking about, different scenarios are different questions that they have people do, right? So the idea is that someone comes into a room and then they get a list of questions and then they have to say these questions in two different settings, and that's called the attitude. So the attitude is you do the questions in a very polite setting, like everyone's wearing a suit and you have to sit in a table and these kinds of things, and there's a very informal. So Paul means the very polite setting, and then you've in, which is the informal setting, so that just means you walk into a room and you read the question, and there's only like one person around, but this person is friendly or you know it. And then of course we have the frequency, which is the response variable, which is the pitch of the voice. So very basic data set, it's very small, but it's a very interesting data set to model, because as you can see already, we have female one who did question number one, but we have the same female one who does all of the different questions in different settings with different pitches, so there is repeated measurements in here. So we need to use linear mixed models to deal with the fact that there are repeated measurements. So the first linear model that they introduced is just a basic linear model, right? So the hypothesis of Winter in Gravunder in 2012, they talk about this more explicitly in the PDF, is that the frequency, so the tone of your voice is dependent on the setting that you are in. So if you are speaking to the examination board, your voice will be different than when you're speaking to your friends. That was the hypothesis, right? So the attitude in this case was treated as a categorical factor with two different levels. So formal setting versus an informal setting, and the idea was is that this setting will change your frequency. Of course we have to extend this linear model, right? Because we have males and we have females and everyone knows that males on average have a much lower voice than females, so of course we have to include the gender into our model. So the most basic model that you can come up with is that the frequency of your voice is dependent on the setting that you're in and on if you're a male or if you're a female. But of course now things get a little bit more complicated because in the way that they did their modeling, they have multiple measurements per subject, right? They don't have just one question, they have multiple questions. So you have the same person going into a formal setting, then you have the same person going into an informal setting and then having to ask a couple of questions. So by design or the way that they set up their study, there are multiple measurements per subject and that means that we cannot use a standard basic linear model. We have to use a different type of model because there is a grouping factor in there. So the random effects come in. So every person has a slightly different voice pitch and this is going to be a factor that affects all responses from the same person, thus rendering the different responses interdependent rather than independent. And this is just because if I'm speaking, my voice is different than someone else's voice. So if someone else's voice is, for example, 233 hertz average, my voice could be a little bit lower, like it could be 210, right? And this is the example that they give. So subject one may have a mean voice pitch of 233 while subject two may have 210, right? And this is very logical. I think that everyone understands this setting is that different people have different voices and you can't just compare the voice of one person to the voice of another person. You have to compensate for this kind of effect, which is a random or which is an interdependent effect. It's not an independent effect, right? Because the same person has the same voice and you can't really change that. So there is something which is user or which is person specific. And of course, when we plot the data, so the thing that I did here is to just take the different subjects. So we have nine females that participated in the study. We have seven males. And then here we see the mean pitch. So the mean pitch is just the average frequency or box plot of all the frequencies that were measured for all of the different individuals. And so we can see that there's the females, which on average have a much higher voice or a much higher pitch than the males. You can see that there's this one outlier where one of the females had like a voice which was more or less a male voice and also that occurs, right? Because if you're doing research and you have people coming in on different days, doing different research studies, then it's not always going to be exactly constant. And this is just the day at which someone had a cold, so their voice was much lower. But you can see very clearly that females have a higher voice than males. But what you can also see is that even for males, they don't have the same average pitch. So male one has a much higher voice than, for example, male five. So there's a big difference there. And of course, we need to compensate for this in our model when we are doing these kinds of models. So we need to extend our model. So we model the individual differences by assuming different intercepts for each individual. So we talked about intercepts. So an intercept is where X is zero at which point does the line cross Y, right? It's kind of the mean of the data set for a certain individual. So you can do this in R by building up a model which looks like this. So you're saying that the frequency is based on the attitude. So the setting that we're in, formal or informal, it's based on if you're a male or a female. And there is a subject specific component. So I say one by subject, right? So the first part is the slope and the second part is the intercept. So I'm saying that every individual has the same slope, so the same kind of directional coefficient. But the average speaking voice of every person is allowed to be different. So the first thing that the model will start doing is calculate the mean for each subject and then it will then do all of the other effects relative to the mean of that subject. And this is more or less how you would write it down as well when you are writing a scientific paper. The study is a little bit more complex, right? Because we have different questions. So similar to the case of by subject variation, we also expect by item variation, right? The same question for different people, it might be that the same question has a difference in pitch as well, right? It could be based on the amount of consonants in there or the amount of like O's or A's, right? Because an I is of course much higher than an A in a way, right? So Heather might be something special about excuse me for coming too late, leading to an overall higher pitch compared to asking for a favor. Because the A and the O are relatively low letters, but the I is a relatively high letter, right? And this is of course irregardless of the influence of politeness. And this might not be due to which setting you are in, but this is just due to the fact that every question has different letters in there and these different letters all carry their own pitch. So the average pitch across the whole utterance is probably different from a voice thing like excuse me for coming too late compared to asking for a favor. And when we plot this, so here we see the different items which are the different questions. So they ask every person seven different questions and you can indeed see that even if we regard the fact that there are males and females and that everyone has a different voice, you can already see that for example question number two on average has a much higher pitch than question number seven. So there is not just per individual variation, but there is also per question variation. Is that clear that there are two sources of like variance which we kind of want to control for and we're not really interested in? Because these two things are more or less nuisance variables, right? We talked about nuisance variables and generally you want to kind of block these out or you want to randomize them away. But if you do a study like this where people just come in, there is no real randomization that you can do and there's also no blocking that you can do to kind of get rid of this effect. So you just have to compensate for this effect in the model. So how do you do that? Well, we just account for this thing in the model. So we say that the frequency of your voice is based on the attitude that you are speaking in. So a formal setting and a polite setting, your gender, of course every person is allowed to have a different mean voice, but every question is also allowed to be having a different mean compared to other questions. Alright, so this is then the more or less extended model that we have for the data set that was provided. So in R, there is no default support for linear mixed models. R has support for linear models, but linear mixed models are not supported by default. So you need to install the LME4 package, so the linear mixed effect package, version 4, which is the current version, might be that soon there will be a version 5, but this package provides a single function, and this function is called LME-R for linear mixed effects regression, and this function is very similar to the LME function. So if you know how the LME function works, you can just use the LME-R function. So how do we now build up this model in R? Well, first thing that we need to do is we need to load the LME4 library. You have to install it as well. If you haven't got it installed, then you have to do an install.packages LME4, but if you have it installed because you only have to install a package once, then you can just use library LME4. So the first thing that we're going to do is just read the data set online and put this into a variable called politeness. So that just has the data compared to what I just showed you. And then we want to boxplot, right? So the boxplot that I just showed you, which is in the gender, is one of the standards. If you get a data set, do some plots to see how the data looks like. So the first thing that I wanted to do was just do the basic model right where I say that the frequency of the data set is dependent on the attitude, so on the setting and based on your gender. But if I use the LME-R function, it will tell me that I cannot do that because the LME-R function is only for linear mixed effects models. So you have to have a random effect term in your model before you can use the LME-R function. So if you're ever in a situation and you see like no random effect term, then that means that you just forgot to specify it. So we can specify the random effects here, so we can say one by subject and one by scenario. So then we just say LME-R, the frequency is determined by the attitude plus the gender of the person, and then we have the personal specific variance, so the variance where for every individual, we allow every individual to have a unique voice and we allow every question to also have a unique frequency. And of course I specify that use the data set politeness, which I just loaded in, and then I'm just going to make a summary of the whole model that we have. So if we do this in R, then this is how it looks like. So it tells me that we have a linear mixed effect, a linear mixed model, fit by a reduced maximum likelihood, by a remel, and then it gives me back the formula. I left out sex, because with sex it's a little bit different, but I just left it out here just to have a slightly shorter, otherwise it wouldn't fit on a single slide. So the reduced maximum likelihood converts that this number, which is not that important, it shows you the residuals, which is also not important. The thing which is important here is the fixed effect, right? So we can see that for the random effects, random effects cause some variance, so you can see that the scenario catches around 219 points of frequency, while the subject catches around 4000. So you see that there's a big difference between different people and that the difference between questions is not that big. It's only like 1 in 20, but we already saw this from the box plots as well. It will tell you now the number of observations, and it will tell you the groups. And this is relatively important, because the number of observations is the amount of degrees of freedom that you have in your model. So in total we have 83 observations, but these 83 observations are grouped into 7 scenarios and 6 subjects. So we're only looking at a single sex, right? Because I left sex out of the model, I'm only looking at a single sex, and I'm looking at the males, because we have 6 males and 7 females. And here it will now tell you the fixed effects. So again, very similar to a standard linear model, it will give you the estimate of the beta coefficient, so it will give you the intercept, right? So that's the average voice after correcting for the individual and the question. And then we have attitude poll, so this is attitude and then the polite setting, and what it shows us that in the polite setting, your voice seems to be 19.6 or 19.7 hertz lower compared to the informal setting. So that means that when you're speaking to something like the exam committee, your voice tends to be lower than when you are talking to your friends. And then it also shows us the correlation of the fixed effects, which will become interesting when you have many of these different fixed effects, because it will show you how much correlation there is, which is a measurement of how linearly related the different fixed effects are. But in this case, there's almost no correlation between the attitude and the intercept. So how do we now get the significance, right? Because the summary here, it only tells us the T value, it tells us the standard error and the estimate, but it doesn't give us any probabilities. And this is very different from linear mixed effect models compared to linear models. In linear models, because there is no random effects, we can directly use our T statistic to go to a P value, like we saw before, right? Because we have the standard error, we have the estimate, and so we can calculate the critical value, and using the critical value, we can find the probability, so the P value that belongs to this effect. But this is not how it works in linear mixed effect models. In linear mixed effect models, you are always doing model comparison. So you're comparing one model versus another model, right? So if we want to get the significance, we have to define a null model. This is the model without the factor that we are interested in, and then we compare this to a model which has the factor of interest, right? So here I see my L, M, E, R, which is gender, and then the other, and the two random effects, and I want to compare this to the model which has attitude in there, right? Because the difference in these two models is going to tell me how significant the effect of attitude is on the frequency. Another thing that you have to remember is when you are comparing two models, you have to set the restricted maximum likelihood to false. So restricted maximum likelihood is different from maximum likelihood in the fact that it only estimates the total variance explained once, and maximum likelihood will do that multiple times. It's not that interesting. Actually, if you forget to set the remel to false and you do the model comparison between the two models, then actually it will automatically rerun your model, setting restricted maximum likelihood to not be used, to use real maximum likelihood. And this is just because of the fact that by not re-estimating the variance, you get an overestimate of the significance, so you have to set maximum likelihood. So you have to use maximum likelihood. But if you only want the effect of attitude on the model, why don't you do it without the other independent factors? Well, we want to get the effect of the attitude, but of course we do have to correct for the fact that we have males and females. And we do have to correct for the fact that different individuals have different speaking voices. So if I'm only interested in the attitude effect, I cannot just do a model saying frequency by attitude, because that doesn't compensate for the fact that the same individuals measured multiple times. And it doesn't correct for the fact that the same question is measured multiple times across different individuals. So I have to put in all of the factors that are nuisance variables, so that cause variance in which I'm not interested in, so things like gender and subject and scenario. And then I compare this model, which has the thing of interest, so my variable of interest, to a model which does not have that. So in linear mixed effect models, you are always comparing models. So it's more or less always a model selection problem. So when I have these two models defined, how do I now get the significance? Well, the way that I get the significance is now just doing an ANOVA test. So I'm saying do an ANOVA, take my null model, and then compare it to the politeness model. So now what it will do, it will do the ANOVA test, and it will give you the AICs. So it will do a model comparison. It will tell you that the null model takes 5 degrees of freedom. The full model, which has the attitude included, takes 6 degrees of freedom. You see that the AIC drops down from 816 to 807, and you can see that there's a significant improvement when you add attitude to the model. So we learned two things. One thing is we learned that when you are in a polite setting, your voice changes by around 19 hertz downwards, and this change has a relatively big significance of 6 times 10 to the minus 4. So 0.0036. So really, really significant. All right, so now we compare these two models, right? We know now what the effect of attitude is, based on the summary. We then also know what the significance is, based on the fact that we compare our model with the model, which does not have the scenario, but the attitude in there. So now, of course, we kind of have everything that we need. So if we would write a scientific paper, then we could now write more or less our conclusion. So the conclusion in this case in your publication would be that politeness affects pitch. So you mentioned now that this has been tested by a chi-square test, which has one degree of freedom, right? Because the one degree of freedom is the difference between the degree of freedom of the null model and the model that you're looking at. It will also tell you here, right? So the df of the chi-square is 1. We have a chi-square value of 11.6, right? So the chi-square value, one degree of freedom, 11.6, which corresponds to a p-value of 6.5, type 10 to the minus 4, or 0.00065, lowering it by around 9.7 hertz, plus minus 5.6, right? And that is just the standard error that we can also get from the summary. So that is kind of how you do linear mixed effect models, right? So if you have repeated measurements, you have to deal with these repeated measurements, otherwise you are massively overestimating your effect, right? Because if I would have two individuals measured and they would have a difference of something, right? Then the significance of this is not very high, because I only measure two individuals. But if I now would measure the same individual 100 times, and I would measure the second individual also 100 times, then now all of a sudden in a linear model, it would see it as 100 measurements versus 100 measurements. But this is of course not true, because you only have measured one individual versus one other individual. So the significance doesn't change based on the number of measurements that you do. So the power of linear modeling and of regression comes from the fact that you have multiple independent measurements. And of course having 100 measurements of the same individual do not make a more significant effect. All right, so random slopes versus random intercepts. So a random slope versus a random intercept. So you see that each scenario and each subject is assigned a different intercept. If we look at the model that we get, we see that for each scenario, so the seven different scenarios that we have, we have every scenario has a different intercept. Every subject also has a different intercept. But we see that this 19.7, so the influence on the attitude, the politeness attitude, is always estimated as being minus 19.7. And the effect of being a male is always estimated to be lowering your voice by 108 hertz. That is what you would expect when you give this model the one-by-subject and the one-by-scenario, because it takes them into account the by-subject and the by-item variability. However, it might be that we are expected, it might be that there is a relationship, right? That based on the scenario, so based on the question, the politeness attitude might actually change, which is a relatively normal assumption saying that, well, the politeness might be influenced by the question that you have to pose, right? Some questions might not have a massive difference, while other questions might have a much bigger difference, right? So then you are talking about a random slope model. In this case, we're just doing random intercepts. So we say that every individual is allowed to have their own intercept, every scenario is allowed to have their own intercept, and of course, the effect of politeness is equal across the different scenarios and is equal across the different individuals that we are looking at. So a random intercept model means that the fixed effects, attitude and gender, are the same for all subjects and all items, because our model is a random intercept model. In this model, we account for baseline differences in pitch, but we assume that whatever the effect of politeness is, it's going to be the same for all individuals and all items. However, in a random slope model, it might be expected that some people are more polite or less, right? So that the politeness has a bigger effect on the frequency, and for example, you can imagine that if you are someone who has a very high voice, you might drop more than 19 points when you are in a polite sitting, while someone who has a very low voice might drop only like 7 hertz down, while your voice is already low. So what we then need is a random slope model, where subjects and items are not allowed to have different intercepts, but where they are also allowed to have different slopes, so different beta coefficients for the different scenarios, so for the different politeness levels. So how do we do a random slope model? Well, if we do a random slope model, we write it down like this in R, right? We say that we have the frequency, which is the thing that we want to predict, is based on the attitude, it's based on the gender, and now for every subject, we allow every subject to have an intercept, but we also allow every subject to have its own attitude, right? So subject one can have a drop of like 25 hertz, subject two might have a drop of only 10 hertz. And of course we also allow this, we also do this attitude by scenario, right? So if we look at this way, so we also do a one plus, and that just has to do with the fact that the way that the model is specified in R, but the notation one plus attitude, dependent on subject, means that you tell the model to expect different baseline levels of frequency, the intercept represented by one, as well as a differing response to the main factor in question, which is our attitude. All right, so if you want to learn much more about random slopes and random intercept models, then read the tutorial. The tutorial is part of the exam, so you will have to read it, otherwise you will fail at least one or two exam questions. But today I told you about random effects, so random effects are effects which are more or less repeated measurements. I told you about mixed models, what is a random intercept model, and what is a random slope model. For the assignment, I want you guys to read the tutorial and of course practice, practice, practice, practice, because the only way that you will be a good programmer is if you spend the time on answering programming questions. All right, so any questions so far, and I would bet there are some, but we'll see. So, I have another presentation for you guys lined up, which is not really part, but I just wanted to show you guys what I have been doing with random slope and random intercept models recently, just to make it a little bit more practical, right? Because, like, this is based on frequency and pitch, and I hope I explained everything well, that people have different voices and that the voices are also different based on the questions, because questions just have different letters, and every letter is a little bit different in pitch as well, and that you can assume things like, well, it might be that the politeness is just a single influence, or that politeness might be different based on which individual is speaking, but I also have kind of a more or less similar example, which I also wanted to show you guys, that the BodoWinter tutorial is definitely worth it. Also, check out this tutorial number one. Tutorial number one is just about linear models, and then tutorial number two, which you have to read for the assignments today, is based on the linear mixed effect models. All right, so just a shout out in chat. Do you have the idea that you kind of know now what the difference is between a standard linear model and a mixed linear mixed model? So the difference between an LM and an LME or an LMM, depending on how you shorten it? Yes. Okay, that's the first one. Okay, then I achieved part of my goal. Testesartos says no. Okay, so what don't you understand? Another yes. I do think it's a little bit sad, test though, that you don't have the diamond anymore. I'm going to actually give you back your diamond as well. At least you have your diamond back then. I don't want that. Why don't you want your diamond back? You just completely desubscribed after I muted you for five to ten minutes. I don't understand why it is different. Isn't it only a more precisely linear model? No, no. Because if you are so imagine a case where I have measured three big mice and I've measured three small mice. Now I can do a t-test between these two groups. But if I would start measuring the big mice over and over and over again then it would look like I have a whole bunch of data. It would look like I have a lot of power to detect effects. But it is actually not true because the power comes from independent measurements. And if measurements are not independent you cannot treat them as such. Because then your p-value will artificially be lower and lower and lower. A linear model has no concept of things that might be measured on the same individual. But a linear mixed effect model you can tell it that this is the same individual after 10 times. And then when it calculates the p-value it will not treat this as 10 independent measurement but it will treat it as a single measurement. So that is the big difference between linear models and linear mixed models. Because a linear model cannot deal with repeated measurements while a linear mixed effect model can deal with these differences. So that is the main difference between these two things. And then we will have another break and then after the break I will show you another example. And then we will go a little bit more in detail and it will be again based on our own Berlin Fetmau which I am always trying to promote. So it will do kind of the same lecture again but now on data that I collected myself and oh, an anonymous cheerer. Thank you for the cheer. Thank you for the one bit. So good. So Topias was the last one, right? So the break now will then be insects which is the one which you already saw last time but then we only had two hours so we didn't really have the break screen. So I will see you guys back at like 4.05 So in a 10 minute break I will stop the recording