 Good afternoon everyone. Welcome to this session on moderation. Just as a quick question to start with, can everyone here be okay? Yes. Great. Great. Great. Great. Great. Great. Great. Great. Great. Great. Great. Great. Great. Great. Great. He'll list, uh, he'll list the third name on the screen, Andreas Richter who is a colleague of ours, who submitted this PDW with us but in the way that Academy sometimes happens had another one scheduled exactly the same time. So, we figured we could cover these one between the two of us and if you wanted to hear Andrea specifically, uh, I'm sorry but you have to wait for another occasion. But this session is basically covering a wide variety of topics around moderation. We're going to start by taking it from a fairly low base, not presuming much knowledge there, but also getting onto far more complex models by the time we're done in just under two hours' time. Now, there are some resources that may be of use to you in particular. At the website that you see, it's got the URL at the bottom and it's also got the QR code there. We showed that a few more times before the end of the session. It includes links to slides we're using. It includes links to an example data set that we're using, for examples, and some syntax that we have been putting together in three different software packages, SPSS, R and Stata, because we think these are the three most commonly used ones for people who attend Academy. There's also links to Excel templates, which I'll be talking about a little bit later on. A couple of housekeeping things before we get going. One is, I think, although I'm not quite sure how this is working this year, there will be some evaluation forms, which are distributed at the end. If you have a moment at the end to evaluate what we've talked about, that would be great. We may well offer this again in future years, we'll see. If there's anything that can be improved on or anything that you'd like, it's worth telling us about that. The other is just about questions. I'm sure you're going to have questions at different points. There will be specific opportunities to ask those questions at different stages, from a sort of complete section of it. We ask that you retain your questions for those points, unless it's something going wrong behind me or something that you didn't quite understand to be saying or whatever, in which case you can ask for clarification. Just otherwise, I know from experience, we need to keep this moving to get through what we're hearing. So, will that QR code up again in a moment? But actually, on this next slide, it's a different QR code. I know that people have talked about this. Yeah, I have this hobby of putting all my teams and staff online on my YouTube channel. And when we were preparing this session, I took a look at what material I have about moderation and interaction models. And I made this short collection of six videos for you. And this card was some of the things that we thought of this seminar and also some of the things that we'll be on. And please subscribe and hit the notification button. It's going to be on this other video, as they are professionally understood. All right. One more thing, I'm recording this and I'm not sure if I'm going to be able to participate in this, but if you say something and you object to your voice, be heard on the internet, please send me a email. And I will then make an email to you. Great. Thank you. I'm just going to give a few minutes' broad overview to what moderation is, because I expect most of you will have some idea about moderation, which is why you're here in the first place. But it does relate to a broad category of different types of model, which are connected by one main thing, which is that a relationship or a set of relationships differs, depending on the value of one or more other variables. So very often, we'd be concerned a link between the independence and the dependent variable, and that particular relationship, either its strength or its existence or its direction, might differ depending on what a third variable or moderation variable was doing. And we'll look at a range of different types of moderation. Incidentally, I'll talk about moderation, I'll talk about interaction. Broadly speaking, I will use these two terms in the same way. Strictly speaking, they're not quite equivalent. If you want to know more about that, come along to my session with Torsten Beeman on Monday afternoon. But if I say one for the purposes of today, I'm broadly talking about both. So interactions or moderation, any time we've got this situation where a relationship differs. Just a few examples from the literature of slightly more complex models, to begin with this one where we've got a link between the number of ties, number of weak ties and creativity in this paper by Joe and colleagues, and they found that there's a curvy linear relationship, some values to the moderator, and a linear relationship, no other values to the moderator. So that's getting on to a more interesting, more complex type of a model, which we'll come on to talk about a little bit later this afternoon. We talked about two-way effects, but what about three-way effects? We'll come on to these later as well, explain you've got multiple moderators or three independent variables interacting with each other, producing more complex relationships. I'm not going to go into this detail now, we'll come back to this example in 45 minutes or an hour or so. And then you might have interactions with non-linear models. So by non-linear it could mean that curvy linear quadratic, so I'm just trying to understand it in a way, it's really, I have some of you. But it could also mean non-linear regression, such as, in this case, Poisson regression models, and you see the non-linear pattern there. So there's something else we'll get on to as well a little bit later. But to begin with, we're going to take it right back to the start and what a simple two-way interaction using moderated regression analysis would look like. So the key thing, you've probably seen this type of equation before, but if we've got a relationship between x, an independent variable, and y, the dependent variable, normally that would just be, have a relationship like y equals b0 plus b1x. That's what a typical simple linear regression would look like. But if we've got a moderator, which here is represented by z, and I'm British so I obviously say z, but I'm in the US so I'll try to say z when I'm here, but if I get that wrong, forgive me, that's my disclaimer at the beginning. We also have these other two terms. We've got a z term and we've got an x-z term where x and z are multiplied by each other to create this interaction term. Why is that? Well, if you rearrange this mathematically, it looks like this. So we've got y, the dependent variable, the predicted value is being equal to some function of an intersect b0 plus another clutch times z. So that doesn't depend on x there. And then we've got another one, which does depend on x. We've got b1 plus b3c. So in other words, the relationship between y and x depends on the value of z. And that's why multiplying these two things together gives us the means to test out this moderation. So Mika is going to talk to you now about how we actually start interpreting those before we start getting into the software. Yeah, if you think about how you interpret normal regression coefficient in the equation here, the effect of x increases by one unit, increases y by beta one unit. So that's the normal regression interpretation. But that assumes that everything else takes constant. But if you increase x a little, then typically xz will also increase a little. And because of x affecting actually two different terms in the equation, the interpretation of these models is a bit more complicated than the interpretation of normal regression. And for that reason, we normally do plotting. So there are a few plots to motivate. And this is another plot from paper by Techman. So this is an AMJ best paper work in a paper. And they study the effect of business and quality for patient satisfaction for male and female businesses. So they are looking at discrimination in this paper and their hypothesis is that men are rewarding more for productivity than female. And they plot two different regression lines to interpret what the effect actually looks like. Now the question is, how do you get from this very standard regression table to this kind of plot that then tells you how the effect looks like for these two groups? You do this way. You basically take the regression code with this regression model and now I realize that when we are, my slides are just going to play around. The line arrangement went wrong. But we take the second regression model. So this is the model. So we have these interaction trips here. And to produce this plot, we will take the regression code with this type of gender. The regression code is a coefficient of quality and the regression code is an interactive job of gender and quality. And we calculate predictions using smart. So we plot in some values for gender and quality. And we calculate four points on this plot. So men and women, we would probably have gender, zero indicating men, one indicating female. But because these data are standardized, we have to work with standardized data. So we choose standardized values and find 0.78 for male and 1.27 for female. So we calculate four points at two different values of gender and two different values of white. One convention when you work with standardized data is to pick the high value and the low value as minus one and plus one. So we do that. So quality gets values of minus one and plus one. And now we just plot these values, minus 0.78 and minus one, if that equals there, gives us a predictive value of cost and satisfaction. So we calculate values using the recursive equation with these four combinations of the interest rate of the moderator and the interest rate variable, the main independent variable. Then we plot them and it gives us the plot. So we have four points that we then connect with lines and that gives us the plot. So this is how you do moderation interpretation using plotting. You calculate predictive values using the recursive equation. You just multiply two variables with two recursive coefficients at four different combinations of those two variables and then do more. There is the coupling of predictions and doing the plotting parts in this analysis will talk more about those two states a bit later in the session. So that's the same plot. One nice way to practice doing this is to calculate the plots by hand with published papers like I did here. So can you reproduce a lot of published paper using the data we're importing in that case? The next one's paper works really well. So that is the same. So hopefully by the time we finish today you'll feel comfortable in producing plots yourself but also hopefully you won't need to because we're going to talk about how to do that using software too. That can save you a bit of time but it's worth understanding Yeah, a lot of things are easily learned. If you do them by hand using Excel like three person analysis you can calculate using Excel and just calculate some squares and minimize that using software. If you can do that, you can understand three person analysis. Absolutely. So all of the examples or most of the examples we're using today are based on this example data set which is available at the link. For those of you who have joined more recently I will put up a QR code again in a little while which can take you to the web page for resources. But this data set which is an SDSS format in the on the web page includes 424 cases independent variables we've got their training autonomy, responsibility and age. They're all continuous we've also got dependent variables which are continuous, job satisfaction and well-being. Whether or not people receive the bonus binary and a number of days absent, 6, 6, absent people have had in the previous year which is a count discrete variable. We don't need to worry too much about all of this but we'll use some of these variables in the examples as we go through. So if you're familiar with regression analysis and you've understood how the models are formed what this slide is doing shouldn't be any surprise to you. It's just to give you the syntax in the three different software packages we talked about for running a two-way interaction or moderation model. It's worth just asking you at this point who uses which software so can you put up your hand if you use SDSS? Okay can you put up your hand if you use R or in two hands okay and can you put up your hand if you use data? Okay so whether it was majority or not so data wins but actually there were good proportions for all three of these. Personally I don't use data very often so Miko is very much the expert in data here I use SPSS quite a lot I use R quite a lot but I'm going to talk about the SPSS stuff maybe a bit of the R stuff but Stater and Dar as well it's worth making this distinction though because the way you go about some of these things will depend on what software you're using and in particular for those of you who do use SPSS it's a lot less flexible than the other software for doing this type of thing you can still do almost everything you would want to do but you might then have to as we'll see in a moment go and use different software for the plotting and it may be that you need to put some extra steps along the way so for example if you're using the regression procedure in SPSS to do this you actually need to compute an interaction term separately before running the regression which is not something you have to do in the other software you don't have to do that in SPSS if you use a different function rather than the regression one but we'll leave that aside for now all of this syntax is available in the files which are on the web page of resources for each software it's in two different formats you've got the the raw syntax in the format for the package itself or you've got a pdf which shows not only the syntax but the output that it generates so hopefully at the very least you'll be able to re-create that that's anyway moving on we'll get about the states one big difference between SPSS and R and stator is that in SPSS you produce the interaction by yourself and in stator in R you specify the interaction as a part of the model of the adjustment you can go this way with SPSS and R it's possible to generate the interactions of yourself but that is a bad idea because if you use this syntax where you specify the interaction in terms of part of the model then all the quality commands that pave the model as an input problem but if you generate a variable called train age for example then stator would not know that that is an interaction between train and age so you always specify interactions as part of the model instead of a new variable before estimation and one slide about the stator syntax because this is something that I was wondering when I was learning stator that stator has a background in the fields other than management and in other fields category of interactions are more common than continuous interactions stator by default treats all variables that you put in the interaction model as categories so every R in the true value of age is a separate category we don't want that so to specify a continuous interaction then the C are three things for the variables so the stator knows that that is a continuous variable in contrast with R STSS that has the variable type as part of the variable specific case of spade up does not really have the variable type then you have two hashes how many of you know what is the difference between two hashes and one has in an interaction one person two three the difference is that in two hashes then I have it here it automatically adds the first order event if you have just one hash then you have to remember to specify trade at age to define the reverse model this is something that you think is always a response hashes if you accidentally specify one hash in this variance in variance of syntax stator will just draw an extra age variable not in that if you want to learn more about how to specify interactions this is the stator command that gives you the help page of factor variable specific case right okay so I told you there would be help in plotting these interactions because plotting the interactions is by far the most helpful thing you can do to try to enter there are other things which you might want to do as well but plotting it tells you the vast majority most of the time of what you need to know about it and there is a web page I have which I think is in the web resources page some of you may have seen it before looks very dull websites could do with a refresh on one of these days I might get around to that but the important thing is it's got lots of links to different Excel files which enable you to plot the interactions and I'm just going to show you if you run that first interaction effect you get some out which looks like this the coefficients table here what you need to use these Excel files in every case what you need is the regression coefficients and specifically that's the unstandardized regression coefficients some software only gives the unstandardized coefficients some like SPSS gives both unstandardized and standardized it is absolutely essential to use the unstandardized coefficients for this because they're the things that make those equations that we don't share dirty of work if you really standardize estimates with interaction models the way to do that is to standardize the variables before estimation so don't standardize the estimates but standardize the variables then from the interaction then estimate instead of form an interaction estimate the standardize it will be standardized correctly and all the estimates will be grossly overestimated and we'll talk a little bit more later about standardizing and centering variables because we will come to have a discussion about that but it's these four numbers here that you need to put into the Excel file specifically in this case you put them in these cells here now this particular Excel file it gives you lots of other things that you can put in you can specify which values of the moderator and indeed the independent variable which can plot these effects so to calculate those four points it draws the line between it is fairly standard practice with a continuous variable to plot one standard deviation above and below the mean of that but that doesn't have to be the case sometimes it might make more sense to choose a value that's more meaningful as long as it represents a fairly typical high or low value of the variable that's fine and in fact if you've got a variable which is skewed choosing values one standard deviation above and below the mean won't necessarily be equivalent to each other so you might choose a more percentile based approach you could take the let's say the 10th and the 90th percentiles of a variable choose those to put in as the values at which to plot the slopes if you leave those black and you put in the means and standard deviations it will just choose one standard deviation above and below the mean but I would caution everyone not just to do that automatically but to think very carefully about what values you need and of course if you are using binary variables it's really important that you choose the actual values of those variables to plot it which won't necessarily want to standard deviation above and below the mean in fact they won't be unless it's perfectly even in order so this is a relatively straightforward way of plotting these if you're using this method and this is how you probably should do it if you're using NSBSS but if you're using par or stator you could do it in a slightly more simple way so this is the stator command for running an interaction model and plotting is the same part so we use this in the marches command this stator and if you specify the plot option it will calculate you that the points, the four points that we need combination of training 1, 5, 8, 25 and then plot so it does the calculation and the plotting in one go and in R there is at least 10 different packages for doing interactions and I have gone through maybe 4 different packages that I use I started with John Fox's 8 effects package and then I used a few others and now I decided that RGR is the best package at the moment when you pick a package pick one that works for you and if you know one package well then stick to it unless there is a reason to do something else but if you are on a plane slate or choosing something and that you think what is the most general so how many different kinds of models in it supports marginal effects the document basis says that it supports 80 different statistical models including multi-level models and generalized bigger models and what not and which graphics library supports so R has four different graphics systems there is base graphics then there is grid graphics then there is lattice with space and grid and then there is ggplot that builds in grid and all of these are representative of graphics I used to use base graphics but nowadays I use ggplot almost exclusively because that forces me to think more about my data and what I want to communicate instead of thinking about which line I draw and which coordinates I use for that line which abstracts out of calculating the points for the lines and allows you to focus on what you want to communicate with the graph so marginal effect is probably the best choice at this point and this is the command for marginal effect and another way of calculating a plot is to build this in two steps first calculate the predictive values and then do the plotting that would be kind of running margins first and then running margins first so you separate it in the two different steps why would anyone ever want to use two commands instead of one command the answer is that when you separate the comms it gives you more control so the margins plot allows you to customers to plot a lot more than the margins format does and in some rare cases for example combine results from different margins analysis and then plot them in a single plot sometimes you want to be plotting that the margins plot doesn't do and you have to do the status two-way plot this is a more general plotting concept the same with R sometimes you might have a plot in your mind for example three-way interactions you want to plot the three-way interaction as a single plot with four lines so plot predictions by default plots are three-way interactions as a set of separate plots that means from the two-way if you don't like that well there's no way to change the view so you can choose your input plot specified that and then specify the plot yourself using ggplot so ggplot is the graphic system that makes it easier for you to use ggplot if you want to learn more about marginal effects there is a really great e-book or dislike whatever the interactive book it is the next place there are the design principles how we use it how we integrate with other practices and how it compares so I recommend you should take a look at this package and consider which one are better so I converted this package earlier this and you can do that and stop so this is from my paper in order in 2022 showing some fancy interaction plots that you can view with these packages so this is with Stata and this is the same with R we put in the paper we have the code for both these plots in the appendix of the paper so this is also a model two-way interaction we have confidence intervals with transparency transparency and then we have markers for observations as above the plot that indicates the different values of the women variable which is our order so you can do a lot more information reach plots when you practice it doesn't have to be just two lines more than two lines and it doesn't have to be a line it can be curves and so on with some examples it's going to be more advanced it can be better so you can get a plot as I say that's the information you need but sometimes you want to go a little bit further and there are various ways in which this is commonly done I'm going to talk a little bit about flow tests and whether you shouldn't use them and if so how and then a couple of other alternatives which might be better or less good depending on what you're trying to do so first of all sorry I'm just going to go back to this here because I didn't be creating the latest slide but we've got a plot here which shows a relationship between training and job satisfaction and we've plotted it for people who are older and people who are younger specifically we've got the dotted line showing people who work 55 years old and we've got the solid line for people who are 25 years old and you can see that the solid line is better than the dotted line is so this suggests that for people who are younger the relationship between training and job satisfaction is more positive but you might have supplementary questions about this so for example for those in the Asia 55 group it's still positive but it's not closely to being flat you might ask well at that particular age is there any evidence that there is a relationship is it different from zero that's what a simple slope test is and note that it is very specifically asking about that age it's not asking about older workers in general it's not asking about people who are aged between 50 and 60 it's asking at that particular point is there evidence of a relationship between training and job satisfaction so if that's something which is useful for you to do the simple slope test and there are actually two different ways to go about the simple slope test I tend to refer to these as the direct method and the indirect method and again they're a bit easier in our state than they are if you've used SPSS so Vika will talk about the other software in a moment if you don't have an SPSS what you need to do is get certain bits from the output of your analysis and specifically you need the variance of the coefficients of the independent variable and of the interaction term and the covariance between them so to get that you need the variance covariance matrix for the coefficients notice the coefficients not the variance you need that for and if you put that information in what this Excel file does it automatically gives you simple slope tests for the two values you have got at the slopes you can change those values and get different tests but again just remember it's for those two specific values and they appear on the right in this case for the age the age 55 the shallower line there CRP value of 0.006 so still statistically significant then however oh sorry yes, so how do you do this in the STATAR in STATAR you again use MARCs so all interaction post-explanation analysis of most of the STATAR in MARCs in some specific cases it might use contrast as well but pretty much everything that you see in management would be done with MARCs calculating predictions so in calculation you are the four points or how many points you want to have and in calculation it's close it's their predictions are you specify this DYEX option so it's DYEX comes from the derivative of y with respect to x and it gives you the slope so we are going to get the slope of train at two ages and then MARCs to calculate that for us we are without all your packages like john pox's or effects might not be able to calculate this but there are marginal effects has a slopes function it just specifies slopes analyzing which variables and what values you are introducing then you get the estimates these are the same estimates simple slopes that you would get with Excel and the p-value from those estimates yeah how many of you know what is the s? there is p-value but now there is an s-value s-value is similar to p-value it's called surprisingly and in quantity-wise the p-value so they are surprisingly how many times you can get a scale of a coin and how unlikely that would be so getting this p-value is equally unlikely as it's to get 42 tails straight when you click a coin and some people have suggested that that's better than p-value I can't agree but I still don't think that's going to manage that very soon but some things use s-values that's why I don't know about this until this morning either I don't know but I do like it it's quite intuitive in the way that you understand the lack of likelihood of something so that's what the direct method is because basically it plugs all of the values into a formula and gives you a test but there's no way you can go about doing simple slope tests or similar types of tests called the indirect method and it's worth just mentioning this because if you understand the principle of the indirect method it enables you to do all sorts of other post-talk tests with not only these interaction models but any kind of regression-based model it relies on the the fact that any coefficient of a variable is equivalent to the effect of that coefficient when everything else remains the same specifically if the moderator has a value of 0 then the interaction term would also have a value of 0 so it will mean that the independent variable at that point gives the simple slope test for the moderator being 0 which means you can rescale the moderator so that the 0 point is any particular value that you want it to be so if you want to test an effect when the moderator has a value of 100 it subtracts 100 from the original value of the moderator to create a new variable rerun the regression and then your independent variable is what gives you that simple slope test so it's not necessary to do it for this but that knowledge can give you all sorts of extra capacity to do other things later and in a few moments we'll talk a little bit about centering variables and the logic for why that might be beneficial relies on the same principle there another useful use for this technique is if you're learning about interactions so one thing that I find useful for me when I want to learn something new is to do the the new thing one way and then do the new thing in another way and see if I get the same result so if you use the simple slope test using the Excel sheet and then you have this risk variable take a look at it if you get the same result if you don't then you've done something correct that's very important so we won't go through the syntax again again it's on the on the files which are available to you but just a few thoughts about simple slope tests before we continue because in some journals they seem to almost require them by default and I don't think that's a healthy thing they can be very useful if you've got specific values which are worthy of testing but I don't think doing them automatically for one standard deviation of below the mean of a moderator is particularly informative because if you change what those values are you're going to get a different result to the test most of the time there's nothing specific about those one standard deviation of below values the test that I often use is can I specify in advance for example even collecting the data what values of the moderator would be worth testing this at and if you can and that suggests it's worth doing or it might be worth doing if you can't it might not be worth doing and there are some examples for example if you've got a binary moderator with only two values in that case clearly it makes sense you know what the two values are that is different from zero those two values it's sensible but remember that's all you're doing you're testing to see whether an effect is different from zero at a given value of the moderator so I wouldn't suggest doing this all the time and I do do that myself but I always think carefully about before I do so what can you do instead one thing which has been used a bit in management research is the Johnson-David approach also known as regions of significance and this in some ways gets around the arbitrary nature of choosing values of the moderator to test slopes out so it asks the question in reverse it says at what values of the moderator would this give me a significant result in other senses that gives us extra information but it does also certainly in what I've read lead to inappropriate inclusions about what it is because you get a region but that region isn't describing the population in a useful way it's just saying if we happen to test it at any of these values would it be different from zero well the truth is if you've got a larger sample size you're going to get a larger region of significance is that a helpful thing to be able to say well there are situations when it can be more descriptive but again it's not something I think is particularly useful to do all the time and I tend never to use this myself so what can I do instead well this is actually it's kind of work in progress but it comes down to the fact that really what we're trying to do here is we're trying to describe an effect not trying to describe it in significance terms we're trying to say something about a population we're measuring what is this effect and there are three elements to the interaction effect that are worth commenting on what he's thinking about so the three elements basically are given by the effects for the independent variable the moderator and the interaction term now one thing that I've just realised it would have been more sensible before this let's do that we'll come back to that in a moment so I'll let you take this one ok centering is something that a lot of people do before we make interactions and when I teach my students in my core class I have a big slide that says they are there there are reasons for centering and reasons for not centering I mean, depends on the context I never centre my data I prefer to do calculate effects of different values of the variables I use marks if I use data I use the R package because that allows me to do post-test I don't need to centre anything but then again if you do the alternative method for simple slopes using SPSS centering would be a mistake but I think state are hard in that context I don't centre it so the pros of centering is that when you have that one recursive coefficient of x in your table it gives you the effect of x when the moderator is 0 and well, if 0 is beyond the range of the data if you want to moderate that if you want to aim as a moderator for working-age population then the effect of x would be calculating that 8 equals 0 for working-age population so that impacts on interpretable because it could be on the range of the data when you centre the data then you take the the impact at somewhere in the middle of the model so if you have data for working-age population the effect of an interesting variable would be somewhere around 40 years but it's the average age of working-age of 18 so why not do it if you plot your results then your quality would be offset so age would be, you would be quality from minus 20 to plus 20 years of age, if you think about 40 years of working-age it doesn't matter ultimately so the interaction effect the coefficient of the interaction term would be the same whether you go centre or not and to show that here are some R-codes that generate data and this is just a standard old variable so x1 and x2 have means of zeroes and x1 and x2 is their product we can see that the product is highly coordinated with x1 and x2 and when we centre the data and we calculate the interaction after centering then it will be uncorrelated with x2 and x1 they are not independent as you can see there is a clear statistic of relationship but it's not a correlation effect it rather affects the various of the data so centering means that you take the mean of the data and you subtract the mean of the original values and standardization means that you centre and then you divide by the standard deviation of the product so what is the difference if we estimate these interest models using this data we can see that the only thing that centering does if you have x1 and x2 is that affects the interest we don't normally look at the interest I don't remember any papers interpreting the interest set so centering will be completely useless because it only affects the parameter that is not independent when you do the interaction then what happens is that not only the interest set but also the x1 and x2 are different but the interaction term the coefficient will be the same that doesn't change how to understand that well to understand it take a look at this three dimensional plot if I would have a screen where I could animate this it would be easier to understand but think about it a bit so this is our x2 and this is our x1 and this is our y variable now and our model says that the effect of x1 varies to the function of x2 or the other way but let's focus on the effect of x1 so if x2 equals 0 then x1 is this blue line so it goes up but it does not go as speed when x2 increases the reverse of slope of x1 gets bigger the reverse coefficient of x1 is not constant but the interest of x2 and the question about centering is that when we have a reverse in table we present there only one reverse of coefficient for x1 which one of these lines should be big if you don't centre then you are picking the blue line so it's an effect when x2 is 0 if you centre then you pick the green line which is the effect of x1 when x2 is at the middle if you look into an x2 coefficient then centering makes a lot of sense but most of the time we don't integrate x2 rather we put the data and in that case centering complicates plotting because your scales of your axis will be shifted sideways and you will have negative A and Cs or negative genders okay that's here so we just hope for questions we will do just finish this little bit first because the whole thing about centering then is that there are benefits and there are disadvantages but essentially it doesn't make any difference to your findings as long as you go through the procedures correctly however and regardless of which software you're using you might get this through some of the management commands or if you're doing it in SPSS you might want to centre the data to get this the thing about those x1 and x2 or xz effects the main effects if the variables have been centred then those then effectively give you the average effect of that variable across the range of the other things to be able to interpret and if you're describing an interaction fully I think often is a useful thing to be able to say so in this particular example here the three elements which are given by this is we've got a positive effect which is always positive we can see from the main effect of the x variable that it's a positive relationship on average which we wouldn't necessarily have got into the data, it depends how it's scaled while studying the moderator we'd be able to see that it's a disordinal moderator effect where the lines cross sometimes being able to say where they cross can be a useful thing to say but also the interaction itself is positive so if the xz coefficient is positive that means that as the moderator increases the relationship between x and y becomes more positive and that's a really interesting useful thing to be able to say sometimes and in particular the v3 which is the xz coefficient that coefficient tells us the size of that effect in other words as the moderator increases by one unit how much does the x-y relationship increase by and far more so than a lot of other interaction effect sizes which are out there I think like f squared or tinge and r squared or whatever they give a certain amount of information but they don't tell you in terms of the original variable what scale of those original variables is and I think most of the time that's the kind of effect size that we're really interested in what's actually happening as these variables change so I'm not going to go through this in the velocity detail now because you can read yourself if you download the slides but I think any attempt to further describe the interaction will be helpful and a lot of the time more useful than doing simple slope tests but as Miko says it's probably time that we had a break for some questions just I'll just all do that I've mentioned binary variables a couple of times we're not going to give examples of this because if you've got a binary moderator or even a binary IV you just follow exactly the same procedures it's just that you've got two values of the independent variable and the simple slope tests are definitely meaningful in this case okay so let's pause now take any questions everything we've discussed so far I think your hand was up first welcome to captain's I have a quick question about the confidence interval because when you run like boarding spot and data unless you supply those CI the option in the end they give you a confidence interval and sometimes I'm not exactly sure how to interpret those confidence intervals I really should go and talk about it the later half but I was just kind of and I think so a lot of cases I just kind of get the confidence interval to disappear to look ahead and look at the clean graphs but I wasn't exactly sure how I could deal with that okay we'll talk about confidence interval a bit later, I'll have an example since data and the way to deal with confidence interval is don't use them, don't leave them out make them semi-transparent so that they're not distracting so that they're available for a person who wants to see them also don't have them use the error bars but rather convert them as bands so they are less so this is what we are and in confidence semi-transparent is hard to see but because it's really up on the screen semi-transparent are confidence intervals, they're not distracting and we can see that they overlap here quite a lot but for example these two lines here the confidence intervals are clearly distinct from these two lines here so you can see it in for something even if it's not getting away the difference in if the confidence interval is within the confidence interval of another line I will intervene with that as saying that we don't have enough evidence to conclude that those lines being for a substantial even if the regression results give you a significant impact on the fact yes because then you're looking at basically the trading value so the effect can be different but the outcome might not be different so for example we might have a scenario where a study has a statistically significant effect on exam performance but they might have two students who study so little and another one that doesn't study and the exam performance between those two students who don't use that experience so you are looking at three different questions you're looking at the question is the effect significant and is the outcome of that effect or a certain change in the independent radical statistically significant one final kind of sorry to take so much time but you said if one becomes a continuous variable a continuous variable that confidence interval and the plotting could also depend a lot on which value they pick out for the plotting yes okay and that is like we could say that studying statistically seems to have a significant effect on exam performance but if you study just for one minute it doesn't have a statistical effect on your performance so these are two different questions okay thank you okay one question about centering that I think curious about if I have repeated observations but not in a pattern model is it the correct way to define students to identify them so by time point for example time point one I take all my observations and tend to them time point two so who needs that or should I grant me that to take all together because if I'm interested within subject effect over time and interact with them I shouldn't be intent with them yes sir if you are interested it's a new solution for centering but you can also accomplish the same by using a winning estimator or you can add dummies for the observations but this is a bit beyond the scope it says how do you center how do you do the cluster needs that's like how to develop a modeling we'll say a little bit about the level modeling but yes I explained in part the different centering I'll just write this in part there's a lot in the morning thank you what is the best way to take the points of the moderator for a time is to a variable that's expected it's a functionality state that's expected to be high but the reason for what this is about is by and low at the scope of the moderator sure so if there's specific values specified before and I would use a percentile approach I'd probably take the 10th and the 19th percentiles in the observed distribution to represent high and low values low and high values and just to follow up what about clusters that moderator is thinking if you have clusters like you have a multi model it means that you have multiple peaks when you plot it then those peaks would be like if it's a gender it's a fly model we have a lot of people that identify as men a lot of people that identify as women and then some people that identify in between but that's a minor part so in that case even if gender is not fly or gender is the psychological variable yeah I wonder if it still makes sense to look at these two typical values so yeah you can look at modes box and other strategies just to think about how is the variable measured like if there are personality scale that might be measured with an agreement scale of 1 to 5 for example then you might peak to the 4 as the moderator wants is that all interpretable? yeah there's one other question sorry it's a simple question we have a cost of the model so the medical factor is not significant but for the simple test one of the lines is significant is there a way to find out where is the form of this model so yes so in other words what value of the model rate would be 0 there is a relatively straightforward way of doing it I just can't stop until my head think you can use it stayed up you can use the test or link of the lines to calculate any linear combinations and say that combinations would be 0 the more the simple slowness based on the product of the recursive coordinates and the more rate of value then I would have to think but status, use date or R yes, so don't use link value use date if you write out the equation and say the X effect to be 0 you can work it out and specify that into the test but I have a question you don't have a live course it's not that difficult but it's hard to do it in your head but I have a question regarding the simple slope tests in your Excel file you mentioned the covariance metrics and stuff can you just explain how you get the covariance values you need to put in the Excel sure, so it depends on the software you're using but if you're using SPSS for example then on the regression procedure you need to include the B-calf key words, the B-C-O-V on the statistics and it gives you an update how many of you understand the idea of variance or variance matrix of estimate not many, let's spend one or two minutes to explain the idea so when you write a regression model then that regression curve is your best estimate of an effect but then you also get an uncertain estimate one by standard level so let's go back to the slide that's used in the Excel the one where there are the nice ones so let's say actually this green is our normal regression line let's ignore the more recent work for a minute and because we have a small sample if we estimate the same regression line from a different independent sample of the same population the regression line is not going to be the same amount of different samples there is sampling variation in the regression line and let's assume that these five lines ignore the more recent work again the five bricks of lines estimated to find the independent samples of the same population so because they vary and standard error for variance of slope quantifies the variance of these slopes so that's our estimate we also have variance in the interest so the internet is not the same in all samples but it varies whether they are quantified that's the variance of the interest so we have variance of interest and we can also see that the intercept and the slope hope that they make the correlation means that because all the points go to the middle of the data probably when slope increases there is interest in thickness so there is a negative correlation between slope and interest so this purple line has high slope, small interest red line has a small slope, a larger interest so you need those quantities to do some calculations and also the statistical software give you the variance of the estimates in a sample if you specify that as an output option because that's useful for some calculations that you can add to it did that help anyone? Okay I can suggest we leave on it because we are in the time we've covered a lot of the fundamental concepts so if you've understood most of what we've talked about so far you're in a good place in terms of how that's going to apply to more complex models so when they talk for the next few minutes about some different types of complex models starting with three-way interactions so what's a three-way interaction? well it's basically a model where we might have two moderators but it's not just the two moderators operating separately but the interact with each other to say how the X and Y relationships change as the two moderators change it can be conceptualising all sorts of different ways as well it doesn't have to be thought of as two moderators operating like that but this was the example I put up earlier last bit of paper which looked at creativity and implementation and a couple of different moderators showing how those two change and we can see that there is to be one slope in particular here which is more negative than the others that's where we've got low implementation instrumentality and low strong times and if that means nothing to you then it doesn't matter because it can't provide any moderated variables but what do you do in practice? Basically extending the model that we had earlier with the two-way interactions there we had X and Z and X times Z is the three things we put in. Here we've got another variable called W and we have to put in the three variables separately there are three two-way interaction terms XZ, XW and ZW and then one three-way interaction term XZW and it's that three-way interaction term which determines whether or not there is a three-way interaction, in other words whether the combination of Z and W affects the relationship between X and Y so there are seven different terms here there's eight including the intercept but seven actual variables you would need to include again if you're doing this in SPSS you need to calculate those separately first if you're doing it in R or state and you don't however a word of warning before this because three-way interactions they're not that complicated but we are getting into more complex territory here and in particular these are more difficult to find the power to detect a three-way interaction is lower quite a lot lower and it would be to detect a two-way interaction you probably need a few hundred cases to have reasonable power to detect a three-way interaction so before you start to do this ask yourself is your theory good because if it's not and you find something that it might not be a real effect is your measurement good because if it's not your reliability is lower your power is going to be done so if you caught both of those things in place by all means carry on if not you might want to stop and think about your research choices but actually testing them again the syntax is in the files that we supplied can you see in SPSS we actually calculated each of the two-way effects the two new ones and the three-way effect before running this part of the regression procedure but in R and state what we said earlier right same here yeah just that three variables I mean the automatic batch or the lower one and the hypothesis that we'd be testing in it well it could be all sorts of things of course but it would be something like training predictor job satisfaction most strongly for younger workers with high autonomy so you're specifying what combination of the two moderating variables would be associated with what change in the relationship there we don't need to go through the equity of exactly how to do what if it's hopefully it'll be fairly obvious from this three-way interaction template we've got eight different curves to put in a side of this the seven variables to intersect but the variables at which to plot slopes for three or the means of standard deviations for three variables there and we get something like the plot on the right now if you want to plot it in all of the states if you want to plot it in PR or you scale up it's the same so just specify that now we have three different values so instead of having two values so we don't calculate four points, we calculate eight points in R in large amounts this is an example of the value of the strategy of calculating the points first and plotting them later because if you specify four predictions you specify four because although it was true it draws you a plot where you have two plots of two two-way plots so it doesn't produce your three-way plot but it produces two set of two-way plots and then you calculate a three-way interaction or an innovative three-way interaction by comparing those three plots so we will have four lines in the standard plot that we have in the previous slide but this R example will produce you two plots of two-wise eight and that is how the packets design so instead of using the drawing function of the packets we can just use ggplot directly and then specify that we have a group out of the eight so we draw four lines and then we specify the line so sometimes the packet doesn't give you the kind of plot that you want and in that case you can take an instance out of the packet and call them yourself to save its data while its plot can give you most of the time the correct plot sometimes you need to do something exciting then you can do as normal to great graphics so in terms of interpreting it plotting again gives you quite a lot but not quite as much in terms of the possible interpretation for a three-way interaction when we plot a two-way interaction depending on how we did it we end up with two lines and the fact that the interaction was significant would mean that those two lines are significantly different from each other that's a direct equivalent but because we've got four lines for a three-way interaction we can't say the same thing we can't say which lines just from knowing that the three-way interaction is significant but that might be a really important part of the interpretation in fact most of the time it should be because if you specified a hypothesis for a three-way interaction carefully it should imply which lines would be more positive or more negative than other lines and which might not matter so for example what we said here is that job satisfaction training for this job satisfaction most strongly for younger workers with high autonomy that means that the line the blue line in this simplified graph here which is for young workers with high autonomy that should be more positive than the rest of them so we can actually test that using a sort of difference test and this is something that I would always recommend doing for a three-way interaction simple slope tests you can do exactly the same caveats apply for a two-way interaction but for the slope difference test this is going to help you understand that through your interaction a lot more carefully and you need to put in more information from that covariance matrix if you're doing this in SPSS again you need to extract that and put it into Excel and I should also add a warning that if you are using SPSS for this for some bizarre reason it often messes up the order of the variance in the covariance matrix I don't know why and it's really frustrating so you just have to take extra care that you are choosing the right variable set but if you put those in you get six slope difference tests because there are four slopes in the plot four lines which means there are six pairs of lines so six differences and the Excel file will automatically test all six but actually for your hypothesis there's probably a subset of those which are relevant so in the case for younger workers with more with higher order autonomy so that's the it's difficult to see with the with the projector but it's the third slope on this plot here you might not be able to see that clearly what that suggests is that that's more positive than the other three so we look at the pairs of slopes two and three and four and three and we would see indeed all three pairs are significantly different from each other so and in the direction we would expect so that's the poles behind what the six that we had there and if you're doing it in states of law we can expect so these six they are you use test command or you can use the link-up command it doesn't make a difference test is my thing right here because we learned about testing so test command makes a lot of sense what we're doing is that we are testing the value of the regression code and this is that specific values of the order so it's writing the regression equation so we are saying that the difference of 25 years of in age multiplied by the interaction of grade and age and grade autonomy in age it could may not so we are saying that this 30 year age difference between the young and old people will not make a difference between the regression points so we just multiply all interactions that contain age we are 85 minus 25 we could also calculate 30 years of self but this communicates the intent of the analysis better than just putting the number 30 if I had 30 there you might not be able to connect it directly to the plot that had 2555 so doing calculations in code is sometimes more transparent than doing it on paper even though it produces a bit longer out so that tests the difference between slopes 1 and 2 and you can do the same things between the 4 and slopes 2 and 3 we are testing young people who have high autonomy put it that way young people with high autonomy against old people with low autonomy so we adjust both variables and we multiply them this interaction term that contains both of those two variables using the difference of the interaction so this is the interaction variable and then you calculate the difference between two lines so if you are complicated lines I think it becomes pretty clear how you do it and you use the test command you will get coordinates controls focus test and e-values and you know are the same thing you would use the hypothesis command and there is a small little detail here that it goes the interaction term the r's regression command quotes interactions with the column in the variable name this is a special character in r syntax so you have to escape the variable name using these and this is just something that you need to know it does not work without the and so we are testing again the same thing the hypothesis that the intent of 30 years difference will not make an difference for the train or the train basically the same the same principle if you go from state up to r it's kind of like you work the exact same way it's like speaking the same language it's the same grammar but the vocabulary is a bit different so just to summarize some principles about three way interaction effects testing if you do have a hypothesis about a three interaction effect specify that as clearly as you can and that should enable you to know which supplementary tests occasionally simple slope tests quite often slope difference tests you would want to do to try and verify that hypothesis if you don't have a reason to do a test then you don't need to do it so even if the software does it if you automatically you don't need to report that of course you can use this in an exploratory manner or using something like this for an exploratory piece of data analysis you really do need to apply appropriate care to your conclusions that's always the case for the exploratory analysis but in particular with this type of thing there's all sorts of prototype tests that can be done any questions about three way interactions in their part if you make two variables in a three way interaction make two variables at a set and test it for the third variable and so if you mind if I have three variables which are actually binary variables binary variables and we want the reviewer to ask us to see the two variables at a set and test it for the third variable and so you've got two independent variables what's that and then what's yet it's exactly the same the way you frame it when you report it, it's slightly different statistically it's absolutely equivalent you have to give it more specific of what it means to test them as a set so are you mean that you are testing a multiple hypothesis that both color businesses are the same at both groups at the same time so you would do a multiple hypothesis test there are a single hypothesis and you would use a world test for that statistically it's also twisted and so that you would say well stay as a test for that that's wonderful so instead of specifying one point like go back to the stay as a test so here we specify that one for example this this one's supposed to be here for example that one's supposed to be zero then you just add the second defect and that that is zero and state of calculates you there are the join test for both hypothesis at the same time calculates gives you three E values E value for the one constraint is one for itself E value for the other constraint E value for both constraint very much so when you do it with hierarchical steps and you have also hypothesis on lower order interaction effects how what does it mean for my interpretation if the lower order vanishes after using the last step with the three way interaction so in general if you've got a significant higher order of interaction then that's the model you should be interpreting and whatever you found in the earlier models effectively is irrelevant at that point because you've got a better model so that's a situation where if you're interested in the lower order effect still I would suggest either centering your data to interpret those in the final model or performing additional hypothesis tests using margins or depending on the software to see whether those earlier hypotheses were still supported but you should always interpret the best model you get and if the three way interaction effect is significant then that's your best model if you think about it it's the same problem that occurs when you have a normal reversal model with no interactions you can add an interaction curve and then your original effect becomes significant well it doesn't really need much just that the new coin is simply not the average but it's the good effect when the moderator value is active so when you add a third order interaction then your effect is interesting the second order interaction is then given at the point where the third order of the quadratic expression is zero and sometimes that point might be the other way like when we add if we think about the trading order of autonomy here we add H as a moderator then the interpretation of the set of training and autonomy would be the four workers whose H is zero doesn't make any sense so yeah I would add the center of how to lay simple forms of interaction yes so I have a question when you do with canonical three-way interactions if you do those it tends to be perfect prediction or multi-coloniality when you have three-way interaction each of them like zero-one zero-one zero-one and so when they run like a regression some co-efficient would not be estimated because it's like perfect prediction so how do you how do you interpret that when some lower order co-efficient would not run at all because of the perfect multi-coloniality yeah the multi-coloniality problem if you have a binary variable that's in your binary variable and you have like a gender variable then that gender variable gives the effect the interest in male and female men and women and you can't have a dominant or male and a dominant or female and you can't intercept as well because then you have just two value of a gender-individed variable and you have a third or a three-based so that is not a problem for quality at all so for example if you adopt this categorical variables then you typically draw one categorical reference then this is your reference for women so the reason for this is the gender which is the difference between these two genders it doesn't induce their effect for women so it's not a problem at all okay I think we should go on as we want to talk about some non-linear effects in the next section let me say it's non-linear there are different forms of this so you are going to take the base on the non-linear variables yeah there are two main kinds of non-linear models that we use in management we have generalized linear models for example I have a loop function which is the previous model it's not a line that we think that they are but it's a kind of an S-curve 0 goes up converges to 1 and then goes flat again and then we have plus some rigorous model a negative line of a model others that have instead of a line they have an exponential curve that they think so that is the one point that comes from the model is this u-shape where we add a second power of an independent value of the model so how do we do these interactions with generalized linear models we are going to be using logic rigorous as an example and but these things work the same way so instead of having normal rigorous analysis we have the logic function so when the logic means or what is the equation it's not that important to know but it's important to know that it produces a little i, it produces u-shape it goes flat first then goes up and then goes flat if i close to 1 if i close to 0 it never reaches 1 ok so that is the logic rigorous analysis and u-s in a logic rigorous analysis we are going to use the command this data g-r in command r and in the logic rigorous analysis so when you plot these models one thing that is very important to understand that you no longer calculate more points and connect them with lines because the model is not there I wrote a paper about nonlinear models and we looked at interactions and maybe a third of papers that presented this kind of logic rigorous analysis connecting these to any points with lines ignoring the fact that it's a nonlinear model so if you should not have lines you would have these curves and there is exo-templates that automatically do the correct curves so there's templates for logistic models there's templates for locally model such as Poisson regression or negative binomial regression two way and three way ones so make sure you choose the right one for the model you're using the bottom line is using these exo-templates but they work exactly the same way and if we take a look at the status index instead of a regress you have logistic and r is the model space in the same way and this is the only difference so slightly different model specification you use a different template otherwise it works exactly the same way and you get a nice curvilinear model how you do this with we don't have multi-templates we don't have an example in STATE but it's here on the suplex that we have to provide as an attachment to the assessment in the negative binomial regression analysis there's another template these work exactly the same way there's a large command for STATE up the prediction command for the r smart-solid index it's exactly the same no different whatsoever it calculates automatically because it knows that the model should be the same but the important thing is to check that your fault actually contains the correct function of fault so we can see that this logistic regression model looks at the very part of the logistic curve where it's close to zero and looking at the linear model tell us a different story because there is not much of a difference between these two age groups here and then for high age groups there are problems so for a sum you have to have the log of the high-stature modeling in fact it's sort of the inexperience of this regression model and you will get an explosive curve of swaps the same way we only got one yeah yeah sure so it's probably just worth saying actually because we haven't actually explained what a Poisson model was we're assuming most of you will know it might not so a Poisson model or a negative polynomial would be useful when you've got a discrete count outcome but with the carrots Nico has explained and if you do need to use the simple slope tests on these certainly if you're using SPSS you can't do it in quite the same way and the Excel templates don't give you a way to do that automatically for that reason but you can do it using the indirect methods so if you understood the indirect method and explained it an hour ago whenever it was that's exactly the same thing that's centered around the value that you wanted to test it at and we run the analysis another important thing to understand about this is the idea of the simple slope so what exactly does a simple slope mean in this context so is the simple slope the direction to which this line goes or is it more about the curvature of the logistic curve so the idea of a simple slope can be understood in multiple different ways that we don't get into but doing these plots would be my choice of a simple slope because this is like clear and I would also not need to explain how I understand a simple slope so a simple slope can be understood at least in three different ways it can be understood under as how strongly these lines curve up so it's not actually a slope at all it's more about curvature of the logistic curve so a slope could be an incorrect curve it can be understood as a marginal you get a tangent at a specific point of this line and it can be understood as average marginal effect which could be the average slope along all parts of this line so you can see that this quickly gets complicated I will not only talk about simple slopes average marginal effect is a better concept because it's less ambiguous but there again that's more technical rather than just new plot and then integrate what does that want to be in the context of your research question these questions around simple slopes etc it is more complex with nonlinear models I'll talk about quadratic models in a moment where it's even more safe but actually the fact that we're having to specify exactly what the question is is trying to answer this should also give you a pause to think about testing under the straightforward linear example because it's answering a very specific question not the type of generic one because sometimes people think it really so yes that's what Viggo was just talking about there and this is another interesting thing that if you talk about the interaction effect in nonlinear models and is there a motorism does the gender moderate the effect of age qualifications raise your hand if you think that gender moderates the effect of age and qualification okay raise your hand if you think that there's no more risk most of you prefer not to have an opinion of this this is something that I talked about in the paper that I published in the last year about nonlinear models is that we have to clearly specify what we mean by moderates but they're not linear models if we specify moderation as the absolute difference we can see that the absolute difference in qualifications increases more for men than women so in absolute terms yes there is clearly a moderation effect the difference here is smaller and here moderation exists but if we think about this in relative terms our exponential model tells that the quantity of interest the qualifications increases not in absolute terms but in terms relative to the current time so every additional year or page gives you certain percentage for qualifications that percentage increase is the same for both gender so the curvature of both these lines or curvature is not but the exponential increase the relative increase is the same for both of these lines but it produces different absolute outcomes so it's our moderation I hope this is a very simple question about the relative effect or the absolute effect we would have to specify that before we can ask for the question whether there is a moderation effect or not it is also possible that there is a negative moderation for the relative effect and positive moderation for the absolute effect so it might be that men receive one percent more qualification every year and women receive two percent more qualification every year so the effect is stronger for women but it might be that on average a man receives five more qualifications every year and a female receives three more qualifications every year not solution on average so there is a negative effect in absolute so this is something that is it's a positive to a question it's not a statistical question but what exactly do you mean when you talk about the rest it's like yes this is a paper it's the paper that I was saying I was referring to and this is a data set of years of education and income for different professions in the sense of Canada from 1970s and this is normal linear reverse analysis because even if there is clearly a moderation effect so men are men in professions receive more benefits for more education whereas men dominated women dominated there is no effect of education so if you are in a profession that is women dominated then people in that profession receive more money to take more education but if we take a look at this curve the exponential model the relative increase for all of these effects groups is the same but it produces different absolute outcomes because men dominated professions on average receive more money so what exactly do you mean when you talk about one race in the context of nonlinear model talking of nonlinear quickly talking about some quadratic models most of you will be familiar I expect quadratic regression basically if you put in not just the x term but the x squared term which then leads you to model a quadratic model parabola effectively or part of a parabola for the relation between x and y which should be very flexible it has certain limitations it is going to be symmetrical about the turning point which may or may not be within the range of the data you are looking at for the types of model which certainly in management research we use all the time they can be good enough for describing nonlinear effects but what happens when these are moderated well we have an extra couple of terms here because we've not only got the x z we've got the x squared z as well so we've got five terms we would need to model there and there are two terms here which we might need to interpret so the b4 which is the xz term if that is different from zero that means that the location of the problem where the turning point can be is different whereas if the b5 the x squared z term is nonzero then that means the curvature itself will be different so it would be more spread out or maybe in the opposite direction and these are both worth interpreting but as before we are not going to go through the syntax now to get through the rest of this in the next few minutes but you can plot it either using one of the excel sheets like this which will demonstrate in this case we've got two lines of certainty to the curvature or you can plot it using state or R if you want to say that it's the interval should I carry on? so we can calculate it we can calculate that plane and margins basically it is aware that it is a quadratic it gives us this quadratic part of the model and it says automatic calculation the same thing with two predictions if we use a polynomial function in R specified the second order we wouldn't have the rule because otherwise it produces all the quantum polynomials which is not something that we find if we use the polynomial to multiply these things together then all the lower order terms can be automatically and both predictions are also aware that this is a U-shape a polynomial model instead of just a linear model so this is a nice trick using the polynomial function in R to get things a bit more automated so you don't have to specify that you're going to have predictions but then in terms of further interpretation this is something where I think there is a bit more knowledge around these things a few years ago I probably got two or three emails a month saying I've got a significant quadratic moderation how can I tell whether my simple slope is significant barely do you realise that actually that's not a meaningful concept in the same way as Nico already explained for the non-linear models in terms of Poisson regression there are different ways in which curves can be different from each other or significantly different from zero so there are three particular post-work tests that I've seen people often want to do I'm not saying you shouldn't do these as before the question is do these particular tests help you answer your specific research question if they do if they don't bother with what is the test whether there is a curvilinear effect to a particular value of the moderator in other words at that particular value of the moderator is there a definite curvature in the relation between X and Y versus a linear effect for example second one would be testing if there's any effect so it might be a curved effect but it might be a linear non-zero effect at a particular value of the moderator or you might be asking whether there's a particular value of the moderator and the independent variable where the relationship is different from zero at that value now all three of these are things that we've included syntax for in the resources we're not going to go through now because we've running out of time by this stage you're not going to include them in the slides it's also worth saying that we might have further questions to ask about three-way interactions and slope differences and curve differences and so on but at this moment we're hoping to submit in the next few weeks which will answer the questions about that but we're not going to go into that more today yeah we'll get sent to the table so we can do that so before we just do there's just a short final section on the other extensions before we do that any questions about the nonlinear effects on linear interactions yes so it is the same data that we use for the nonlinear models it's just an independent variable that is what you're asking that's right so the examples we've included here in the resources they use in the same data set we've used a different output variable so that we've got binary answers to the binary logistic account variable for the Poisson models and actually for the quadratic models we might even have used the same output variable in the examples I can't remember but it might be a different one but in that case it's a continuous one so for those examples we've included it's assuming the outcome is continuous but you can apply the same models the quadratic models to logistic regression or to Poisson regression and the three-way interactions basically you can make as complicated as you want but the more complicated it gets the more complicated it will be to interpret and the greater chance there is of not finding a real result yeah how many of you you know that Poisson looks like like an exponential function and parallel looks like going up and going down if you want to up like a polynomial x squared what does stay on at the Poisson model that has an x squared in the model what does that look like how many of you know that's normal distribution so it goes up enough so if you ever want to model something that's normal distribution number of public cases or time rather than normal distribution great use for Poisson because tomorrow we are quite eradicated yeah that's it yes so the question do you think Poisson will be adding to the formula of the quadratic one in the future? yeah basically again that's just how it is the complexity which you know if you've got good enough data and good enough theory quite reasonable let's just go to the final set of extensions so we can talk about most of it yeah multi-level modeling and structurally it doesn't modeling multi-level modeling works the exact same way the formats for quadratic are the same so there's no way to change anything and they are in the excel sheets also work for multi-level modeling results there is one small detail that if you are doing test single slope tests then instead of using t-test you will be using z-test but in most cases in appropriate use but t-test instead of z-test there will be no consequences because they are so similar to results it only affects very small samples so multi-level modeling as far as interactions go basically it doesn't matter not much there and go back with SPSS there is a bit of software limitation because the multi-level command for SPSS does not give you the variance for variance based on the product output so that limits what you can do you can always do test of single slopes by rescaling your variance then about SCFs SCFs with latent variables are a bit more difficult thing to interpret because the scale of the latent variable is not that clear if you have a observe variable of like 8 or you have some of two 5-point scale items you understand how that varies but understanding the variation of the latent variable is a bit more complicated if you scale the latent variable at the first indicator then you can assume that the latent variable uses the scale of that indicator so if you have a 5-point scale and item as your first indicator of the latent variable then a large value of the variable might be s1 and a small value might be minus 1 for the latent variable but this requires that you understand how the latent variable is scaled and it's beyond the scope of this this is otherwise doing interaction for latent variables is no different for this multi-target there are some specific specialist functions for example if you use the R-AX-LABAN for latent variable models there is the SCM tools and there is Pro 2-Way MC so I don't remember when MC comes from but it gives you plots but that is about what those functions is that if you apply the functions they tend to be for a specific purpose and they are a bit of a black box of what it does and I spent half a day last week helping my wife to do both 2-Way MC function plots and we were wondering why do all the interaction curves always intersect at different so if let's say there are some pointers always at the origin of the coordinates and after conducting the developer of that function we realized that's a feature it's not designed to model differences in intersections in a proper way and figuring out this kind of function in these less known packages or less known functions is very hard and oftentimes we will report an interaction whether the lines start at the same point and then go different ways or whether they start on different points and intersect have different theories coming to this and the fact that the package always does by design lines of intersect is not a desirable feature so if you ever want to do SCMs I would recommend that you use a general purpose package for quality interactions or use the exercises instead of these specialised functions unless you are absolutely one having to assure what the package starts and what kind of limit this is the package. So we're just about out of time I'm going to not say that you're not testing multiple interactions it's pretty straightforward so you can read what's on the slide if you download it. Likewise moderation and mediation if you want to combine those this isn't the place to talk about that but the process of macro is a very helpful tool we can't do everything but it can do a lot of the ball-based models that you might want to do there and advanced spotting is there anything you want to say about this 10 seconds? Yeah 10 seconds advanced spotting I would recommend that you plot as in classical software supports as bands it's not shown in the projector but so it's lies on the screen also show the observations so that we can actually see that the model is the data it might be that there's one point here, one point here which gains the lines are greatly extrapolated. If there are no observations here and you go along there then you should not be in this part of the curve so that is our grace between so I talk about this examples in this table that's data code, next one this is very complicated we do the same thing base R so this is like base graphics R you do the real programming next one and this is more in a way of using the package that you need to use in the demo so it just shows that the package allows you to focus more on what you want to communicate instead of focusing on how do I program who thinks to have it. Okay, so we are out of time so rather than take questions now I think we're okay to stand around for a few minutes to ask any questions if you want to come up but we'll let the rest of you go so thank you very much, I hope you've learnt something and there hasn't been a chance to give the evaluation forms out but if you get those opportunities later to build them in a mouth please do tell us what you did. Thank you very much.