 Hi, welcome to week four of Poli Sci 506 Bayesian non-parametric and computational statistics. I'm Dr. Justin Hestery and this week we're going to start really getting our feet wet with using the existing packages that are out there to help you with Bayesian modeling without having to program all your own Gibbs samplers and metropolis has things out of those and algorithms and the like So what that we're going to do this week is first I'm going to show you what packages are out there how to get them set up on your computer and all that and then we're going to do some applied analysis with these packages and you'll see that even though it's a lot easier than Having to do every single bit of the work yourself programming your own custom Gibbs sampler and the like They're not foolproof packages and you still have to have a good understanding of what's going on under the hood in order to Ensure that this out of the algorithms that you use are working correctly And so the things that we learned last week will carry over into this week in the sense that it'll help you Get it have a better sense of what these packages are doing and and make sure that what they're doing is what you would like them to be doing With that, let's go ahead and get started so the first thing you need to do in order to get started with using Pre-packaged Gibbs samplers is to get one and there are two that I'm going to focus on today One is wind bugs, which I've got its page pulled up here and the other is open bugs Which is a fork of the wind bugs project that's basically I'm actually it's the same or very similar people working on the two I think the original wind bugs might have been meant to be proprietary or something. I don't know but they have roots in in some some of the same people and some of the same techniques Bugs and and open bugs and wind bugs are Gibbs samplers that are pre-programmed for for your computer and allow you to take advantage of Not having to worry about making sure that all the computational bits are working correctly So it makes it a lot easier to build a very complex model And to rest assured that the Gibbs sampler that you use to attack that complex complex model is itself working well So in order to install wind bugs on your computer It's pretty easy if you have a 32-bit machine. You can literally just download windbugs 14.exe Install it and and that's it. It does it all for you If you have a 64-bit machine though and most modern computers are 64-bit now It's a little more complicated. You have to download and install The windbugs 14 program that's already been compiled for you right here And then you have to open it up and got a whole bunch of stuff down here You've got to unzip it. It takes just a second to do Then what's unzipped you actually have to copy this and paste it into your program files Which is where R expects it to be when we when we use that When we call windbugs from R it'll expect to be in the program files folder So you have to just paste that into your program files folder like so It should show up here and one thing you want to check to make sure is that you have this set for the right Permissions, otherwise things will not work quite right so you want to go over to security here and ensure that Users have all permissions. So if you just to show what I did here Just right click on this go to properties Scroll over to security and then scroll down to users and we're gonna actually want for this particular windbugs file We're going to want all permissions to be granted To users in this case So if I want to edit this I just click edit and I can go down to Users here, and I just want to give full control to users apply that and Now users have full control over the windbugs folder like I said This is going to be necessary because otherwise R won't be able to access windbugs or permissions errors will come up when you try To access windbugs Through R so now windbugs is set up if you double click on that program files folder You'll see down here windbugs 14.exe. There's actually one more thing to do or two more things really to do before We're finally ready to use windbugs The first is to download this patch for 1.4.3 You don't necessarily have to do that, but you certainly can All you do is just this is a big text file full of what appears to be gibberish But windbugs knows how to interpret it. All you have to do is save this source as a text file Maybe save it to your desktop like so And then open windbugs, so here's my windbugs folder And I don't want to have it ask every time so I just don't look there Tell I don't want to ask for permission every time Open the text file on the desktop so it doesn't appear at first, but if I go to text you'll see it pops up And then I want to go to Fought or I'm sorry, it's tools and decode and Then just say decode all and it'll ask me for some permissions. That's fine And once it's done you can just quit Like so and when you open it again, I think it tells you that a patch has been Applied I can go over here and yeah, it should be 1.4.3. Which is exactly what we wanted Now the second thing you have to do with windbugs is you have to install a license file so it doesn't cost any money but For whatever reason the predators of the program wanted you to have to install a license key separately, so You just download this immortality key dot text to your desktop just like I did before with patch Open up windbugs open up the Text file with the license in it right there and Then tell it to decode and you should get this keys that OCF just till it's a decode all and that's it and As long as you close windbugs and open it up again everything should work fine from that point on So that's how you get windbugs working pretty simple Unfortunately as the name would suggest windbugs is only available on Windows Including Windows 7 Windows Vista Windows XP. I think it also works on at least I've seen remember using windbugs on XP So if you're using a Mac or a Linux machine You have a couple of options available to you one is to try to use wine Which is a recursive algorithm for a wine is not an emulator And and in effect to emulate windows inside of Mac The other thing you can do is to install parallels or some other Virtual machine program. I think there's another one called VMWare And run Windows inside of your Mac or Linux box. So this unfortunately requires you to install Windows On your machine in addition to your Mac or Linux OS But you'll be able to use windbugs Open bugs is a lot easier to install. You just go to the downloads file There's a where is it? Here we go. You can see there open bugs is available for Linux and for Windows With Macs, you're still out of luck. It tells you to use wine. I believe So in that case probably installing parallels is your best bet But if you go to open bugs here, it'll download a 12 megabyte folder. That's a Setup file that all you have to do is double-click and it does the rest So if I just and it's scanning it or something if I go to open bugs and run that Accept the licensing agreements. Just take all the default options and Once it's done, you're good to go. You're ready to go with open bugs Windbugs and open bugs like I said share some common roots and and for many practical purposes are identical And you can call R. I'm sorry. You can call either windbugs or open bugs From R directly in order to do that. You're gonna have to install some additional packages in R So let's switch over to R and talk about which packages you need to install So here I am in R This is my lecture file for this week and you can see there are two libraries that I'm loading in order to Execute the commands for this week One is called arm and one is called B rugs arm is a very excellent package provided by Jennifer Hill and Andrew Gelman that accompanies their really great multi-level modeling book which is one of the assigned books for this class I've used it in the past as well for my linear model class great book and it comes with some software That's very helpful B rugs is a package full of commands for calling Bugs open bugs and windbugs through R. There's a subsidiary package called our two windbugs that I camera visit arm or B rugs package Also loads, so you're just gonna want to install these two packages if you haven't already and I believe yes I've already got them installed B rugs is going to allow us to run My Gibbs sampler models through windbugs or open bugs directly without doing any All all through R Windbugs and open bugs can actually run on their own is standalone programs and you saw that before when I opened when bugs a program came Open sort of looks like a program. It works like a program But since we're working in R And all our data sets are coming through our for example It might be more convenient for you to to work through our and that's that's how I'm going to show you how to do it All right, so We've got our packages loaded now. Let's start some analysis So what I'm going to do is start off with a a Bayesian regression And what I've done here is created a fake regression data set of size 50 The day's generating process is 3 plus 2x plus a normal error The normal error has mean 0 and term deviation of 2 and I've created a data frame called dat that has y and x in it And what I want to do is I want to run a regression model I want to run a regression model using either windbugs or open bugs and What this regression model is going to do is it's going to sample out of the posterior of a regression Model with the standard regression likelihood, which you probably learned in one of your maximum like good classes Or you could have seen it last week when we talked about it and some prior that we've specified Now there are a bunch of different ways. I can I can do this So what I want to do is I'm going to Specify my model in what's called a a bugs file Dot BUG file the bugs file has The actual model that I'm going to run in it So let's open up a dot bug file and for this corresponds to this regression and take a look at what components it contains All right, so here. I've got example one dot bug. This is a regression model and Bugs files have their own special syntax, which you have to learn if you'd like to use Windbugs or open bugs to as your as your Gibbs sampler now There's another a third option which we're going to talk about in a few minutes called MCMC pack which also implements some some Gibbs samplers and metropolis-hasing samplers that you can also use if you like But if you want to use Windbugs or open bugs you have to learn the bugs Language and it's it's not fortunately that hard to learn and there are some excellent Documentation associated with this. So if you go to the windbugs page You'll see if I click on windbugs here Whoops did not work Well since the windbugs website doesn't seem to be working at exactly this moment Let's look at the open bugs documentation, which as I said is very similar to the windbugs documentation So there's some nice manuals here. They can get you started on on how to to run bugs But most importantly from my perspective is they have Examples of large library of example models that you can look at to see what they've done and how they've done it Secondly they have a nice In their manual they have appendices. Let's see if I can go here user manual contents aha, and there's an appendix Here we go in the model specification link right here. They've got a nice appendix where all the different distributions that you can Use to sample from and a bunch of different functions mathematical functions that Windbugs and open bugs knows are all stored in here So the online manual and online examples are a really good place to to figure out what you're doing and to model your particular To model your models on the to model your models. That's great to in other words to take a look at things that they've done and Literally copy and adapt what they've done to things that you would like to do It works really well, and it's a it's a good way to get going quickly with learning how this how it works so What I want to do is show you my regression file and as we're going through it talk a little bit about how these bugs files are generally structured and The kind of things you can do when you start to build your model. So Here's my example one dot bug every bug file is Be gone and ended with a model statement So there's model and then a bracket and then an end bracket this tells bugs that this is the model I'd like to update. This is the model. I'm trying to run and What you do is you sort of treat this model as a bit of code that tells the Give sampler in open bugs or windbugs what to do So in this case I've got a data set who are the dependent variables called y and the length of that data set is called n And it needs to be told those things which it will be by R And for every observation in one to n for the data set I want the dependent variable to be distributed where there's distribution is this little tilde thing Normal according to the normal density D norm just like an R With a mean parameter y and a precision parameter tau y now You'll notice I said precision and not standard deviation It turns out that the precision is one over the standard deviation Which open bugs and windbugs uses a convention for reasons. I don't fully understand But just remember that tau is going to be one over sigma when you when you specify what tau is supposed to be Actually, I take that back tau is one over sigma squared, which you can see in the model specification Manual here in the appendix for distributions You can see that the normal is actually a function of tau is appearing in places where sigma squared would ordinarily appear one over sigma squared so tau is actually one over sigma squared not one over sigma Anyway, so I've got a standard normal distribution with mean of mu dot y and standard you or I'm sorry precision of tau y And what's my mean supposed to be? Well, this is a normal Classical linear normal model. So the mean of this distribution is going to be alpha plus beta x Where alpha plus beta x is the regression equation. So the regression equation tells me Where I expect the distribution to appear normal density part of it tells me that there's going to be error around that expectation So this first bit just basically says this is how the data should look the data should be normally distributed around a mean given by this linear regression and with a variance or precision in this case of tau Now, I don't know what alpha beta and tau y are Neither does wind bugs or or open bugs But I can tell it my prior densities for these things So I'm in this last bit here specifying. What's my prior density for alpha beta and tau? so for alpha and beta I'm specifying a normal density with a Precision of point zero zero one now that's one over sigma squared Which means I'm putting a very big standard deviation on this parameter So this is In essence a very diffuse although slightly informative prior density For tau y. I'm giving it the the gamma density This is the inverse of the inverse gamma density that we talked about last week Which is why it's appropriate for a variance parameter like tau It's bounded at zero and has a it goes out to infinity. Just just actually let me think about this so the Sigma squared is going to be bound at zero, which means one over sigma squared. Yeah, it's also going to be It's it has to be right into the right So because if it was less than one that would imply a sigma less the ones are right it has to be bounded at zero So anyway, this a shape and scale parameter of the of the gamma density of point zero zero one is Giving me a very diffuse prior. So I'm specifying diffuse priors For the for the entire model So this is a very very simple Bug file that gives me a linear regret a very simple linear regression and tells open bugs and wind bugs How to what the model looks like and it's going to figure out more or less on its own How to update the posterior given that this month given this model. I've told it So the next question is How do I get this to happen in R? Well, it's a relatively simple procedure the first thing I need to do is Specify what the data is now in this case the data is stored in ours memory You can see X and Y are up here and I see I need to actually get in in there as well I just run this bit here. So I've got XY and N in memory the data I Object is just a list of the names of what and of what data I need to feed to bugs Either open bugs or wind bugs So I don't actually have to feed it in I just need to feed it the name of N And and b rugs will take care of feeding the data itself The second thing I need to do is create initial values This is the start of the Gibbs sampler chain the initial value of the chain for all the parameters I'm going to update and I'm going to update alpha beta and tau y and so I'm going to initialize them at 0 0 and 1 and This is the format that this needs to come in and it needs to be a function that returns a list Takes no arguments and returns a list. This again is just a convention of open bugs and wind bugs So just do it this way the parameters is a Character vector of the names of the parameters for which I wish to Examine the posterior density in this case. I want to look at alpha beta and tau y So I create a character vector that has alpha beta and tau y separated by commas in quotation marks The final argument is the argument that actually calls Either wind bugs or open bugs and this is the bug statement So if I just do a help on bugs you can see How this is being called this comes out of the R2 wind bugs package I think is loaded by be rugs and There are a lot of options you can specify in in bugs. I've specified the most common So I will talk about what what I've specified here First you have to feed it the data is the initial values and the parameters and the name of the file That contain the bug file that contains your model That's data in its parameters to save and model dot file and I suggest you set your working directory To the source file location and then drop all your bugs files in in where your R file is in other words Just keep them all in the same folder and that's sort of an easy thing to easy way to make it all work N dot chains tells it how many mark? How many Monte Carlo a marker of money Markov chain Monte Carlo chains you want our R or in this case wind bugs and open bugs to run Wind bugs and open bugs are capable of Running multiple chains at the same time, and if you have say a four core PC It might make sense to run four chains at the same time then to run one chain. That's four times as long I actually have a 12 core PC here. I don't think it's probably overkill to run Four or 12 chains, but I can run maybe four chains N dot iter is the number of iterations per chain And that's inclusive of the burn-in. So in this case, I'm going to have it update 5,000 times with a burn-in of 1000 now I've got four chains running so that's going to give me a total of 20,000 draws five times four, but I'm going to lose 4,000 of those to burn in so I'm really going to get 16,000 draws out of this procedure And thin is that thinning thing we talked about last week. I could discard every So many Observations as a way of making the autocorrelation between the the draws less I'm in fact setting and thin to one which means no thinning at all. There's going to be no thinning at all Debug is basically a way of Getting a detailed output from the open bugs or wind bugs program, especially if something goes wrong In this case, I know nothing's going to go wrong So I'm going to set debug equal to false But you can set it equal to true if you need to if something's crashing and you need to figure out what's wrong And finally the program I'm going to have this update via wind bugs And I'm going to save the output of this command in regression dot sim So if I run this command, you can see when bugs pops open it updates the chains and Now it's done. So now that we've got our regression simulation all loaded in What we might want to do is first start doing some diagnostic plots the way we normally would But in this case, it's going to be a slightly more complicated because we've used bugs to draw our chain or We used bugs as a give sampler to get our our mark off chains and we've got four of them We've got more than and more than one There are some packages, however that that help you do this diagnostics helped you do these diagnostics very easily the MCMC plots package is one of these and The MCMC plots package has several commands that that might be of interest, but the one we're going to talk about right now is this aptly named MCMC plot Command all you need to do for MCMC plot is just feed it your bugs object regression dot sim in this case And also tell it to dump the files It's going to create in the current working directory, which you can specify with the get wd command And what's going to happen is it's going to dump a bunch of PNG files and an HTML file Into your current directory So if I open this file of what it's going to do, let me just you can see my face It's going to create diagnostic plots for all of the parameters that were estimated by the Markov chain alpha beta tau y and also a deviance parameter which is the it's related to the total likelihood So you can see right here for example Each one of our chains list chains is listed in a different color Here's our posterior plot for each of the four chains, and they're pretty much identical and you can see they're mixing pretty well over the space of the possible parameters and there's Pretty much a perfect overlap in other words We're not getting the case where one chain is exploring this space really well and another one's not The same for beta all our chains look pretty much the same and they're pretty much mixing as we would expect Here's our autocorrelation plot Virtually no autocorrelation. That's really really good This is a partial autocorrelation plot Which I think controls for the other parameters in the in the estimate or parameters in the model But our autocorrelation plots all look pretty pretty good pretty normal So what we've learned here is first of all cool command Secondly, this estimation seems to have gone pretty well Open bugs seems to be or I'm sorry wind bugs seems to be doing pretty much what we would expect it to So now what I'm going to do is attach the regression dot sim object I'm actually gonna get rid of these plots bring I need those I'm gonna focus on just plotting the density for alpha beta and tau y are three parameters And you can see here that so the true value of alpha, which is the constant is three And you can see that our posterior density is pretty well centered over three It's shrunk toward zero, which is what we'd expect given we have a Well anyway, yeah, actually we don't have a prior in this I actually know that I take that back We do have a prior in this particular case. You can see it in the bugs file It's a but it's a very diffuse prior, but it is centered on zero So we're seeing a bit of shrinkage towards zero, which is which is what we would expect given given that diffuse But still centered on zero prior In fact, I might be even able to even add a vertical line at the true value So we can get a sense of how we're doing here Makes sense and maybe even also add a quantile Calculation of the 95% Credible region around alpha So if we do a quantile alpha and do probabilities point to 0.025 and 0.975 for a 95% Credible interval we can see the 95% credible interval is 1.12 to 3.8 6 which is you know Centered on the true value I'll be it a little bit shifted towards zero and a little bit diffuse, but still nothing really wrong What about beta that's our coefficient on x which we might be most concerned with if we were actually conducting a live analysis Well, here's my density for beta. It's the true value is supposed to be it's supposed to be oh I take that back the true value for the constant supposed to be three. So I should There we go So yeah, three is still in the 95% credible interval. That's good news The true value for beta is supposed to be two So if I do a density plot of beta You can see that it is centered on to so you shift it a little bit to the right which is away from the prior This is probably just due to the fact that we only have a sample of size 50 So there's some degree of variation in the estimate But our 95% credible interval still comfortably has the true value of two inside of it, although again a bit diffuse What I mean by that is that there's a bit of This is a fairly high variance estimate probably owing to the fact that we only have 50 observations in this particular data set Tao y is the as our estimate of how much error there is in the data generating process Remember that tau y is one over sigma squared. So if we wanted to plot this as a variance or a standard deviation rather we could Take the square root of one over tau y and plot that instead and It's centered on two Which makes sense because that's exactly how much data generating error was in the data generating process So we're getting correct answers there I could Add an absolute variance line or absolute value line or just a vertical dotted line dash line there to indicate the true value And finally add in a 95% credible region I'm gonna do this on tau y tau dot y and again the true value is well within Actually, I should really do this on square root of one over tau y Yeah, so two is well inside that component or the credible region So so far everything seems to be going pretty good That's how you run a regression in in R Using the wind bug sampler Now if I want to use the open bug sampler, which is very similar to the wind bug sampler All I need to do is take that program command and Change it out from wind bugs to open bugs So I'm gonna rerun Everything but now I'm just going to have it use open bugs instead of wind bugs And you can see instead of that open bugs window opening up. I got a wind or I'm sorry, so that wind bugs window opening up I got this open bug open bugs output printing out on my arc command And what I've used here is the print regression dot sim command instead of the Various other commands I use before this print command also works for wind bugs output I just decided this was a good opportunity to show you this print output what this tells you is the mean and standard deviation and various quantiles of each of the estimated parameters alpha beta at tau y and the Deviance which is again a function of the likelihood And it sort of sells us. Okay. Well our mean estimate for alpha was two and a half with a standard deviation of point seven And my 95% credible region would be from 1.1 to 3.9 Which is pretty much the same answer we got last time as I recall We could estimate the regression diagnostics just like we did before using the MCMC plot package I suspect they won't be much different For running an open bugs compared to wind bugs, but it's worth checking to make sure So if I go in here and look at my MCMC output Ah, well in this case you can see I only specified one chain, which is only one chain is showing up Other than that these things are looking pretty similar although I'm noticing a bit of auto correlation in the ag or in the lag for beta and alpha More than we were in the wind bugs command, which is a bit strange a little bit unusual What if happens if I specify four chains for my open bug command and then rerun this So it's gonna There we go, and I should be able to Run this Yeah, again getting relatively similar results, although I noticed that the auto correlations that higher for the alpha and beta Then it was before I noticed actually that partial auto correlation is better and the raw auto correlation is worse So this may have something to do although speaking out of ignorance here It may have something to do with the way these things open bugs and wind bugs are optimized their default characteristics They may work on slightly different Slightly different procedures either way. We're getting roughly the same results and good results Now you may have noticed that there are these diagnostic Things what I get maybe it's not clear their diagnostics, but that go wiki heidelberg and raftery diagnostic criteria these are specified in your Jackman textbook as ways of checking to ensure when bugs in my face ways of checking to ensure that your markoff chains look good that they are Meet the criteria of a good markoff chain estimate For your target density in this case a target posterior and if you recall Things we want out of our chain are we want them to be irreducible, which is to say they need to completely cover the space Parameter space that we're interested in estimating and not be boilable down to a subset of that space We also want them to be aperiodic to tour to have a chance of touring the entire space And really most importantly we want them to be an accurate reflection Of the space and what that means is we want the markoff chain to in at Later points in its estimation To not be different than itself in other words to sort of settle down into the target density and stick with that density So these three diagnostics are ways of ensuring or ways of assessing whether you run enough Markoff chains. So the first one I'm going to start with is the go wiki diagnostic um this diagnostic as jackman explains is um What amounts to is a a t test or in this case a z test for um, whether Posterior estimates are different for the first 10 percent of your sample versus the first 50 percent of your sample And if your markoff chain, uh, and this discards the burnin If your markoff chain is is doing well, um, it should be the case that there should be no difference between your first Uh, 10 percent of your sample and your first 50 percent of your sample. I'm sorry. I'm sorry. I take that back It's the first 10 percent of your sample and the last 50 percent of your sample So it's the front the very front end and then the back end of your estimates. I apologize um, so these are all z statistics here and if you actually run the Uh, go wiki Diag help You'll see that the z scores are um reported here for each one of your parameters alpha beta at dowy and deviance Um, and what you want out of this is for all these z scores to be statistically insignificant So just to say less than let's say 1.645, which is the 10 significant threshold um Yeah, so the z scores are for a test of equality of means If you get a statistically significant test that means the means are not equal So, uh, in other words, if i'm estimating the mean alpha Posterior out of my posterior beliefs about alpha, that means you be the same for the first 10 percent of my chain versus the last 50 percent If you get a statistically significant z score above 1.645 What this is telling you is you need to run a longer chain. You need to run, um, maybe more chains But really you need to run longer chains. I've got a 5000 length chain. That's that's more than sufficient But this go wiki diagnostic is telling me Uh, the next convergence test is the heidelberger diagnostic and you can see i'm going to run that right here Uh, the heidelberger diagnostic runs separately For each a chain you can see i've got four chains and i've got four tests here And actually two tests here One's the stationarity test and one is the half width test um The stationarity test is exactly what it sounds like It's a test of whether this chain for this for each parameter has reached a point where, um The kramer von misstatus that cannot reject the null hypothesis that the distribution of the mcmc chain is stationary And usually you can specify any number of p values here, but um in this case the default is five percent if The null is that the distribution is stationary If we receive a p value on this particular statistic less than 0.05 We reject the null that it's stationary and conclude that it's not stationary If we reject the null that's very bad news because we want our our markup chains to be stationary distributions Because only then can we have any faith that they um are stationary distributions Their stationary distributions match the target density because ultimately what we want them to do is to match the target density Now you can see these p values are all pretty big except for the density on tau y But for that chain, it's pretty there's a pretty low p value But for all other three chains the p values are very high so i would Conclude that probably this is just a fluke And in fact that all the chains really have reached stationary densities. Hopefully stationary densities of the target The second test here is the half-width test. The half-width test is basically a test of Whether the estimate of the mean of the posterior is Sufficiently accurate and more more to the point is sufficiently low noise And so what you can see here in the help file for the heidelberger test Is uh that the half-width test compares The 95 confidence interval for the mean With the estimate of the mean and if the ratio between half the confidence interval and the mean is lower than some pre-defined threshold Eps the half-width test is passed intuitively what this is doing is saying I want to make sure that my estimate of the mean is um Low variance um has a fair has a has a low enough variance that we can conclude that this uh Procedure that we run is a good estimate of that mean In this case, we've uh passed all of these um these tests. There's actually no um Test statistic reported here, but what we do is divide um the uh Half-width divided by the mean and as long as that's sufficiently small We say the test is passed and the test actually reports whether it's passed or not It's passed in all cases both the stationary and half-width tests have been passed in all cases. So that's that's good news Uh, the last diagnostic is the raftery diagnostic um Oh, and I may need to uh Specify this as mcmc here Yep, there we go. Uh, so, uh, again, this is conducted separately for all four chains Um, and what the raftery test is looking at is in essence It tries to estimate how long of a chain we need to run in order to Um, accurately, uh, get an accurate, um, estimate of the density for all of our parameters And what it does to do that is it looks at the tails of of those, um, of those, uh, of those, um, estimated densities And and I'm reading straight out of jackman right now Um, what it does is it, uh, let's see Assess the accuracy of the mcmc based estimate of the 2.5 Uh, percentile the 2.5th quantile With a 95 percent down on the estimate no greater than 0.005 in quantile terms So what it's in essence doing is saying, um, have I got a very narrow Variants in estimates for the very edges of my density if I do if I'm estimating the edges Uh, well in terms of at least having a low variance estimate I'm probably in good shape in terms of my, uh, estimator Um, but it goes further and says well We can use, um, we can use this, uh, procedure to estimate how long of a sample We would need in order to get an accurate estimate in this case What it's telling us is, uh, we need, um 3,787 We need a sample size of through or a chain length rather of 3,787 in order to, um, to get Uh, a good a good estimate now our chain is currently 5,000. So we're already in good shape I can only see one of all these tests, uh, maybe one and a half if you count the deviants Uh, that's uh, actually telling us that, um, we need to run a longer chain than we already have So in general the ratary diagnostic here is telling us we're in good shape Uh, you can get more, um, estimates or I'm sorry more information rather On the ratary diagnostic and all these diagnostics by just calling up the, uh, the health parameter I'm sorry, I'm sorry, health parameter That the help function, uh, and it, uh, the most important thing is it tells you what you can actually, uh, do, uh, do with what it outputs Um, so for example, uh, let's see Right, so it's, uh, gonna shoot out, um, a 3d array containing results Yeah, m is the length of burn in n is the required sample size Uh, n min is the minimum sample size based on zero auto correlation So this, uh, total here is probably the safest, um, bet in terms of the ratary diagnostic of how long you should make your chain In order to get quality estimates of the entire density. In this case, it's telling us we're in pretty good shape So we don't have to think about it anymore So you might want to use these diagnostic criteria when you're using open bugs or wind bugs To get a sense of whether your chains have converged properly There's the mcmc plot command that gives you some information But these diagnostics can, uh, can help you too and in the case of the ratary diagnostic You can even get some direct information about, uh, whether you ought, how much longer you should make your chain if you need to make it longer Uh, so I think, uh, wind bugs and open bugs and also jags, which is a variant of, um, uh, just a, well it's actually The acronym is just another gib sampler. So it's just another gib sampler It happens to work on macintoshes, which is why it's gotten its popularity Those those three are probably the most common ways to estimate a model, uh, with bayesian techniques using a gib sampler But if you don't like, uh, creating a separate model, uh, dot bug file, uh, you can also, um, there are some packages to run bayesian, uh, updating directly in r and in particular, I want to show you a package called mcmc pack Uh, it was written, uh, there were multiple contributors, uh, martin and quinn were the primary contributors and I think dan pemstein committed, uh, uh Contributed some c programming to make the routines faster mcmc pack, uh, can, um, can do some some work for you Uh, and in particular, uh, for simple models, it has some of these models actually built into itself So for example, regression is a model that, uh, mcmc pack knows how to do Automatically without programming any model at all. You just feed it, uh, and I can show you if you do mcmc regress Um You just feed it a formula, uh, the data that you're using tell it the burn in, um, length that you want In this case, I've specified a thousand, uh, how many mcmc updates you want to run? Or actually if I only get around five thousand, I think, uh, the thinning parameter If you want to thin it all, whether it should spit out a bunch of, uh, garbage If you want to, um, set the seed to a particular value, so you always get the same answer In fact, I think I'm going to set the seed to a particular value Um, uh, beta start, I believe is the, uh, yeah, starting values for the beta vector, um, if you, um, Specify nothing, it just runs a simple regression and uses those as the starting values I think that's perfectly reasonable and then, uh, little b0 and capital b0 are the, uh Conjugate prior, um, belief, uh, parameters for, in this case, first the, uh, constant and the, uh, Coefficient on x and then the, um, covariance matrix, the vcv matrix for b0 Um, if you remember our, um, our, uh, uh, discussion from last week Little b0 and capital b0, um, fit into, um, regression as conjugate priors When they, um, specify a normal Density for the, um, prior beliefs about, uh, our beta parameters, so in this case, uh, the, um, Um, mcmc package is allowing us to use, um, ideas that we developed, um, from closed form analytical versions of the regression, of, of Bayesian regression, but not have to think about all those formulas that we, you otherwise have to memorize, It'll sort of do that for you. Um, c0 and d0 are, um, the shape parameter, um, and the scale parameter, um, for the degree of, um, noise in the dgp, the variance of the dgp. Um, and these are, um, Gamma, not inverse gamma, um, parameters. So, uh, so in this case, I've, um, I've actually specified shape and scale, uh, pretty, pretty small. Um, in actually a, uh, No, actually it says it's an inverse gamma here in the health file. So I should actually change this. This are not, This is not about tau. This is about sigma squared. So I should probably actually change this to be a little different. Uh, I might want to set the, uh, I'm going to set the shape to be equal to One and the scale to be equal to two, maybe three, which is a fairly diffuse prior. Um, so if I run, uh, this chain, it's going to very quickly, um, update it. It's much faster, um, at least for some models, um, than, uh, the windbugs and jags. And now, uh, I can work with this regression.sim object. For example, I can plot regression.sim and it'll show me, Let me expand this up for you here. Uh, these are, um, plots of each of the, um, main parameters I'm estimating intercept, uh, the beta on x and, uh, sigma squared. And these are densities, um, posterior densities, uh, for each of those parameters. And as you can see, um, my posterior density for the constant is about two and a half. Um, a little less than three. Um, for the, uh, regression coefficient, it's about, um, two and change, maybe three. It's supposed to be two, but we're still well inside the confidence intervals here. Uh, and our sigma squared is, uh, centered on four, which is actually, um, a bit bigger than it should be. I believe the, um, um, the, the density is of two. Um, if I do a summary of regression.sim, uh, you can see it, it reports, uh, some results, uh, for each of my, um, in other words, it just basically creates a regression table out of the results for you. Um, the mean estimate for the, um, the mean posterior, the mean of the posterior density for the X coefficient is 2.8. Uh, the true value is, uh, two. So it's, it's overshooting it by a bit. Um, sigma squared, the, it's, uh, telling me the posterior, um, the mean posterior is 4.2, which is a bit bigger than it really ought to be. May have something to do with the prior I specified. Um, in fact, let's see what happens if I, uh, set the shape a bit lower. 5, 2, and it doesn't really change that much. What happens if I set it to 0.001 and 0.001? And that's actually, oh, no, that's not right. So it's, it's pretty much getting the same thing every time. So the priors doesn't seem to be informing it that much. No, it's a sigma estimate. It's not, not as good as it perhaps could be. Um, anyway, it's the same, uh, the fundamentally the same, um, way of doing the, of, uh, of running these models, but, uh, you're just using a, uh, uh, Markov chain, Monte Carlo routine that's built into R, doesn't require you to call open bugs, um, or wind bugs separately. Um, so that's how to run a regression using Bayesian techniques in R. Now let's move on to, uh, some more complicated models and we'll start with the only slightly more complicated model, uh, the logistic model. All right. So, uh, the first thing I'm going to do and to run a logistic model is create a fake data set. Uh, and what I'm doing is just, uh, drawing, um, uh, again a data set of size 50. Uh, X is out of the uniform between zero and one. Uh, and the latent index X beta is going to be negative two plus two X. Uh, and then I'm going to sample, um, uh, from, uh, the, um, I'm going to sample random numbers out of logistic distribution using, uh, R logists, um, which is going to give me Y here. Uh, so if I, uh, plot Y against X, uh, there you go. I'm getting, um, uh, logistically distributed variable, uh, Y here. And then what I need to do is, um, this is actually, I should really ought to call this Y star because this is the latent, excuse me, the latent index for Y. Uh, then what I want to do is, um, create the Y variable, um, which is going to be, um, if Y is, actually I should do it this way, Y star bigger than zero. And I'm going to do this as numeric. Let's turn it into a one zero variable here. Oh, I gotta actually call that Y star. So now if I do a plot of Y against X, I'm getting a nice logistic model exactly the way I would, um, or would anticipate. Um, the logistic function, the R logist gives me, um, values of the, um, latent index Y star, which is going to give me, uh, X beta plus a logistically distributed error, um, with in this case scale one. Um, normally the scale, uh, parameter in a logit model is unidentified because it's indistinguishable from X beta. So I'm setting an equal to one here in the same way that a maximum likelihood routine would. So now I'm going to load that into a data frame called dat and proceed with my analysis. Uh, so just like before, I'm going to, um, run a, an open bugs, um, or actually I'm going to come in and do a windbugs model. Yeah, see I'm specifying windbugs here. But I've got to change my model. Uh, so example two dot bug is a, a new, um, a new model that, uh, is specific to the logistic. Uh, so let's open up example two dot bug and get a sense of what's going on in there. Okay. So here's our basic, uh, logit model. And as you can see, uh, what we've done here is just adapted the regression example, um, but changed a few things to, uh, make it a logit. So for example, um, Y is no longer distributed, uh, normally, as it was before, but it's distributed Bernoulli. Um, it's a series of Bernoulli trials with probability P, uh, PI, uh, which is to say the probability specific to Eaps alteration I. Um, and, uh, that probability PI, the, the logit transformation of that probability PI, is given by alpha plus beta times X I. Um, this logit transformation, uh, makes it easy for us to, uh, run a model in windbugs because it means that we don't have, we can just write the, um, in another, in a, in a, in essence, uh, this is the link that transforms, um, probability into, uh, index space. And then we can just write the index as a linear model like so. Uh, I'm going to leave alpha and beta parameters as a reasonably, um, diffuse, just like I did before. Um, these, so this is a reasonably uninformative prior. Uh, now I can go back to R and I can, uh, run this model on the data that I've created. So, um, let's create the status set again. And, uh, what you can see is, I'm just going to, um, basically do the same thing I did before, almost exactly the same code. The only thing I'm really changing, uh, is I'm running example two dot bug instead of example one dot bug. And it takes a second and it finishes up. Uh, let's, uh, quickly check our diagnostics. Um, it looks like our Z statistics are all pretty small. So we're good to go there. Uh, the Heidelberger diagnostics, uh, they're all listed as passing. So that's good news. Uh, the Raftery diagnostics, that's telling us, oh, whoa. Saying some of these chains, we really ought to have a sample size of, oh, as many as 20,000 perhaps. So maybe we should go ahead and rerun this chain with a somewhat longer, we'll say 25,000 chain. Excuse me. Uh, so I'm going to rerun this again and then compute the Raftery diagnostic again. Now, one thing you'll notice is this is taking a very, this is a very short time to run, um, only a few, few seconds really. That's partially because the data set in this particular example is so small, it's only size 50. Um, if we had a data set of 5,000 or 500,000, um, or even just 500, this would take a lot longer to do, um, because you'd have to compute the likelihood for each one of those observations, just like we did in our build your own, uh, Gibbs and Metropolis tracing samplers last week. Um, so this can take a long time on a big data set. Um, and it's not a trivial thing to just run 25,000 more, uh, samples like I just did. Um, but in this case, it's something we can do. So why not? And it looks like in this case, we're getting, uh, now sample sizes that are right around the sample size that we've actually run. So that's pretty good. I'm going to go ahead and call that a win. Uh, now let's print the output and see what we've got here. Uh, well, it looks like, uh, it's telling us that alphas, uh, somewhere between, let's see, uh, negative six and negative 1.8. And the answer is negative two. So we're actually getting inside the 95% confidence interval there. Uh, beta is between 1.5 and 5.5. Um, it's two, so that's good. Um, and if we do an MCMC plot of this, there we go. It's taking a second, it's taking a little bit longer because the, there's so many samples in this particular chain. Uh, there we go. Uh, light coefficients are kind of crappy. Might, we might go back in and thin those if you like. Um, but you can see, here's our distributions for alpha and beta. The true value is negative two and two, and we're well within those. Um, the estimates are somewhat high variance because this is such a small sample, but nevertheless it's coming out. Okay. Uh, I'm going to get rid of these plots here. So if we can attach, uh, bugs, loji.sim and maybe do a plot of the density for alpha, um, oh, left out the A. There we go. Uh, this is exactly the same plot we were just looking at, uh, in the MCMC output. So I'm going to detach that, and that's it. That's all there is, uh, to, uh, conducting a loji analysis. Uh, now let's get into some, uh, more complicated examples. So, uh, the next example I'm going to do is, uh, sample, uh, with, uh, unit heterogeneity, regression with unit heterogeneity. Um, and you might remember that, um, in a, in a frequentist framework when you're running Stata or even constructing smells and R, uh, you might run a random effects model, maybe a fixed effects model, but a random effects model on, um, on your data, if you suspected that there was unit heterogeneity, um, that was inflating or deflating your variance, especially if it was deflating your variance statistics. Um, well, uh, we can do that in, in a Bayesian framework too. How do we do that? Well, let's find out. So, uh, one of the first thing I'm going to do here is I'm going to create a data set that has unit heterogeneity in it. And, uh, what you can see I'm doing here is I've constructed three variables, a unit, which is the unit you're in, uh, X and Y. And, uh, from I one to 20, which is going to be the unit, I'm going to, um, basically set your unit number equal to I, uh, and I'm going to draw a unit effect out of the random, uh, normal distribution, uh, with a mean zero and standard deviation of three. So this is going to be a random effects DGP that I'm creating. Uh, and then I'm going to construct Y is three plus two X, uh, plus a normally distributed error, um, plus whatever the unit effect I drew was. So every, um, every observation inside of unit is going to have a common shock, uh, to, uh, its Y. And, uh, that common shock, if you ignore it, is going to, um, get its way into the, uh, the error term of your regression. So in other words, this is just going to necessitate us running a random effects model. So if I just run this whole thing, I get a data set with unit effects. Uh, and if I, uh, look at that here, you can see I've got three variables, Y X, and the unit identifier, which is just a number from one to 20. Uh, what I'm going to do, uh, now in, um, in windbugs is I'm going to specify a random effects model. And, uh, I'm going to open up, uh, example dot three here. This is a random effects model. It's just a regular regression model, but I've added in, uh, this random effects term. Uh, I'm adding it right on to the, the mu, um, which is the mean of, uh, of Y. And, uh, the random effects term is going to be a vector of length 20. And the particular value, um, that any particular unit has is going to be a function of what unit identifier it has. So my unit value, I was going to be a number of one to 20 determines what random effect I'm going to draw. Uh, and basically I just have to add another loop where from J in one to 20, I, uh, draw each unit effect, um, out of the normal distribution with some kind of, um, precision parameter corresponding to how spread out these, um, random effects are. So all I've done is just added a couple of, uh, additional parameters to my model, the random effects parameter, which is in turn a function of what unit you are and, uh, and how disperse, um, the, this particular random, uh, this particular random effect is. So, uh, this is just a standard regression with a little bit of extra stuff, um, a common shock to a unit. Um, I'm telling, um, uh, windbugs to, um, to pick which random effect you get by your unit identifier, unit I, and then I just tell it to update, um, each unit's random effect by drawing out of the normal distribution, uh, with comment with, uh, mean zero and, and, uh, common, um, variance, uh, tau, uh, RE. And I'm drawing tau RE from a very diffused scam and distribution, assuming I don't know much about the distribution of unit effects in this particular dataset. So now all I have to do is create an additional, um, a variable called M. M is the length of, uh, basically how many units there are. I have to tell, um, windbugs that, so I'm going to tell it, uh, M, there's a number of units. I'm going to feed it the unit variable Y and X, and then N, which is the total number of observations in the dataset. And I'm going to set initial values for alpha, beta, tau Y, tau RE, and also the random effects, uh, variable here. And, uh, what I'm doing is I'm just initializing, uh, the random effect at zero for all 20. So this is rep zero 20. If I run that, you'll see it just gives me a vector of 20 zeros. I'm initializing the random effect at nothing for everybody. So that's it. Just feed all that into, uh, windbugs and, oh, I'm actually running this in open bugs. No, just for some variety's sake. It's all the same anyway. It'll take a second to update. Yep. And there we go. It's all saved. Uh, so let's take a look at the Kawiki diagnostic. Looks like they're all statistically insignificant. So that's a good thing. I need to call this as MCMC. So I'm going to check out my Heidelberg diagnostics here, Heidelberger diagnostics rather. Uh, and they say pass. Now I think I only ran one chain. Yes, that's true. I only ran one chain in this case, which is, uh, why we're only getting one, um, output here and check out my Raftery diagnostics. Whoa. It's saying that for Alpha, I'm going to need, uh, 80,000 samples. That seems like kind of a lot. Um, but okay, so maybe what we'll do since I have a 12 core computer is just up the number of chains to eight. Um, that's going to actually give me pretty much the same results in the, in almost the same time. Uh, so it's going to initialize eight chains. It's going to do some stuff, do a thing, uh, blah, blah, blah. Come on computer. You can do it. All right, pause the video. This is boring. All right, that's done. Uh, so, uh, what are the Raftery diagnostics telling us now? Well, 90,000, 140,000, 90,000, 91,000, 74,000, 78,000. Yeah, we need a really big chain. Um, all of the, uh, half-width tests and stationarity tests are passed, uh, with looks like one exception. Um, so even though the Raftery diagnostic is telling us we need a few more, um, given the length of time it took to get, uh, this particular sample and also given the fact that the Raftery test is, uh, reasonably conservative, I'm going to go ahead and say this is fine. Uh, so now what I want to do is, uh, do an MCMC plot of, uh, RE-SIM. I'm going to take a second to load up. Looks like it's all done. Uh, here we go. So, uh, here's my diagnostic plot for, uh, wait a minute, there should be more here. That's for loja.SIM. Oh, I know what I forgot to do. I forgot to specify the directory. So I need to go up there and say I want the directory to be the current working directory. Try that again. So, uh, it looks like there's quite a bit of, uh, correlation in Alpha, which, um, would be problematic. And also it looks like the mixing is not going, uh, as well as it could, um, for Alpha, um, which is probably why the Raftery diagnostic is telling us that we need such huge, um, such a huge number of draws from the chain. On the good side, beta is looking pretty solid, decent, uh, like characteristics and good mixing. Uh, so is tau. Uh, here's tau y. And this is the tau RE, which is the degree of, um, precision in the random effects, uh, the number of random or the, uh, distribution of random effects. Um, the Alpha plot is a little troubling. I probably, in this case, would think about thinning that chain a little bit, um, given its extremely high level of correlation in the fact that the mixing doesn't seem to be going very well. So the fact that the mixing on that Alpha parameter is so bad might lead us to think about, well, is there's something screwed up about this model? Now the model, as we've written it, is actually technically speaking correct. Um, it's, it's, uh, you might think, well, there might be an identification problem because we're estimating, um, a whole bunch of different, um, random effects for each unit. And, um, thus we're eliminating the, um, the constant. In other words, we sort of set ourselves into a dummy variable trap. Uh, but that's not exactly true because this random effect is distributed, it's, it's constricted to be, um, distributed around zero. And so Alpha should be the, uh, portion of the, uh, constant variance that's not explainable by these deviations around the constant. Uh, but on the other hand, um, we're still separately estimating the D norm around zero and Alpha. So maybe it would be better, maybe convergence would improve if we, uh, made Alpha a part of the normal, uh, distribution. In other words, that we said Alpha is the, uh, center of the random effects distribution. It's the same thing. So it's still just estimating a constant. But now in this Mu, uh, in this Mu expression, we don't have Alpha entering, uh, separately. We have it entering as a part of the random effects parameter. Uh, so what I'm going to do now is just rerun, um, the, uh, chains that I already had. Let me just clear this out here. I'm just going to rerun everything, um, with, um, this new version of the program. And actually I'm going to turn down the number of chains back to one because as you're going to see, that's going to fix this convergence problem very nicely. So I'm just going to make sure I've got all new data and everything. So I'm just going to load the chain. It's just going to take a second for this to run and it's already done. And you can see already, uh, things are better because the raftery diagnostic for this chain is telling us that 3000 is perfectly fine. Um, that's a, that's an indicator that things have gotten markedly better. And if we go to the MCMC output here, uh, wow, this looks much, much better. Um, we've got our Alpha centered on, um, uh, it's mixing correctly. Uh, there's no, uh, problem in the auto correlation. It's, uh, the constant is three, so it should be centered on three. And you can see it's not quite centered on three. Um, but this is a fairly small sample. So, um, uh, actually three, I think is still in the 95% credible region. Uh, we could figure that out by looking at, uh, print, um, RE, SIM. And in fact, yes, three is inside the, uh, uh, seven, or no, three is just outside the 95% credible region. Um, so it says this may be a case where we got slightly unlucky, but you're talking about, um, only 20 observations of units and 50 observations per unit. Um, so a relatively small panel, um, which might be responsible for the relatively crappy estimates, um, of the Alpha parameter. Um, in terms of, uh, the beta, um, we're doing, uh, looks like a little better here. It's centered on two, which is what it should be. Correlations look fine. Mixing looks fine. Uh, Tau RE, which is our distribution of the, um, how much precision is there in the distribution of brain effects? Looks like it's centered on, uh, 0.15. Um, and correlation looks good. So, um, much better performance out of our model with a relatively, um, small change, um, in, in, uh, how the program was written. So something to think about if you ever see conversions problems, uh, don't necessarily just attack them with, uh, more iterations or more chains. Think about maybe improving your program. Uh, let's take a look at the plot. This is the plot of, uh, square root of one over Tau RE. So this is the square root. This is the, uh, sigma of the distribution, um, for random effects. And what this is telling us is the random effects are distributed around a mean of, well, zero, if we're considering the brain effects or the constant, if you want to sort of add that in. Uh, and the variance, I'm sorry, the standard deviation of the, of the random effects is somewhere between two and three. Um, so we're, and in fact, uh, we specified it to be, I think three, two. No, uh, no, unit effectors are three. So it's well inside, uh, what we would expect. Uh, so random effects models, uh, work well in, uh, in, in, uh, Bayesian framework. And in fact, uh, mathematically speaking, they're, uh, even easier to think about in the Bayesian framework than they are, um, in the Frequentist framework or the maximum likelihood framework. Um, in the maximum likelihood framework, uh, competing in random effects models is somewhat complicated because you have to integrate out the random effects as a part of the likelihood maximization process. In R, it's just a matter of adding another parameter, adding a random effects parameter and, and estimating those things on the fly. Uh, relatively easy and straightforward thing to do. So it makes random effects a very natural way to proceed, uh, when you're thinking about, uh, a Bayesian analysis in the panel, uh, framework. So the last thing I want to show you today, um, is a function of the, um, MCMC pack package or library that I showed you earlier. Um, MCMC pack has a nice little function called MCMC metrop1R. Uh, MCMC metrop1R allows you to, um, sample, uh, using a metropolis Hastings algorithm from any arbitrary, um, uh, pa, posterior function that you yourself can specify. And so what I want to do here is set up a little problem where I generate some data out of the Couchie distribution and then use MCMC metrop1R to recover the location and scale parameters, uh, which are unknown. So, uh, what I'm doing is I'm saying that, uh, X is, I'm going to draw a data set of size 50 and, uh, uh, that's X, uh, that's the predictor and Y is a dependent variable and it's just going to be distributed Couchie, uh, with location or mean, uh, not actually not, not mean, it's location. Whoops. Uh, 2X plus 1 and a scale of, uh, 0.5. Uh, and then I'm going to just load that into a data frame called dat. Uh, so what I'm going to do is I want to estimate, uh, a Couchie distribution, just Couchie distributed model. It's sort of like a regression except instead of assuming that the regression errors are normally distributed, I'm going to assume that they're Couchie distributed. So, uh, in a Bayesian analysis, I need to specify the likelihood in the prior and then combine those into the posterior and, and sample out of the posterior. Uh, so what I'm going to do is, uh, first write, uh, likelihood function using the Couchie, uh, function. So what this Couchie dot like function does is it takes three arguments, uh, the beta parameter, uh, Y and X. So in other words, the beta and the data. And, uh, the first, um, uh, 1 minus length B minus 1 elements are, uh, going to be the beta parameters on X. Uh, and the very last element of that B vector or beta vector is going to be, um, gamma, the estimated, um, uh, uh, scale of the, of the Couchie distribution. So what I'm going to do is I'm just going to calculate the, uh, the Couchie density for every, um, observation in X. So I'm going to get a vector of these. Uh, then what I'm going to do is I'm going to, um, uh, take the, I'm going to compute the log likelihood. This is a log likelihood problem. Uh, what I'm going to do is I'm going to, um, uh, basically pick the maximum of out, which is the likelihood and, uh, a very, very small number. Uh, the reason I'm doing this is because if out is very close to zero or even zero, the log of zero is, uh, negative infinity. And I don't want infinities creeping into my likelihood. So I'm basically going to, um, force the likelihood to be at the least a very, very small number before I take the log. Uh, then what I'm going to do is just log, uh, those, all those likelihoods and then add them up. So as you may recall from, um, your like maximum likelihood class, uh, if you're dealing with raw likelihoods, you, you, uh, multiply them to create the joint likelihood of the data. If you're taking, if you're assuming log likelihoods, you sum them, which is a property of, of, of logs. The, uh, log of a product is the sum of the logs. So there's my likelihood function. Uh, my prior, uh, is just, um, well, I've got three parameters in this case. So I've written a custom prior. Uh, the first one is the, uh, is the, uh, prior on the constant. Um, the second is the prior on X. So in other words, uh, going back to our location, we've got two X plus one here. Um, we need a prior for the constant and a prior for the coefficient on X. And I'm just specifying normal priors with mean of zero and standard deviation of three. So reasonably diffused priors for the third, uh, gamma parameter, I'm saying that as long as gamma is bigger than zero, um, the, uh, uh, the prior will be given by an inverse gamma with shape point zero, zero, one and scale fives. This is a reasonably, um, diffuse prior. And you can see what that prior looks like. I could create an X for a sequence from is point, oh, one to is, uh, say five by is, uh, point, oh, one. Uh, and then Y is, uh, the inverse gamma of X and that will, oh, I've got to load the CPAC library. There we go. And plot, uh, Y given X and there we go. That's what our prior looks like. Um, I'm actually not super enthused about that. I might turn down the scale just a little bit, three here and see what three looks like. Yeah, it's moving a little closer to zero, which I like better. So I'm going to set that at three. Uh, now again, I've got to take the maximum of whatever prior I come up with and a very small number because I need to log these priors, um, and, uh, and then add them together. And I can't add a log of zero because the log of zero is negative infinity. So there's my prior. Now, uh, my posterior is usually my likelihood times my prior, but I've got log likelihood and a log prior. So if I'm multiplying two log numbers, uh, what I want to do, I'm sorry, if I'm multiplying two numbers and then take the log of that, uh, what I'm doing is adding the sum, it's the sum of their logs. So instead of, uh, multiplying these together, I'm adding them together. And actually I'm going to add a return out statement just to make that, uh, syntactically unambiguous. R would actually do the right thing there, but I want to make sure that R does the right thing. So I'm going to load that posterior up and now I'm ready to simulate. So MCMC metrop one R, MCMC metrop one R, uh, samples from a user written R function, which in this case, the fun is the first argument. Uh, the function is couchy posterior. Uh, what I need to give it is some initial values, uh, for the, for the vector of things it's sampling out of, and I'm just feeding it a vector of ones. Uh, I need to tell it, uh, uh, any arguments that the posterior function takes, in this case Y and X, I've got to tell it what those are. Uh, and Y is going to be just Y, the data that's already in memory right here. Uh, X is going to be, um, uh, X, but it's also going to be a column of ones. So I'm C binding together, um, one and X because I'm estimating two things, I'm estimating a coefficient on the constant and a coefficient on X. So I, I need, I need those two things to be bound together. And in fact, I'm going to do that. Oh, looks like I could get a load X again because, uh, I put in an X before. So let's do that again. Uh, now so C bind one X. Yep, that looks right. Uh, so, okay. So, uh, that's perfectly reasonable. Just remember to add in the constant whenever you're doing in these, uh, user specified functions. I'm going to set a burn in of 500. I'm going to play around with that if you like. And I'm going to take, uh, 20,000 draws. So, uh, MCMC metrop 1R is a reasonably, uh, fast, uh, um, algorithm because it's written in C or C plus plus and whatever. Um, and, uh, it's done. It's stored the results in Cauchy.Sim and it tells me that the metropolis acceptance rate was, uh, about 50%. Um, you want a metropolis, uh, acceptance rate typically between 40% and 60%. So that's the good news. Uh, so now if I do a plot of Cauchy.Sim, it's going to tell me, uh, the trace of my chains and also the density that I got out of those chains. Uh, traces look pretty good. Looks like we're getting pretty decent mixing. Uh, and, uh, looks like let's see. This is variable ones. This is the constant. This is X. And then I think this third thing is the, yeah, that's the shape or the scale coefficient. Uh, so if we go back to our, um, initial thing here, we're looking for a constant of about one. That's looks pretty good. We're looking for a multiply, uh, I'm sorry, a coefficient on X of about two. Yeah, looks pretty good. Not exactly the posterior mode, but close. And we're looking for a scale of about a half and our scale is about a half. So our model is accurately recovering, uh, the things that, that we wanted to recover. Uh, and I can print out a sort of pseudo regression, uh, table here for my Cauchy, or this is not a couch. This is a Cauchy, uh, continuous model. Uh, here's the mean of the posterior distribution 1.15. It should be one. So that's good. Uh, this is the mean of the posterior mode for, uh, the coefficient on X, which should be two minutes about two. That's good. Um, and here's, uh, for the shape parameter. So this is a really powerful, um, MCMC metrop1r is a very powerful, uh, tool because it allows you to write whatever posterior function that you can imagine. And as long as it's a reasonable posterior, in other words, as long as it's, um, it is a proper posterior, uh, you'll get interesting answers. You can estimate coefficients, um, out of that model. It doesn't have to be pre-programmed at all. And you don't have to write your own Gibbs sampler or Metropolis Hastings sampler in order to do that. I can specify any likelihood that I like. Uh, if I recall correctly, I, I'm pretty sure that I used MCMC metrop1r in my, um, in, in some, in some versions of my, um, political analysis piece with Wilmore and Bumamucergy. I think for reasons of time, we ended up, um, just approximating the posterior mode with a maximum likelihood model because we had to run many, many, many thousands of iterations and every iteration if we had to run Bayesian chains would just, would have taken forever. So we ended up, um, shortcutting using, uh, maximum likelihood methods. But our original, my original forays, um, used the MCMC metrop1r package when I was exploring, um, how the, how my likelihood for that particular model worked. Um, and it was really, really helpful. It's really helpful for models where there's, there exists, uh, no closed package, not even a bunch of distributions to, um, to enable you to sample out of it. MCMC metrop1r can sort of take care of the details of metropolis acing samples for you as long as you can write down the posterior. If you can write down the posterior, MCMC metrop1r can take care of quite a bit of the rest. All right. Well, that's probably enough for one week. Um, thanks for watching and, uh, next week we'll start to talk about, uh, uh, some really deep applied stuff. Um, in particular we're going to talk about, um, hierarchical modeling using these packages that you've learned this week. So I'll see you then.