 Hi, this is Dr. Justin Essary, and this is PolySci 506 Bayesian Computational Nonparametric Statistics, and this is a bonus lecture on how to use JAGS as your Gibbs sampler instead of WinBugs or OpenBugs. You may remember from my last lecture, we talked about using various pieces of software such as WinBugs, OpenBugs, or MCMCPAC as your Gibbs sampler when you're estimating some kind of model with Bayesian methods. However, WinBugs and OpenBugs are not available on McIntosh systems, at least not without a good deal of extra work such as installing an emulator or maybe even installing Windows on your PC. So I decided to create this little tutorial using the examples from last week's lecture showing you how to use JAGS if you'd like to use JAGS. It turns out that at least in R, the use of JAGS is relatively interchangeable with the use of WinBugs or OpenBugs. The bug script files that you write for WinBugs or OpenBugs can be used pretty much in most cases without any kind of modification in JAGS. There are some syntactic differences, but at least for the kind of models that we're running, it'll be roughly equivalent to run these models in JAGS instead. So what you want to do is just Google JAGS and you'll get Jersey Girls Area Soccer and the Jacksonville Jaguars and Junior Amateur Golf Scholars, but you'll also get just another Gibbs sampler, the source forage page here. Just another Gibbs sampler is just another Gibbs sampler. You want to download it and install it on your system using the various links here. The Windows page where I got mine, actually it looks like they're binary packages for OSX2 is here at the source forage page and this JAGS 3.2.0 EXE at the top is the link for the Windows version. So if you have the Windows version or if you have a Windows PC, you can just download that. Otherwise, you're going to want to go into JAGS 3.x, go into Mac OSX here and the latest version should be right here, the JAGS 3.1.0 DMG file. Right there, here we go, the EXE 3.1.0.DMG, that looks like it's slightly behind the very latest version of the Windows port 3.2.0, but I'm sure that it'll be close enough. So for the Windows PC, you just download the CXE file, double-click it and more or less installs itself and you're ready to rock and roll. So moving on to R here, what I'm going to do is do two examples, one with the regression data set, just like last time, and one with the random effects data set, again just like last time. And then I'm going to do both of those using a standard JAGS type model. Then I'm going to do a third example where we look at a parallelized version of running in running a Gibbs sampler. One nice thing about JAGS is that it actually allows you to run multiple chains at the same time on a PC that has more than one central processing and more than one core. So for example, this computer I'm recording the lecture on has 12 cores. So in principle, I can run 12 Markov chains at the same time and get many, many more updates and many more samples from my target distribution in a much shorter time. That's going to become really important when you run a very complicated model and want a good number of samples in a reasonable time. It's nice to be able to use the full power of your PC and JAGS allows you to do that. So we'll start with the simple examples and then move on to that parallelized example. All right, so I've cleared everything from my R memory. I'm just going to load in the same regression data set that we used in lecture four. This is a very simple Y, X linear relationship where the relationship between Y and X is three plus two X plus a randomly distributed error term that's distributed according to the normal with mean zero and standard deviation of two. If I do a plot of Y against X, you can see that's what the data looks like. It's generally speaking an upward sloping curve with a good or upward sloping line with a lot of noise involved. What I need to do is load in the packages I'm going to use. As before, I'm going to load in the ARM package and the CODA package, which allows me to do some post analysis on Markov chains, such as the one we're going to generate with our Gibbs sampler JAGS. I also need to load in the RJAGS library. The RJAGS library is a library that enables us to call the JAGS system from R, just like B-RUGS allows us to call WinBugs and OpenBugs from R. This is exactly the same model I ran last time, except there are subtle differences in how I need to specify some of the arguments here. For example, last time when I specified the data arguments for the model, I just gave it a list of N, Y, and X in quotation marks. Now I need to actually say N equals N. This is the numerical argument N, which should be 50 in this case, because I've got 50 observations. I'm setting the character Y here equal to Y, BAM, which is the Y dataset I have. Now if I look at the data list, it's a named list of numbers. That's now my data argument that I give JAGS. I would give OpenBugs or WinBugs just a list of names. All it needs is the names. It can call the relevant arguments from R's memory. The NITS command pretty much looks the same. I'm going to give it a list of alpha, beta, and tau, y, and NITS. I'm going to specify the names of the parameters I'd like JAGS to keep track of, just like for OpenBugs. I'm using the exact same example 1.bug file. Let's open that up and actually take a look so you can see that it's exactly the same bug file I was using last week. Here's that bug file, and you can see this is line-for-line exactly the same bug file I used to run a regression in WinBugs and OpenBugs. There's no difference, so if you understood the lecture from last time, you'll understand this bug file. It's exactly the same. Now what I want to do is create a JAGS model and update it. For JAGS, this is a two-step process. First I'm going to create the model with JAGS.model. I'm going to specify the bugs file that corresponds to this particular model with example 1.bug, file equal to example 1.bug. Give it the data, the initial values, the number of chains I want to run, and this NAdapt command has to do with a sampling procedure that JAGS uses. Its default is a thousand, and I just leave it set at the default. So I'm going to create this JAGS model, and everything works right. It should say no errors, it compiles the model, and it's ready to rock and roll. The next thing I want to do is burn in this model with a thousand samples. So I use the update command to burn this in. I tell it I want to update the reg.jags model, and the number of iterations I'd like to compute, which in this case is a thousand. So I do that, and it computes almost instantaneously. Finally, I want to draw the samples I'm going to actually analyze and use. And to do that, I draw the argument, or the function, coda.samples. Coda.samples wants to know what model I want to run, so reg.jags, what variable names to keep track of, which I specified up here in the parameters argument, that's right down here, and the length of the chain I'd like to draw, which in this case is going to be 5,000. So if I do this, and again, it's almost instantaneously updating, it now has 5,000 samples from the regression model. And just like before, I can run my Goiky diagnostics. That looks pretty fine, no statistically significant Z scores there. The Heidelberg diagnostic, Heidelberg Welsh diagnostic, which is passing all of them. The Raftery diagnostic, everything's working pretty much like before. This is telling me I'm going to need a bigger sample, about 13,000, 14,000. So I'm going to up the chain length to 15,000. Pretty quick, run the Raftery diagnostic in. Yeah, pretty good. So I'll go with that. I can do a summary of regression.sim, just like I did before. And you can see it's computing based on these samples that alpha is about 2.5, beta is 2.8, and tau wise 0.25. That's pretty close to the actual values, which are 3, 2, and I think 0.25 is what, yeah, tau should be in this case. So we're getting pretty close to the right answer. I can use MCMC plots library, just like I did last time, and it will create an MCMC plots argument, or I'm sorry, an MCMC plots HTML file that I can open. So if I go to the current working directory, I should be able to find that. All right, here's my current working directory, and you can see the MCMC output HTML is in there. Here's my chain diagnostics here. Everything looks pretty good. The product correlation plots look pretty reasonable. Maybe could use a bit of thinning, but not really much of a problem. And my posterior densities, marginal densities look pretty much fine. I can also call this den plot command, which will allow me to plot the marginal densities for alpha, beta, and tau y directly in the R window, which you can see right here. Den plot also works on windbugs, objects, and openbugs objects, if you'd like to use that. And there's also this catar plot function, which creates nice confidence intervals for the parameters you specify, in this case I've specified alpha and beta. And it gives confidence intervals at two different levels. I believe they're the 50 and 95% levels, actually let me check that. So if you go to the help file, it should tell you blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah, blah. Where is it? Quantiles. Yeah, 95% and 68% credible intervals, it's actually not 50%, it's one standard deviation and two and whatever standard deviations. So there we go, 95% and 68% credible intervals for alpha and beta. And I can put it in the zero line if I'm interested in rejecting zero as a credible value, a credible belief after this regression, which is not. So everything's pretty much the same in Jags as is in bugs, except for these couple of different commands. There are, I should say, also some syntactic differences in the way that bug file is written for certain models. There are some differences. But for most of the models we'll be doing in this class, the differences are not going to come up. And so it's pretty much the case that the examples that I run in WinBugs and OpenBugs will work in Jags just fine. All right. Now let's do the example with unit heterogeneity, the random effects example from last week. And I'm just going to basically run the exact model I did before. I've made the changes to the appropriate Jags bits here or the appropriate model initialization bits here, just like I did for the regression model. The only difference is that because this is a somewhat more complicated model, you're going to notice it's a little more, it's going to take a little longer to run. And in particular, I'm going to initialize, instead of one chain, I'm going to initialize four chains, which you can see I'm setting up four chains in my Jags model. And then when I use the Burn-in and Coda samples, I'm going to actually tell it to update all four of those chains simultaneously. So this is actually going to take a little bit longer because it's going to run four chains, but it's not going to run them at the same time. It's going to run one chain, then the next chain, then the next chain, then the next chain. So really what I'm doing is running 80,000 samples in sequence. All right. So let's go ahead and load in, clear out the old stuff and load in the new data. This is exactly the same data that we used last week. And if you just do a head on this depth bit here, you can see there's a Y and X relationship. And there's also this unit variable, which tells us what unit we've observed here. We've got an N of 20 and a T of 50. So we've observed 20 units 50 times in this data set. And the unit heterogeneity is there, but it's a random effect. So the unit heterogeneity has mean zero and is distributed with standard deviation of three around that zero mean. So now what I'm going to do is just go ahead and load my model in Jags. I've got that all done. You can see I'm using example three dot bug, just like I did in the lecture four example. Now I'm going to update a thousand iterations for my chains, which I've got four of. And now I'm going to run 20,000 simulations of each chain. So a total of 80,000 simulations. Now that's a lot of simulations. You can tell it's taking my computer a little bit to crank through all of these chains. That's not actually too long. This is a simple enough model that it's not unreasonable to do it that way in sequence. But for really complicated models, this can end up taking a very, very, very long time. So this is why eventually we're going to want to talk about using parallel, we're going to parallelize this process. So instead of running these four chains one after the other, we can run all four of them, or even more, all at the same time. So how do my diagnostics look? Well, my Guicci diagnostics on all four of my chains all look pretty good. My Heidelberger Welsh up, it needs to, can't course more than one chain. So I've got an issue with that diagnostics. So what do I want to do there? What I want to do is remember that this is not a bug's object, so I actually don't have to have coerced it to MCMC, it's already an MCMC object. There we go. So it looks like I'm passing all the tests for all of these, that's kind of weird, stationarity test path failed for one chain of the beta. So we might want to go back and run that simulation again possibly. Oh, and again we can't coerce because we don't need to, it's already an MCMC list. So if we run the Raftory diagnostic here, it's telling us we need chains of 13,000, which is actually perfectly reasonable. So I failed one of these tests up here, the Heidelberger Welsh test, but all the tests were fine, all my other tests were fine, so probably it's fine. If I do a density plot of RECim, what you see is it plots each chain separately in a different color, and it's really reassuring to know in this case that all four of the chains are kind of hitting the same marginal density, the density for alpha, for example, looks the same. We can treat all four chains as one big long chain by adding collapse.t to the den plot, and then it just sort of glues the chains all together and looks at the posterior marginal densities for each parameter separately, and you can see in this case each of the chains looks alike and together they look alike as well. So our plot here shows that, oh look at that, beta is about two, and alpha is somewhere between three and five, five and a half. We know that in fact it's, where is it here? It's three. So actually that's interesting, the confidence interval is slightly, it's actually right at the border of the 95% confidence interval, so we're actually not getting a terribly great estimate of alpha in this particular example, but we are getting a very accurate estimate of beta. Okay, so that's all there is to using Jags, now at least that's all there is for these simple examples. Now the next thing I want to show you is how to use Jags to run multiple Markov chains in parallel, and to do that we're going to need to load a couple of new libraries, so let's take a look at that. So in order to do this parallel updating of Markov chains in Jags, I need two packages, the Declone package and the Snow package, the Declone package has a special command in it that allows me to do parallel updating of Jags models, the Snow command allows me to set up a cluster of, which is to say a bunch of subsidiary R programs, each one running on a different CPU on my computer, that I can send programs to and have them execute independently. So once these, you can load these with the usual install packages command, if you don't already have them, I already have them installed, but the first thing I'm going to do is load the LaCouilleur module into, this is a module of Jags, and what this tells it to do is to use the LaCouilleur random number generator. It turns out that updating, having random numbers generated on multiple machines at the same time can be somewhat complicated because you don't want to just use the same numbers, in that case you get the same results, and you also don't want to do simple random, pseudo random number generation because you can get overlaps where you start to get the same numbers after a certain number of times, or the numbers can be correlated with each other, all kinds of crazy things can happen. So the LaCouilleur random number generator is designed to handle random number generation in a parallel format. So the next thing I'm going to do is make a cluster of 10 machines, or actually it's 10 CPUs in one machine, my machine, and call it CL. And the make cluster command does this, this is a command in snow, 10, it tells it I want 10 workers, and type equal sock means it's going to set up the simplest kind of cluster, which is just the kind of cluster that communicates with itself inside of the same PC using simple socket commands. This won't work if for example you had five PCs, all different boxes, all networked to each other over the network, and you were wanting to use them as a super computer, then you'd have to load in some other package and use a much more complicated parallelization framework in order to do that. But since I've got a PC with more than one processor in it, it's all in the same box, the socket type parallelization is going to work fine. And most of your PCs probably have four or even eight CPUs in them, or if you have a lot of top maybe two, and why not take advantage of both of them, just set the cluster size equal to the number of CPUs you have, maybe minus one if you've got something else going on in the background. This PAR load module thing actually tells R to load the Lakuya module in each one of the workers. So I need to tell each worker to load the Lakuya module into its memory in order to use the right random number generator. So the rest of this, this dat and jags and its and parameters thing is just the same stuff I loaded before, in fact it's probably still in memory because I have the data still in memory. This is the same basic problem I had before. One thing I have to do differently is I now have to generate parallel initial values is in addition to just the regular Jags initial values. All you need to do is call parallel in its feed in the Jags and its command you have and tell it how many chains you want. And in this case, I actually don't want 12, I want 10. So I'm going to save those initial values in RE par and it's then I'm going to feed RE par and its to my Jags model, which is here I'm constructing my par Jags model. I have to give it what cluster I'm using. The name that this model is going to be called, so you notice I'm actually not assigning this to a name with an arrow the way I usually do. I'm actually putting the name inside the par Jags model command. That's a peculiarity of this command, it just is the way it is. Here's the bugs file I'm using, example 3.bug, the data, the init are these parallel in its that I created. Tell it how many chains you want to run, which is 10 and this in adapt command I keep at the default of a thousand. So if I do this right, it should feed return and incidentally these 10, this list of 10 things that's returning back is each one of the workers telling me that there were no errors in executing the command. I just, I just executed. So I'm going to use par update instead of update to burn in a thousand simulations on each model. Then I'm going to use par code of samples in order to generate the, the samples I'm actually going to use. And you can see that's the same as code of samples up here just got a par in front because there's additional overhead with doing the parallelization. And what I'm going to do is instead of running 80,000 updates, 20,000 each and four chains, I'm going to do 80,000 updates, 8,000 each in 10 chains. And each one of the 10 chains is going to run in parallel. Now watch how quick this is able to do this. Done already. That was much, much faster than it took to run a, and I had the same number of samples as when I did four chains each one after the other. So in a problem that takes a really long time that may take a day to run. If you could cut that run time in a half or in a quarter or in an eighth by running eight processors, 10 processors, 12 processors, even as few as four processors instead of one, you're going to get some real advantages in some of these, in some, in performance in estimating some of these models. So if you look at what I've got out of this, here's my, the head par read dot, par dot read dot sim, I've got 10 chains starting in a thousand one, because I burned in a thousand samples, and I've got my four variables. And if I do a density plot on these, look at that, looks the same. All of them pretty much look the same. I've got 10 different chains. They're all pretty much overlapping. If I collapse them into one gigantic chain, yeah, looks pretty fine. Nothing crazy seems to be going on at least. My tau y is at 0.25. My beta is at 2. My alpha is at 4. It's a little high. Three is what it's supposed to be in truth. But nothing really that different from my initial JAGS model. So that's really cool, pretty easy. And you can run JAGS models in parallel without a whole lot of hassle and really get some additional speed up in your sampling time. Remember, once you're done with your cluster to use the stop cluster command to turn it off, that will delete it from memory and keep you from having 20, 30, 40 cluster pair, 20 or 40 zombie programs are running in the background, eating up all your RAM and taking your precious CPU cycles. Probably mostly eating your RAM, not taking your cycles. Anyway, that's it. I hope that helps for you Mac users out there especially or for those of you interested in running your chains in parallel. And with that, I'll see you later with the regular lecture for this week.