 Sorry, everybody for the debate. I'm not sure entirely what happened and apologies for any background barking. I'm staying with my mom and she has many dogs. We're all delightful. Quite noisy. Ah, cool. There's a message from the Arkansas team. Thank you for your patience. I apologize for the delay. Cool. Wow, such a full house. So nice to see everybody. Awesome. Cool. I think we should get started since we're a little bit late. My name is Astrid. Welcome to the session. I would love to know where everybody is joining from. I'm joining from South Africa. If you can't tell that's where my accent is from. I'm from Johannesburg. So if you feel comfortable, please pop your location in the chat. The city that you're joining from. USA, Portland, Oregon, cool, Charlottesville, Florida, Michigan, Los Angeles, New York City is one of my favorite places in the whole world. Ah, Istanbul. Hello. Awesome. From New Zealand, kind of in London. Sick. Cool. So everybody seems to be experiencing mourning right now. It's about 5pm here. Switzerland. Awesome. Chicago. Chicago is the only musical I like. Cool. Welcome everybody. Is there anybody from Africa on the call? Am I the only person in Africa right now? I would love to know. London Sri Lanka. Awesome. Cool, cool, cool. Yay. Okay. Let's get the show on the road. I have a few slides and I've got some exercises for us to work through. We have a virtual machine, which is running our studio, which people will be able to log in from Tanzania. Ha ha. Yay. Nice to see you. Nice to see you Lucas on the call. Cool. Yeah, so we have a virtual machine that's running our studio on Posit Workbench. So for those of you who don't know, our studio has gone through a re-branding and they are now called Posit. And this is really about the sort of love story between our own Python. So our studio workbench is now called Posit Workbench. And this hosts our studio in the cloud as well as VS Code and Jupyter Notebooks. So we'll be working on this Posit Workbench platform today, but we'll be using the RStudio IDE cloud-based. UCT Alumnus. Yes. That's awesome. Nice to have another UCT person on the call. Sick. Yeah. So we'll be working in the RStudio IDE on a virtual machine. If you would like to just use your own local installation of RStudio, we can do that. Most of the activities, I mean, pretty much all the activities today are like not reliant on the code being shared. Although it will make your life a little bit easier to work in the cloud with me. Either way, that's fine. So I'm going to share that link in a bit and we can all get lumped in. Okay, let's get started. Let me share my screen. So I would love to also hear in the chat what you're using R for currently, what are some of the sort of tasks that you're writing code to be able to do. So go ahead and drop that in the call while I get set up here. What industries are you from? Can you guys see a slide deck that says RN for reproducible research? Yes. Thanks, Shirley. Epidemiological research and pharma use R for metanalysis, lab data, clinical micro, awesome epigenetics. Sick. Clinical research. R for everything. Love that. Awesome. This is cleaning messy data. Neuropsychology occupational health. Wow, guys, it's been so long since I've been in a research environment. So it's so cool to hear what people are doing. I did a PhD in molecular biology on these really cool plants called resurrection plants. Yeah, and I left a research a few years ago. It's really nice to get the highlights real of research without the suffering. Awesome. Cool. Okay, let me get into this. Okay, so we're going to be talking about R for reproducible, RN for reproducible research in the chat. Can you also just pop in there? If you've ever worked with version control, if you haven't worked with any version control, so get or subversion, can you just pop in the chat, the word no. Yes, get, get awesome. If you haven't worked with it, please do that, you know, as well, and I just want to sort of figure out how much emphasis to put on version control. No, okay, there are a few people who have not my type of version control document date. Nice. It's something it's something it's not nothing cool. Awesome. Okay, so a little bit about me and about the company I'm working for. I work for jumping rivers. They're a UK based consultancy. And we have two sort of basic arms of the business consulting and training, but we also work a little bit in infrastructure. I'm actually on the data engineering team at jumping rivers. I'm the token data scientist on the data engineering team. And I help the data engineers kind of interpret what data scientists need from infrastructure. And so we actually deploy positive infrastructure so the positive workbench that we'll be working in today, which runs our studio and this code and other positive products are things that we deploy in the data engineering team. So we do all things data science, sort of end to end from the infrastructure side all the way through to training and writing code writing shiny apps maintaining these products. So yeah, full stack data science consultancy. Super nice people really recommend taking us out. I think we're also we're sponsoring this event and we're also going to be at the job fair. And if you are, yeah, if you kind of like what you see, then maybe pop us a message. Okay, so the outline for today's workshop. In case you missed it. What are studio projects. So the RN concept for reproducibility really builds on top of projects. We've been working with a project before. It's kind of tricky to understand why are in this is a sort of step up from just our studio project. And so we're talking about projects and some of the notations of projects are just organizing the code into projects. We'll be talking about how are in can help with this. And how it works. And then we'll create our first iron project. We'll also be reconstituting a project with our end so a big component of this sort of new way of working with our in an isolated environment is about sharing, you know how do we share our code in a way that makes sense to other people how do we make it usable for other people. Yeah, how do we prevent issues arising. So we will be reconstituting a project. We'll be looking at how are in handles package repositories. So most people most our users don't even know that there are different repositories where you can actually access your packages from. Most people just know about crann and they're like cooler comes from crann seems legit. Keep going. So we'll be looking into how to handle package repositories from other sources. So then we'll also do a little activity around handling these package repositories. We'll finish off with some handy aren't commands and then I've, if we have time I would really like an open discussion about using our in your organization so these little techniques and workflows that we sort of touched on today. It's really nice to get together and brainstorm how we can actually use these in the work that we do out in the real world with the real data and the real issues and the real environments and the real messiness. Yeah, so it'd be really cool to brainstorm with some of you and hear from you like how you would integrate this into your own work. Okay, so in case you missed it. What are studio projects projects are really the foundation of the best practice in our especially for analysis. So, it's basically converting a collection of files into a project or starting a new project has a following effect. It sets the working directory for that project. So whatever the, the folder is that you're doing the work in that is going to become where our, you know, reads things from populates things to any outputs that it generates are going to go and live in that working directory. So this is really nice because you don't have to go set WD to wherever you're working it just sets it for you automatically. And so that's really a really nice thing about projects. And you can also manage environment variables and objects on the project level rather than on the system level. So you can sort of start to make those environment variables specific to a piece of work. So to define the same environment variable across different uses of work, you don't have to set that environment variable globally and set it once for a project, and then it's there. Projects really shine with get in our studio ID so you can control your, your get kind of processes within the graphical user interface of our studio. So if you've set it up as a project, and that's so helpful. And really nice and we'll touch on a little bit of this today so today, we're not going to do like a full get flow exercise. We're just going to clone a remote repository into our session. We're not going to be pushing any code anywhere we're just going to be pulling code in. And I'll show you a little bit of the layout of using get in our studio, especially since some of the humans on the call said that they've never worked with get in our studio before. So don't stress, it'll be a very light introduction to get. But I think it's nice to get a little bit of exposure to what's possible within the studio ID. Okay, so we're here to talk about reproducibility. So projects are great, but they can only get us so far. So the first one that you have analysis files that you worked on five years ago, and you want to reconstitute this analysis today. There are a few things to consider when you're trying to do this. The first is like what our version was I using. I might have been using our 3.6, five years ago was 3.6 even out five years ago, I don't know. I don't have too much of a baby to know these things, but I'm sure that there are some of you in the call who've like kind of tucked away some work and revisited it and been like, oh my God, what was I using. So what our vision was using and what happens when I update are all my codes will run, you know, so some packages only work in certain versions of our. So can I still actually use this code in the same way. So the first one is what packages were used what versions were they will this code still run is the overarching question. And, you know, where did the packages come from, you know, if it's an internally developed package or a package that's hosted on an alternative repository. And you no longer have access to that package. You know, it'll be impossible to actually do the work. So, I am seeks to address some of these issues. The first one is dependency management. So it helps you manage the package dependencies for your project. So tracks the packages that you used in your project as well as the version numbers of those packages, just super sick. And only do I know it's deep layer I know it's deep layer version 1.2. And because you know packages change in between in between updates and it can actually change considerably. So it's super helpful to know exactly the package version that she is. So isolation, you can create isolated project specific libraries. So this means that any packages installed or modified within a project don't infect the global our environment or other projects, providing a clean and self contained environment. So, basically an isolated space for you to do your work any, anything will be contained within this is in the space. So, so normally, when you install an R package, you have a global library where that package gets installed to and any, you know, any R code, basically access that library. But with our end, you're basically creating on the project level, a mini library, a subset of only the packages that you need for piece of work, and at the version numbers that you need for that piece of work. Package restoration. So I am allows you to restore the package environment for a project. So, not only do you know, okay I use these packages these versions to do this analysis. I am also gives you a way to restore those packages automatically you don't have to then go manually install every package at its particular version to get your analysis to it. So that's really helpful. So let's look at the anatomy of an R in project. So the usual suspects are there. We have the get ignore if you're using version control. This will be very helpful, because it tells get the version control system which files not to keep track of. And it also has a dot project file so this is how you create a project in our is with within the studio ID you create a new project and it creates this our project file which captures some information about that project including things like the working directory and Yeah, so this will be my project dot our approach. So these are things that you would see in a typical our project anyway. We have new friends. We have a project level our profile. In the chat. Please let me know if you've ever heard of a dot our profile file. Because I don't think most people have her profile. Yes, we're getting a lot of yeses. Excellent. No, yeah, okay there's some no some yeses. Thank you guys for being a very responsive chat. I really appreciate this. Thank you. Don't know what it does or how you remove it. Okay, cool. So our profile will talk about this in a bit more detail is is a script essentially that runs before your as your R station session starts. And so if you want your our session to start saying hello astrid you look gorgeous today, you can go put in your our profile print hello astrid you look gorgeous today and when our starts up it'll print that message for you actually actually really put that in my profile. And so you can also set options in dot our profile. So if there's a particular In the context of our and we're going to be talking about repositories. So if there's repositories where you want your packages to be downloaded from, you could set that information in the project level our profile. It's basically a place for bits of code to work with your our projects are the slides and all for this talk. I don't see like Stephen I'm so sorry. I was kind of working on the slides until like a few minutes before the talk as usual. So the slides are not in size on on the website yet but I will upload them after this. And consider adding praise praise on startup. That is cool. I like that. Thank you. Yeah, so your profile enables you to sort of set options for your work. Just super helpful. So there's a project level our profile so most people love our profile at the user level so at the user level I'm telling are that it must do these things for me, but you can also set that on the project level and then it's contained within your project. That information is isolated to that project. And so whatever's in my user level, our profile doesn't actually impact my project level our profile. Then there's the RN dot lock, we're going to dig into this file and a lot more detail. This is the file that captures things like our version. It captures what packages are installed. It tells you what packages are used in the code of your project. So you can have lots of random packages installed but it will only capture the ones that are actually used so it reads your scripts and grabs the library entries, and it captures the versions of the packages that are imported in your code as well as the dependencies of those packages. Super helpful. And this is what's used to reconstitute your project. Then we have the RN folder, which has a few sub folders depending on what you're doing, like the library folder has that little mini package library that I was talking about earlier. That subset essentially of the packages that you'll use in the project actually going to live in that library folder in your RN folder. Then there's a history, which has history of the project. There's other folders in here. And then finally there's the activate dot our file, which is what's used to reconstitute the project essentially. And that's what RN uses to activate itself essentially. So the RN block file is something I'm going to be talking about a lot. So I highly recommend checking out this link over here. This is to the RN documentation. It's all linked in the slides, which you'll get after this. The documentation is really well written and helpful and great. I know our envies not even version one yet, but like they're, they're doing great sweetie like great like this is a really nicely documented package. Let me just close my slides again sorry. Did I go and kill my slides as well somehow. Oh I did. No, that's annoying. Sorry guys two seconds. She does a fantastic job with documentation of rent a massive improvement over RN, I guess a massive improvement of a pack right. Yeah, yeah, it's, it's lit bro. So, thank you to ex it's jumped back to my slides. Thank you for keeping the comments. Thank you, Eric, while I fumble with my slides. Appreciate it. Okay, so the anatomy of the lock file. So this is what it looks like this is for just two packages, Markdown and mine. And so you can imagine if you're working on a project that's quite complex and uses many libraries of this file can actually get pretty long because lists the packages and their dependencies. So this is the anatomy of this file. It grabs the our version, and the repositories from which the packages were downloaded. So by default, it'll have the CRAN repository, and it links to the CRAN project website, which is where the packages will come from. So if you have other packages, and other other repos specified for your particular user level are our profile or you set options in the project itself to pull packages from other repositories. This information is captured in the in the lock file. And then in the packages fields. This is a JSON format. The packages. Here we have each package as a label, and then the version of package where the package is downloaded from so these are these sort of CRAN labels here correspond to the repositories up here at the top. So if you have other repositories you can expect to see other repositories listed here, and then a hash for the package installation so the hash is calculated from the package package, it's the package version the source code. And I think the operating system as well. And so what are my operating system, don't don't take me don't take my read on that. But basically what this hash does is it enables RN to reconstitute those packages exactly the way that they were on the first, I guess, the first environment in which this lock file was generated. And so the package the hash is really important for kind of replicating the environment as you expect does not track. Thanks. Thanks, Mo. Awesome. It's so great to have experienced our people on the call so I can delve into the public knowledge. It's awesome. Thank you. Okay, so let's create our first RN project we're going to create this from first principles, very basic. It's not going to be super technical. We've got this training. VM. And I'm going to just show you quickly how to log into it and I'm going to give you five minutes to log yourself in and grab a username and password. So, I'm going to log myself in please don't take my username and password. Please. Yes, I'm going to send you the URL Martin but after I demonstrate because I don't want people to still my username. Okay, so we're going to go to the training welcome page. And we've got a password here date cloudberry is our password. And so you need to enter email address is the welcome page grab your username and then head to the training environment login and then launch a session so I'm going to show you how to do that now. Okay, so it takes you to the training course authentication portal, and you'll pop in your username or your email address sorry astrid at general upwards.com. I want to just say to you all right now I'm not going to recover these email addresses so don't be worried about me sending you spam. I'm not going to spam you. I will not keep your email address, it just needs to be a unique identifier for the Bm. That does not look like the password date hyphen cloudberry right submit. And then a username and password is generated so this is a unique username and password so I'm number five, five is actually my lucky number. So I'm grabbing my username and password and then I'm heading to this training environment link at the top. And then I'm going to log in user zero five and my password and sign in. When you get to positive workbench. Yes, I'm going to send it now. And you're going to log in and create a new session over here. Okay. Start session. And you'll get a studio ID opening up. So while that's baking. I'm going to just send this link through. Yeah, this is the welcome app. So when you're ready and you have a username and password just head over to the training environment and connect there. Oh sorry yes date cloudberry date hyphen cloudberry. Nice. Thanks Rose. Okay. When you're in just type in the chat. So I'm aware. People are logging in okay. It's always interesting when we have many humans on the call. Denise as fast fingers first well done. What do you do when you have a user aha when you have a user you go to. Sorry. Dot. Start training. Sorry slash welcome. You want to say welcome. You click on this training environment button at the top. You can use your email in this in the welcome app over here. So type your email and the password over here. And then when you get your unique user username and password pair and you log into the training environment. And then you use your username that you that you generate with the welcome app. Not your email. Cool. People are in. So I'm going to give it a one minute. Which pass which we put in. So you put in your unique username and password so I'm going to just show this process again. So in the welcome app. I'm typing Astrid at jumping. Com right my email address. And then I typed date. Loudberry, which is the password I'll just type it in here date. Right. And then we get a username and password username password. Don't take mine. And don't take a random one because you might end up kicking someone else out of the training environment, which is not kind. It's not how we roll in the zoom call. So we click on the training environment. And then you can pop your username and password in there. So username being user or whatever and password. Ready for some aring. I can't is like, come on, bro. Get it going. Move it along. Let's do some aring. Sick. Okay. So this is what your studio environment looks like now you can go ahead and click on one demo. We're going to create a new project. So this screen might end up closing. What I recommend is to download your the files so that it's not annoying, you know, switching between projects. So you can export your files from workbench over here using the export button. Please select one of the files to export. So just export all your files. Right. Like that. Ask you to export that zip. That'll download. And then you can just copy and paste across these files if you need to. I didn't get a username and password when I submit. How many people are trying to log in? I only have space for 100. I'm going to just pause on the logins for now. And in the break, if anybody's having an issue, then I will. Then I'll help you out in the break time. Okay. So in the meantime, just kind of watch along. Okay. So. Let's create a new project. You'll see here when we log in that we have no project open currently. Let me just zoom in a bit. Right. So project is none. So you can create a new RStudio project in a few different ways. There's a little RStudio project icon over here. It's like R within a little box. Right. So you can go ahead and click on that. Or you can click on this little drop down over here, little drop down over here. And go new project. So we're going to create a new directory, right? So you could create a new directory, use an existing directory, or basically clone a repo essentially in this project generation space. I'm going to go ahead and start with a new directory. So new project, blank project, I'm not doing anything fancy. And we're going to give it a name. So here I'm going to name mine. Test. I don't really matter what it's called. And I'm going to just create it in the home directory here. I'm going to go ahead and create a get repository. So initialize get within this repository. And I'm going to use our end with this project. So this is one way that you can make an RN project happen. You can also use the RN package and a function called init, which we might dig into a little bit later. But using the UI is also totally fine. OK. So you can click open in a new session. And maybe this would be nice. You can also specify your version here. So 4.2, 4.1.2, or 3.6.3. So this instance of positive workbench comes loaded with three different versions. So not all of us have the luxury of multiple R versions when we're working on our own sort of desktop version of RStudio. So it's quite cool to be able to use different R versions. So we're going to open in a new session and go create a project. OK. Initializing RN. OK. I can call my session. Since I'm creating a new session, I can give it a name. So I'm going to call this RN test. Same name as the project. Doesn't really matter. I'm going to open that session. And it's telling me that Firefox is blocking my RStudio. Cool. So now I have this new project, right? It's in its own session, which is quite nice. And if we quickly go back to our initial session, right? If you want to see what sessions you have open, you can click on this drop down here. So I have my local RStudio session. And I have this RN test one open as well. So that's quite nice to see what sessions open. You can also navigate back to your workbench sort of home page just by clicking on this big R button over here. So that takes you back to Posit Workbench. And then you can see here are my two sessions. So this is my vanilla non-project de-session. Just R, no project open. And here is my test environment now. So this RN test is now its own isolated RN environment that's totally separate to the main R environment on this machine. Mo says, Astrid, if you have a good repo with this stuff, I can work locally instead to free up space. I do not, unfortunately. But I do have a repo for us to clone later for an exercise. So that'll hopefully help. Okay. But I don't think that we have 100 people on the VM. I think people are just struggling a bit with the password and stuff. So that's cool. Thankfully, this RStudio instance is using an explanatory package formats that makes bootstrapping library much faster than combining packages from source. Yes, exactly, Eric. Yeah, Eric knows what's up. Cool. Okay. So here we have our R-session open. We've got these files that I was speaking about in the slides. This is the anatomy, basic anatomy of our RN environment. We have our project file, like I mentioned earlier. So this basically enables you to click on this button, this RN file, and open up your RN project on your machine. We have our RN folder, which is where our packages are living. You can already see that's populated with some packages. It's got the RN package and not much else, but this will change very soon. Okay. We've got our activate.r file, which is what initializes RN when we start up the project. We've got some additional files like settings.json. I'm not going to go too much into this, but you can set some additional parameters in your project. And some getignore files. So there's a getignore file at the sort of root of the project, as well as in the RN file. So we'll dig into that as well. Okay. So I'm going to go back to my other session so that I can just flick between the files. Oh, I suppose I could open it here. No, let me not confuse people. Let me go to my original session. So we can look at our project files that I shared. We're going to put the session to start. Okay. So I'm back on my sort of main page. And this is where I've put all of the files that we're going to be working through. These are our files, but they don't need to be on our files really like they could just be plain text files. But yeah, we're, we're, we're r-ing right now. So you'll see here is the RN test folder that I just made that is running in this session over here. Okay. So I want to go back to my instructions. So we created a new project. We initialize that project with RN and Git. And we had a little scroll around the files that represent there. So when a normal project session opens up, what do you notice? How is this different from a normal R session? So a normal R session looks kind of like that. And when this one opens, it looks basically the same, right? Except there's this extra little line over here, which is telling you that RN is being used and that it's loaded. Right. A really nice command is to run status. RN dot status will tell you that the project is already synchronized with the lock file. So if you're confused about your project, I just want you to run status and like have this as an anchor point that you keep coming back to, like what is the status of my project? Like is everything kind of working as expected? And it will tell you the project is already synchronized with the lock file kind of as a baseline. Okay. So yes, we explored the project files. The RN folder holds our library, which is where our packages are going to actually install to, right? And we have our lock file over here. We went through the basic structure of this in the slides. We've got our R version. We've got our repositories and our packages. And so here, because we have no files and no, like packages installed in this isolated environment yet, it would, it will only tell us that RN is here. Nothing else is here. You'll notice that there are two repositories here in this lock file already, right? There's a CRAN repository and draft. And this CRAN looks a little bit different to the CRAN that I spoke about in the slides. The URL is different. This is running. This is pulling from package manager. So I'm going to open this link. Oh, no, stop. I'm going to open this link in a new tab. In valid request. Sorry. So package manager is a positive product that enables you to host R and Python packages in a repository that is not CRAN, essentially. So here is posits actual package manager. So they're hosting a CRAN mirror. So this repo here is CRAN, basically. The only difference is that this CRAN serves many different binaries. So binaries are pre-built packages, essentially. So when you get a source package, you can also download source packages here. But when you download a binary, it's already kind of pre-compiled for your operating system. So when you download it, it's a much smaller package, and it downloads much quicker onto your machine. And so our training VMs are pulling from package manager, not from CRAN, and they're pulling the version of packages that are built for Ubuntu Focal, which is the Ubuntu operating system that we're running. So I just wanted to give you a little bit of a background of where your packages are actually coming from, and they're not coming straight from CRAN. But if you were to set this up on your own machine, and if you have no repositories specified, this will actually come from CRAN, CRAN, from the real CRAN. And then we have another one, which is our own internal draft repo. So we create packages for every training course that we use, training course that we run, just to make it easier to import data or specific functions. So those packages are internally divided by jumping rivers. They have no business being on CRAN, so we created a draft repository to house those packages. And so that's what you're seeing here. Where do these repositories come from? How does our studio know that these things need to be in the lock file? How's that whole thing happening? That's happening because these repositories are specified in the R profile, but not the R profile on the user level, the R profile for the whole training VM. So our profiles are also hierarchical. So if you have a training VM, or if you have a sort of professional version of our studio, you can specify what repositories your packages come from at the organizational level, which then feeds into everybody's work, right? So that is where our aim is actually putting this information from. Okay, so any packages that we install are not going to come from CRAN, like straight CRAN, they're going to come from our studio opposites version of CRAN, and they're going to be pre-compiled. And any additional packages are not going to come from there, even they're going to come from our own internal repository. Okay, enough rambling on that. So yeah, so here we have our input lock. So I want you to kind of take a mental snapshot of what this looks like right now, because it's going to change very soon. Okay, if you push this way to GitHub, what files would not be uploaded to the remote repository? So if you went and changed this repository now by default, which are the files that are going to be ignored by Git? So this information comes from the .gitignore file. So on the sort of root level of the project, it's the normal, it's the usual suspects that we would expect to be ignored by Git. The rprojective user, the rhistory, rdata, rusidata, sort of, there's nothing special, nothing to write my home bar over here. And then in the rn folder, there's another gitignore, right? So these are all of the folders in your rn folder that are not going to be uploaded to GitHub or GitLab if you push, and they're not going to be added to your Git commit history. Any changes to these files or folders are not going to show up, right, in your Git history. And one of them is the library. I think that's probably one of the most important ones. So the packages that you install on your machine in this environment are not going to be things that your collaborator or yourself in five years time are going to, you're not going to have access to those files, like those program files. So those things are not going to be part of the sort of development of the project. It's very important that your machine has its own packages that it's, you know, it has isolated in this environment. So those things are not going to be transmitted. Other things, the staging area. So this is where when you download packages, they first go to the staging area and then they'll get added to your library. So you can choose those kind of things will be your result before they're added to a library. And a few other folders are going to get into too much detail. But yes, so there's two levels where Git ignoring is happening by default. So that's, I think, just important to fly. Okay. Let's do a little bit of our ring. We're going to create a dot R script and copy over some code, right. So we're going to just copy over this code from between step three and step four. So I'm just going to copy that into a new R script. So creating a new R script. And just adding my code in here. I'm going to save this script as just, I don't know. Visualization is your station. It's going to create a new R script for me. So because I've created our project, our studio project, when I create a new script and just save it, it saves to the working directory of that project. So it's saving into this RN test folder, not into my home directory by default, because I've created this project. Okay. So our studio is very kindly telling me packages, flametree and ggplot are not installed, right. So I'll need to install these packages to actually create this visualization in this R script. Here I'm using the flametree package, which is a really cool package created by Daniel Navarro. It's a generative art package. It brings me so much joy. I use it for all my examples because it's just really fun and nice. So yeah, I highly recommend if you have some time to kill and make some art in our, explore the package more. Okay. I'm not going to go click install. I'm going to go to my console and check in a little RN status. Again, I'm just going to click the up arrow and that will give me my previous mind. So it's telling me the following packages are used in this project but not installed ggplot2 and flametree. I think somebody asked here, what would it say if it's not synchronized, that it's not synchronized? Yeah. So it should tell you that the project's not been initialized. So if there is no RN lock file and you type in RN status, it's going to be like, there's no lock file. Like, what are we doing here? We need a lock file. And if you need to actually install something, it'll tell you as well. So here, this is really useful, right? So it's telling you that we don't have these packages. So we can go in and just type RN install and the package names, right? So I'm going to just check them in here, ggplot2, ggplot2, ggplot2 and flametree. I quite like using the RN install mod but you could also just go install the packages. It's fine either way. And can you also click the top? Yes, you absolutely can. That's totally fine. Yeah. So where is this actually installing from package manager. com, linux, focal, latest, source, ggplot. So it's actually pulling from this library that we specified here. So I must, I mean this repository. I must say that although I've just specified brand latest, it's redirected based on the operating system that I'm using. I don't know how this magic happens. Maybe somebody who works on this package can tell me, how does it know which one to use? I would love to know. That's not from RN. Oh, Mo, lay it on us. Give us, spell the T. That's basic R function. Oh, cool. Cool. Great. So it's now gone and downloaded all of the packages and it's installing them into the library. So not yet. It's installing them. We'll get to the caching later. Cool. So I'm having not just the packages that I wanted, the top level packages, you know, get ggplot and claim tree, but it's also doing all of the dependencies for me as well. I think if we're all doing this on the machine at the same time, I don't know, it's pretty beefy VM, so it should be fine. But yeah. Cool. So install 38 packages in 1.78 minutes. Not too bad. Okay. So let's have a look at our new library, right? So we've just installed all these packages and before where there was only RN, we now have all the things that we need the package for the for the code to run. And what's really cool is that I, you know, it predicted this. From looking at my code. So it's going and checking out these library function calls and, you know, telling you what's needed in your project, which I think is really sick. Okay. So save your script until the package is required to run it and check the status. Okay, so let's check the status of our project now. So RN double colon status. You could also just, you know, library RN and then you don't have to type our every time. But I'm not found with the double colon notation. Yeah. So the following packages are installed, but I'm not recorded in the lock file. So it's telling us that it's installed these patches, but they're not captured. You know, for later reconstitution. So we're going to use what tells us we're going to do what tells us and use RN snapshot. To grab a snapshot. So it's going, it gives you a little warning before it goes and updates your lock file. The following packages will be updated in the lock file. Right. All of these and it's going from nothing to a package version. Do you want to proceed? Yes, I do. And now it tells us that it's written to iron stock lock. So let's go back. I've already got it open over here. Cool. So now I have a way longer list of packages than before. So you'll notice that some of them have a RSPM repository. And we only have a draft and a crime up here. So if anybody on the call knows why this happens, I've been mystified. For a while. I'm assuming that it detects that this is a package manager. An Oscar package manager or positive package manager. URL. That was my hypothesis. Maybe Mo has an answer for us. About why it's giving a different environment. Variable name. Yeah. I think it might be something to do with the activate. R script. I'm not sure. But I would love to crowdsource that information. Mo says, I need to know because I'm going on the rabbit hole lately and you don't want to know. Just be happy about the automatic. Yeah. Yes. Sometimes you just, you just don't want to know the answers to things. Is there any link to get about this? Eric, if you feel like elaborating, I would really appreciate it. Yes. RSBM does stand for our studio package manager. Now called positive package manager. Because it hosts both our own Python packages. Cool. Great. So we've updated our orange lock file. Right. So that's looking cute. Let's just check another status in here. Just so that we can make sure. The project is already synchronized with the lock file. Right. So. This project is kind of the information is captured. For, for this project. Snapshot time. Right. So we, we skipped ahead a little bit here. So. We now have our, our, our inf dot lock has changed to reflect in patches. And the R version. When you want to. Deactivate orange. So right now we have our inf in an activated state. Right. So it's initialized. So. When you want to. Deactivate orange. So right now we have our inf in an activated state. Right. So it's initialized. It's created the environment and we're in that environment right now. We're sort of isolated from our. Like main R install and our operating system. Generally. We're kind of in this little bubble. So what happens when we deactivate. So we're going to deactivate. Right. So. I'm going to go. I'm going to go. I'm going to go. I'm going to go. I'm going to go. Right. So. I'm going to just go on deactivate. It doesn't look like much has changed. But we are now no longer within that. Our environment. To, to. Ah, this is. Oh, yes. This is actually. Thanks. Only a meaningful counting. Powered to me with information. The packages. This is our, these are all of the packages that come pre-loaded with this VM. So these are, these should be on all of your machines, right? If we look here at our visualization script, we can see that flametree and ggplot are acquired but not installed. So we're getting this message again, right? And that's because we are now outside of that, um, our end bubble. So let's reactivate it. So we're going to use the activate function here. Let's activate our bubble again, right? Um, this should not be here now. Um, but if we go and have a look, we can see ggplot2 is now installed in our package collection as well as, uh, flametree over here. So we deactivate and ggplot2 might still be here because this machine might have it actually. Uh, ggplot2 is here, but flametree certainly is not. So it doesn't actually install, do you know flametree? So it doesn't actually install that, um, that package into your general R library, right? It installs it into the project specific library. Okay. So, yes. So activating and deactivating our environment. So I'm still in the same project, right? So I'm like, if I deactivate now, I'm still in the same project. Flametree starts are, but it starts are outside of the project. And then when I activate, it starts are in the project, right? And any environment variables, for example, that I were to set outside of the environment wouldn't carry through into the, I said, environment and vice versa. So in that sense, it's, it's completely its own universe. Okay. Back to the slides. Oh, let me put a full screen here. I'm going to recap and then we're going to take a little break. Excuse me. Not found. This is what happens when you don't organize yourself. Okay. Let's actually take a break now and we come back from the break. I can, I'll have all my slides set up and things. Okay. Take a 10 minute break. So we'll come back at quarter past whatever quarter past is for you. Wherever you are in the world. For me, that is 1815. Okay. I'm going to be on the call if anybody needs help with just setting up there. If they can't log into the VM or they're struggling with that. Yes. Let me, I'm going to paste code from demo to onwards into the chat. So that people can follow even if they aren't. Managing to log into the VM. If anybody needs help, just unmute yourselves and we can, I can help you log in. I'm sorry. Some more questions. Yeah. Yeah. Thanks, Mo. That's a good point. What is different to the difference of using condo versus RN. In our Amy, I have no idea. I hope someone else can answer your question. I've never used condo for our stuff. I'm sure there is good answer to that. You post code. Yeah. As far as I know, condo time capsules the whole environment, including region of our and Python. Okay. That's cool. Hondas. Condos tailored to Python. I will not manage our packages. Yeah. I've only used condo with Python. I know condos used in tandem with reticulate. I could be wrong. Yeah. I've only used reticulate and wept. I've not quite managed to do reticulate. I actually kind of gave up at some point and just started learning Python. And then corto came around and now everything is fine. Everything is good in life. And I was pasting my one. Sure. My message is too long. That's annoying. Okay. Let's make sure. I'll go hard. Alan says, I will only tell you which version of are you using, but it doesn't control it. We're using singularity for that. Introducing RN to fix compatibility. Yeah. It's so interesting. The idea of controlling things. And locking down an environment. So we've been doing a little bit of work in validated environments. So package validation. And also the valid, like using packages, validated packages in the validated environment. And it's complex because, you know, you can specify. Global our profile, right? To only or to, yeah, by default install packages from certain locations. But people can just come through with dev tools and install. Using a link, you know, unless they aren't connected to the internet, unless you have, like the way to really lock it down fully is to have your own package manager, your own positive workbench. And those two things only talked to each other. And, you know, the package managers on its own server can talk to the internet, but. Workbench can't talk to anyone. The following link is broken. Oh, Vasiki. That is an astute observation. That is by design. Someone's skipping ahead in the activities. Yeah, we're going to fix that. Moses condo is an entire thing. Iron versus smaller piece. Iron makes it easier to track packages and versions use that forcing it. And also redirects a little parts to this project specific ones that does not collide with the user's environment. Like I might have reasons to lock my own computer version of digital plot to do something because I need it for something specific in that version. But a project needs a newer version. I can do that without messing up or even having to have my own bleep solutions. Yeah. Also allowing me to work on my local computer or the environment I'm used to without needing things like containers. So, something I haven't explored at all is IRN with CI, CI CD and bold automation that would be interesting to delve into. I use IRN with containers by a docker or podman. It's my main streaming system actually. Because I don't have IRN stored everything I make is with containers. You're Eric. Eric should be presenting. Eric should be presenting this workshop. Many people work how they're used to without adding extra stuff. That's true. I really like that strategy. I like kind of meeting people where they're at. It's a good strategy. Okay. I'm an IRN's cheerleader. Oh Eric. You just care for fun. I'm trying to up my game of teaching IRNs to others at the end of the day. I had the very I think it was serendipitous that I wanted to do this talk and then we have a client who I'm going to develop more into a nice training for them. We'd love to know how has IRN helped you in projects? And has it been helpful? As a consultant, we come in. We make recommendations. Obviously things that we use and we understand. But you don't often get to be there for all of the wailing and gnashing of teeth of onboarding users to a new piece of tech and getting them familiar and getting them comfortable with it. That's a whole other thing. I'd love to hear your experience. It's so smooth. That's good to know. That's very encouraging. Can you walk us through the training environment again for those of us just connecting? Sure, Tessa. Absolutely. Let me just go back. I'm going to show you your email address and the password which is date-cloud and then you'll get your own username and password. You followed by a number and then a password which is basically the password with the username tacked on. No one has kicked me off yet, which is amazing. But please don't. Then you're going to click on this training environment button at the top and then that will take you to a webbench which will look like this. It won't look like this. You pop your username and password in. The unique one that you've just generated. Don't put your email address here. Then it will take you to a webbench and you can click on new session to get started. Eric says, I understand my analysis and shiny application projects. I need full control of dependencies in case I need custom versions of packages that the central package library does not have available. Plus it shows my collaborators operating with the same package environment and environment that is a huge point to address. Yes. 100% awesome. I'm going to go back to my first session just with my little R script open. While people are joining back on the call if you've just come back for a break I would recommend scrolling up in the chat and reading some of this because people have been laying some good truths and making some interesting comments. While I get set up here you can go and have a look. I've popped the training welcome app again in the chat. I'm just going to pop the steps to get logged in again. This is smooth for people who are trying to get in. I also copied and pasted the actual contents of the R scripts just the instructions for the exercises. The first one let's clone a repo containing an RN project that's the next one I'm going to work on. The last one ended up being the first one it's a mess. We have a shared library in the lab but packages have been getting overwritten as others update things. I can imagine someone looking at RN seems to be working. That's great. Thank you lively chatters I appreciate you so much. Let's proceed. Let's go to the next one. Let's go to the next one. Let's go to the next one. I like it as an anchor point. I find that the messages it gives are really helpful and they give you prompts of other functions to use. Status is your friend. You can use the R script or the R markdown associated with the project. Any library calls basically it takes those and pops them in the lock file including the dependencies captures the version package hash. Everything beautiful. Snapshot and status deactivates the isolated project environment takes you back to your main session. Activate activates the isolated project environment so you could be in your project but it's not isolated. Just be careful about activating. The other thing we didn't use is RNV init replaces the gooey point and click where we clicked initialise project in the RNV init. If you have an existing project you can go RNV init and it will create all of the RN specific infrastructure around your project. It will create that when you initialise using init. Super helpful. Those are the two ways you can create a project. RNV project. RNV init. The first one is about reproducibility. Imagine your colleague has shared an analysis with you using RNV to capture the state of the machine when they perform this analysis. We want to reconstitute this analysis. I don't want to say most but many projects are shared using Git version control which is internal for your organisation so it's not accessible to the broad internet or on the internet itself. So what we're going to be doing is cloning Git repository. I'm excited for people who have never worked with Git before to gain a little bit of exposure to it. We're going to clone a repo. Does using RNV init work? You need to be in the project and then you run in it and it activates as well. It initialises, gives you the project files and activates the project at the same time. Remember a way you can tell that the project is activated is when you open your R session and then you close the repo. Let's clone a collaborator's repo. I'm going to go to my first session. I'm going to close all these extra tabs here. Excuse me. I'm going to go to demo2demo.org. We're going to clone a repo containing an RNV project. We're going to clone a new project in R Studio. Again, I would say a new project, version control. What I wanted to say is we're going to code a new project and make sure to click the open a new session button again just to keep it in its own tab on the internet. So version control. Earlier we just went new directory and now we're checking out a version control repository. We are cloning a project from a get repo and so we're going to stick the URL into this field over here and I don't think I copied it. No, I definitely did not. So let me copy this. This is the URL for the get repo. I'm going to just open the link in a new tab. It's a public repository. You don't need any username and password to clone it. You can just clone it. So we're going to clone with HTTPS and I'm just going to copy this link and when I create my new project project I'm going to stick it into this URL here and then ask how do we start a new session? Okay, so when we actually clone this so when you click on new project over here and we do this we're going to check on this open in a new session box down here in the project wizard that or you can go back to this R button over here this RStudio button and that will take you back to your home screen and then you can code a new session there but I find this is easier just to open a new session. Okay, project directory name is going to be rn underscore example let me go hyphen because it's hyphenated in the URL rn for example and again I'm just going to create it as a subdirectory of my home directory so create project it's not cloning because it's creating a new session I need to give the session a name again I'm going to call this one rn slash example open session and now it's creating this new session for me got it, sick Alan thanks for the feedback Eric says if you're using R from outside RStudio say in a terminal or VS code make sure your working directory is the project directory before using the rn functions that can trip people up great thanks for that tip Eric that's good to know it's so interesting thinking about how people use R because you know the way I learned R is I didn't even consider I didn't even know that like you could just type R in your terminal and then an R session pops up I thought R in RStudio were the same thing so yeah okay I'm not sure why it's bringing my history here but anyway okay so I've created this rn project well I've cloned a project I've created a project I've cloned this repository from Git right if we navigate to the Git tab here on on the right hand sort of top window you know this is our default tab here we're going to the Git one you can see any changes that happen to the repository will appear here unless they're part of the Git ignore right so if Git is ignoring a file or a folder we won't see any changes appearing in this window but I really like this because if you use Git in the terminal it can be a little bit obscure sometimes you need to type commands to actually see what's changed in your tree but in RStudio it's like right here and this is to do with the fact that RStudio is the one that's going to RStudio IDE and here you can push and pull and commit your or your code or your changes over here it's really nice and we can see that we're on the main branch so we've pulled the main branch and we're on the main branch on our local machines you guys won't be able to push anything to that remote repository because you don't have like credentials to do that so this is just really an exercise in pulling this bit of code from your collaborator and that's what we're going to be digging into right now so we can see here this code doesn't include an rn folder right it's got the rn.loc it's got the project it's got a readme it's got a little r script it's got its project level r profile and it's Git ignore and so it's telling me here it's giving me a warning in file name r encoding cannot open file rn activate.r so no such follow directory so what's happened here is that as a collaborator we've gotten the rn.loc but we haven't gotten any of the other you know we haven't gotten the activate.r file that's going to actually initialize rn and like make that actually work yeah so we need to provide that infrastructure essentially to the project I'm curious to see what happens when we type rn status here oh it's thinking it's thinking real hard cool so it's giving us something it says here the following packages are recorded in the lock file but not installed ggvis use rn for restore to restore the packages recorded in the lock file I want to see what happens when we use activate great so now it's created the rn folder for us and it's created activate.r and our library folder so it's created all of those things using the fact that there's a lock file in the project repository and the project files to go and create that and so here we're being told to use rn restore to restore the packages reported in the lock file so when you get code including a lock file obviously you're not getting those packages you're just getting the instructions you're getting the recipe to make the packages happen for yourselves so mo tried rn restore and actually did all that and started installing things that's great so rn is clearly a very clever package so we're going to follow the instructions and we're going to restore this package library and so let's have a quick look at our library our library is now empty it's only got rn vinnit because it's just been initialized so all of the packages that are required for the project to work all of these packages are reported in the lock file let's have a look here the vm is going slow in the rn front lock file we've got some repositories and we've got all of these packages so all of these packages are appearing here to be installed so we want to pursue it we're querying repositories for various source packages it seems to be installing from source Dan is thinking I'm just going to think here so the difference between R and R2 is like yes sorry how did you get the pull from git so in the instructions over here there is a remote and how I got that was by going to the repo and then going to code over here this little drop down clone with hctps and then I just copied this URL over here and that's all you need to clone the repository if you're using SSH so if you have basically if you have rights to pull from a repository and you're using the SSH protocol to do that you should rather use the SSH option over here so because this is like a totally public repository and there's no issue with anybody cloning this hctps is fine also not going to put everyone's SSH credentials on our machine cool so it's downloading packages so it's retrieving them from the package manager repository this might take a while I think it's also maybe taking a bit longer because we're all on the machine but that's okay so one thing I want to note here right is where the packages are being restored to so we this is slow so I might ask some of you to just kill your sessions at some point so we can see our packages are being installed to this our library so our RNV this should be updating maybe it's in sorry I think it's just taking its time to unlock is the only thing is the one thing that RNV needs there's some stuff in here that's just amazing like RNV installing itself like how I don't know and it's just so cool Mo I love how little chill you have never changed so my colleague who's code is on GitHub needs to have been aware of RNV but plain vanilla repo won't follow the workflow just illustrated no Kevin you can do it with a plain vanilla repo in our studio like any repository if there's an R project file it'll use the R project file associated with the repository so if that R project is actually pushed to the repo so did we do that yet so R project if it's in the repo it'll use that and if it's not it'll create one so that would probably be the only difference but this could be anything it could be an image and you could just pull it clone it and open it up in our studio yeah yeah so yes but Eric and Mo as Eric and Mo are saying you need the RNV.lock to do the RNV part of this you know so if you just want to share code it'll use the RNV and it'll use all means like just clone and open it up as a project and if you want to follow this RNV then definitely RNV too oh no are we crashing failed oh my heart is broken I might ask all of you to just cease and assist on the VM everybody kill your VM I made it a beefy one but not beefy enough Martin just log out sorry yes so Eric is saying here I should elaborate the ideal situation in a multi-team environment is to have one person be the RNV admin to prevent possible merge conflicts oh our studio workbench is freaking out okay babes yeah we're going to talk about that later the top right corner looks like a power button mind completed with warnings so Stephen do you want to post your warnings in the chat I'm going to try and log back in now sorry this always happens with live demos spicy things go down oh no it's freaking out let me try and get this session going again a new session if we all had multiple sessions running that would be quite a lot of stuff I think it was successful though we started interrupting promise evaluation so I don't know a lot about promises in R but I think it's when something is supposed to appear and be populated at some point it seems like maybe that broke okay so I'm going to just open up my RNV example again I also just want to quickly kill everything so okay so this is important here you can actually quit these other sessions I'm just keeping the one going that I need to have going so these will now not be using any resources on the machine okay I was quite ambitious with having everyone open to new sessions so I'm going to open my RNV example .proj so in that moment I just clicked on the .proj file and it's not opening up my session so that's cool and oh it wants me to restore so let's try that again yes I just want to make sure that okay so I'm going to just let this run now and hope that it doesn't die again it's going to now I can after I've now installed basically restored this environment basically reinstalled or installed all of these packages I can run our RNV snapshot again and that's going to grab all of that information and populate a lock file edit the lock file for me so we're hopefully going to get to see that happen but in case we don't maybe I can ask Steve to run RNV snapshot on his environment and he can copy and paste the first few lines of his lock file thanks Steve okay so I'm going to just let this go and we're going to come back and see what Steve has for us and maybe mine will be alive again but let's go back to the slides okay what have we learnt we've learnt that the RN folder should not be shared between analysis or the contents of it anyway and it isn't by default right so it's out of there ignore the contents of the RN folder in this case we didn't need the RN folder at all to reconstitute our analysis we didn't need activate.R to reconstitute the analysis we could do that just with the RNV lock file and RNV will actually create any missing files needed for projects that are not active oh nice Steve okay so Steve can you post I'm sorry I'm calling you Steven like Steve like I know you Steven can you please post the let's say the end of the repositories and then including like the first package entry so like maybe the first like 30 lines of the RNV lock I want to chat probably cool okay so snapshot captures the R version of your system one moment Steve and this is something to note so what we have in our lock file oh it's dubious yeah what we have in our lock file is the R version that was on the system that was used to create this lock file what I wanted to demonstrate but of course technical issues is that this updates to the R version that you're running ah ha Steve just posted in the chat nice I'm just going to copy Steve's message onto my own screen so that I can pretend that I had this as a result okay I'm just going to copy that much so this is what Steve got after he snapshot it after your own restore and snapshot so here we get the R version our repositories are the same and we have some information about the packages so this would have obviously continued for all of the packages right I didn't post everything here but what I want to draw your attention to is the fact that the R version changes to 4.2.2 so just when you're creating these lock files and when you're snapshotting and updating projects after a long time but also when you share with other people obviously they should be aware you should be aware that there are differences in our version but just like highlight this as as a consideration so this is updated to 4.2.2 right in positive workbench you can change the R version like I mentioned earlier so for example if this code wasn't working if we updated all the packages and it's like maybe there's some package that's stuck in R version 3 you could change the R version and try to run the analysis that way so not all analyses need to have all of the packages updated to the latest version you can also go back if you need to I'm going to just kill the session as well quit I'm just going to close these ones and while I'm just yeah so snapshot captures the R version and it's going to note when updating your emojis and I think Eric mentioned that in their organization they have one person or a few people max he wrote an ideal situation in a multi-team environment is to have one person be the iron for admin to prevent possible merge conflicts with multiple users editing the lock file so yeah a potential workflow could be like I as the dev only work on specific files I don't touch the lock file when I'm done with my analysis the last thing before we merge branches you know before we do get flow merge pull request and merge is like Eric comes in and just like runs update run snapshot it is responsive to you know what else is allowed to touch that file that's maybe a potential workflow okay so where are the packages downloaded from so by default packages are downloaded from crime I mentioned earlier and that's also possible to download packages from other locations and why like why would want to do that if crime is perfectly good and like I mentioned sometimes people want these private development environments and I think this is especially true in the pharmaceutical industry where you know there's the I think that you know these sort of frameworks that dictate how validated environments should work and these environments often it's a good idea to have them locked down so that they're not able to access the whole internet and in that situation you would have something like positive package manager like I showed you earlier for your organization that's not connected to that's accessible by the whole internet and packages that you download into your RStudio IDE are coming from an internally managed package repository where there's one person who's making sure that the correct packages are there and only validated ones and so yeah serving also like if you develop an internal package you that isn't on CRAN you know you can also serve that to the people in your organization using these alternative package repositories so if you are using if you do have packages that are developed internally like there's pretty much two options as far as I'm aware one is draft and one is package manager and at Jumping Rivers we use both so we use package manager for our internally developed packages and only people in our organization are able to access those those are hosted on positive package manager and the ones that we use for our training are hosted on draft and those are publicly available to anyone so that's what we're going to be using in the course today if I can get this through so how are these package repositories actually managed like I alluded to earlier that our repos are coming from the R profile that's managed at the organization level but there's other levels at which these default package repositories can be managed from so the one is the rn.luck so R can actually grab the repository straight from that lock file it's not from any R profile it's coming from the lock file another way is user level or project level R profile so on the user level you would set your options your repos option like so to be a specific source so in my case I'm running Jammy Ubuntu 22.04 on my machine and so there's a repository for jammy packages on package managers I'm going to download binaries from package manager by default rather than CRAN source packages and so it's a bit quicker and more efficient but yeah so this is the line that you would put in your R profile for the project level and then there's the R profile.site so this is a system-wide especially for only really if you're using a professional product a positive product like workbench it's setting those parameters for all of the users I added a little link here with more info on repository management via the R profile it's a blog post it's on the Posit website which I think is helpful if you need more information about so managing repositories with RM so the lock file captures that information about where these packages are downloaded from when snapshot was originally called so earlier I showed you the CRAN environment variable and the draft environment variable and those are coming from those two repositories referring back to the last exercise when we updated the snapshot the repositories did not update so let me try and log back into the session is it going to open up my project did I kill it too soon let's look at the RN for key the original one had Jami and JR training and the one that Steve posted in the chat also had Jami and JR training so even though this machine has other repos configured for it it didn't update this repository's section with the repos that it has on the machine editing these repos where the packages are coming from in this lock file is a manual process that's not happening automatically RN is not doing that automatically and updating yes updating a repository requires manual intervention and why my guess I don't know like I'm sure that there are reasons but my hypothesis is that it would be really annoying when you're collaborating with someone and they change their repository on you too whatever is on their machine so I'm using Jami and you're using Focal and then your machine now goes and downloads from Jami and you change it and then I'm trying to download things so I think that's probably why okay so our next demo is to update a repository I'm going to call for a break now because I want to just confirm that I can still do stuff and I can't do stuff on the VM I'm going to just revert to my local studio installation so I'm just going to ask for a 15 minute break just to give myself enough breathing room so break until what would that be yeah seven minutes past the hour whatever the hour is for you I'm going to just try and get set up on the side thank you all for bearing with me and my technical issues I really appreciate it cool see you guys in 15 minutes I managed to get my local machine ready for action but let's see if I can get this to work on the VM as well Moe and Eric are basically teaching assistants right now thanks guys for jumping on the questions yeah Moe Amira inside the environment are you using package manager positive package manager or are you using something else inside a container image Moe says it's our IT system is crammed daily into the air-gapped environment I just got funding to test positive package manager next year so I can call to testing that awesome where can I access the Zoom recording and slides afterwards about the recording I will post the slides on the on the our medicine um page what is that page called sorry I will post them here we go in the channel after the course I will upload the slides Moe did getting funding for nice things in academia is nigh impossible I do think that Posit has different pricing for different institutions and might be wrong I am not the sales person yeah they do nice in the industry can be challenging to get management to support new funding I would love to if anybody wants to post their Twitter in the chat I would love to keep up with people doing interesting things I will post mine you can follow me and if anyone wants to put theirs here be cool to see what you are up to so while we are in the break I can punt a few of jumping river services so we actually in the data engineering team we deploy and maintain package manager workbench and Posit connect which is a platform to publish your analysis or your shiny apps or actually pin data that other people can use it is sort of all in one platform we maintain the Linux environment for these bits of infrastructure and we also do the deployments for people so we do deployments in an infrastructure as code where we gain access to virtual machines and deploy our infrastructure and remote deployments so we can support someone who is in IT through the installation process as well I am here but I am more here Eric has a podcast tell us more about your podcast it is an incredible podcast Eric that is awesome we have four minutes to advertise ourselves so please tell us about your podcast he has a few are you a weekly podcast shiny dev series this conference on Thursday that was fun that is cool dude think about the most niche flex ever the most niche flex ever that is so cool sometimes jumping rivers is a blog post feature on our weekly page it is always exciting we do a niche flex let me see if I can get this to work Mo is talking about internal R packages on Friday awesome recurring contributor Colin I am sure lots of you know Colin he says if you want to learn a new tech one of the nice ways to do that is to write a blog post then you get to explain it I love the bros getting together in the chat we have one minute of our break left I am going to share my screen I am going to do this one on my local machine to get around some of the issues cool we are basically going to edit our last project that we made in this theoretical situation we are trying to use an internally developed package in this case we are going to use a package called JR introduction let me do something in the background I think we are good our guards are shining on me today my blog is like me I am afraid to fail I love that that is also my ethos around teaching I don't know everything we will figure things out also to engineer mistakes into your teaching material that you learn together the thing we want to do is add an internally developed package into this code I am shifting over to my local RStudio environment because technical issues I don't want to crash this again I am going to move over there to go back to our example project we are going to add this package to the code called JR introduction I am going to the code here I am going to add library JR introduction I am going to save that I am going to add the JR introduction to my machine I have added it to my R file I am going to run status I am coming back to my anchor my little blanky I also just for my own illustration purposes want to see what repos are used in this project I use this function early in the slides I showed you that you can set options and add these to your .r profile and these can include things like package repositories I want to see what is available on this machine I have got jami and JR training and if I look at the rn.loc file that is what I have got in the actual rn.loc and this rn.loc I snapshot at the end of demo 2 demo 2 ended up with us snapshotting this and recreating this rn.loc I am going to try now to install my package you can try and install this as well it is a public package it is internal but it is in a public repository it is not a secret I am checking in rn.loc I am checking in rn.loc I am trying to install it it is looking for the package it is loading it is thinking it is doing its thing it is taking forever I am not even on the internet maybe the issue is my internet I am both intentional and unintentional of course this is taking forever I am crying inside it might be taking forever because it is supposed to fail it is taking forever to fail great if you want to post your error message in the chat that would be great I am just going to kill this now we know this is not going to work this is not working I am going to stop I am going to now update this training repository this is our old training repository we have now updated this training repository to something else this is not supposed to work this is not supposed to work when you are updating a lock file you don't want to just willy nilly update these repositories like I mentioned earlier this is the job of someone who is administrating your rn setup like Eric alluded to earlier this is the request of URL return error 404 that repository doesn't exist it is giving a 404 message we are going to replace this in the rn.lock file there are two ways to do this you can just go straight up and edit this I quite like the other way this is the the the brevity the importance of what you are doing that is using the rn.modify function as Eric says do not update this lock file in the editor manually if you can you can automatically do this this is the danger zone the only time you want to do this is when an old repository doesn't exist anymore it is not something you want to do every day you don't want to manually change the version numbers of your packages unless you desperately need to do that I don't see why you would I am going to just grab the new package this is the new one over here I am going to copy that in and pop it into my URL I am replacing that I am running save when you change a repository link when you edit an R profile you need to restart R because R doesn't know you have changed the game it doesn't know you have shifted the ground underneath it you need to restart R so it reloads that rn.lock and R profile so those changes can take effect I am going to restart R and restart R it tells us the project is out of sync rn status for more details rn status I think it is just a complaint about jr introduction it is telling us the jr introduction is not installed now we can go ahead and install it so rn install jr introduction and this should work now because we have updated the repo but I am not holding my breath because technology is not cooperating with me today once you have updated your repo you can go ahead and try this command and let me know in the chat if it works I have no idea why this is not working you can restart R and then notice where it installs from this new repository it should I can't specify the repo in the rn install command that is a very good question Alan can you check the arguments of rn install just by using the little help function in R session restart r is a good idea after installing packages let's jump back in I had an exercise for you to do this on your own but with time constraints it is basically the same thing you are just going to update the crann link to real crann and not the crann mirror that is on package manager so what we learnt through doing this is that our packages come from multiple sources and not just crann site-wide settings dictate where packages will be installed from these options can also be set on the project level and the user level the lock file can be edited but proceed with caution snapshot only captures the packages and the dependencies that are used in the project files so if you install the package and don't use it it has no effect on your lock file and that lock file doesn't automatically change the repositories so I have different repositories set on my local machine it is not grabbing those it is using what is in the lock file to install the packages the repositories that I have set on my machine in my R profile that is my global level R profile it would have enabled me to download the package whereas because it is an isolated environment it is not taking cues from my R profile it is using what is in the lock file if you create a new project it grabs those repositories from your R profile so it is a interesting caveat so where are the packages downloaded to so we have looked at where they are downloaded from but where are they actually downloaded to so normally to your own R OS R version folder so on my computer this would be home R this like Linux new library and then R version 2.2 and then it would have all of my R packages actually now living in that folder on my computer but with RN does anybody want to guess where these packages are downloaded to you can drop in the chat ok no one is feeling ok people are having their own chat in the background so the packages aren't actually downloaded to your project level library I showed you that folder RN slash library and there is your packages they are not actually downloaded to there and this cache is in a hidden folder so denoted by the little dot so it's in the cache and then in our RN and then repository with all of the packages and the packages that you actually see in your project are what are called sim links to this cached repository so the reason for this is basically speed so when I download a package for one project it saves that package and its version in this cache and if another project needs that same package it pulls from the cached version so from this common pool that sits between the two projects so it doesn't go and ping package manager or ping the package again it does it once and then it saves it in this cache and then the project downloads the package from the cache so it's pretty magical it makes installing packages a lot faster so the next thing I want to do is clean up our project I just wanted to explain a little bit about the background magic I'm going to try and install dot packages for good measure and see if that works anybody managed to install the package of course this is working beautifully builds a surrounding restore me going on about how fast this is I'm going to get it to install either that's so annoying this is now modified did not let me try again have I activated it I don't think I had it activated let's see if that works now okay Stephen says he modified our block directly and now the package is installing the aren't modified strange okay so if we directly edited the way I just tried to it should install the last anyway you should likely snapshot after editing okay let's try that snapshot yes okay let's restart our session for good measure let's check looks like we are I'll just I take it back yeah okay this is annoying let's talk about cleaning up our project so yeah okay let's clean up our example let's see if this will actually work in this let's try it let's try it on the VM I live in hope I live in hope okay I'm going to just add my repo to here and save that oh no this was Steve okay I'm going to add a dot r script here oh I can touch it okay so I've got my oh so annoying right because I didn't manage to get this to install at all let's see maybe since other people are not on the VM it'll work oh I think I made a mistake cool thanks Eric for all your input I really appreciate it yeah that's I'm just going to back away slowly from this what I want to say is that we can use a function called clean so if I have installed any packages yet so I'm going to install something that I know that I'm not going to need so I'm going to install the parsnip package so I know that I don't need parsnip in this project oh I'm having issues with installing packages at all that's so random I'm going to try and install I'm going to try and install from CRAN rather maybe it's this I don't need to restart my session I actually was very slow I think something is up even the status even the status is saying it's okay yeah I think this is some bad luck but I'm also seeing CRAN is being really slow trying to install from CRAN and installing parsnip work for Steve thank you Steve it was quick let me just close our studio and open it up again and see if that helps me yeah okay what I want to talk about is cleaning up your rnf project so just pretend that I've installed some packages here that I don't need I'm going to run rnf clean and so oh this will actually work yay I've got these packages installed I mean added to my my file right I'm just going to add ggb I've saved my r file I've saved my script basically I'm ready to share this project or I'm just on my own machine and I've got some packages hanging around that I don't need we can use this rnf clean function and so what this does is it removes any packages that are not actually called in the code and are not captured in the lock file so you might have installed a bunch of packages so in like the course of you developing an analysis you start and you kind of move through and you try this library and then that doesn't really work and then you try this one that doesn't really work and finally you find the one that works and the workflow you know the pieces of the puzzle come together and you finish your analysis but now you've got all these additional packages hanging around the lock file does not capture those which is great because that means that your collaborators are not going to see all these random packages that they don't need so that information won't actually be captured in the lock file but you might have like created a lock file that will be captured later whatever like it's not going to it's not going to actually end up being shared but you might want to just remove some of these packages that you don't need and so you can use the clean function so I'm going to just proceed and it's just quickly gone and removed all of those packages from the project library which is awesome another one I wanted to share with you is install so on my machine I'm not sure if this is going to work but I think I have used this in my package cache install sorry install so use this as a really nice package which enables you to easily edit some of these files associated with projects so there's a function called use this edit our environment and edits your our environment file our profile but clearly I don't have it installed already in my package cache which is annoying let's try it here actually let's see no it won't be on this issue okay anyway you can quickly and easily download functions I mean packages if they're already in your cache I'm just going to take a look at that another one I wanted to show was perge so it's a pretty gross sounding function but if you have a package that you want to get rid of in your cache as well as in the project itself so there's a package that you know is not going to be used in anything else or there's a version of that package that you want to get rid of in your cache you can also remove packages using this perge function so I just wanted to touch on that as well let's see here what else I have for you the handy functions yeah and then update so if you want to update your packages you can do that with our end as well you can do this because I mean you could go to your packages and update them here but our end comes with a function to do this for you I have no idea what is going on with my machine did anybody manage to try install or update oh you very likely have mass in your cache let's try that install mass I think it's my computer that's the problem I would have that on my it's working nicely on your site awesome thanks so when you want to update your packages you can do that using update all of the packages that are associated with the project all of the packages that are captured in the lock file which is really nice I'm very relieved that these things are working on your site at least we can talk about it even if it's not working on my site let's head back to the slides so like I mentioned update your packages upgrade so right now I think we're running the latest version but the management of all of your packages this tool to manage your packages also updates so you can do that within projects as well if you want to update the rnv that's shipped with your like when you open the project that little rnv install it's going to update I'm not sure about if update also upgrades rnv that would be interesting to find out install installs packages for you and diagnostics gets the full context of an rnv project so definitely go and run diagnostics Mo you're an angel cool Mo just post for us the output of rnv in the chat so yeah so checks for packages to update and it goes forth and updates those for you Mo I don't think it automatically updates your lock file so you can see if you need to run snapshot again I'm sure status will let you know either way and yeah if somebody wants to run diagnostics and put the output maybe there's some secrets actually that might come in the output at your own discretion diagnostics can be also really useful because it shows you some of the environment variables are that you've established and status shows the following packages are out of sync so there's a lock file version and library version and then you can decide do you want to update jason light or do you want to go for the library version so do you want to update or do you want to downgrade essentially yeah so rnv like I find that the messages that you get from the the tool are really helpful and yeah give you options okay so we have Mo says that's brilliant because you might want to update test everything and then decide whether you want to snapshot or not exactly okay so now I just kind of have a little bit of time at the end of this workshop for open discussion like how do you envision rn fitting into existing workflows how will you manage collaboration and how will you manage repositories you know we've touched on a little bit of best practice around this like Eric mentioned having one person who is the administrator of repositories and also the lock file itself can be really helpful use this allows us to edit the moment yeah let me show you that let me see let me just deactivate this I'm deactivate so use this use this there are many functions within use this that I have not actually explored for the most part I use it for edit our environment okay I'm not going to use this on my local machine because you'll see all my secrets my secrets will be splashed across the internet okay but I think my art profile let me do this off of the screen share I think yeah I removed any incriminating things from my art profile so my art profile is where I've set which repo I want my packages to come from so this by default will be coming from package manager my packages will be coming from my manager by default and the jammy linux install but yeah edit our environment is also great so for example I'm going to use gitlab gitlab is the tool that we use to manage our code repositories and so I've got an access token for gitlab stored in my environment file and that's what I'm able to use to access some of the data on gitlab for example also like api tokens so the clockify api is something that we use internally that enable us to look at time tracking data and so that basically my machine is connecting to the clockify website simply put and pulling that data in but you because the data is confidential you need a like a password to do it my question may have been lost in the past if you use this update the description file I don't think so you'll need to use package package for that but I'll not install the package you don't want su rent su rnb in a package development project yeah yeah you don't want to use rnb in package div yeah I don't think that's a good idea I think it's really built for things like analysis yeah okay yeah so my question is how do you like can you immediately see the value of rnb for your own workflows it's unfortunate that my demos didn't work but I'm hoping that some of you who ran the code were able to see the benefit I know mo was yeah has anyone used rnb in standalone r scripts rather than projects I know that's but weird but sometimes people have quick and dirty scripts to run essentially only once and then they realize later that they need to run it again and then that is literally why projects exist as far as I know yeah I'm pretty sure projects and rnb go together I don't think you can separate the two but I stand to be corrected on that you want dev tools and use this for package div where every package your package needs is in the description and id not locked to a version and is not locked to a version yeah yeah in terms of managing collaboration I think that the suggestion I made earlier of updating the rnb.lock being the last kind of action before an analysis is considered like productionised is probably a good way to go about it obviously like if you are collaborating with somebody and you're going back and forth and pushing your changes you can also push your lock file but as a sort of routine last step I think it would be a good thing I definitely see the use of rnb but will struggle teaching it to users because of the use of it is so different than what people are used to that's so true I think also if you're not in industry getting people to use git is already quite a stretch martine says you typed rnb well done mo I was in and mo's got it that's great yeah I think that's definitely I think there's a learning curve for sure yeah but I think maybe organisation having somebody be a champion for a new tech that's all the other people can refer to I find that having a person who makes a new tech piece of technology like approachable and workable is helpful for that when I try teaching it to the lab albeit badly since I didn't really get a lot of the things myself will be better now like they're mystified about this concept running something outside the script yeah I'm glad that it's more less opaque for you now mo thank you for that okay and then managing repositories I think repository management is a whole demon on its own thing so like having package manager is just really helpful for restricting I guess what users are and aren't able to install on their machines it's basically both for that I think if you don't need a lockdown environment it's a lot simpler really I think just making sure that the same person who's maybe reviewing the code for the lock file is the person who also like keeps the repositories maintained so I've been doing orange and hitting escape to keep orange nice okay before I let you go here are some bits of info about us so jumping rivers is consultancy like I mentioned earlier we have data science consultants we have trainers who teach all sorts of things and we have infrastructure data engineering side and often we just kind of straddle all of these categories you know like sometimes people need help with bits of everything so you know if you need help with something related to data and infrastructure and Linux and R and Python we can help yeah these are the people to contact with me I popped my Twitter in the chat and I'll also just pop my email address and I'm if you have any questions or feedback for me following this workshop please don't hesitate to get in touch also if you are a woman in R or a underrepresented person in R you know in the community and you just need like some guidance or help or input also please like get in touch with me I love meeting other people who are looking for support okay I'm going to post the slides here later on I need to render them to PDF and then I'll pop them in this place over here and if you're on github you can have something called code owners file which is great yes so yeah exactly so you can have multiple people working on the same project and those applications are managed yeah in github or gitlab or whatever you're working with oh my team you're welcome thank you for the feedback I'm glad that it's starting to compute that's awesome that's great to know I'm so sorry that all my demos just crashed and burned towards the end thank you and please do follow me on twitter so I can make new friends thank you guys live coding is terrifying for this reason yeah I've done enough training at this point that I'm like hardened it's just weird when it happens because it's inexplicable sometimes Brittany thank you Brittany awesome yeah I really think that especially for like pharmacometrics and epidemiology and these fields where it's really important that your environment is sort of as standardised and replicable as possible it's really important Alan the recording probably posted by the R consortium and I think it will probably appear on the same page where the slides will appear I hope that is my hope yay that's great Todd awesome sick okay guys I'm going to call it there thank you so much for joining me and I'll see you on the internet I might be able to pop into a few of these talks yeah thank you thanks for being a great chat guys very lively chat I really appreciate it okay keep well everybody have a good evening and have a great conference I know this is just day one and really take the opportunity to connect with other humans doing cool things even though it's all virtual and weird cool okay guys bye