 So let me introduce today's speaker. So the webinar is provided by Greg Willis. Greg is responsible for the outreach and he is co-lead of the hotel project. A project I think mainly supported by NSF for several years now. The hotel makes it possible to have your science reproducible and therefore transparent. And having reproducible science is the pillar of science. I think if somebody else cannot reproduce your science then your outcome might be questionable. So I think this hotel project is a very important element. So I'm pleased that you can join for this presentation. Hotel developers have created a wonderful gateway. So please visit that and I'm sure Greg will point you at some point to it. But the gateway is there for you. You can post your workflow basically there and others can follow the steps you have taken to get to the findings you present, for example, in the paper. Some logistics regarding this webinar. So we're recording the webinar. So please keep your microphone muted, your mic muted as well as your camera for now. If you have questions during the presentations feel free to put them in the chat. And after the presentation, me or Lynn will read them out and hopefully Greg can answer some of those. So after the webinar, the recordings will be posted on the CSDMS website. It will take a couple of hours, but we will keep it there for you to review or if you want to point colleagues to it or something. So with that, Greg will present today, publishing transparent and reproducible computational research with hotel and with that crack the floor is yours. Great. Thank you. Thank you, Albert. And good afternoon for me. Good morning for those of you putting in your time zone. That's not shared here. So, as Albert said, I'm Greg Willis and so present about whole tail. University of Illinois at Urbana champagne and school information sciences. I've worked on whole tail now, the projects in that seventh year. I've been on it I think for five or six and so quite a while. So just very simple agenda I'll cover a little bit of whole tail, the background with some of the concepts that underlie the platform and then actually spend most of the time walking through the platform showing you some of its capabilities and then some time on questions. And I'm good because I'm going to be looking at another monitor I'm going to stop my video if that's okay. So whole tail is as Albert mentioned, as I've funded the old data infrastructure building blocks blocks program now CSSI. And as such, the goal of the project was intended to rebuild usable reusable infrastructure. What we've developed and continuing develop as an open source platform for computational transparency and reproducibility. I'll talk about our definition of that in just a second. And the platform is intended to help researchers create publish and enable review like assessment verification of computational research artifacts. And in the platform we call these artifacts tails, you'll see them called capsules by other similar platforms. And the project websites there and the dashboard. So we operate an instance of whole tail hosted on NSF's jet stream to cloud. And I'll use during the demonstration what I will say is that very sadly about an hour before this presentation. Jetstream had a network outage. So hopefully the demonstration will go well I may flip over to another instance that we have for another project if needed. So it was a unfortunate but typical incident so this it's a public dashboard and I'll walk you through using it. So what do we mean by computational reproducibility and transparency. It really takes this narrow view of computational reproducibility that's from the National Academy's definition, which is really getting the same results using the same data, the same computational steps methods code conditions analysis so some people would use reproducibility. So the broader ideas of reproducibility and replicability were whole tail platform is in support of these, but as a technical solution it's really about enabling people to re execute. You're now a computational workflow or an analysis from a point in time, trying to retain as much as we can about the environment, the information about the software that was installed the software used et cetera and the data that was used. So transparency as providing enough information so that others can assess your results without necessarily repeating them. And I think what we like to recognize that the notion of repeatability for reproducibility isn't always feasible or even desirable. You know there's a lot of good reasons why someone may not be able to simply re execute your published code. So if you have proprietary data or sensitive data you might have long running or large scale computation so we see transparency is equally as important even if someone were not to actually re execute. Oh, thank you. I'm sorry I was just checking the chat so one of the drum motivating use cases for whole tail is actually as the project evolves is the kind of increasing adoption of reproducibility policies. This is part of the peer review process and I understand this isn't really a widespread in the geosciences, but in some of the fields that we work with. We see journals not only requiring that authors share computational research artifacts, including the data and the code, but in some cases those artifacts are subject to review to confirm that they actually reproduce reported results. So two groups that we work with closely is the Odom Institute that implements the verification policy for the American Journal of Political Science and the American, the data editor for the American Economic Association where they have a fairly strict code of data and code availability policy across the eight or nine journals that publish empirical research. All right, so what is a tail is kind of one of the core concepts of whole tail. So it is a research object and if you don't know that concept and it's really intended to capture information about all of the artifacts involved in the research process beyond just publication or just a data set. Technically speaking, a whole tail tail is like a could be a zip archive of a particular format that's got data it's got code it has documentation. The results of the computation information about the environment. Tails are intended to be executable so they're designed for someone to what underlies this is Docker, so that's designed for somebody to re-instantiate that environment that image to be able to explore, assess, re-execute, publish research. And whole tail itself doesn't is not an archive so the platform tails can be stored there shared use there interactively. But we rely on third party research repositories are actually minting DOIs and preserving artifacts for the long run so tails are intended to be publishable objects that can be put into external repositories that have archival guarantees. And then, in addition to transparent the notion of verifiable I guess we've got kind of two definitions of verifiable in this context one is that the resulting tail object that can be moved around the zip file that can be published is verifiable it has technical metadata. And hash is what not to ensure that the files are that we're in there are the ones that were originally there so we use a standard called bag it for that and that verify will also in the sense of the journal verification policies that it's really intended for these things to be able to be assessed and verified in the sense of re-executing workflow and confirming reported results. See, I'm just going to show you a couple of examples here to highlight. So this is not a tail. This is for the American Journal of Political Science. They have this policy that researchers have to share all of the code all the data and sufficient information for a third party in this case the Odom Institute to reproduce those results and confirm that any analytical results figures tables etc are reproducible. This is put into Dataverse which is a fairly commonly used research data repository at least in the social sciences, link back to a peer review publication. So a tail built on this just to show you the difference this one's been published as a no no contains all of that information plus information about installed packages in the environment a reference to a container image hosted in whole tails. Image registry and in this case. So concept I'll go over in the second year called a recorded run, which is an independent execution of that workflow that produced the figures you know any of the results that would have been reported in the manuscript so the notion of the recorded run in this case is kind of to that end of transparency where if you use that feature of whole tail someone doesn't actually need to re execute, which you've done because they can trust that the run of this code with this version of the code is actually what produced the results. I guess the other side of that is that this can be easily re imported back into whole tail and re executed so I would pull up the environment that the author selected with their code with their data and have access to that executed run. The idea of a recorded run so you have a workspace and whole tail which I'll go over in a second that's just a directory that's got your code, you can reference data externally or your data can be in there. So the recorded run would create a version of your artifacts. I'm sure that it's got the correct version of a container image if you've made any changes to say packages, execute a workflow that specified by you, and it captures an immutable copy of all of the outputs from that. It'll capture system and runtime information. So folks can know downstream what types of resources were used as a part of the process to what kind of resources that they would require to rerun. So whole tails approach to computational reproducibility and transparency is to allow researchers to run their code on an external system. Well capturing information about the computational environment so through the container image we get the operating system version of software that was used specifically for an execution and produce results and then to put you know easily publishes artifacts long term research archives. It's a web based platform. So we'll authenticate using an institutional identity access commonly used environments and then we're very happy to add new environments so primary ones are Jupiter. We've got support for MATLAB's data, Julia also through the binder's infrastructure and this is the environment customization is based on project Jupiter's binder so the underlying configuration of which packages are installed. It uses binder as component, we have extended that for MATLAB and STATA, which are not supported there because they require commercial licenses. So and then you can reference data in the system by pulling it in from external repositories and then you can publish out to archival repositories to get an identifier. Yeah, we've got, you know, at this point, thousands of people have used the system for various purposes either research exploration tutorials assignments we've got classes that have used it as well across a variety of domains. So we're, we've this is a thing we did a tutorial last May for CSDMS. This is our second time talking to the CSDMS community. All right, I'm just gonna I'm going to jump into kind of a demonstration that's based on that 2022 tutorial. And I guess I can drop a link to this slide deck if it into the chat if anybody is interested and I can share this I'll but that get a repository has got a fairly detailed tutorial that you can walk through. I guess I'll actually, if anybody's got, I guess you're not opening the questions yet right Albert, because it's a breaking point if anybody had any specific questions for this part of the presentation. Yeah, I mean, if you're, if you're fine with it. It's fine with me if people ask questions. One other thing, I clicked on the link and it asks for access, you need access. Let me fix that then. It should have been within our. Yep, you try again it should be shared with everybody. Yeah, I can get to it now. Thank you. Without further ado, I'll give a quick look at whole tail, the whole tail dashboard and I'm going to do this and screen sharing still correct. Can you see my screen. Yes, yes, we can. Why do I don't know why I'm not saying that. For an incognito window just so you see the first time user experience dashboard hotel.org. And I'm going to also look at an example that's been published. There's a change that we implemented after actually some feedback, including some feedback for the last tutorial, when you come into hotel now you don't it doesn't require authentication. Initially, so you can view anything that's been shared with you, or that's been made public within our system. Just directly, I see if my. I didn't make my land lab, this is Bertram CSTS of land lab tutorials so you can view metadata you can view files etc without needing to log in. But most other actions require you to actually sign in. This is Globis's authentication service. It's convenient because it allows us to use your institutional login to get in. Some people get concerned about the privacy, the consent and privacy policies. The other reason we use it if you're familiar with Globis is we can transfer data. From Globis endpoints into whole tail on your behalf it granted permission. If you've never used the system before there will be a consent form there. So you can use Orchid, you can use Google account here's the consent form, I cleared my consent so you can see this. So it's allowing us to initiate transfers for you and for users of Dereva services, which is a specific integration that we have. So you can use typical identity information use your email address. Your email address to show information about your system. So we have the notion of public tales. Tales that have been shared with you within the system so you can collaboratively work on the same, the same tail. And then your personal tales are things that you own. You can create a new tail and three mechanisms for that so one it'll just be an empty tail I'll show you that one we can create one based on the get repository. And whole tail doesn't require use of get but it also doesn't preclude it. So if you're a get user. You cannot use get from within the interactive environments but part of whole tails design is intended for people who aren't regular get users to also be able to publish things like binders. And then you can also create a tail from a digital object identifier so if it's one of the supported integrations. We can pull that in either as a reference data set or something that you're going to act on. So I'll just create a blank webinars tail. You can pick from a variety of environments. I'm going to pick Jupiter live because I feel like CSDMS is predominantly maybe Python based community. I will note for MATLAB and you're not stated users for MATLAB users were leveraging the network license at JetStream which is at Indiana University so you can use MATLAB using institutional license that, but images from whole tail require you to have your own license if you use them locally. So here, just tell metadata pages your default. You know, typical stuff that we would add this gets translated at publication time publish time to your repository. We can add files folders to that what's called the tail workspace this is your primary directory. I'm not going to cover this today but it's in our documentation you if you use external data and you don't want to include it like it's a published data set that you're analyzing. We do it by reference so your tail will always be connected to that and it can be pulled in we cash it for you during analysis, but it's not something that you republish. You know, the main thing here is the interactive environment. So what's going to happen is we're going to spin up a this case of Jupiter lab instance. It's running on whole tails infrastructure and just present to you that interface. And if you're familiar with Jupiter Jupiter lab, it's the basic interface all actions are still usable here so you can upload and manage files directly. I'm trying to not going to recreate the land lab example here so once you have your tail, you, you know, you can run commands in the terminal you can create notebooks you can execute anything interactively that you can do in Jupiter lab. You can do via whole tail. And the operations that we now have on our tail menu allow you to do a few things. Administratively you can look at the logs of that running container in case you have issues like code that creates errors that don't. They're on apparent within the interactive development environment. If you make changes to configuration, adding packages, you can rebuild that container image which is what the rebuild tail does and restart the tail using that rebuild container image. You can save a version of your work. So that'll show up in the version panel here. You can initiate a recorded run. So this would allow you to initiate a some master script that executes and I'll demonstrate that here with the land lab example in a second. We can publish an export. So publishing would publish a to Zenodo we have to connect to Zenodo to do so, and export would give you a zip file that you can operate on locally. We also had the ability to share with other users within the system so I think if Albert. Albert is here here we go I can share a tail with Albert I can give him permission to view it, which case he would be able to make a copy of that to edit, he can view my tail but he can't run and modify files within it. So he wants to run it and execute say notebook steps or execute code that produce help but he'll make a copy, or I can, he can co owner co edit this with me. We don't share container instances but he would be able to modify the files as I can. Here is a publish tail that published as a sandbox yesterday that's got a land lab tutorial on it. And this is just a demonstration of how that tail can be brought back into whole tail. There we go sorry for the Salonis. So you can, you know, once you publish to archival repository, you can delete that object from whole tail and it can always be re imported it can be re imported by somebody else. That should have taken me to that should have taken me into that interface so. So here we see my land lab example. It was published to Zenoto and brought back into our system on March 8. It's got a simple Python script that generates some very simple figures based on a tutorial. There was one version of this at the point in time that I published, and it had a recorded run, which means that that run dot sh was executed and generated the figures that were in the sub directory. So I can bring up this tail that was published to Zenoto. It was it's a Jupiter lab tail. So, in addition to so it's a very simple master script that just runs a simple Python script that was taken directly from the land lab tutorial materials that just produces a couple of figures. So it's interesting this environment YAML. So this is one of the, the, you want to configure your environment whole tail requires you to use the convention that reported Docker. These are typically common package manager format so environment YAML for conda. So this is installed a set of dependencies into the container image. Allow me to run land lab. So this built image could have any package that you want in it as long as it as long as it installs under Linux because these are assumed to be, I should have said that up front these are a bunch of based images. So as long as you can run within a Linux environment, you can declare the dependencies for your workflow. What's that images built anybody who brings that back into the system. What's that environment either as a recipe that the image can be rebuilt or via the image that was actually hosted in whole tail. So I can get again from within. Here I've got a full interactive terminal I can run that land lab example myself. It will generate some figures that are viewable by me in the interactive space I could just publish this right now independent of a recorded run. And it's, you know, it's still reproducible it's still useful, but no one knows that I actually ran the code to produce those figures. In this context, I could have uploaded them from another source. So back to the idea of a run so the run the run script. It's really just going to do that same command I did. There'll be some notifications that come up. So this is now running in an isolated container using the image that would be associated with that runs the container image. And once I have that run. So there's one from yesterday one who today they're also identical. I can go back through the process of publishing. So I would publish. This is a sandbox. And it would actually in this case update that same instance of that same record in Zenodo just with a new version. Since this already had one, and that obviously wouldn't work for someone who isn't me since it's me I'm allowed to update and create a separate version. I'm not going to go through the publish process right now. So let's see. I think one of the issues with jet stream running locally was not convenient, but I can take my code out of full tail as a zip file. So I've downloaded that local and I'll get it just going to get a terminal up here. And there's a script in here that just says run local I'm on a Mac so you have to have Docker in order to do this. The script doesn't seem a Linux based system right now so that's window support is something we per week on. So what this is going to do is do verification on the downloaded package using the bag it formats this is just making sure everything is as it should be going to check some standpoint. And it's going to pull the image from whole tail. That is the built environment that I ran to create this package. And it will start that locally. I'm not going to do that because it the downloading from this registry was not working for me this morning with their network problems so this is just to show that you have it's it's sort of a weak point with whole tail, the publisher, the repositories that we publish to tend not to want us to deposit images into their records so we maintain a registry. But whole tail is a non archival system right now. That's a step we're moving towards is actually having some archival guarantees behind the images, but for right now I mean that's going to support them as best we can the built images but you can they can always be rebuilt from the recipe later. I guess I can go through a went faster than I expected so I'll go ahead and look at the registered data external data case. So, if you rely on externally published data, something that already has an identifier. So I'm going to go take an example identifier from that tutorial. Oh, just apparently not working at the moment one moment. Not something I tested earlier. Okay, I'm not going to do this at this point. Sorry about that. I, the issues with jet train this is one area that I didn't test prior to sign getting on for the white mark today. So I'm just going to go into a few plan features. What we're working towards as a part of the platform right now. There's been actually a number of requests for visual visual studio code support, which is something it should be added. I was in there today but it's not implemented yet. As part of a collaboration with, I guess with Albert and with the commsus community, we've got proposal with proposal and right now for an SF to expand whole tail. And a few key areas. So one is the ability to create tails locally so toolkit that you can run on your laptop or on your own server to create these objects that don't require you to use our cloud interface. Integration with GitHub through GitHub webhooks where you're not. If you're a GitHub centric user where you can benefit from some of the features of whole tail without necessarily having to go again into the IDE. Since our interfaces required for all of you to use all the features right now. Support for increased resources in terms of memories and cores and specialized resources GPU access which we have now in just room to all the support for HPC and HTC workflows. So the ability to create tail like objects and things like recorded runs through batch clusters, while recognizing some of the special requirements for large scale computational work. So things like building singularity images, allowing users to selectively include or exclude intermediate data, etc. The whole tails and open source project and increasingly we're moving towards community, a lot of the development occurred based on primary NSF funding but we're moving much more to community based model. If anybody's, if you're ever interested in joining our slack is open invitation link here it's also on the whole tail website. And members of the whole tail organization are not just whole tail project members it's a lot of people are very interested in reproducibility transparency replicability and research across a number of domains. So it's got it's a it's a good group. And then there's our GitHub, which you've got, you know, probably 30 member contributors and growing. That's the end of my presentation I'm early, but I will open it up for you and a I guess. Yeah. Yeah, thank you Greg. That was very cool. There we have a question from Anna. So Anna indicated, you know, what are the terms of use for whole tail. And can it be used for classes or workshops, even in a in a non academic setting. There we go. Yeah, I should have gone over this at the very beginning. Thank you. Great question so it's yeah it is and for academic use only. That's also the policy of jet stream and our access to the mat the mat lab and state of licenses. So it can be used for classes and workshops. I'd have to understand the context of a non academic one. So yeah if it's a anything within kind of a commercial context it would be not allowed within the terms of use. But if it's outreach education related that's in kind of a gray area it's not specifically an academic class but it's a outreach to research community, etc, then I think that would be reasonable. Compute resources for a tail. So by default so it's kind of a right now we're a laptop in the cloud. We've got some limitations on the underlying VM size at jet stream. I think the, there's a soft cap of like it's eight gigs of RAM for a tail but that can be extended. I think that probably a remembering it should be 16 gigs of RAM right now so these are not this is not for large, and that's not even very large these are not for larger work but that's an area if you want to use wholesale for that please reach out to me, because that will motivate us to actually increase some of those caps, those caps can be. Increased. Then we have a question from Katie. Katie, can you. Yes. Yeah, thank you so I'm. Thanks so much for the presentation. I have a couple of questions I've written down. I'm at the USGS in golden Colorado where I do a lot of work on computational computational work in landslide hazards. And I am. I'm going to give a little bit of a preamble, which is that I'm often working in a situation in which I am developing the software that is being used. And then on top of that there's some workflow that is being used to run a set of jobs. And I'm also in this category of people who, you know, I may be doing a set of runs. There are, you know, like hundreds of, say, two day well time 40 no job runs on a slurm cluster and then there's a post processing phase. And so I'm curious about what and I think this big business your prior answers. You know, some of these are like in the future of whole tail, sort of what the vision is on supporting of multi phase analysis of sort of the runs and then the post processing and so forth. And sort of where what the current capabilities are and sort of what the vision for where that's going is. All right, and I'll say, is very nice to meet you because we used one of your GitHub repositories as an example and some of our work for the premise proposal because it's exactly. It's like transparent work that's extremely well documented. It's obviously not automated because you've got a lot of steps that are in there so I'm familiar at least with one of your publications where I think it's some of this stuff. This is the one that has to do with predicting erosion of the nuclear waste repository maybe was it nuclear was like a multi article publication. Yeah, yeah, I mean and that's that's the kind of thing that I mean it's like I think it's a little tenuous to say what that is or isn't something is demonstrated. And, you know, we ran into a lot of issues with data archiving there it was very hard you know we ended up having to make sort of like a pointer into a global this anyway, sort of. Yeah, so this is so these are cases that there's a long running thing and whole tail is something we call tails at scale, and the two pieces of tails at scale or that a lot of researchers, we've had with astro community that do massive simulations they can't put their data anywhere. Unfortunately they're not as concerned about reproducibility I mean fortunately for them and not for us. Economics even you know it's a much smaller percentage for the American Economic Association but obviously work gets done on campus clusters and exceed resources. And they can't verify that stuff that's one problem right there's no journal policy of going through and rerunning the code and can happen. But often they don't have the confidence that stuff was run as it was stated right because it's if it's multi phase where there's human steps involved and multiple people there's a lot of risk of potential error I guess right like and in the reproducibility context so we're very interested in this this is actually one of the key points for the proposal that we put in for CSSI. We have a secondary project right now that's coming out of economics so they have the added bonus of private data using census data census resources that are access controlled so if you want to be transparent running on census resources where you can't share the data. You know compute infrastructure that gets kind of tenuous so trace is not, I guess one piece of that is there's a component will be implementing for slurm to help, not instrument slurm in a deep way but to help like it's would be like an isolated place where we can trace stuff that happened and report and provide information from a transparency standpoint this really happened on this cluster, and you know these are the outputs. But again, you know, long running compute and enormous data doesn't have a home remain big challenges for everybody, but for us as well. But yeah this is like it's something we're eager to work on. Yeah, it's a part of it's like cloud based whole tail does you know good, but using some of the tooling to create a provenance trace with some guarantee that people can know that you ran something and they, you know, increasing kind of trust in what your outputs are is kind of where we're headed. Does that make sense. Okay. Yeah. I don't know that this is a feasible thing but you know it makes me think I recently started to use snake make a lot because it is flexible enough for what I, my reality. I can sort of but you know it. Unless you specify every single input and output file that is being made you're sort of not necessarily, you're not truly archiving like that this was run on this platform but I can imagine maybe the sort of like snake make generating. You know, sort of interrelated tales could be a very valuable. And that's I'm sorry that's the other. That's Lars Wilhuber who's the data editor for a long pushing us for, and this is part of the trace project as well. The chaining of these things right outputs of one becoming the inputs to another step but they might have other dependencies they might be completely isolated jobs that get. So yes that's, again that's part of trace trace is going to be a lot more lightweight, I think. Yeah, I'd say you do this, if you're ever interested in talking about it, you know, as a case like if you've got research that could inform what we're doing I think we'd be very interested so. Yeah, yeah, I may I may follow up on that because I think you know I'm sorry I'm dominating this conversation but hopefully people find this interesting but that you know the kind of thing that I'm facing is you know running. Hazard assessments for postfire debris flows for every large fire on public land in the western US every year and I thought you know that's a thing that takes substantial computational resources. And I think it's important as a government employee doing science to demonstrate that that you know what it is and where it's going and what went into it and what came out of it. But you know how that. But, you know, it's a real. It's a real lift. Learn the tools to do that well. And, and, you know, so I want to make sure I'm on top of that because I think at the US just you know we have a lot of requirements in terms of archiving our data and increasingly archiving our code and they're sort of internal discussions about what it is and it's a sort of archive something that isn't really data and isn't really code but is like a data and a code that all go together. Yeah. So anyway, thank you so much. Thank you. That was a great question. And I'm glad I ended by demo early then. We have another question in the chat from Steve. I'm Steve. I'm wondering, do you want to elaborate a little bit on this or shall I read it out. I see you're a muted but I cannot. I can answer it. I see it here so. Okay. Yeah, Steve has problems. Yeah, so I mean provenance computational provenance is at least two folks on the hotel project that have been deeply engaged with that. So we would think there's the different notions of provenance like just the, the artifact chain of custody provenance in terms of handing off something that was produced on one system to another system, versus provenance in terms like what files were accessed, which code was responsible for producing which outputs. And let's see. So yes, that's a thing, not as much of it as we wanted is actually in the primary whole tail system. I think we moved. So the question we moved away from the S trace methods, S and P trace methods of tracking stuff within the system because that was just too risky and too intensive. But Tim McPhillips who let a lot of the provenance work is looking at EVP of trace methods for tracing stuff internally that should be more performant, all technical stuff, but we're still working in this area or he is certainly. So can I point you to a link that summarizes our current approach to recording provenance information. In terms of whole tail. Do we have a good and I. You can answer this question Steve in the chat. Are you talking about computational provenance like the sort of S trace, represent EVP of trace model or are you talking about something else. Okay. So the short answer to your first question is no, but the follow on answer is that I can ensure that we can provide you with that information and get our docs in the state to answer the question so I want to make sure I know which side. So, I don't think we have a good summary right now. That's something I'll take away from this session. Most people aren't interested point of the system. And I guess you can absolutely feel free to reach out to me. Yeah, the problem implementation so I think what he had implemented was used reposit as the data collection mechanism, and then he post processed the, what was pulled out of the kind of the S trace output there to generate triples from actual like the graph kind of model, which he's the expert at, but in the end the reposit implementation was pulled because of both security and performance concerns. And so now what we do is not really provenance it's very from the computational standpoint it's really just resource like resource usage information and then the notion of the recorded run where we know that something was run independently. There's a kind of assertion so we can make but we don't have. What was a part of that, some of the design and some of the work that he's done there in the system yet. Thank you then there is one last follow up from Anna regarding computer resources is wondering. Yeah, if there are how many course are available if you want to run something. I have to look at the underlying. We migrated to Jets ring to so I don't know I can pull up there. If I can get into the horizon interface, I can get the specific course we don't. I don't know that we put a cap on course so you get what's on the underlying VM, which I want to say, it's going to be probably for a camera member with migration so I can answer that in a second. Jets. And I'll say, I'm sorry I'll just just a quick follow up. So we're trying to be good stewards of jets trains resources so we don't want to provision machines that aren't used and resources that aren't you know we're consuming stuff so that's a the more demand we have for stuff that is bigger. So we're going to provision larger DMs and utilize those as part of whole tail. And I know that's the, you know, we're asking people to demand it for us to do it which is problematic but the more feedback we get like this is good so I guess the follow on to you, Anna, and is what would you like to see what kind of resources would be useful in your context. Yeah, actually so. I have to say so the underlying VM is actually six that we went larger with this deployment so it's a 16 core for an individual. So your tail instance can go up to 16 and actual Ram on that then is 60 so there's ways for us there's actually a limit that can be increased via an advanced setting and hold tail that like can let you go there so that's if a cast for provision these I think so we're 1660 or the numbers of the actual absolute maximum. I have one last question. And this is maybe a little bit technical but if you are. You have published a manuscript and you you made a repository in hotel on the through the whole hotel gateway with, you know, people can rerun your stuff or go through the code and kind of see where, you know how you came to your findings. Publication is published link is there. And, you know, a few months later somebody points out that might be a book in your code, you know, revisit and it, it doesn't change anything to the results but yes indeed there's you know small book in the workflow or something. Can you still change your, your, your instance that you created and save it so the link in the paper still points to the, you know, to kind of the updated repository or, or are there other ways to guide people to to the newer version that's where the book or, or some. Yeah, some type of is correct. Yeah, so I'll just show you quickly here so publishing this is where hotel itself is not you don't want to give our URL out right to publication you want to deposit somewhere and rely on Zenota's infrastructure to, you know, to post it for long run. So here's an example this is just a test in fact here's here's one of my like key ugly use case tests where I test every new release that we have. I own it. It's my key to Zenota. I produced it. The tail is in whole tail. I've never deleted it. But that's my working environment. And each time I republish. This is the nice thing about some of the research data infrastructure is that as long as that relationship is maintained in the tail of the original objects like I'll republish I'll get a new version. So so no no has this idea of the called it like the context do I, this is the top level do I that references the latest version, and then they give identifiers for each subsequent version so I think classically you'd give one to your, you know, that's attached to the manuscript and then later you go and change it. You would still link to that but they would know that there were newer versions in the, in the Zenota record. Very nice. Thank you. Let's see, I see a few things in the in the chat. The more thank yous and comments and questions. If there are no other questions we are about to do the hour. Then I think this is it. Thank you very much for for your presentation for walking us through a whole tail and yeah, thank you the recording will be provided in the coming two or three hours or something on the CC message page, maybe Lynn can send out the link to the recording. So it can be shared with your colleagues. Thank you. Just one follow up here for Anna and anybody else who's interested in using the system for a class. Please feel free to reach out to me directly if you want to talk about like class size and some of the resource issues it's good for us to know, you know, particularly if it's a lot of students. So your experience is good so if you intend to use it or you want to talk more about it we're totally open for it and just feel free to reach out to me. That goes for anybody. Thank you. This was great. Thank you, Albert. Thank you, Lynn, for the opportunity and this was great.