 Yes, there's no slides, doesn't it? Okay, let's talk training infrastructure as a service. This is a really cool feature of Galaxy that we've built. It's something that I designed, developed when I was working at usegalaxy.eu to support all of our training needs. We ran a lot of different training events and we found that sometimes the cluster would be full, sometimes someone would have launched a large number of jobs and that our students would sit there waiting and they'd have no way to run their jobs and they just have to wait. And the teachers would be very confused and worried because the teachers never had any visibility into the cluster or how things were behaving. So the teachers would say, well let's keep waiting, maybe we take a coffee break, maybe something like this. We don't know, we just don't know. And so we built TIAS to address this problem. TIAS sequesters all of the jobs onto separate infrastructure. So if you have two compute nodes, you can say use one as TIAS and say if the person is doing a training, all of their jobs go to this dedicated compute node or a compute cluster. And we use it for that purpose just to say all of these jobs can run immediately and only the people who are waiting are the other students in that class, which makes life a lot nicer. So this tutorial obviously requires, you've done a lot of things before, it requires that you have a setup galaxy, you have a working galaxy, you have a job configuration file, you know how to write dynamic destinations and perhaps that you've also run jobs in Pulsar. So let's get started with this really quickly. This is how TIAS works. All of your users of Galaxy, they'll go into the same Galaxy server, they see the same interface. But what happens on the back end is that some of those jobs are put onto a different queue or a different cluster or a different node. All of these are options, it's up to you how you configure it specifically at EU. What we did was we said, okay all of the jobs should go primarily onto our dedicated compute nodes. But if those are full, then we'll let them execute on some of the main cluster as well, if they're spaced. So all of the users see the same interface. They have all of their histories, they have all of their data sets, they have all of their normal permissions to access data sets if you're teaching with restricted data. But in the back end, secretly hidden from them, their jobs are sequestered onto this special queue that should be a lot faster. So let's get TIAS set up. First we'll add our requirement and we'll install it. Next we need to set up some group variables. So I'm going to open up my group var server and at the bottom I'm going to paste in all of these TIAS settings. This will set up the TIAS directory into slash OPT, TIAS. It'll set up a user, it'll pull the latest version of the code base and it'll set an admin user and admin password. Additionally we need to make some more changes. We need to grant some permissions on the database. So at the start of the tour, at the start of this week, we set up these PostgreSQL users and objects and here we set up the TIAS user. But now we also need to grant them some privileges. So we set up the Galaxy user in the Galaxy database. Now we're adding this TIAS user and we're going to add these permissions. This is a nice thing you can do with this role. You can say that this user only has permission for a very, very limited set of things that they can do. So for instance here the TIAS role can only access, can only run the select command on the user session and job tables. So they can't change anything in any of those tables but they can read the data from them. And this is nice to prevent them, prevent any errors in the code from becoming bigger problems. So we need to add the role to the end of the playbook. I'll add that down there. And then we need to add the configuration for the service to the nginx role. Templates nginx. I'm just pasting that in before the very last bracket. And this will say, okay everything coming into the location TIAS gets redirected to the service. We serve our static files and then we also redirect things from join training. And that's it. We're ready to run our playbook. We'll let that run in the background. So now that TIAS will be set up there will be a couple of new routes that it'll be available. There is the TIAS new route which lets you register a new training with the server. There's the TIAS administration page. There's a stats page so you can see like the overall statistics of the server. There's also calendar page. I'm going to show you the EU calendar pages and stats pages quickly. So here's what the EU TIAS looks like. You can see that most of our trainings are in Europe but we also support some in US and EU. That one's bright blue because it's actually part of France. We've provided VMs for a massive number of days, massive number of compute hours and provided. We've taught so many students and there are six events today. You can see where all the events are from. And over on the calendar page you can see how many events are running on any given day. This can give you a snap in if you're running a lot of these events. An idea of when are good times to make changes to your server or not. So I'm going to pause for a second to wait for that to load. Okay and looks like that it's finished. You can see the TIAS got installed. All sorts of different configurations got installed for it. If we run systemctl status TIAS we should be able to see that it's running and working. Let's have a look at the nginx configuration. Okay let's see how that looks. Looks like Galaxy is running and then if we access the new TIAS route we should see an information page about what TIAS is and how it works along with an apply now button if you want to register a new training instance. Some of these are templated variables so we've just written default or admin at local host. You can customize this to your needs. There are a lot of different variables that can be set. So I'm going to register a quick training. My name's Helena. That is my email address. I'm going to do a testing training or let's call it GAT since we're at the admin training. And here we can write something about our training that would be helpful to the administrators who are going to judge this training and say if they're going to support it or not. Some admin training will start in January and end next year. Normally you'd specify shorter dates of course or whoever's registering for it it will. And I'm in the Netherlands currently. So one of the things that TIAS does is it just handles where the jobs should be dispatched and if the training is enabled or not. If you are an administrator though one of the things you need to know sometimes is how many resources do you expect this training to need. And you need to compare this with how many resources can you expect this training to require or how many that you can offer. So you can have some questions here for your training or your trainers who are going to be submitting these forms saying oh I'm going to be using this URL for my training material. And then you as an admin will know okay if you're using this training material these are the tools these are this is how much memory I allocate to that tool and you can start to think about how many resources you might want to offer them. I had a very important question how many people will give a training identifier like get and we have some questions about advertising which is a common requirement. Again all of this is just defaulting to use galax.eu but you can change this as you like and when you then you click submit and it'll tell you congrats you've registered training. So let's go approve that training now. So I'll go to the admin interface of TIAS. The password was the username and password we set in the group variables you can see in the end of the group for our Galaxy server we said admin and change me maybe you changed it. It would be a good idea and then you should be able to log in with that. Okay that didn't work I'm going to debug this quickly. Okay I just typed the password in correctly. Oh look it works okay so this is the default Django administration TIAS is built on the Django Python web framework we just did that to make life easy for us. You'll find all of your trainings under trainings here here you can see the identifier of the training the the keyword that'll be used to reference it their email address of the submitter if you need to contact them with questions you can see things like how many days since the application was received how many days until the event so if you have events in the future you can approve them in advance and the state has been processed yet has it been injected or approved if you click on the GAT on the training identifier there they'll be taken to this page where you can see information about the training and if you need to change anything for your records and then at the bottom you can decide if you want to approve or reject it so at the EU we get a large number of training requests and then we get to approve or reject them and email the administrators accordingly so I'm going to mark this as approved then I'm going to go back to Galaxy to actually join the training so part of TS is this URL slash join minus training and then an identifier in our case the training identifier was GAT so I'm going to paste that in and it gives me a message congratulations you're registered in GAT so what's happened here is that my user has been put in a group named training minus GAT in the Galaxy database itself and then we'll be able to make decisions on where jobs should run based on that training group membership additionally we get this really nice status dashboard which is not working yet so let's go see it in the admin interface first and under groups we'll now see training GAT and this has one user my admin user the TS join site so we just went to join training GAT and you notice we didn't have to log in or anything what happens with TS is it knows who you are because it's running underneath Galaxy's path so it has access to the Galaxy session cookie and it's taken that session cookie and decoded it and figured out my username and then registered my username in the database with that with that training next we will set up my jobs to run somewhere special so you can see how that let's get that job configuration sorted out so we're going to create a new dynamic job rule i'm calling mine hogwarts.py you're welcome to call it whatever you want mostly this naming convention is just taken from edusegalaxy.eu i called it the sorting hat that decides where which house or which compute cluster the different training jobs should go to i thought it was very clever at the time so we'll see what happens here is that in the sorting hat we get the app as well as the user information and if there's no user or the user is anonymous we'll just send them to the default destination of slurm however if they're not anonymous if there is a user we'll collect all of their roles and if any of the roles start with the training prefix so we know it's one of the training events then we're going to send them to the slurm2c destination or pulsar or something else and this just lets this starts to give you an idea of how you can use this so you can use this dynamic rule and say if they're one of the training jobs send them to a different cluster you can get more fancy of course and say if they're one of this specific training event send them to this specific cluster something like that with easegalaxy.eu we use htkondor for our cluster and so we can label individual machines for individual purposes we label some of them as okay training gat they'd be labeled and then we send specific condor information in the job configuration and there we say okay this needs this extra requirement it's required that it should run on the training job or the training cluster things like this are possible but they're also very cluster specific so you'll have to make a decision per how your cluster looks and how you want it to behave so i'm going to open up the galaxy servers and we need to add our new dynamic rule again make sure these minuses match up they may or may not have the training materials because they get updated to different times and just make sure that they're all consistent otherwise you'll have issues i am going to double check my job configuration to make sure that the slurm and slurm2c destinations are still there and functional they look good i didn't change that in a different training and now in the job configuration we need to add our new destination so this would be a destination i'll just put it down at the end with the other dynamic destinations and indent it properly okay and this is just another of the python destinations so like the admin only destination we wrote before now we have this sorting hat destination and then i'm going to make that the default so all of our jobs will go through the sorting hat way back up here at the top the default destination should now be sorting hat and so we've defined one python function that will decide where all of our jobs go and all of our tools should go through there and say okay does this need this cluster or that cluster when you are doing this in production though one common thing is that the upload jobs are a little bit special right they should stay on a local cluster they shouldn't be sent somewhere remote that can be an issue sometimes and say pulsar so let's run our playbook it's a little playbook galaxy.yaml and this is set up galaxy to start processing these jobs in a different way and then we can get started running some jobs plan some uploads and whenever this is finished running then we will kick off the jobs and we can see where they get sent the upload jobs should be sent to our default slurm cluster but with one core while other jobs should be sent to another location because i'm a special user and i'm in the group okay you'll notice this big error message this is a consequence of how jango works in jango there is an easy way to create a new user you run a script to change the password and create them but if you run it again it'll fail because the user's already created so we have a task that we've defined in an ansible to create this user and it'll try and create the super user every time but because the email address or the username is not unique it'll fail so we allow this task to fail and you notice that with the ignoring statement the ansible said okay something went wrong here clearly but i'm going to let this fail so i've added some jobs do you have the digest tool yeah i'm going to run the digest tool on some of our data sets select all the hashes i like to use this tool for testing it's just an easy way to say okay did it are things working i'm looking at the information for one of my uploads one core was allocated so it was definitely sent to the one node slurm cluster one core slurm cluster and then we see some other jobs that are waiting to run okay and in the background the singularity container is being unpacked to run these jobs that's the delay here once that's complete these jobs should get sent off to the slurm two core cluster and we can already go ahead and look at it and you'll see the native specification down here at the very bottom saying two cores per task oh no error occurred if you've never seen an error before you must provide an input file name okay something is going wrong there on some of them uh the data sets may be missing or something anyway for some of them it worked that's the important thing and now i'm going to show you the tos dashboard so we've kind of skipped over this at the time let's come back to it now this is the tos dashboard this is something that's available to everyone so the teachers and the students in the training if they like can see this dashboard and get information about how their students are doing for something like when you're in a pandemic this is incredibly important incredibly useful for remote teaching we developed this actually as a result of a remote teaching project where we were teaching students in classrooms across europe and we said well we don't know how the students are doing we will tell them run a step we'll tell them run the upload job or something and then we have no visibility into how they're doing as teachers so we developed chaos to work around that this will give your teachers visibility into how students are behaving they'll get this nice dashboard with an overview of how many students are registered so they can tell if not all the students are registered in the tos yet they can say oh you need to register so your jobs run in time other you're you don't have to but if you don't your jobs are going to be slower they can see this overview by tool they can see okay for that digest command that i just told everyone to run two are in okay but three are in error state so maybe i need to like discuss this tool more or help students figure out what went wrong you discussed that with the class or for the loads jobs that they're all in the okay state they get a really quick state overview of how many are in each state and then they get this job queue view this tells the teacher a hashed user identifier that changes regularly but it's consistent within this interface so this teacher can say oh these students got this result or i can see if i have 10 students i can see oh that 10 times the fast queue tool has been run and maybe one of them failed or one student is failing consistently something like this they can see how long ago that job was created if it was in an error state or if it's okay and the job runner id which if they need help debugging they can always pass this to one of the cluster administrators or the galaxy administrator to figure out what went wrong additionally we also show a workflow invocation queue so if you make a workflow i'm going to make a really quick workflow with just this digest tool again because i'm such a fan i'm going to give it an input data set and now maybe it'll run digest twice especially for bigger workflows that have a lot more steps like this it can be a bit harder to see what's going on in the interface with the workflow invocation view for the tutorials that use workflows you can really easily see oh this many students have run the workflow they're supposed to run now okay let's click run i'm going to go back to the status page and you'll see okay two jobs have been created now one's running one's okay you can see in this interface the new jobs being created running and being marked as okay and down here at this bottom you can see the workflow invocation queue so for some of the tutorials that are on the galaxy training network we tell the students hey upload this workflow file and then you can do part of this training in a faster manner than having to configure all of these tools over and over again and with that you've set up dios so this has the tos service at slash tos which has information there's the new page for registering new trainings for your teachers there's the admin page reviewing the trainings going through reviewing approving canceling etc and then there is the join training page which lets the students join the training in the status page which lets teachers see how students are doing and because this page is public of course that's why the user identifier is hashed and we can go back to the join page just see yeah this is how tos works it's one of the best things i did while i was at use galaxy ready this makes training completely changed for teaching especially especially in a pandemic or in remote teaching times when you need to see what your students are doing and be sure relatively sure that their jobs will run on time so as always please give us feedback let us know what you thought of this content if there's anything we need to change etc thank you so much and enjoy your weekend