 All right, so release testing. First of all, thank you very much for a wonderful team. Alireza, Brian Cameron, Jennifer, John Shilton, and Lila. So first a brief overview of how we do release testing. It changes every cycle, every release cycle, but roughly we have the same process. So we test against usegalaxy.org and we use local instances whenever necessary. We have a formal detailed release testing plan which outlines a formal protocol which specifies what we test, how we test, how we determine whether or not an issue is indeed an item of concern, how we open issues, how we verify that something has stemmed as a result of the current release and not something which has been around for ages. This helps us keep on track with testing everything which we initially planned. We communicate via MATE Matrix channel for release testing. The channel is public, so this is a link in the slides. Please feel free to join anyone who is interested. The channel is more or less inactive between releases but during the release testing cycle it's quite active. The primary goal of the release testing team, well, there are really two goals. The one is the goal of the manual testing. The other one is whatever automated testing activities we come up with and we have one this time. The goal of manual testing is essentially test everything we can, we can stray off the happy path, try to do whatever a user might accidentally do and discover bugs which are not discoverable by or very, which are very difficult to discover by automated means. So we discover these bugs and whenever we find one we open an issue on GitHub. And of course at the end we summarize the result and present them to the community. The scope is for manual testing, we focus on two areas, requirements-based testing and scenario-based testing. So for requirements-based testing we use the highlights for the release notes. These are the key items which make the current release. This is what we put forward, this is what we advertise. So this is our highest priority. We try to make sure that the items which describe the release actually work as described. Pool requests, we use pool requests which contain manual instructions for manual testing as well as pool requests which are labeled as UIUX. It's a curated list. So we select out of this master list the tests, the PRs which are actually testable and which makes sense to be tested manually. This is our well loosely based requirements-based testing. Then we have the tutorials, the tutorials, GTN tutorials. We use them as scenario-based testing. So we use them as sets of paths a user might take when using Galaxy in the wild. So these are the paths we take and these are the paths we try to leave in creatively trying to break Galaxy. The timeline was usually one week. Usually the week right after freezing we have three to five, four to five team members and it takes about a week, two, three, four, five days. And a special thanks goes to volunteer testers. Thank you Volskan, by the way, this time help was most appreciated. What was manually tested? So the release notes highlights, these are the areas we paid special attention to. So tool search feature, tool search and advanced search updates, the history drag and drop, tool form improvements, theming support which is not enabled by default on usegalaxy.org. So we tested that using our local setup, global drag and drop, et cetera, et cetera. We tested, first of all as a disclaimer, we never are able to test all the items which are on the list of items to test. That said, the proportion of items tested is really a function of what we decided to put on the list because we degenerate the lists. So this time we had a list of 64 PRs with manual testing instructions and 65 PRs which had the UI UX label and which were curated manually as to be most likely a reasonable subset to test manually. So out of them, we tested 49 out of 64 and 17 out of 65. The GTN tutorials, well, we've covered 21 tutorials and again, it's not about the quantity of the tutorials, it's just suggested scenarios which we follow. Opened issues. So the next slide will give more details about how many issues and what types of issues were opened. This is a rough overview of them and this screenshot includes not only issues opened by the team but also issues opened by the volunteers who took part in the testing. Again, thank you for that. And now moving on to Ali Reza. So hi everyone and thanks, John. So I'm going to give you some information about how's the opened issues and how the portion of each. In total, we have 37 opened issue from different areas, terranium, galaxy and tools and mostly for the galaxy. And next slide please. Yeah, so if we want to compare them by their technical category, we have mostly more than 80% opened issues for the UIUX area and API in the following and tools and accessibility. It means that we have to focus more and more on the UIUX bugs and fix them in the future. So next slide please. So if we want to categorize the opened issue by their area, the most important area was the history and then the workflows and tools and et cetera. The history is the main key of the galaxy and mostly have the most issues for them and then for the workflows because we made great changes to the workflow during this release and because of that, we found numerous bugs related to workflows. If I want to give you some examples about how these issues are important to us. In the next slides, we are showing some examples. For example, we found an issue by using drag and drop and multi-view history together because as you saw in the previous slides, we have two great feature in this release that's the multi-storey view and the global drag and drop but we found them have some conflicts when we are in a multi-storey view and we want to use a global drag and drop together. That's one of the important issue we found in the UIUX area for example and in the next issue. Next example issue was related to the workflow editor because as we were testing it, we found for example, we cannot connect some of the steps together or in a correct way or in the next example, it's again related to the workflows, the data for the workflows are not seen. I mean, for the input and outputs, if we want to filter the history items by the input and outputs for imported histories and it's not available and we should address this issue by for example, disable this filter or explain the user why it's not available for imported history because they are based on the jobs. And last but not least example here was about the new data types, page renders an empty table this makes, maybe it makes the user confusing about what's going on with this data type page and so on. Okay, that's from my side. Thanks, thank you Alireza. So a few observations and thoughts about these bugs, Brian? Yeah, I just, I'll go over a couple of things that during my camera on that I noticed while of course there's some new additions to the UI and one of the things I spent some time doing is kind of going over some of the older tutorials that a lot of like first time Galaxy users might use. Just off the get go, I noticed that there were some of the screenshots were outdated in most cases this isn't vital, but if you're sort of like new to Galaxy, I might be confusing when you're trying to follow things step, sorry, follow things step by step. Among them I noticed some of the icons look differently and as far as like where they're at, for example, the refresh history button, how you edit your history name, before you just go in there and click and type what you wanna name it. Now you actually click the edit button and edit the name plus other attributes. So again, generally minor things, but for new users it'd be good to have more recent pictures which the tutorial owners went ahead and updated these images. So now everything looks the way it does in the new releases. Next slide please. One, I found a couple of cool new additions and to be honest with you, one of them was this new, the second run tool. Well, for one now it says run tool before you says execute and I think it's more sort of like direct for users know what you're doing here, but also there's an additional way to run a tool before there was just the one at the bottom. So you have to scroll through all the different tool options and then run it. Now you can just do it right at the top. Now this is really useful if you're just trying to rerun a job exactly the way it was before or perhaps only changed like an initial setting like your initial input file and just do it quickly. Some of the tools we have on here have like loads of options and settings. So it's really neat to just like quickly rerun something. So A plus on that whoever developed that. Next slide please. I also spent some time with the interactive tools particularly Jupyter lab, which is really neat to have. We're actually a working good bid with one form of this tool here in the Blankenberg group. We're trying to import Kiskit which is a software stack used for one computing using Jupyter lab. But I did notice a couple of things. One of them was it, and again I don't know if some of these issues are sort of like on the service side a problem with main or perhaps on the release itself or perhaps on the tool itself. But I did notice that you can't save the notebook back into your Galaxy history and you're also unable to import history items. You could give it the command saying get and then your dataset number from your history but it didn't actually import it. Can't really see the resolutions, not that great. So I apologize for that. But what happens is after you give it the command to get a history item, it doesn't import it. It just says import and then backslash whatever number you gave it. So I found that out. And then I also, one thing that I also found is that like notebooks won't stop running. They're actually still running right now. Even after you kill the interactive tool it still shows up in your history as running. As you can see here on the image on the left. Again, I don't know if this is an issue on the server itself. Things are a little bit bogged down and running slowly but it's just something worth noting I suppose and now certain things though like if you use matplotlib in the notebook and you generated graph and save it as a PNG file that was able to be saved and sent back into your history, no problem. So that was successful. Next slide please. So the new history layout. So I got a, I guess a couple of points on this. To be honest, I found it to be a little bit more difficult and less intuitive, particularly when you're showing multiple histories. You have to sort of like an extra step that you have to take and your current history isn't automatically shown. So before and all the other releases, first of all, you just click show them side by side there's one icon you click and then it shows you everything your current history stays on the left and then you can scroll left to right to all your other histories and then you can just drag and drop items, no problem. So this sort of like adds an extra step and it just didn't really seem too intuitive. Like you open, you show your history side by side and then nothing shows up and then you have to start picking what you wanna see rather than just showing everything all at once and your current history at the front. I feel like, and then this is just my opinion, man. I feel like how it was in 2205 was a little bit more intuitive. So I also, like Alizarra mentioned earlier, I had issues dragging and dropping history items. So in this setup, when I show them side by side and I tried to drag an item from an old history into the current one, the icon or against the screen would show drop here but like I would drop it and then it actually wouldn't happen. So it wasn't actually copying the datasets using the drag and drop feature. So that's kind of what I have to say on the history side of things. But yeah, next slide, I think that's fine. Yeah, that's it. Thank you, Brian. And John is going to talk about deployment testing. Super exciting stuff. John, are you there? I see him drawing. Oh, sorry, sorry, sorry. Yeah, I'm technical difficulties. I drew on the screen as I turned on my microphone for some reason. All right. So one thing we did this release cycle was sort of make some progress on a task that we've talked about for, for as long as I've been on the project and that's deployment testing. So the idea with deployment testing is that most of our tasks we run in CI as a result of pull requests and deployment testing as the sort of the act of running tests against the running server. And so in this case, I was using usegalty.org. And so it's running our existing tests against usegalty.org. In theory, in the future, we can sort of develop tests specifically for our public servers and that'd be great. But for right now, it's mostly about running the tests we already have, the ones that make sense. And so that includes a lot of our API tests and a lot of our end-to-end tests which are currently implemented in Selenium. Yeah, and so the idea here is to run those tests and see how the deployed Galaxy is working. Hopefully, I'm sure this will be a valuable resource for other people deploying Galaxy. Now there's, we have, you know, hundreds, not thousands of tests and there's some serious limitations to what we could do with deployment testing. So like a lot of our tests depend on like highly specific Galaxy configurations, right? So we have two whole like sections of integration tests and all of our unit tests, for instance, that don't make sense to run against usegalty.org because it's just not in the right configuration. And so of the appropriate tests, you know, a lot of them are sort of currently factored in such a way that we wouldn't wanna run them on main because they require an admin key or they assume you got a new history. Every time you start a new test or they do a login of a new user, but there's a bunch of tests that do run and I did get running against usegalty.org and then I have a note that might turn into a GitHub issue but, you know, there's a bunch of things we can do with the existing tests to make them more compatible. And hopefully this will sort of, this process will make it so in the future as we're reviewing PRs and such, we can sort of, you know, think about tests in terms of how we could, you know, get the most bang for a buck out of them when we're running them against main. Yeah, do you wanna head to the next page? Yeah, this is just a little, I did a GitHub action. So these tests, you know, you have to go in and launch them. They don't happen against every PR. So if you click actions and deployment tests, it's probably only something repository owners can do, but if you've got a clone of Galaxy, you can certainly just modify that file and run it against whatever server out of your own branch. And then up here, you've got this little dropdown for how GitHub actions lets you parameterize your tests. So right now you can run either API or Selenium tests or all of them. Yeah, and I've got a few different servers plugged in there. And then you can select what branch of code the test is coming from. That's the GitHub option at the top that GitHub provides. And then like within the action, which set of tests you're cloning out. So usually both of these are just going to be devs. But yeah, so that's where that's going. Right now the output is just a pie test, you know, you just got to scroll through and look at pie tests and figure out which tests are failing and to do the debugging. Hopefully in future releases, we will improve that and we can, you know, use stuff like the HTML plugin and we'll make this process better documented and easier, but this is just that sort of initial experiment and getting the plumbing down. Next page. Yeah, so that's what we've accomplished. In the last couple of releases, we did a better job of like annotating which tests require admin users or new, too many new histories trying to main that sort of thing. And but then this release, we got the GitHub action down and we built a list of useful tests that I think will basically run against main in the future. We caught one of the issues that was the manual team already caught and that's the zip files, not downloading for collections. So the tests, you would expect these tests would be things since they were run on PRs that for the most part, you wouldn't expect a lot of failures. But even this sort of first initial pass caught something and that has to do with the fact that how we were served that zip file is very dependent on Galaxy's proxy configuration, the backend configuration and those things can vary. Those things are very different between main and the test environment. So over the next couple of releases, hopefully for the next release, I'm gonna sort of continue this process. So hopefully I can run the release testing before the release and then after the release is deployed so we can sort of start looking for regressions. And Bobby mentioned improving the reporting and the testing and documentation. And then as we're sort of growing the number of tests, I think it's gonna be another key thing. So improving the existing tests and growing the number of tests that we do run against main, I think will be helpful. And then past that as we've sort of solidified the process and it gets better, we need to develop a plan, I think for, not we don't need to, but I mean, I think it'd be nice to develop a plan for integrating this into part of the deployment process. So if we can, even before the manual testers get involved, if we've run all the automated tests and found a bunch of bugs that we can fix or found no bugs, then it's easier to sort of hand that off and say, we think that this is at least in a state where we think that golden path or the main path works. And so hopefully that will just ease the process and make things more stable. Thank you, John. This won't just ease the process, this solves like 50% of the problem that we keep going over about manual testing. We keep saying that, hey, we need to automate this one way or another. And finally, thanks to John, we're actually doing it. Well, John is actually doing things. So a few notes on the release testing process itself. So we try to learn each time. So the lessons learned this time is we won't plan to start in mid-December. Merry Christmas and everything doesn't work well with release testing at the same time. We're not gonna start on January 1st either. January 2nd is out, so we'll do better with that. We will try to form the team in advance, well in advance, two, three, four months in advance. And of course we'll stay flexible. So if something unexpected comes up, we'll find a replacement, not to worry. But I suppose it will be much easier both for the release testing process and most importantly for the people who are doing the release testing to plan ahead and to know that, okay, that will be a week, roughly three months away when I will be one of the people responsible for making sure the release works. So these are the changes for the planning process. We try to look back each time we plan a new release testing cycle. We try to learn from previous experiences. So here are a few points. First of all, we introduced the release testing label on GitHub last release to make sure things did not get left behind and lost. So it worked. It works pretty well. At the beginning of the release testing cycle, we checked there were only two issues still left which had the release testing tag from the past release. One of them was already closed. One of them is still open. It's sort of like a wish list kind of item. So I wouldn't say it's a problem. So in that regard, we're doing pretty well. So we are going to leave the release testing 23.0 label on GitHub and hopefully next release testing cycle. There will be very few, if any, issues tagged with that label. Implemented works well. So this is not specifically from the previous release cycle. This is a combination of recommendations and feedbacks from all the previous work cycles. So one of the types of feedback we got was always we need a more structured testing plan. So each time we make it more and more structured, this always helps. It helped this time, I think. We open issues aggressively. That means that we don't try to fix the problems we find because again, as practice shows, when you do that, you get stuck on an issue. It takes you down the garden path and you miss opening 10, 20, 30 other issues. So we do that. We keep trying to reduce the list of PRs to relevant PRs, relevant to manual testing and relevant. We try to keep the list limited to the PRs, which actually are worth spending time on. And in that regards, what John just went over with deployment testing is going to hugely help this particular problem. And of course we prioritize release nodes because these are the key features which must just work. Still need to address still too many items to test. So again, we try to manually curate the list of items to test it's still too much to be handled by a team. And it's not a problem of the size of the team because if we add twice the number of people, there will be still too many items to test. And if we spend a very little time on each item, we can test them all, but the quality of course will go down. So the key here is to still keep narrowing down the item, the list of items to test to those items which are definitely not automated, not automatable. So those items which do require human attention. Oftentimes it's not clear how to test an item. What this means is that sometimes a feature will be not available on usegalaxy.org for whatever reason. We resort to local instances. Sometimes the instructions are not clear and I'll get to it. Sometimes it requires admin access. Sometimes it's a backend issue or requires a change in configuration. So we need to address those better than we currently do. And finally, we cannot do side-by-side testing of select items when we want to compare whether or not the item that the bug or the issue indeed occurs on the previous release. We've discussed this forever. It's not easy to implement, but if we finally get to setting up tests alongside main with a similar setup within reason, that would be nice. Then we could resort to running stuff on test when we need to double check what's going on on the previous release. How the community can help. So first of all, manual testing instructions. They are used for two things. One, we use them to automatically generate the list of PRs which are worth testing. The list of PRs is very long. So that checkbox, which we added about a year ago helps a ton. And also they are used to conduct the actual testing. So first of all, if you think that your addition, your new feature or your fix needs to be or could benefit from being manually tested during the release testing cycle, do check that box. Otherwise, there is no way it's going to be manually tested. The process of selection is automated. And another point, if the box is checked, please, please, please include the instructions and make them clear. One of the observations from a previous release testing cycle was that many of those instructions were written by experts, by domain experts, for domain experts. So they have to be clear so that any member of the testing team who may be a developer who is who works in the front end or a developer who uses only, I don't know, the back end. And it's not familiar with this or that particular item which is described in the PR. Maybe it's a user. I hope we'll eventually get diversified teams, tester's teams, where there will be both developers and maybe admins and users. So these instructions need to be clear and very specific. So again, we have a lot of PRs to test and the clear set of instructions helps very, very much. And a few more points. So release notes. Release notes is a chore. It takes a lot of time to write them. They, it's not easy to generate them automatically. We can work without them. So what we do is we look at all the PRs which have been tagged with the highlight label. And based on these PRs, we make a list of key features which probably will make it into the release notes and which probably deserve our primary attention. That said, if we have an actual description, actually an actual text describing each feature, that simplifies the work of the release team tremendously because then the testers can actually focus on figuring out how to test a given feature and not spend time doing galaxy code base archeology, looking for relevant PRs and issues trying to figure out what exactly that feature is supposed to do. So if, and again, there is no easy answer to this question. Generating writing the release notes takes a lot of time. So maybe we can figure out a way so that other teams can help with this. But if we can have those release notes before we start testing, I think the testers will be able to test more and well, we'll be able to be more productive. So that would help. Let's see, improving list of items for manual testing. Differentiate between user-facing and admin-facing items. Again, some of the items are definitely UI UX related and can be tested against usegalaxy.org. Some of them require backend access. Some of them require admin access. So maybe it makes sense to put some more thought into specific labels we could use to tag all the PRs which need to be manually tested. I don't have any answers or specific suggestions at the moment, but maybe this is something we could think how to improve. And another point, Cameron made this suggestion and I don't know how workable is that, but it sounds pretty neat. So I wanted to share it with the community. So can we generate a component dependency graph for the UI so we can, well, automatically see which features should be tested when a particular component has been updated, especially for abstract components that expose a variety of use cases like the form text.view. I don't know how feasible that is. My guess not feasible at all because the mapping itself between view components, the set of view components and the set of things that should be viewed by a user or an admin, we don't have that mapping, I think. And that would be a challenge, but the idea is neat. So I wanted to throw it out. And that's it for us. So again, thank you for the team and thank you to the team and thank you for bearing with us and questions, what can we do better? Thanks so much for that presentation, John. And thanks so much for the release testing team. Do you want me to stop sharing? Oh, sure, that'd be fun. One thing you noted in the things that are going well is that having a good structure for the release testing is really beneficial. Can you describe a little bit about what that structure looks like? Yes. Yeah. In fact, how about I share the screen real quick if you don't mind and I will show something. So let's see. So we have a release testing plan and the release testing plan outlines all the steps of the release testing phase, what goes into it and how we do things. So we have a summary of tasks, which explain what we do and why we do that. And then say we have this GTN tutorial scenario-based testing. So it explains why we call it scenario-based testing. We have a section here which explains when to open an issue and when not to open an issue. I remember when we started with this approach about two years ago, we would open issues for absolutely everything. And as a result, we spend hours upon hours upon hours testing one given tutorial and finding all the possible faults, literally going as far as the background color on the screenshot is outdated. So it might be confusing because the current color combination on Galaxy has changed. And that took 99, like 90% of our time. So because of that, we tried to map out, well, what is an issue and what maybe is not an issue. So these kind of examples, straight away from the happy path again, we tried to make sure that the testers keep that in mind that following the tutorial basically is duplicating what a script can do. And maybe we don't have all the scripts we want, but still it's an automatable task. And the value of a human tester is to do crazy stupid things, which a user will do mistakenly or because, well, that's a user. So we map it out here. Then we have release notes, requirements based testing. And again, it's loosely based requirements based because we don't have a preset list of requirements which must work, but well, a list of PRs and highlights is as close to as good as it gets. So we explain how to handle those. Then we go over about how to record bugs. And actually this release testing plan is a little less detailed than the previous one. We've sort of trimmed it down. So it doesn't look as boring as scary, but it still provides this set of steps we try to follow. So that's how we do it. And that's only for manual testing. So we also have the automated testing component and that's a different story. Awesome. And I assume there's like a sheet with all the specific tutorials and that gets broken down by the group of testers. Yes, we have a, let me, let me again show it so that the community knows how we handle this. We have a Google Drive, a folder on Google Drive, which has several documents and the most probably the one which is the most important is this Google Sheet. Just a moment, I'm trying to get rid of Zoom there. So we have a Google Sheet which lists the highlights which we are testing with all kind of relevant data, opposite each highlight. We have a list of PRs, which has the description of the PR, the status of the PR, the labels. So that whenever a tester assigns a PR to themselves, they know what that PR is about, what areas of galaxies it deals with. We have a list of tutorials and this list we treat as a list of suggested tutorials. So if someone wants to use a different tutorial, for example, for science tutorials, we used to have specific science tutorials. Now we just have any science tutorial as long as there are three. Then we have a list of open dishes. This is where we post stuff which we open and whenever we can link it to a given PR, which is currently in process, which is fixing the issue. We do that. We also have semi-structured notes. We have structured notes here for each open dish here. And in addition to that, we have a Google Doc, actually two Google Docs with notes, which we use for discussing stuff which does not logically fit into the release testing channel, for example, or the end slides. So that's how we discuss. That's how we keep track of what's being done. Thanks. Any other questions or comments? So this was really great. Thanks for doing all this. I guess if the plan is to start arranging the groups about three months before is now the time to start the next group. Yes. Awesome. We have a spreadsheet. So the selection of candidates, potential testers is semi-random and we tried to keep it. It's a super useful task. It can be fun. It's also a chore. We all have our favorite coding projects we are working on, which are super important. And this is more fun than sitting and testing the next PR in a list of 100 to test. So we're trying to be fair and we have a spreadsheet which lists all the release testing cycles starting with 2009, I believe, and all the names of people who have been testing each cycle. So when the next release testing cycle comes up, we look at that spreadsheet and try to find who has not been a tester yet. And if I think the next release is actually the first time when we have to roll over and someone will do release testing again, because we have so many team members. So once everyone has done release testing, then we go to the pool of those who have done release testing years ago and we do a random selection. And then based on availability, we adjust. And then just from a coordination standpoint, for the release and the testing and the deployment, is the plan to have that out before the GCC? Yes. Okay. That's going to be a couple of busy months, I guess, right? Yep. Yes. Awesome. Thanks a lot. I guess normal questions. Thank you again. One more comment maybe. You mentioned something about kind of adding specific tags about what needs to be tested manually, or guess what's more user facing versus what is more admin facing. Is there like a particular process to go about adding a certain label? Does that kind of depend on folks adopting those labels? How can we kind of move that into reality? Well, I suppose first we need to give it a thought then the committers agree that a new label indeed makes sense. Again, we probably don't want to have too many labels because as soon as there are too many, they stop being useful. So we try to be very careful about when adding a label. Then we add a label and there is no formal process about introducing the label to the community and having community adopt that label. But since in this case, we would be using them to tag PRs, which should be tested manually during the release testing phase, I believe we would need to somehow convey to the community of contributors so that they start using these tags. And I don't see this as multiple labels. I'm thinking one or two to differentiate between needs to be tested. And this is user facing versus needs to be testing admin facing or maybe needs to be manually testing. But hey, this is backend related. An example of that would be I can think of our recent work on changing the migration system. Lots of manual testing steps there and that was an important manual testing tasks specifically during release testing to ensure that there were no regression bugs, but that's not user facing. That's not even admin facing. So that's 100% backend. I mean, we found this while deploying. So it was 100% admin. It happened to be okay. I suppose you're right. I suppose, yes, I suppose that would be 100% admin facing. Maybe I'm wrong. Maybe there are no pure backend items which would require manual testing. I mean backend which I'm not admin facing. With a lot of the bugs found being UI UX. Do we need to invest more time in integration testing? So, I mean, we only have Selenium. It's what's the feeling on that? Would it help to have more tests that extend beyond testing components in isolation? As John pointed out during his talk on deployment testing, if we can automate the happy path, that would be super. I'd like to do a retrospective and sort of categorize the issues that we did create from this down the road and think about how, obviously, yeah, I mean, more testing is always good, right? But I wonder if, so just quickly glancing at this. I don't think a lot of the UI UX type issues that we did find. That's not something we would see in testing, right? Most of these. Global drag and drop which turned out to be a regression issue and that happened not when it was merged, but after a different feature was merged on top of it. That could be automated. That type of thing could be automated, I think. Yeah. Would you definitely catch a few? I mean, generally, a lot of these are more look and feel kind of, you know, stuff that we'd catch that way. I mean, the goal, the holy grail is when the happy path is completely automated. Like, and the testers, the manual testing is only used for discovering stuff which is not scripted, not scriptable. Yeah. I mean, presumably the tutorials are all potentially scriptable right now. We don't have the resources to get there, but if they each had a tour and we figured out different ways to automate that, et cetera, et cetera. One could imagine someday getting there and then hopefully the manual testing would be just like A-B testing on main. We give Jen and a couple of power users access to, but yeah, we're obviously a ways off from that. I mean, it's for the bugs that I fixed for the release. I mean, they were all in the work for editor and I think all of them resulted in me adding, I think mostly integration tests, selenium tests. So I think the answer is probably at least half of them can be fixed with integration tests. But yeah, I mean, I agree with you. We should go back and see which ones can be caught. Yeah, tests are a little bit like airport security, right? You can always have more and more tests, but then you don't know which tests you need until something breaks and then you're like, oh, I can add a test for that some of the time. Yeah, but I mean, I think there's a pattern in which that the component works fine just not with the data that's coming in. And if our default way to write tests is only component tests, it's bound to cause some problems. Yeah, I agreed there. We can definitely do better. Yeah, so yeah, John, you mentioned that. That was the thing I wanted to follow up on. Where the time is best spent, right? So if you spend most of, you know, X percent of your time on PR, sort of retesting PRs and Y percent of your time just testing the tutorials, where are the issues actually coming from? Are you seeing the most value in retesting PRs or in the more broad tests? That would be a question to Ali Reza and Brian. What do you guys think? Did you repeat that? Yes. So tutorials versus retesting PRs versus going over release note highlights. Where do you think you spent most of your time and where do you think it would be more useful to focus your time? I mean, I ended up spending most of my time in tutorials just because I wanted to go at it as like a first-time Galaxy user and then see if you've rendered any problems there. But I mean, I'd say all would be equally useful. I mean, having the highlights was good just because it's good to have a sense of like, okay, what is actually most important here? Because, you know, like you were discussing, there's definitely like a lot of items every time we do release that they need to be looked at and perhaps not enough time or enough hands or eyes. So I would say a combination of tutorials and then maybe highlights and then just having like the most relevant PRs within those highlights, I suppose. Because it was, at one point it was kind of like, felt a little scattered like, okay, what exactly, what should I focus on here? I've got three different types of like, you know, approaches or issues to look at, I suppose. And so, yeah, that'd be my kick on it. Jen, I just noticed you're also Nicole. I'm sorry. If you have anything to, you know, to comment on, please, please just interject. And as far as I remember the tutorials, as soon as you start treating them in my experience from, there we go, as soon as you start treating them as the item you are testing, you start dedicating all your time to it. But as long as you treat them as a suggested path and you start trying to break Galaxy using that path, then it's much more valuable in my opinion. Because in the first case, well, basically are doing what's, what a script can do for you. Yeah. Following on that, I mean, I guess what I was thinking forward to was if we can get more of this per topic testing done during the release, you know, if you're, if you're looking at a specific PR and you want to make sure it's well tested, how we can, you know, really shift some of the burden of testing those individual PRs to out of the end of the release cycle testing and then have the sort of more end-to-end topic-based testing be you guys real focus. Or maybe that's not a good idea. That's kind of what I was trying to dig into. It's an idea. I don't have an immediate response, not because it's not feasible, might be. I mean, I mean, you are, you are thinking about the, the, the main issue of release testing, which, which is the core concern, how do we distribute our time more efficiently? Yeah. Exactly. You said you had, you know, in one of the earlier slides, you know, there's 2000 things you wanted to test. You can only do half of them or whatever. Right. If we can, right. If we can somehow help out or not help out necessarily, but, you know, if we can do more of the individual testing, you know, we can, you know, highlight or PR testing during the release cycle somehow. That might help. Maybe the labels. Assigned to PRs at the time of merge, even though, I know Maria's pointed out that they're not available to it. It's only for maintainers. But if when we approve a PR or merge a PR, you know, this PR was merged with, without a kind area. And then of course, Nikola will add it about half a second later. So maybe that's what we could do. Tag a PR if it needs to be manually tested. If we can tell that it's manually tested, of course, should be manually tested. Do you have a specific board? I mean, could you make a list of like five PRs? Where you think that would be the case? Because the pskr should all come with tests. Right. We don't want to duplicate automated tests. But the global drag and drop, for example, that was broken after another feature was merged in. So global drag and drop would be probably something. The submitter would expect needs to be tested manually when it's added to the release or adding a label would help that. But again, I mean, this can be integration tested. I don't. I think as reviewers, we should probably pay a little more attention to these. I mean, I feel bad asking sentry tests from people that usually don't code in Python. We need some solution there. I think. Yeah, I mean, to address that, we talked about 23.1 potentially including the playwright stuff. So, I don't know, maybe that's something to consider. Well, keeping an eye on the time, I think it's been a really awesome discussion. And unfortunately having to cut it off a little bit, but again, thank you so much to the release testing team for all the time that you've put into helping to smooth out the release. And for your presentation today really appreciate it. Our next community call is going to be on March 2nd. And there we'll have the PIs present the 2023 roadmap. So that's going to take in some of the feedback from this working group progress meeting that'll happen on the 23rd of February. So stay tuned for that. But thanks everybody for calling in today. Thank you. Thanks everyone.