 Okay, you can see my screen. Yes. Okay, great. So I'm going to talk about Bioscience, Stable and RDC projects. I should point out this is molecular Bioscience, not imaging Bioscience or other types of Bioscience. So I'm based at the University of Queensland. I work at QSIF and the University of Queensland, RCC and I'm also at MBLABR. There's quite a few development partners on these projects. So QSIF and QFAB and RCC here in Queensland, Melbourne Biophematics in Victoria and the Centre for Comparative Genomics in Perth. And the projects supported in one way or another by Biopartforms Australia, Embo Australia Biophematics Resource, the Atlas of Living Australia, and with funding from ARDC. So just to provide a little bit of context, I guess molecular bioscience may be different to some other disciplines in there. I guess it's birth as a data science that's happened in a relatively short period. You've probably all seen graphs like this. This shows in blue the number of sequences that are held in a large international repository in the States called GenBank. And just to provide, I guess, context in a timeline, that arrow is when I finished my PhD. So basically we've gone from a non-data science to a data science in a relatively short time. So that's all well and good, but how many life scientists are really taking full advantage in sort of being part of this, what analysis of this data allows you to do. So this is just a graph showing estimates that have been undertaken from a working group looking at an Australian bioscience data capability. And it classifies biologists into four broad groups. And on the right, we have biology focused bioscience researchers. So these are people that are working in the lab at the wet, you know, doing wet experiments at the bench. And this still forms the large majority of biologists in this group. So those people might go and use a web service once a month or something like that and go to the NCBI and look up the sequence or do a blast. But they really don't necessarily engage more than that. So this next group here is what we call Data Intensive Bioscience Researchers. And these are the group that is growing. So these are people who may have decided to do a gene sequencing experiment. They've sent their sample away to a sequencing facility that got the data and now they need to know what to do with it. Then we have a couple of other groups here which are smaller. So the Bioinformatics Intensive Bioscience Research Group they effectively use bioinformatics on a day-to-day basis in their research and probably don't really do much wet experiments. And then we have biometricians here on the left and this is the group that are developing new algorithms. So what I'm really going to talk to you about today is that the audience we're focusing on for the DEVIL and IDC projects are these groups at the right. So they're the groups that are not, they don't have it all sorted out and they may or may not or they might be easing into this data science world and need some help and tools to be able to do that. So the DEVIL project, it builds on the previous genomics virtual lab work. So that was a Nectar funded project and what the GVL is essentially is a server image. So this server image contains a number of standard tools. So Galaxy, I'll talk quite a bit about that, but it also has RStudio, Jupiter Hub, Command Line Access, Virtual Desktop and some administrative tools on it. And then there's a number of optional bioinformatics pipelines and analysis tools that can also be launched when a server is launched, a GVL virtual machine is launched. So the other part of it is a virtual machine that's running the server image and it's been built so that it can run on an open stack cloud such as Nectar or on EC2 cloud such as Amazon. So that effectively is the GVL. So the option that previously or still exists for people to use this is to fire up and self-manage your own GVL instance as a URL here. But the steps for doing that are effectively on if one was to use the research code here in Australia is that we need to access the Nectar dashboard. The user has to get their Nectar allocation. It's probably just worth mentioning that a trial project is sufficient for launching a GVL. Then the user has to obtain their cloud credentials, launch their personal GVL instance, access it, manage it and use it and then shut it down. So that actually is still quite a technically challenging set of tasks for our intended audience. So the second option for using a GVL has been to use a public managed GVL service. And until the beginning of this year, there was actually a few of these. So there was an RStudio service and three Galaxy services. So one hosted by RCC at the University of Queensland, one hosted by Mullen Biomedical Mathematics and a training instance. So the devil projects has a few broad aims. I'm not gonna talk about all the aims, but the main ones are to rationalize and re-architect the public managed GVL services. So that's effectively taking these four public services and developing one single service, which is called Galaxy Australia. And that includes RStudio and Jupiter Hub. That's now public. The URL is there at suesgalaxy.org.au. The proposed architecture that underlies this federated, I guess, model is that there's a head node that resides here at the University of Queensland in the Research Computing Centre. And there's a SLIME queuing system that submits jobs to work nodes. So that's pretty much what Galaxy, the Galaxy instances were like previously. What's being done to, I guess, speed things up and make the service more efficient is by separating off a database service. But then there's also this new, I guess, component of Galaxy and they're called interactive environments. And it's possible to just launch up a single use virtual machine for these number of interactive environments such as RStudio or Jupiter. So they run on this Docker swarm. So the idea of that is that it's just a VM that's fired up for a particular session and then it's shut down again. So that's what's here at UQRCC. But obviously we need to think about how we can increase the computational resources that sit under a national service. So one of these is submitting jobs, not just to work nodes on Nectar Cloud, but also submitting jobs to HPC. So Galaxy is pretty good in that it can submit jobs over a SLIME queue, a PBS queue and some other queuing methods. So one thing we're working on now is also submitting jobs to the HPC machines here at UQ. Now we're also going to be submitting jobs from the head node using a condor queue to a condor head node sitting at the University of Melbourne. And because we can submit jobs from the head node to either cloud or HPC via PBS SLIME or condor, we can also submit jobs to other sites and we're currently having some discussions with the University of Sydney who are interested in supporting Galaxy Australia as well. The second main aim is to harmonise a look and feel with other global Galaxy services. So Galaxy Australia is not the only one, there's actually over 90 Galaxy servers around the world. But the two other main ones are usegalaxy.eu which is hosted in Freiburg and usegalaxy.org which is hosted in the US. Novoitec talked about a consistent user experience. So this project between the three galaxies listed here is all about having a consistent user experience with similar tools, with similar training material, with similar look and feel and layout and reference data sets as well. I should say, so there's a global Galaxy tool shed. This is kind of like an app store for Galaxy. So when command line tools are wrapped and unable to be using Galaxy, it can then be downloaded from the Galaxy tool shed. And we have a policy now on Galaxy Australia that all of the tools that are installed have to be installed from the Galaxy tool shed. And to get into the Galaxy tool shed, there's a core set of tools that have undergone extensive QC. The third aim of the devil is to rationalise and expand our existing training efforts. So one of the environments I mentioned previously was Galaxy Tube. So that was just for training. It will be eventually decommissioned over the next few months and we'll use the Galaxy Australia service for all of the training. We have developed previously training material here in Australia. So that's being rationalised. And there's also a global Galaxy training material registry. So all of the Australian material is going into that particular training registry as well. So again, the idea of all of this is that it should be possible to go to any of these global Galaxy resources and be able to use that material on our Australian instance. The project's also establishing a national network of trainers in Galaxy. So this is happening through the Emble ABR network. There's a train and the facilitator two-day workshop that's happening in Melbourne. There's about 10 people going from around the country to be given the same training and so they'll be able to go away and deliver Galaxy training locally. And then we'll be undertaking at least three hour virtual, well, we call them virtual physical national training events. So we have a lead trainer that's based in one location and then simultaneously around the country, we can hold training events in multiple places. So we'll be holding three of those on different topics. So I also wanted to talk about the sister RDC project, so the research data cloud. And this one is extending Galaxy Australia so that it actually can support other national infrastructures that require a bioinformatics analysis functionality. And in this particular project, we're going to be supporting BPA's data portal. So BPA's data portal, it's used to store and share framework data sets during the period where the team are working on them. It's based on a CCAN framework and it's accessible to consortium members. So it's primarily a data repository and storage and sharing mechanism. It doesn't have raw data, I should say. It doesn't have an analysis functionality. So what we are doing in this particular project is linking up the two so that Galaxy Australia can perform that analysis functionality. So we're using it to support a group of researchers that are interested in metagenomics. So metagenomics is a methodology where you can get a sample. So it might be something from the environment like soil or water. And then you can extract DNA from that mixed population. You sequence that DNA and then you kind of work backwards to identify what species, maybe what species were present in that original sample. So to do that, we are installing a couple of tools onto Galaxy Australia. These are kind of, I guess, commonly used in this metagenomics analysis. These are called Chime and Mother. So they're both being installed onto the system. Now, the second part, I guess, of a metagenomics analysis is not just determining what microbes are present in one particular place. They're doing statistical analyses across different places. So saying, okay, so what are their commonalities between site A and B and C? And there's a whole plethora of types of analyses that people want to do. So see things on a map, do clustering, look at box plots, so on and so forth, ordination. And so there are a number of R packages out there that do this. So two of them are called FileSeq and Ria, so they're pretty well known in the community. And what we're doing in the project is wrapping these so that these are available for use through the Galaxy graphical use interface. We see QAing of those wrap tools, depositing them into the tool shed, and then installing them from the tool shed onto Galaxy Australia. There's also a training component. So we're developing material for primary and secondary metagenomic analysis, and depositing this material both into the global Galaxy training portal I talked about, but also into the EcoEd training portal, which is part of the EcoScience devil project that's occurring. And then also delivering workshop across the ambilabian nodes. So I guess the intended outcomes over time, we would like to see that there are, we're enabling people, I guess, to move from one of these groups to the next one to the left. So we have over time, so we would expect that there'll be more biology-focused bioresearchers that are moving into the group that are enabled to actually undertake analysis, and then the group moving, or this group also moving across. So I should say that it's both pretty short projects. So March to December 2018, both are on track, going pretty well, and there's a lot of people involved, and they are listed here. Okay, thank you.