 first, why we're doing this, actually. So in our field, in neuroimaging, we have a huge bunch of scientific software. It's usually written by scientists, which means they're not software developers, and it's difficult to run this software. And why is this the case? So the first problem we often have is that most tools require Linux. So now you say, well, this can't be a problem. Linux is a great operating system. The problem is for clinical researchers, it is a problem. They often can't install Linux on their boxes. They have a very restricted by their system administrators, what they can administer. And often this is already a bottleneck. So then the problem is, even if you get something like Linux running on your box, the packages that we need are not in the standard package system. So if you just run apt install or yum install, you don't get these packages. They just don't exist. And then very wise people say, oh, it's trivial. Just compile it from source, right? Just there's nothing, nothing about it. The problem is a lot of our software is a pain to compile. So just one of my examples here is a toolkit that I've worked regularly with. It takes about eight hours to compile. And that's that's after you fixed all the problems in the make files. Because a lot of the things are not existing anymore. And you need exactly the right versions of the compilers. And it's just it's not easy. But yeah, then also, when you when you got it working, the problem is that scientists, as I said, they're not software developers. So often they're not updating their libraries. So what can happen is you scientific software needs a very specific library. And when the operating system updates, this library disappears. So in this example, lip PNG 12 is a very good example. They just got removed from the current packages. And you can't just run our software anymore. So and then also the problem is because of all of this, it takes a long time. Let's say you get a new computer, and you need to reinstall all tools, and it takes a long time. And it's just in the end, it's not reproducible. And we get different results between different software versions, which is really annoying for researchers. So this is the problem. And now basically, we said, Well, how can we help with this mess? And here, I just choose Python as an example from XKCD, but it's pretty much the same problem with with all our software. But we don't want to create another package manager, right? We don't say there are 14 different package managers and all that don't work. Let's build a real package manager that works. And then we have a 15th one that also is not good. So we try to avoid that. And what we said is, let's let's see what we can do with the existing technology that's out there. So this is basically existing repositories and software distribution methods like combine that with containers like Docker and Singularity. And with the platforms that we have like CVL, the Nectar platform, and how can we run on these platforms, our analysis? So our design principles were that what we want to build has to run on Linux, Mac and Windows. So that's why we said, okay, we want to use Docker for that, because that gives us an easy way of running all of these platforms. We also need to run on high performance computing clusters. So Docker is not the right way of doing this. That's why we need Singularity containers right now. I mean, I'm not saying there are the better technologies out there, but this is what currently works. So as I said, we need technology that is available now and that actually runs on the platforms that we have access to. Also, we wanted everything fully interactive. So we want a full Linux desktop interface so people can do what they want. We don't want a website where people click through and upload things and restricts them in their workflow. Also, it has to be lightweight. We don't want ours to compile software, download software, just has to run out of the box. And as I said, reuse existing tech as much as possible, not reinventing the wheel, because we don't have time for that. So that's what we basically did. We built an automated container building architecture on GitHub that runs with GitHub Actions. People submit recipes to this repository. They get automatically built and tested. And then we push them out to the Docker library and we build a Singularity container that we host on simple object storage. Then we build a little lightweight Linux desktop in a Docker container around it so people don't have to worry about all that complexity of installing Singularity, downloading Singularity containers. So we wrapped all of that nicely in a graphical user interface. And this is how it currently looks like. A user would just start our container, then this currently starts the whole desktop environment. Then the user can just open a browser and just goes to local host port 6080 in this case. We're running a local Novi NC client inside of the container. And then inside that container we have all our software installed that people need. They can just run it. The container gets downloaded, unpacked, and people can just start using it and look at it. So they don't have to worry about Singularity, X, X, Singularity run. They don't have to do this. It's all wrapped and nicely hidden for them. So then one of our biggest problems and why we got involved with Arcos is we need a place to store our container images. So currently we store it, as I said, on the Docker hub. And we had a lot of issues with pulling containers from the Docker hub because it is not an Australia. So we get low download speeds. We're limited in container size there. And we also need a Singularity storage because we use a lot of Singularity containers. So we basically need something that offers this long-term storage of containers because we want everything to be reproducible, ideally also worldwide. Not just in Australia. We need a fast access to local storage of this. So ideally we even don't want to download the full containers to the client. So I will show a little bit what we did in this space. We also want deduplication of container content because a lot of our containers have overlap. And by just updating a container, we usually just change a couple of binaries in there, but it's still a 15 gigabyte image. So we can't keep storing that. And also a lot of the software that we use is quite outdated. So what we would like to do is either work on automated vulnerability scanning so we understand what problematic software is in old containers. And also one idea that I would like to pursue is how can we limit containers, what they can do on data? So basically, only give them access to really what this container needs to figure out that we can give the minimum priorities for that container so that we're not doing stealing data, for example. That's a simple case. So this is then a little proof of concept that I worked on in the last weeks with the ARCOS group where we said, okay, let's think about how could we build a container registry that would do everything that we need as a project? So I said this is very much my use case for our project. And what we did is we looked at what CERN does. So how does CERN distribute software? And they use something they call CVMFS. So it's a CERN virtual machine file system, which is basically a read-only file system in user space hosted on web servers. So it just needs a standard web server and it uses content addressable storage and merkle tree. So basically Git, every file gets committed to a Git repo. This means the application actually works. And the data is transferred on demand to the client. So this means we don't have to download the whole container, but we can actually open the container directly on the CVMFS storage. And whenever something is needed within that container, it gets pulled via the web. And then as you see, there's a hierarchy of multiple servers and the squid proxy. So we can actually handle that load. So this system is supposed to scale very well and CERN runs it on quite a large scale. And that's how we currently build it. So we basically, again, what I already showed in the beginning, we use GitHub to build all these things. We test these things. We put them to the Docker Hub and our GitHub registry. And then we build this cascade of servers. So with stratum zero, that's what I call the main server. The main server then has client servers stratum one that mirror that server in different regions. So here we currently run that on Oracle cloud that supports us in this project, where we run it in the US, in Europe, and in Australia. And we pull these containers down to the stratum zero and then it goes through the hierarchy to the stratum one servers. And then we have our end users on the end that use laptops, desktops, or high performance computing systems. And there we use basically if it's an HPC, we recommend that people set up a local squid proxy because otherwise every HPC node will talk to our stratum one server, which is problematic. So a local squid is a good idea. The laptops and the desktops, we allow them directly to talk to the stratum ones at the moment. So we haven't seen any scalability issues there. And we use a geo IP service in the middle to identify what's the next server close to that. And the thing is we can also scale up how many stratum one servers we have easily. So it can be 10 and 10 and more servers without any problem. So yeah, what's what's needed? What are the pain points? I think that was one of the questions. So currently we have for this project at UQ we have a zero point FTE level a post op position available and the recruiting can start anytime. So if anyone is keen on working on this project and helping us and or has cofunding to fund this to a whole position, get in touch with us, that would be amazing. Also, it's an open source project. So if you think this is useful for you, definitely have a look at it help us with the software and container testing that we're doing. We're working on simplifying the installation. We still have issues in, for example, making that desktop size adjustable to the window size. So if there's a system administrator out there that has done a lot of this in this space and knows how to hack Linux in a way that it behaves how we wanted, that would be really cool. And we're also working on interesting things where we want to run a singularity container inside a singularity container inside a Docker container. So basically multiple hierarchies of containerization. Because basically when we run on an HPC, I said that we don't have Docker, but we basically want to run singularity. And then we need a couple of tricks there. And I think it would be possible and would enable a lot of things. So if anyone has has played with these things containers and containers and containers, that would be really cool. That's pretty much it. So as I said, it's an open source project. And there are lots of people involved. And I'm just one of the developers there. And I just presented it on behalf of a lot of people. So questions. Any questions for Stefan? Obviously, you know, he's a total amateur at all of this. And it's just learning. So yes, as I said, and I'm not a software developer, I just a researcher. So I have actually no idea about this. I'm just using this stuff. It's been facetious, Stefan. So CVMFS. So are you setting up your own CVMFS service and everything to do that? Are you piggybacking off CERNs or? No, exactly. I was thinking about picking back off CERNs. So CERN runs it and you can actually submit containers to their registry with a wish list. And they say it only takes 20 minutes to integrate our containers and their library. But I just wanted to see how it works. So I set it up on my own. And I also wanted some security scanning and a little bit more cyber security aspects of this whole system. So that's why I wanted to play with my own system for now. Because one of the things we've been thinking about is there are few groups now in Australia who are using CVMFS and a few others who have expressed interest in it. So providing a sort of national CVMFS service that everyone can use rather than everyone having to do it themselves is probably a good thing we should look at doing. Yeah, that would be amazing. And especially together with Arnett, right? Because this is something like Arnett could just do this for us and they could just put the squid proxies everywhere and it would just work. Because that's the biggest thing where people told me you will run into problems. Like how do you want to run this on the scale you're running at it? And yes, I don't have an answer to this yet. And I think the answer is we need yeah, exactly. We need help from other people. So yeah. Okay, thanks. Can I say something? Go ahead. Yeah, so yeah, not a question just like I mentioned. So I'm already on the list here on the slide. I just want to mention that like this environment at the NeuroDesk we ran it successfully on Nectar, in Nectar instance. So that's the one way to run it. So you can contact us anybody who would like to run it in a Nectar instance. But equally, well, you can run it like on your laptop computer or desktop. Steve Crowley here. I was going to ask, when you talk about security scanning, are you talking for external vulnerability scanning or internal sort of things like malware and Trojans and things like that? Yes, exactly. So very, very good question. So the problem is that our containers run on potentially sensitive data, right? We have potentially human data in there that let's say there is a tool in there that takes this data and steals it and sends it somewhere else. That's something we definitely want to avoid. So we want to see if we can scan for patterns of tools that could do that. And also we would like to scan for tools like crypto things where people just misuse things exactly, malware, but also exactly just outdated libraries. So we actually know that we are like, the problem is a lot of our software runs, for example, Ubuntu 16.04 and it doesn't run past that. So we know that Ubuntu 16.04 has a lot of security issues by now. And I just want to be aware of what we're running on the HPCs because that's a big problem that I see that when we build this for everyone and people run it on an HPC and something goes wrong, then people will go screaming at us. So that's somehow why I want to build security in from the start to make sure that these containers can't do anything evil. Don't steal data, don't mind crypto stuff. So that's a bit the plan. But yes, it's tricky because yes, we haven't made a lot of progress on that side. So what software are you using to do that? For the vulnerability scanning? Yeah. Currently, we're using the Docker, what they use. So there is a Docker Hub plug feature that just has a security feature that is called scan for vulnerabilities. But otherwise, we scan the flat files that we have in the containers for signatures. But that's pretty much how far we got with this. So I think it's not yet well developed. So we definitely helped there. Stefan, interesting talk and high to a bunch of people that I haven't seen for a while. Just wondering, I've missed what you said about the base system that you're building on. Have you chosen a base OS or anything that you're layering up on as a standard? Yeah. Okay. So there are multiple answers to this. So the desktop environment that you see when everything runs in, this is based on Ubuntu, the latest version of Ubuntu. So we tried to keep this quite current because this is where probably a lot of attack surface would happen. The problem is that these internal containers that we build, they really depend on the software that we're building. So we're building it whatever that tool runs best in. So we have centers, quite old versions of centers, quite old versions of Ubuntu, old versions of Debian, basically whatever that tool works well in. That's I think the problem because in these subcontainers, that's where I think the security problems will happen. I wonder whether there's a similar project that I kept was similar in some ways but with a slightly different scope and you've gone a bit further up the stack, I think. The European, what's it called, EESSI? Yes, EESSI. Yes, EESSI. Yeah, the European environment for scientific software installations. Yes. Yes. So I saw that talk and it's really cool. So what they basically do is they say we don't need containers. We build everything from source and they basically use Gen2 and they build every package newly for Gen2 and then they have wrappers for different systems that they can run on. And I think it's a really, really cool approach and I'm following that project. The problem I see is the pain that currently goes through to actually build their software again. Yeah. Yeah. I had not, I haven't seen that. I assume you're talking about the FlossDem talk or something like that. Yes. Yes. That was just a very recent talk where where they presented that and I think it's a super cool project and I want to see it take off but I looked at their builds of some of the tools that we would need and like it takes days to get the tool built into their environment because you basically start, right? You compile everything. So for every tool they have, they compile everything from source which is just a massive task and I'm not sure if it will scale. That's a bit, that's what I'm looking at right now. That's cool. Thank you. Yeah. Thanks for that, Stefan. I think we've got to move on. Very interesting and thanks for sharing that. Thank you for all the feedback. Thank you. We might move on to Jake now on the next registry. Hi. Just try to share my screen now. Can everyone see this? Yep. Yep. All right. Thank you. Sorry. Yeah. Hi. My name is Jake. I'm with ARDC and I'm in the core services. So we run the Nectar Research Cloud and it's a cloud much like AWS and GCP is a cloud but we are of course much smaller. So today I'm just going to talk about what we are trying, what we are doing and the containerization space for Nectar. So Nectar Research Cloud runs on OpenStack and OpenStack itself has many services like Nova, Sinder and Swift. They are the functional equivalent of EC2, EBS and S3. Other services are essentially just Python processes. So OpenStack itself, they have a half yearly release and the release goes alphabetically all the train users. And Nectar, what we do is we take the upstream code of every release, we test and we patch them and then we upgrade the production cloud. So there are a few reasons for the patches. Some of the patches are for this what we term as Nectarism, which are just business requirements. Things Nectar needs to do that. No other cloud in the world needs to do. And the other reason for patches are the boxing features that we contribute back upstream. So we have production architecture, which are just basically a bunch of Ubuntu VMs and we have different bunches for each release version. So why do we have that? It's because of the problem. First of all, when we patched upstream code, upstream builds binary packages like Red Hat and Ubuntu build binary packages, but once we have our patches in, we can't just use their binary packages. So we need to build the packages and building Debian packages is really a pain. The other problem we have is Nectar runs on these Ubuntu VMs, but because different OpenStack version requires different libraries due to different versions of libraries, so we have to spin up a new bunch of VMs every time we have a new release. And luckily, there's a technology out there that solves this packaging problem and the library problem, which is containers. So the other thing we are doing for containers is in a service called Magnum. Magnum is container orchestration as a service. It's a long name. Sometimes I forget what it means to, but what it does is it builds Kubernetes cluster. So generally, people want to try to use Kubernetes, but nobody wants to build Kubernetes cluster or manage them because, well, I wouldn't want to if I have the choice. So what Magnum allows a user to do is just simply through a dashboard interface, just say, spin up a Kubernetes cluster for me and give me Coup CTL and from there, yeah, you can get your cluster and control it. It uses OpenStack native resources. So for example, it builds up the cluster using Nova instances and uses the Cinder volumes and Octavia load balances for the cluster. So each cluster is made out of two instances, VMs. So when they start up the cluster, every thing in there is basically a container. So you have like almost 10 containers, 10 different types of containers. You have the Kubernetes family of containers like Core DNS and Fano and things like that. And you also have the OpenStack containers like the OpenStack Cloud Controller Manager and CSI Cinder. These are containers that have, there's a layer between the Kubernetes cluster and the OpenStack cluster so that when you use the Coup CTL command to say, give me a Kubernetes volume, it actually can talk to OpenStack and say, give me an OpenStack volume and then they're actually the same thing and they work. They can be attached to your ports and things like that. So because there's about two Nova instances of 10 containers, each is about 20 image pools, each time someone spins up a cluster. And that's the problem. So all these containers that make up the Magnum cluster, they are from different repositories because like the different organizations working on different parts of it. So we needed a way to mirror these containers or have a static mirror so that say if somebody deletes a container from their repo, a user, it will not break our service basically of a user trying to spin up a new cluster, which actually happened a while back. And then the very famous incident of Docker Hub limiting pools, which just would cost us to have intermittent failure for users. Not a good thing. So one of our solutions would be just to create our own registry. So we have started a work on creating a new Nectar registry and that's the next part of it. So Nectar now we have a registry just to recap the problems. The first problem is we needed a place to host the images that we're using to build up the Nectar cloud. The second problem was we need to a place to host the images for the user's cluster. So we have started on using this solution called Harbor. We run it on Kubernetes and it's spun up. The Kubernetes cluster is spun up by Magnum on the Nectar cloud. So we are dogfooding our own Magnum service. And the engineers among us will have seen, hey, it's a circular dependency. So whether this is a brave or stupid time will tell. And for my bosses here, don't worry, we have a backup plan. So yes, the Nectar registry. We use the Nectar Swift object storage. So this is using the S3 plugin. So this is just like S3 object storage. So it's virtually unlimited storage for us with a star because virtually unlimited is marketing talk. And so I'm not technical talk. So yeah, we can grow and store a lot of images if we want, if we need. The Harbor registry allows us to replicate repos so we can just point it to a repo and say, pull everything in and just keep a mirror of it, which allowed us to easily migrate from Docker Hub when they started implementing the limits. It also has a proxy cache functionality. So you can say, if I access this repo of this container in Harbor, check if you have a local copy. If you do not check Docker Hub for the copy. And then in the newer version, whenever you do a pool, it actually do a request to Docker Hub to see if it has changed, the container exchange on Docker Hub. And I think that the hate request do not consume limits. So it's sort of intelligently try to consume as less as little resources as possible. So some future work for us. We're like to extend the registry to a Nectar users. So it'll be cool if like a Nectar project on the cloud already has an allocation quota and say it's like, yeah, you can also use this quota to host your container images. And here's the URL for all your repos. But this needs a bunch of work to integrate into the Nectar cloud, mainly about a location. And what's that called roles? Which user belongs to which project? And things like that. How much quota they have, Harbor needs to be able to see this information. So because it's a bunch of work, we don't know where the users are interested. And we would really we would like to know if users wants this. And if you're interested, do let us know so we can work on it, justify our work on it. Thank you for my presentation and questions. Next questions for Jay. Maybe two minutes. Hey, Jay Carmel here. It's a question for you a little bit more broadly as well. With the registry working group, how does this information feed in? And do we have other use case examples of people building registry? So the registry working group has sort of like two things we want to do. One of them would be the CVMF-based registry that we've seen for Steven's talk. The other one is a mirror, a static mirror. This is what Nectar registry can provide the functionality, but we have to see what we want to mirror and whether we have enough resources for it and things like that. Sorry, yeah. So yeah, the registry working group has a few things that we can work on and this Nectar registry is one of them and Steven's registry is another of them. One more question. Okay, we might move on. Thanks for that, Jay. We might move on now to Anthony. It's QTEcoAcoustics. Are you ready to go, Anthony? Let's see. Can everyone see that? Yes, we can. Cool. I work for a research group. We deploy acoustic sensors out into the bush and with those sensors, we record audio and on that audio, we try and understand what's going on in the landscape. So we listen for geophony, so noise from plants, wind, rain, anthropophony, noise from humans, machines, that sort of stuff, and most importantly, biophony, so noises from animals. So bird calls, koala grunts, cow moos, frog choruses, you name it. And we collect a lot of data. So we have multiple inferences of our software running for different projects. One of our inferences is Aldo. It has about 139 terabytes of audio, which is about 68 years worth of listening. Audio is interesting in that it's not like video. The size of data is much smaller compared to the temple amount of data you collect. And we've started another project called the HMO, and we're going to collect about three petabytes more of the over the next five years. There's 400 of the sensors you see there on the left recording 24 hours a day, deployed all across Australia. Where in Brisbane, we have a few developers of which I am one, and about half my time spent doing server administration and DevOps and stuff. Our primary product is a web application that now runs in containers. It's pretty old and needs a lot of special care to keep going. We used to run the network, and we used to have just raw VMs provisioned, and we automated that through Ansible. What they didn't do for us, though, was address developer concerns. So I had a student or a dream developer working on our website, and they had to be able to spin up the same stuff that was in production and test it locally. And so we started using containers for the local development experience, particularly Docker Compose, which is really nice. And eventually those containers migrated their way to production. And so now we're running all our stuff at QUT in private VMs. And the extent of our container orchestration is running Docker run a bunch of Ansible, basically just on shell commands in the end. So it's just Docker run on 16 different servers. And so far that kind of works for us. There are obviously ways we could make that better. Because of our scale and who we are, we don't want superfinancing infrastructure. Has to be small, has to be implemented in small steps, has to be understandable for people who don't have a lot of background knowledge and versionable. I've done a lot of reading on Kubernetes because the advantages look interesting, but I've never actually used it. So from what I can tell, it's probably not the right fit for us. And so the question is, what orchestration should we use? And that's something I'm still investigating. We had the opportunity to move to AWS and we decided against it because we have what we're calling an asymmetrical scale. We collect a lot of data, but we don't actually have that many compute requirements in terms of the website. Analysis is done on the university's PBS system. And a lot of these managed Kubernetes instances for existence, for example, only exist in the cloud. So I'm not sure even if there was a managed Kubernetes instance, we had access to how I would use it. It's just stuff I don't know. Yeah, this is pretty much what we use at the moment for our DevOps stuff. In some sense, we've kind of aggrieved with where we are with our servers. And that's just because of a cost thing. And we'd like to make more use of AWS in the future. There's some really good things about using Ansible and just Docker Run. Things actually get really simple. Ansible is a single source of truth for provisioning our servers, deploying our apps or the config secrets, they're all in one spot. And then moving our apps to be compiled in Docker containers on CI servers as code changes is really much better in terms of maintenance. So our containers are built more often, and we have less breaks that are unnoticed for longer because of the containers building it on the CI. The other good advantage of moving most of our apps to containers is we don't have an OS monoculture anymore. And we're able to take advantage of things like Alpine for services that really don't need any dependencies on the file system. One thing I'm really cautious about is how much knowledge a team member needs to have to be able to deploy new things or maintain a system, especially if I leave. I don't want a bus factor of one. And I'm not sure how to deal with container orchestration stuff and what I would call our hybrid cloud scenario. And yeah, that's pretty much me and what I do. Excellent. Thanks, Anthony. Any questions for Anthony? I can't see everyone, so you might just need to pipe up if you've got a question. Good question from me. How do you transport the capture sounds to your infrastructure? What transport method do you use? Sneak in that. So we post SD cards. It's actually fairly high bandwidth in the end. And we have considered doing right uploads via AR nodes. So a lot of our deployments through universities and it should be feasible. But many of the sensors are deployed in the middle of the desert. And the easiest thing for people to do is put the SD cards in a post bag and they arrive back in our doorstep and we upload them at some point later. Fair enough. There will be no coverage over there with mobile networks. Yeah, and we have investigated remote streaming data and something's becoming more feasible as new technologies emerge. But when you're recording 24 hours a day, it's just easier to send back a SD card. Thank you. I think we've got time for one more question. So you said, Anthony, that you weren't sure whether how effective Kubernetes would be to provide assistance to you and your group. I think that's one of the things that we're trying to do with our cost to provide sources so that you can go to expertise and find out those answers easily. You can say in a hacky hour or something like that where you can go and ask an expert and they can say, well, you can do this, this and this and it might be beneficial or maybe it's not beneficial. So hopefully not to distant future we'll be able to offer that service. That's exactly what I'm looking for. Okay, we might, if there's no more questions, we might move on to our last presentation. Oh, no, sorry, we've got two more. We've got Ryan. Are you ready to go, Ryan? Thanks. Let me just share my screen. Is that full screen for everyone? Yep, looks good. Cool. So I'm going to talk about the Australian imaging service. So a bit of high level, roughly one third of the population of Australia has a medical imaging scan per year. This is just medical ones, not including ASMRC and MRF-funded ones. So there's a lot of value to be had in handling in a private and respectful way to patients, their data for new diagnosis, new treatments and new basic understanding that imaging gives us. So what we are doing is building a federation to securely handle this imaging data and allow collaboration across both Australia and internationally. So what we have is a federated set of nodes where we can integrate directly with clinical vices and viewers provide in-browser medical annotation, AI segmentation, and a number of analysis capabilities in a ring-finced secure environment. So just to talk about the scale, this is our federation currently, either in production or in negotiations with different local health districts around the country. So in blue are institutional sites, so the University of Queensland or the University of Sydney would run a node over which they have governance and they cover the cost for their storage compute. And then a number of pink sites, which are the data sources, so private public hospitals, clinics, sites, which are researchers or affiliates work closely with to do their research and for patient treatment. And importantly, one of the things that kind of dictated the approach that we take is we're not just Australia-based, we need to deploy internationally. So we have nodes that we are deploying in the US and Europe, as well as a number of clinical sites that we'll be integrating with, and we are in conversation to build a sister federation across the European Union. So we really needed something that kind of has that international scale. So a number of academic partners, quite a bit of funding from the Australian Research Data Commons, their platform scheme, through the Australian Cancer Research Foundation, and a couple of other grants that kind of all merged together to help contribute to this platform. And of course the team. So I help bring together the idea, the vision, the funding, but many of these people were the ones who kind of do all the work to really make this feasible. And names are here. So how does this deal with group news? As I mentioned, we need to be able to deploy on many different infrastructures. So some people run on Chrome, some people use commercial cloud, many people use Nectar Open Stack. So we needed a solution to build consistent nodes from a single code base across all of those. So Kubernetes is that approach. So AIS consists of two major things. Each institution has a node, which is built using Helm and Customize for the peculiarities of the different Kubernetes managed service or not managed as it may be. Prometheus for the reporting layer, and we'll be looking at Istio for the service mesh. This sits as a sidecar inside every container to kind of give you that, I think it's layer seven security and controlling the traffic between every single container that we're deploying on the node. We're also looking at potentially using Kubernetes for a bunch of edge devices. So best practices you de-identify data before it ever leaves the hospital. You don't want patient data hitting a university system and then having to deal with it afterwards. So we are looking at a couple technologies, clinical trials processor, Xnet upload tool, Gadgetron and Clara endpoints, and being able to manage a fleet of edge devices, either one to one for an instrument or one to one to a clinical site, depending on whether they're using the standard.com protocol or they're using something proprietary. So that's kind of the high level approach that we're using for Kubernetes for this. What's actually in a node is the core technology is Xnet, which has a built-in pipeline engine. So this means when data hits, you can do quality assurance, any containers or analyses that you want and have that standardized and have the data management system do the orchestration so that you're capturing direct acquisition from the instrument and every step for what was done for QA, what analysis, sorry, what analysis was done and having that full provenance and auditing trail so you can see whose touch data has someone signed off to allow this data to be changed because in many cases we want to be able to hold the sole source of truth for this patient data. So we have to make sure that we always keep an unaltered original copy. And then we're expanding that with a couple other capabilities. So Indeed is Clara for machine learning, which we'll be looking at Q4 to Q1 next year. NeuroDesk, which was discussed earlier through the ADEPT project, we'll be looking at being provisioned that on AI nodes directly. We're looking at production Q4, Q1 next year similarly. Gadgetron for MRI reconstruction, particularly around cardiac and respiratory. And then there are two projects around RStudio and RShiny and Jupiter to be able to launch those directly from XNA with the value there being that they're inside the same security net and it exposes the patient session and subjects hierarchy in XNAT directly as Python and our objects, which makes it easier to kind of integrate your workflow and put the data back using a RESTful API. And lastly, we'll be integrating with RedCap as a, I guess, more trusted source. And we expose front-end REST APIs and browser interface for users or for integration with other platforms, such as the characterization virtual lab. Starting to get slightly, slightly down. What's a node looks like? So we have the core XNAT data management in a series of components there, which is what the end user interacts with. So for all their visualization, annotation, finishing the pipelines, etc. And then that has a number of plugins, which coordinate number of things. So the first thing is XSYNC. So what this does is allow you to securely transfer data between two of the federated nodes. And we're looking at building Globus endpoints as the back end for this. The second is the viewer plugin, which will have a lot of work as part of the ADEPT project to be able to run containerized pipelines or lightweight desktops and have that viewed directly from the browser through XNAT. And that's what the Jupyter and the RStudio integration are as well. We have a machine learning plugin. This is particularly for claret trains that clinicians and researchers can do AI-assisted annotation and segmentation directly in the browser and have that real-time feedback for their data set and their training. And kind of the core of it all is that container service. So you see some things integrate with the ML and container service. So the container service is the part that really monitors the health of all the individual pods. Does the spinning up and down assigns things to different projects, exposes the XNAT data to different containers. So it is the base workhorse that does all the orchestration for a number of tools. And then those tools may talk back and be exposed to the user through an ML or viewer plugin. But otherwise, they will go do their thing and they will dump the results back into XNAT, notify the user, text message, email, and then they have their results. We're also, as I mentioned, looking at integrating number of edge devices and having all of our containers as part of the AIS repo and looking at upstream sources such as what NeuroDesk is doing for potentially secure containers. Two things that aren't quite necessarily in scope for the ARDC funding, but things we're looking at, are one, integrating with edge devices not at the data management layer, but directly at the computational layer. So what this allows you to do is say real-time or near real-time reconstruction directly from the clinical site using research systems with a much higher fidelity than is possible at any hospital in Australia currently. And with Clara looking at federated learning between nodes so that you never actually have to move the data, you just move your models. That's it. Thanks, Ryan. Any questions for Ryan? Nothing for Ryan? A quick one on Istio. What was your experience with having to deal with Istio and the benefits you got out of it? So I guess you should clarify that we are looking at Istio, but we don't have it in production environment at this point. So we have a number of nodes which are running in old architecture. We're looking at going live or having the 1.0 of this Kubernetes helmet customized by the end of quarter three, beginning of quarter four, at which point we'd start looking at moving nodes from the current architecture to the fully Kuberized approach. Okay, thanks, Ryan. We might move on then. We're getting a bit short on time now. And the last presentation is from Gordon. Okay, can you hear me all right? Yes. I'll just try and share my screen. So let me find it on this. So while Gordon's working that out, so after his presentation, we'll stop for probably about a 10 minute break and then we'll come back and we'll have some more breakout rooms and you'll be able to go through some of the things that you're working on in each of your own teams in a more sort of informal environment. But over to you, Gordon. Okay, can you see that? Yes, we can. It's not in presentation mode. How's that? It's, I don't think it's in presentation mode still. Well, I may just go with it as it is. Yep, we'll be right. It's large enough to see that, right? Absolutely. Okay, so I'm from OSSRC in Perth. If we expand all the acronyms there, it's the Australian Square Kilometre Array Resource Centre. And as the name suggests, working on setting up a resource centre to handle data that will come off the Square Kilometre Array Telescope. At this point in time, we are working with the precursor telescopes. So there are smaller telescopes already running, ASCAP and MWA. From the development point of view, we're a team of four, but we are hopefully expanding. We have minimal experience with container orchestration, but we have run containers, docker and singularity in particular. And we have a resource at Pausy running on the Nimbus OpenStack there. So the Square Kilometre Array Telescope is expected to produce around 600 petabytes of data per year. And science teams around the world will need to access advanced data products that we will be producing from the semi-raw data that comes off the telescope. We've developed a system called RADIC, which stands for Radio Astronomy Data Enhancement Cloud, which provides these advanced data products. But we still, we have three types of users that we've identified. So we have the scientists themselves. We have pipeline developers that will develop pipelines to produce the advanced data output from the telescope data. And we've got system administrators to, or we will hopefully have system administrators to administer the system once it's up and running. So we're very much in the design phase at the moment. And it is very much a big data problem. We are bringing data to the computation, sorry, computation to the data as the data comes directly into the Pausy Center. We have been using traditional HPCT methodologies. We have a Slern cluster running, for instance, which is running with a couple of projects that we have up and running on the precursor telescopes. And as traditionally expected, it's a stable mature technology, persistent data is easily catered for. We have MPI and interband interconnects for parallelization of code. However, we have fairly inefficient resource usage because the cluster is up all the time, whether it's all being used or not. And we have poor failover and redundancy. So if a head node goes down on the Slern cluster, for instance, we have to get in there and rescue it and restart the system. So we've been looking at containerization and we realized that the touted advantages of high availability, high scalability, resilience especially, are going to be very useful for at least part of our workflow. Abstraction over cloud infrastructure as we are running on the cloud is very useful, but it does come at a cost. And that cost is the complexity of setting up and running your containers on specific clouds. There are some disadvantages. MPI is kind of an issue. We tend to be focusing on using singularity. But we have anecdotal evidence that most of the codes that are coming into our pipelines and use MPI aren't actually using the MPI libraries for real inter-processed communications. They're normally actually being used for more trivial tasks such as data partitioning, which we can achieve at the container level. And in most instances, remove the need for MPI in that particular environment because most of our problems are embarrassingly parallel. Configuration can be very daunting. We are doing this all by ourselves and pretty much from scratch. And upgrade does seem to be quite an issue if you want to upgrade any of the parts of the cluster. So from using identifying users from the precursor projects, the sorts of architecture we're looking at is like this. So the boxes in yellow we already have. And if you have a look, can you actually see my cursor? Yes. So the idea was to use a container technology to provide Jupiter Hub and some NextFlow pathways for running pipelines. We have considered looking at OpenShift to sit on top of OpenStack. And that's possibly a path that we'll look at a little further down the track when we get administration access on some of these clouds. But this was the vision that we kind of have come up with. I might return to that in a tech. So the proposed architecture, we're restricted by the environment we have. We have the Nimbus OpenStack private cloud, essentially on Ubuntu images. But there's no Elbas, Octavia or Magnum on this particular installation. And it was originally only a partial heat installation. So without those additional plugins, much of which I found out a little further down the track halfway through trying to install various versions of Kubernetes, it does make it slightly problematic. So the requirements there are what have come out of the use case analysis. And some of the things further down here like the load balances, high availability, user spaces and persistent storage do make it a slightly more difficult task than just setting up a basic Kubernetes cluster. However, originally we weren't wedded to any particular technology. So we had several things to look at. And we did have a look at Docker Swarm, Kubernetes and Mesos Marathon. But we settled on having a closer look and identifying advantages and disadvantages between Mesos and Kubernetes. So I kind of went through the differences between them. And for what we actually wanted to do, there didn't seem to be much in the way of an advantage, one way or the other. We weren't probably going to be scaling up to 10,000 instances at a time. So that ability that Mesos has didn't really apply to us. And the quote there on the right is from Emeril Dezrazek, who's the product manager for Mesosphere, who actually produced Mesos. But he was asked what he considered were the key differences and what environments suited the two different technologies. And as he says there, if you're dedicating exclusively to doco slash singularity orchestration and you're willing to get your hands dirty, Kubernetes is a good technology to consider. I put a reasonable amount of weight on that quote. I possibly should have looked at the second part of his statement, which I suppose you would expect, because he does come from Mesosphere, but they're all salient points there. Mesos does make it quite easy to move these things across cloud providers and data centers, although easy is a relative term there. However, we did choose Kubernetes mainly because, as we found out, we had no magnum or heat set up on an open stack and we don't have administrative access to that particular infrastructure. We already had the Spark YAM Hadoop environment running, so additional features and Mesos were thought unnecessary and without the admin access, we probably couldn't viably install it on the resource that we had or the level of resource that we had. So this was what I attempted to build. So it's a high availability Kubernetes production cluster with the usual HAProxy and Keep Alive for failover and resilience and persistent volumes coming via Cinder from OpenStack. We got a fair way with trying to set this up, but looking through the blogs of developers who had experienced setting up Kubernetes from scratch, to me kind of bore a striking similarity to reading posts about survivors from bus crashes. Those tales of woe and frustration, time to production was a few months to a couple of years, and it did sound rather daunting, however, we thought, well, you know, how hard can it really be? So we decided to give it a go. And the hard bits fairly quickly surfaced. There are dozens of tool pathways for installing Kubernetes and depending on what you're installing it on, what versions of firmware you're running, what operating systems you expect to be facing the cluster on, there's a multiple combinations of these things to get Kubernetes up and running. I've used all of those, and for one reason or another had to drop them because they wouldn't quite do what I wanted, or at least I couldn't get them to do exactly what I wanted. In the end, I went with CubeSpray with a version of Ansible and my own scripts to try and produce what we wanted. I think I read quite somewhere that out-of-the-box Kubernetes is almost never enough for anybody. And after trying to add things like load balancing, ingress, HA proxies, persistent storage, shed volumes, twin interfaces, network interfaces, and especially VRRP on OpenStack, which is the virtual routing redundancy protocol you need for Keep Alive and HA proxy. And on top of that, you've got metrics and service discovery, secret management, and a whole swath of other things that you actually need before you can reasonably claim to have a production level cluster that's running and available. Some of those guides there that I went through kind of reinforced that with interesting names, such as Kubernetes the Hard Way. I particularly liked Zero to Jupyter Hub because they didn't say how long it took. But I'm estimating around 10 to the eight seconds. Sorry, Gordon. We might need to push it ahead a little bit. Okay. Well, I think this is the last slide. Most of that original diagram I've kind of got running. I still need to organize user spaces so that I can install Jupyter Hub and add an ingress controller and integrate the autoscaling that you can produce with Kubernetes with OpenStack itself. So what I've learned in a couple of months so far, two to three months I've been into this is that setting up Kubernetes clusters is hard, especially in the production environment. Helm charts are extremely difficult things to be throwing at your developers and saying, well, this is how you're going to develop and deploy. Most of the research you have to do yourself because of the specific different parts that make up a working Kubernetes cluster and the different versions of different parts that are all interrelated. I found version creep to be quite a killer in this and debugging when it takes over a half hour to configure and run up a cloud for every time that you want to change your configuration aspect makes it a very time consuming process. I'll leave it there. Thank you. Thanks for that Gordon. We'll take maybe one question for Gordon and if we get time later on, we can come back. Any questions for Gordon? I'll go if no one else has got one. What was your biggest hurdle with Cube Admin? Do you think Gordon? Cube Admin. My issues with Cube Admin were that we were running on Ubuntu with OpenStack without Elbas and I was trying to set up a load balancer that would effectively run a public interface but be able to secure it correctly. I didn't get any further with it than that because I switched back to CubeSpray after hitting that block. Yeah, fair enough. Thank you. Cheers.