 So we're going to be talking in my part of the talk, at least briefly just a quick introduction to Singularity, which is a software that I'm a developer on, here's our website. So Singularity is developed as a container solution for high-performance scientific computing as opposed to Docker where they mainly developed for industry solutions. There are three main reasons that we've been finding it useful to have containers in scientific computing. One, escape dependency health, somebody having to go around and manage everybody's versions of their libraries for every single scientist. Number two is that you can ensure that your code is going to work on your laptop the same way that it's going to work on the cluster every time. That's a huge benefit. A third also very important point for scientific computing is we can take one file and we can send it everywhere, and that's really great for reproducibility of your research. You can take that file, send it to some other supercomputer, and have somebody verify your results. So I'm sure many people who have worked on it, HBC are familiar with the scenario. On the left, you're working on your computer, you run some code and it works, and you send it over to run on the HBC, and all of a sudden, nothing works anymore. So that's what we developed Singularity for. So Singularity, any user on the cluster, they can run a container without any special privileges as opposed to Docker where they're always having a root-level daemon running. Singularity integrates right into an HBC infrastructure, so when you would normally run whatever code that you've built, you can instead just run a Singularity container as an executable, but instead you're executing now inside of a container. Singularity is portable between systems. We allow you to run your Singularity container on essentially any system that can support Singularity, and we actually have users that are running on incredibly old kernels. We don't require any new features, so you don't have to take use of namespaces if you don't want to, and it'll run back on kernel 2 even. Then any user can bring any container onto the HBC, and so you don't have to worry about screening a user's container for malicious content, you don't have to worry about security implications of people bringing any container that they want to onto your HBC system, and that makes the job a lot easier for administrators and IT security people who are worried about malicious code being run. Again here, I reiterate the same points. Something really important to note is the concept that we have one single image file, and for scientific work we found that's really important. One of the things that we talked about in a paper that we submitted for publishing about Singularity is the fact that we can now use just an image file to distribute not only the code that you've used to run your experiments, but also the environment you use to run it in and data that you used to generate, and so you can ensure that somebody who wants to reproduce your results has the means to do so. So here's a list, incomplete list of some of the places that have installed Singularity. There's a couple of places I think in the top 10 from the top 500 list, like Stampede is on there, GSI which they're still working on their cluster there. So the basic usage of Singularity falls into three main parts for the workflow. Your first part you're going to be working, you're going to create an image file. So that's usually done with pseudo Singularity Create and you give it a name, and that creates a physical .img file on the disk. Then the second part is to bootstrap it, and that's the process of installing whatever software you want on your image, configuring inside the image, your operating system, and then you're going to run it. We can run it in three separate ways. Singularity Shell just opens up an interactive shell inside the container. Singularity Exec will execute any file inside the container that you want it to, and then Singularity Run is a special command. We can, on the bootstrapping process, we can actually generate a script inside the container that will do anything you want when you do Singularity Run. That's actually what happens when you just directly execute the image file. Is that it will execute that run script and allow you to just execute the image as an executable. So this is just a small comparison in contrast between a couple other container solutions. So as far as HPC goes, Singularity fulfills what we had determined were some very important points, and you can see that Docker doesn't fulfill a lot of things since Docker is developed with a totally different kind of goal in mind. Shifter and Charlie Cloud are similar to Singularity and that they're both developed also for HPC environments. However, we do have some slight differences between them, and this is just an overview of a couple of the commands and what we are able to run as far as container formats. So now, I'll turn it over to you for the full size, please. So good morning to everyone. I hope you enjoyed last night. I'm this guy, Cesar Gomed, which is the guy that appears on the website and on every other place, had a medical issue. He's well, I don't mind on him. But we will have a little secret between all of the room and me, and this talk was given by Cesar. So if anyone asks you, I'm Cesar. I'll be talking about, given the introduction Michael gave us on Singularity, I'll be talking about an specific HPC use case for Singularity. For those of you that hadn't worked as a systems administrator on a super computing center, the real pain is that when one user opens a ticket, pushing you to upgrade some kind of library. It's the hell. There's no worse thing than that. You have a very stable configuration, very well-performant configuration, and the trickiest part of that is maintaining that over time. So as long as you don't upgrade anything, everything will be well. So by the time you ask someone to install this 1.0.0 P4 of OpenSSL, because you need that for that Python library that connects to any other place, you are totally screwing them up. So don't do, please. But given that, we have Singularity to rescue us. I won't be talking about the creation of a portable container because Michael did better than me. So I'll be talking about two use cases for Singularity in the HPC environment. One of them is having access to InfiniBand interconnection and actually using that. And the other one is having very expensive GP GPU cards installed on the machines and actually using them with the same software stack you are used to in your Ubuntu or in your Debian laptop. So, well, text is very small because I didn't do the presentation. I'd do it better. But it's uploaded to the website. One of my slaves uploaded like a five-minute ago. So definition files will be available to all of you so you don't get blind by having to look at this small text size. In this part are the basic commands of creating a container. This parameter is important as you select the size for the container. You can have a container as big as you need because it creates a sparse file on the file system. So let's say you need 32 gigabytes but maybe you won't be using them since the beginning of your development. You will create an actual 32-gigabyte file but it will be like maybe 200 megabyte on the file system so it can grow as you are putting things inside the container. The bootstrap definition is the secret source for singularity. It's all of the work with singularity will be done in the definition files. It's where you have a minimal setup of the line distribution you are going to use within the container. And then you add as many software as you need and as many stages for building the software or installing any code you need to install inside the container. And one of the options is expanding the container. Given you're running out of space because you have to download those NVIDIA drivers that take 1 gigabyte compressed and like 5 uncompressed, you will have to expand your container. The easiest definition file is in this form. The preferred thing to specify is the bootstrap method you are going to use. Almost all of our work has been done on jump with centOS or the bootstrap with Debian or Ubuntu. You select the operating system version and the mirror you are going to download that. There's a very clever solution for that. Given you can install APT, k-share, NG in the machine you are creating in the containers and you only download the RPM or depth files once and the container creation will speed up over time. On the run script section, you write a code that will be executed when you call singularity run or slash dot slash container name. And in the post section is the secret source part for your reproducibility as Michael was talking to us. Let's have a look at the example. These guys have been working in a blind association in Spain so they do so tiny text in order to push you to our association of blind people. We only have to concentrate on this part. We are creating a bunch of directories for the paths that we are going to bind between outside world given the machine and inside world given the container. So let's say we have a very high speed parallel file system for scratch. We can bind this scratch inside the container. So when our application running inside our environment needs to write local files per node, they can be written in the fanciest hardware available. And if you need to dump some of your partial results in a shared file system, you can point them to a scratch file system that is shared among the machines. So it gives you access to the actual enjoyable part of running code on a supercomputer, which is everything is faster than your laptop. And by this part of the definition file, you are installing every dependence you are going to need. This step is run on your local machine. So the root privileges you are needing are more or less granted or can be run inside a virtual machine that you have installed in a machine. You don't have root access, let's say. And for the run script, you are loading as many environment variables you're going to need as possible. So in the case of the examples, you're going to use OpenMPI libraries with local InfiniBand drivers, bind it to the image, and you're going to use the OpenMPI version that suits most your code. So you are not limited to the OpenMPI version or Intel MPI version available at the supercomputer inside it. Given that, we have almost a magical example, Greg told us yesterday, of what can be done with containers. This is actual code in a Spanish supercomputer. This is native code. It's an Alliancy value. It's the same code run inside a CentOS singularity container and the same on an Ubuntu singularity container. As you can see, there's improvement only by updating the version of the operating system. On the other hand, if you follow the path of CentOS 7.3 within singularity, you're not going to have so much bandwidth. And the scale doesn't make sense because there's not such difference between the parts. But the general idea is a container is performing better than native code for this specific US case. And in terms of bandwidth, a container is performing almost equal that native code within the cluster. Now, this is the setup for the benchmarks. We'll be uploaded with the presentation to the web page. All another example of trying to make you blind. As you cannot see, I will tell. We are installing the QDA version we are going to need within the container. And the only thing we have to have is the driver available at the external file system. Given it can be loaded as a module to the kernel, we can run this code with the libraries we are needing. And we can take the advantage of having a very powerful GPU card in the host. And we have all of our stack values too outside the host. So another example this time is with Chainr, it will be hard. If you thought the other slides were the difficult ones, this one, it's Paolo's fault. Now we are bootstrapping the last version, the last LTS version of Ubuntu. And for the post part, you can see we are compiling OpenMPI installing QDA inside the container. So it will have matching versions of the QDA stack and QDA drivers inside and outside the container. And you are loading those libraries when running the container. So you don't have to take care about running some script just before calling Sbatch or any other Q manager you are going to use because the container will take care for itself of setting up the correct environment. More crappy small text. Now you are actually downloading and compiling the source code for Chainr. And by the time you need to run that, you are going to call the Python executable inside the container with the Chainr example code that lies inside the container. In case you need to write, your home directory will bind it directly inside the container. So anything you are writing in the home directory inside the container will be outside also. And if you need to bind the special folders for, let's say, drivers or libraries for the infinite band stuff, you can bind them by placing outside path and inside path. And the container run script will set up the environment for using these libraries. I'm a bit long on time, so I will give you some time for questions. And thank you very much for the presentation. Somebody has to ask something. We have a couple of minutes for questions. Any questions? Could you provide some intuition why running it in a container would be fast? We were guessing about newer versions of GCC compiling the libraries inside the container. But we didn't profile that. That's something we found. But it's more likely that Ubuntu 16 is using GCC 5 and CentOS is using a bit older version. So it could be like that. But we don't have data. It's just guessing. Yeah? But for what I've seen, we can also bind the code in the container. We can also do some sort of escalation if you run it. Oh, you will have the same permissions. Well, can you repeat the question? Oh, he was asking if you are binding host directories inside the container. You can escalate privileges inside the container for actual file system outside the container. When you're running the container for the creation part, you need root privileges in the machine. But by the time you're running the container, you will be running them with the actual user privileges that is running the container. So in a shared supercomputing environment, it's likely a privilege user. Let's say you have to bind some ETC library barbs for infinitive libraries. And those files are all barred root and only right ballerite root. Inside the container, you will have the same permissions for this file. So you cannot do anything inside the container that won't be allowed to you outside the container. I didn't understand you. Environment module. Environment module. But you have to install the module. Yeah, you have to have a module installed inside the container. A bit louder, please. You mentioned that same similarity when it does not require a root permission, right? But my understanding is that it requires either user namespaces or a set UID root helper. Yeah, that's correct. So I mean, the question was, the understanding was that singularity requires either user namespaces or a set UID binary. OK. I'll just hold it. And so the answer is essentially yes. So that's the entirety of how singularity is working, is we're either having a set UID binary, in which case all privileges are dropped before any user code is run. We only use it to do certain things like that require root privileges. Or on some newer kernels, you can do it entirely in user space, and in which case, then you're using user namespaces instead. Yeah, that's all we have time for. Let's thank the speakers again. This button goes green. It's OK. Matt? Oh, thank you. It's not bad, right? Yeah. Go ahead. Plastic. I have a badge of all the things on the table. And a piece of it. It's all right.