 So hello, I'm Daniel. This is Gregor. We are going to present this brief talk on HPC Unicornos and OpenStack. So Let's see how this works. Oh, yeah, so now we have here. We have the talk structure We're gonna have it in two parts first part is about our project Michelangelo This is necessary because we are promoting the work from there And then we're going to go dive in into Unicornos. This is going to be done by Gregor I'm gonna be sitting somewhere here Later on in Q&A section, of course, or if you need to ask any questions just find us out hunt us down problem Okay Let's start Okay, so this is good motivation this gets you out of the bed Each morning and the basically good old Chuck Norris always says something and this time It says that we as a project want to make your computing more efficient Let's go into details. So What are the what is the problem we are trying to solve namely We have HPC centers, right? These are big things usually a lot of course a lot of Big clients a lot of big fat simulations Like like this one here. Oh, maybe not like this one here For that, of course, you need this kind of machinery, right? But on the other hand, you have a lot of small companies who do a lot of a lot of simulation on their own So basically small simulations small things Which actually build up to these bigger things think Boeing and subcontractors. So this kind of stuff These guys, of course are here Now HPC centers want to cater for these guys, right? They they see a huge market There are companies addressing this market like the uber cloud or even European projects like fortissimo fortissimo two and so on Also big software packages will provide you with the capacity to run your workload in the cloud or in HPC cluster something like that And these big guys have a problem. They're not flexible enough. They have an issue with changing their notes with I don't know being able to to get the load from some workstation and just run it without the queue or something like that so They know they have a problem. They know they have a market and They're trying to address that and the other side of the coin. We have these small companies small to medium enterprises Of course, what do they want to do? They want to have maximum performance on their Already bought infrastructure. They don't care basically. They they have the machines They just want to use it completely to the last Cycle of course If it's okay, if it makes sense for them like total cost of ownership and so on and so forth They would like to avoid when they're locking obviously, so they would like to go for something that's open What's our offer so this is what we are offering to these guys This is improved virtual infrastructure. So for HPC guys, we want to give them in virtualized infrastructure. So software In a way that CIS admin in I don't know HLRS HP Center in Europe won't just Start jumping around about the last cycles for SMEs. Of course, we want to give them really fast virtual infrastructure Just By offering something that's already standardized and so on Okay, so now, you know what we are trying to solve. What's the problem? We are solving now Let's go a bit deeper into overall approach then of course unicorns So when we start a project we set out a lot of big lofty goals like improved responsiveness Agility security of virtual infrastructure like big things when we looked into this matter. We saw that Poof a lot of stuff has already been solved. Okay Then we said, okay, let's try and see what's not been solved and I owe virtual I owe hasn't been solved really and when I say Virtual I owe mean ingest in host in between also when you when you look at the things like General efficiency like size of images and so on Of course the second thing we wanted we said, okay when we are building this We don't want just to go for clouds or HPC cloud whatever that is or just HPC's we want to build something That's like okay that you can deploy it everywhere just take some components out of it and so on of course With these two goals when they had we then had additional issues like we want to have stuff upstream So it doesn't just die after two years of the after the project is finished We want to have a standardized. I don't know like using standard Cloud management software like I don't know open stack for example And of course the packaging of applications streamlining of this Adding some monitoring security, which is really important of course and so on just to remember everybody This idea is not really new. It is the first part with unicorps and so on we had Lightweight kernels back in the day. You have the timeline here is being taken from some of the kitten presentation So you can see it's already quite old Now if you look the overall approach I said, okay, we have lofty goals Now let's try and see what the layers actually are so on the lowest layer. We have guests. We chose OSP as our unicolonal We have here I will I optimization of course Application deployment because the original one was so so then we have KVM. Everybody knows it We added components like IOCM Z corks coming from IBM That just improve the IO of course between these two virtual RDMA And so on if you go to open stack what we are doing actually we are trying to we are supporting of course all the image works of glances Just working with OSP. We have heat templates cloud unit support and so on and of course then we have some nice demonstrators like on the right Typical HPC applications talking about Fortran Open foam, of course, then you have cloud bursting examples also some stuff a big big data and so on We built up the security a bit on the host side. It's finally called scam And of course we added monitoring from Intel Cold snap What do we want out of this? Well? Xlap is a company And all the other partners have this kind of stuff in mind when they say expectation Right, so like adding or getting some metrics analytics inclusion of file in IoT products or data centers And of course HPC cloud for the biggest HPC providers Consortium is here. I think you know most of the names for the smaller ones Which you don't know get back to us. We don't want to lose any time with that And okay, we are done with this introduction. Take it away Thank You Daniel Let's see is it working Hopefully it will so the majority of this talk will deal with the unicernals So basically we'll start with introduction so that everybody knows what we mean when we did when we say it a Unicernal then we will discuss how this are actually used in a real world world. So basically real real applications that are Foundations of the Michelangelo project And then we'll show the case studies how this was was done For this application and also trying to discuss briefly what the plans for the project Are in the future so basically this is this is the Diagram that shows what what the purpose of the of the next 20-30 minutes will be At the bottom layer we are going to discuss about the unicernal the application packaging the application management And of course also the integration and the use of open stack resources to launch Unicernal applications there Really? So to get started Contrary to to ordinary operating systems that we are accustomed to right now So Linux mostly Linux The unicolor is comprised of just those three main components So we have a bootloader that will ensure that the kernel which is rather small Including just the major components major drivers that are needed to run in in the target environment And then we have duplication libraries and the data So there are no redundant tools no redundant drivers that are not needed in the target architectures Or deployment scenarios for for the applications So the kernel will only contain the following few components basically the drivers for for the For the block devices for the network devices We'll contain some standard libraries some standard tools and of course the file system sorted Things can be read from the disc and the application itself. It will just contain what it needs to run on this container image on this image and all this is wrapped into a bootable Machine image. We are not explicitly mentioning that it's a virtual machine Although at least in the scope of the project. We are mostly interested in virtual environments But in the end it's not necessary that these bootable machine images are virtual so they can be deployed in bare metal as well We have decided to split the Unicernals into three main types the first one It's it's basically the new the new era of Unicernals They are language specific meaning that they require specific language to be used and they require Specific tool chains to be used in in this context. So for example the Mirage OS will will require Camel Then there is Haskell, Clive is for Go and Include OS is for C++ applications In order to use this Unicernal, these four Unicernals for example, you have to rewrite at least parts of your applications So that the Unicernal will actually launch the source code of your application We've added one specific purpose Unicernal which is Firmit core As the authors say say on their website It's not yet production ready, but its mission is to support HPC in particular So HPC applications in particular and then there is the last the last type Which we call the general purpose and then so the two the two representatives here are Ramprun and OSV They are completely different in their architecture and concepts But the main the main idea that they share is that Most applications that are available on current systems should be Executed on these two types of Unicernals and as Daniel said before Our Unicernal of choice is OSV as the creators say it's a new operating system designed for the cloud and Here are some reasons why we decided to use the OSV First our goal is to support as much as many applications as possible out of the box It's unique because it really provides some general general purpose Baseline that is solid and it works and it's mature And it can also run different types of applications written in various languages Running on various types of runtime engines next It's I Won't say it's fully standards compliant, but it's probably the most of those that I've mentioned before so In in in a sense if you have an application that runs on Linux on POSIX systems It will most probably run on OSV as well It has KVM support as well as some other hypervisors of course and KVM was our choice in the project It has demonstrated superior net sorry network performance in the past And one of the main reasons as well is that numerous existing applications Are there so basically we don't just see some simple applications that are there to demonstrate that it works But real applications have been used with OSV and then the last one It has nice community Collaborative it's really nice to work with with them. They're also a member of the project Just a quick show of the overall architecture The OSV kernel is on left top as you can see there are just few components Of course, they are not not all of them But as you can see we have the driver for the network card We have the driver for block devices then there are some modules In particular the ELF linker is one of the important ones as you'll see later on We have memory, thread management and so on and on the right hand side We have an OSV image instance which is composed of this kernel and Some other modules that we'll see later on in doc Okay, so some of the advantages of a unicolonel first of all they are really lightweight So the images that we get from unicolonels range from hundreds of kilobytes to a few megabytes So there's no unnecessary overhead that typical operating system typical cloud images provide in their base images So this is of course efficient when we need to launch multiple applications multiple images On a distributed nodes as well as of course its energy more efficient. It boots extremely fast In particular for the language specific unicolonels we are talking below half a second And for some others we were talking about one second two seconds something like that. So it's really fast to boot most of them or I will say Yeah, most of them require or support single address space So we have a single user running everything in a single address space So we have a kernel and user applications. They are all in the same address space There are no switches between the kernel and and user module mods modes So we also don't don't need any any permission checks because everything that runs within the OSV or the unicolonel image Is owned by that single user Last oh, sorry, not not so the next is there's no or at least little legacy code that would Stop or prevent from enhancing the kernel itself and last They are somewhat more secure In a way they are secure because they are running in a virtual machine and also because they are so small and they could They contain only the things that they need The attack surface is rather small for them. Of course There are plenty of disadvantages even though we are listing less Then there were advantages I will I will Discuss each of these disadvantage disadvantages next so In a way, we said that single address spaces is an advantage because we don't need any permission checks and any any additional constraints Limitations and so on but there is also a problem with that Then there are missing functionalities in core kernel and core libraries No, no, not everything is provided there It's rather complex Or it used to be rather complex to build applications for unicolonels This these things improve and there are some performance concerns So now we go to the unicolonels in the real world So what it means to use the unicolonel for your applications So the first one as we said before single address space Which means as I said before kernel libraries user application. They are all sharing the same the same address space Essentially, this means that We are not able to fork in a unicolonel. So we have a single process running But we can run multiple threats Whenever whenever for fork will occur the kernel will most probably die so This talk is about HPC as well and most of you know MPI of course and you know that MPI Uses Processes so whatever you do with MPI it uses processes So this this was the first whole hurdle that we have to overcome when we were trying to use real applications with the with OSV and For this to work we actually had to change something in the kernel itself So it's not it's not easy to or it's not always possible to replace a process with with a thread So we sometimes it is but most of the time this is not Visible for example, if we just look at at the fact that the MPI will always pass Different parameters to the process through environment variables And this is something that we cannot do with threads because they are sharing the same environment And this is the first thing that we had to change in OSV So that we are able to pass different environments to each of the threads running in the same unicolonel The second thing that we had to do is to change the the MPI in this case we took open MPI Implementation which is actually using The SSH so whenever you you request a work on Using MPI it will connect to each of the target nodes and it will through SSH And it will request workload to be launched and what we did with that is that we change the component required to to supporting these process To launch threads in different OSV instances instead and we have now working working version of this Basically, we are able to spawn as many as many MPI instances as possible OSV instances and launch MPI applications there Nope This is not what I what was that sorry about that so This moves us to the second limitation of missing functionalities Frequently you will you will find in in a unicolonel that there are some missing system calls. There are some missing library Functions in a in a standard library standard libraries. They are just not there The good part of this is that these are rarely used in a context of a unicolonel So basically these calls these functions are typically used in multiprocess environments that make no sense in in the unicolonel and Frequently when we were dealing with this kind of problems Stubbing those functions. So basically just adding them to the library not doing anything there solve the problem of course not always and This is one when the community comes in because the community of OSV kernel Will resolve most of the issues rather rather quickly Okay, the next thing are the applications So if we are to use a unicolonel We will not be able to use unmodified binaries in in any way We will have to do something with the application with the source code of the application in order to use it I'm not talking about native applications Of course so written in CC plus plus Programming languages if we are to use Java or Node.js or whatever you will be able to to just deploy that Onto the framework of course, but if you have a CC plus plus fortune application You will have to recompile at least recompile the application So this is the first one the typical changes and The other one Sometimes you will need to actually change the source code of your application to make it work and Of course the process is there Compile compose run debug and so on so basically you you really have to to add this layer to your workflow So that you're not just compiling The application to run on your hosts, but you also have to to have another Another step the compose step and then run step with a unicolonel to make to make sure that it really works in there Okay, and then comes what we've been doing in the project to support To support this last step so basically compose and run applications in a unicolonel So we've decided that we would like to have Support for application packages that are basically composable building blocks So we have we we take the kernel itself So that's one module that we would like to include in our image Then we would like to have the command line interface for the kernel basically allowing us to to run Some simple shell commands in the unicolonel then I would like to have the open form application Which is comprised of different modules and and also requires the open MPI application I'll try to show you in the demo after this part So the package itself should be self-sufficient It should contain everything. It needs to be run to to run And it can of course do this by itself or by including some dependencies So basically providing what it needs to to run in there and the structure of a package is rather simple on the left-hand side we have The only mandatory file here is actually the one called manifest So it's basically a package manifest file that will define some basic information about a package everything else is Optional and the right-hand side of this of this image is basically a verbatim copy of Of a tree structure that you have in in a package directory So whatever you need to run the application there. You just put in a folder You can use so you can use arbitrary directory structure In this case we've used you user a leap and user bin to indicate that there these are binaries and these are some libraries Okay when we have a package we need tools and There are two tools in particular that we are working in the project first is called the capstone tool which is an previously Developed care tool that we extended to get to give this notion of an application package What it what it supports? It will allow you to manage your packages your applications So it will facilitate the initialization of a package Collection of the data the files that are stored in each package composition of virtual machine images So we have a tool that will automatically build a runnable virtual machine image And also a tool to facilitate execution of this images either in local environment or in in Open stack in this case. There are also some run configurations which Simplified the execution by providing standard ways to run your applications so that you don't have to type long commands Because unicolonel doesn't have a shell in in in the way that we know it in typical Linux, so it has some Rather simple shell command or shell shell command line interface that allows you to do Do something with with a unicolonel, but it doesn't have a complete complete shell The the capstone tool also has support for So basically for pushing virtual machine images that we've just built Onto open stack and also to run to run this images there We also provide some package hub, which is a set of packages that we are working on in in the in this project and Then there is another tool called unique rather new tool It's being developed by some guys a group from emsc It's really a nice tool even though it's I think it was first released in May this year But it always supports Plenty of the unicons that I've mentioned before and also plenty of providers So basically the back end infrastructure providers that can be used to deploy these applications And what we've did what what we did for for this or what we added to unique Is the integration of the capstone tool that I mentioned before so that we have the notion of an application package in Unique as well And we we have had it more advanced image and image and instance management in Unique for open stack for the network in itself and The next step is to support the volumes as well Okay So I think this is the last slide before the the demo The nice thing about what we've been doing is that it mostly works out of the box with open stack So these are the services that we are currently using to to To add images to add those images that are composed from application packages into glance To use Nova to to launch these instances there, of course There's some networking, but it's not really related to the to the kernel itself It's a port orchestration and we also added some extensions for the for not for the OCO for the Unico But for the application itself for one of the use cases that way we are using there Okay So I think that we have 10 minutes Let's see if we can get this demo Running So it's working Can you can everybody see there? Okay Okay So this is this is what we call a package directory as you can see we have the meta file Which I can Display what it contains so we have some some information about the package the name the title with the vendor and also the required packages The that the required packages that we need in order to launch such an application And in order to launch or to to build come to compose the application from this package We can use just the capital capstone tool Can pose Let's see like this So what this will do is it will collect all the information from the current directory Also include everything that it's found in the package in the required packages and build a runnable Machine image and it already did us that so we can just run the application now With its name I will add The command there and Stack demo demo one. I think it's the one Let's see if that will work What was it cool? Demos thank you demos demo one Like this so we have now a simple shell shell in this OSV instance And that's running on a virtual machine image on this on this Machine so as you can as you saw before This one also has a simple form application there Used and those of you that perhaps no open form Will also notice that there is a case. So basically the input data file Used or yeah, you use both by open form whenever you would like to launch the simulation So in order to show this Let me try to do demos demo one and In a few seconds it should start running the simulation and it's using KVM here So basically we have a new virtual machine image that's running the simulation on this on this computer like this Okay, I think we're running out of time. So I will skip the other demos. So We can discuss more than was later on Okay, I will go quickly So the use cases or the case studies in the project We have four of them. I'll present three because did the fourth it has not been started yet and the first one is the aerodynamic simulations that are used by by Company from Slovenia They're using open form Again, if you know open form, you know that that's a large source code Repository containing a lot of different applications different libraries used there But the good part was no changes To the code itself were required to support a unicernal There were some changes necessary in the build system mostly the one that that Needs so OSV needs to have shared libraries in order to launch the applications in in a unicernal And there were also some build dependencies missing in open form itself So this is what we had to change so that we were able to automate the process There were a few missing functions in osv and standard library that were added rather simple calls not Critical for you for for the simulation, but of course they were necessary in order to launch there The good part is that as you've saw before We can run and modified commands So we don't need to change anything in order to launch applications in in a unicernal And we also have multiple pre-build application packages for for the open form So open form is comprised of multiple applications multiple multiple solvers and all of these just works using these packages The other simulation is also an HPC application is the bones tissue simulation It's proprietors simulation codes source code There was some modifications in the code, but again irrelevant for the simulation itself So there was some missing functions in OSV, but in this case We we simply removed those functions from the source code itself Recompilation was was necessary and again, we can run unmodified commands The third use case or the case study is the big data There are three tools mentioned in this slide to page DFS Apache Storm and Spark And we have the HDFS implemented already There were just two forks in the code in the Java code Which are again completely unnecessary for HDFS to run to run there and again We can run unmodified commands there for the storm and Spark. It's let's say slightly more trickier but we have plans to To implement this support so that we'll be able to launch storm applications and storm workers and spark workers Inside Unicarnals Okay, the evaluation here is rather simple. It's preliminary We have plans to add More complex evaluations by the end of this year But just to give you some idea of what we can achieve at this point with the Unicarnals So basically the project is ongoing and we are investing a lot of efforts to to make this work in in this kind of environment, so to give you an idea of an image size Perhaps that's not fair comparison, but it's nice because it proves that we are smaller So we we have a Ubuntu based image. It was cloudy cloudy image We added open form there and made a snapshot and we got an image Let's say image snapshot that consumed two gigabytes of disk space On the other side, you have OSV Where the entire image containing the kernel and open form sources the open MPI library Without the the input case of course took only 65 megabytes. Of course, this is much more Easier to distribute over the network the provisioning times We have two Two for Linux So for on on first boot you can see that the spawn time is rather long it took more than two minutes to launch the application on on on on the next core instantiations provisioning the this was much much faster, but again OSV Seems to outperform due to its small footprint and small initialization step. So basically the booting is trivial the runtime Again, rather small use case showing showing Two in three interesting parts. The first one is that we are getting one to two percent slower execution in OSV. We're still investigating where the time is lost for that and the other one is on the right-hand side so this is this is one Vm running on a single node what we with two numar numar nodes and as you can see the Linux quite a quite well outperforms OSV And it's just the problem here is that OSV is not numar there at the moment The the the middle the middle case here shows Two VMs each running on it on on on its own numar domain And of course the communication there is between between the VMs is using TCP IP communication and the reason why the last the last part the last graph is In the last case the OSV is even slower slower than in the second case Where we are actually using TCP IP to communicate is because in the last case the OSV or the MPI itself doesn't know that they that there are different types of Processes so in the second case the MPI knows What it costs to communicate between the first four course workers and between the other four and between them But in the in the last case this information is missing basically the MPI I think that they they are all equivalent so it distributes the workload in a In a worst or in a worse Scenario making it slower Okay, quickly go over the future for the application packaging for the application management This is something that we would like to get to so that what whatever you saw today We would like to to have it integrated with with open stack applications So we would like to have applications not as packages and not as capstan package compose Tools we would like to have this as something that a user can grab from from the application catalog Compose automatically and deploy on the underlying infrastructure. So this is one thing and From the perspective of the entire project They are let's say two high-level goals one is for the compatibility. So basically improving What can be run on on the osv Unicernal and a lot and then we have the performance Where we are working on on this So one is the dynamic our workload manager that will The dynamically adapt or fix the I of course necessary to execute your applications Then there is a power virtual rdma driver that will be built into osv and linux as well security in hypervisor Full HPC cloud and integration Okay, I think that's that concludes that's it