 Thanks again everybody thanks for coming back in we will now begin the exciting porting portion of the afternoon Actually, I think it's already been fairly exciting, but this is another thing that I'm There's a theme in this room. It's all stuff. I didn't know three months ago Like all of it. I thought I was pretty smart, but I was wrong apparently So this is a talk about in unicernals It turns out that a unicernal is a very interesting thing with a lot of cool uses that I had no idea about Our distinguished guest Uli is going to start talking about them and then our intern Ali Raza Has been working with us all summer and is working on a PhD in this subject. We'll continue from there. Thanks, Uli All right, I want to keep it short So Ali is the one who's doing the work and he deserves to speak for that. So I just want to say that unicernals have been and orn will Expand on this if necessary. You have been of interest in academia for a while So when we are talking about something in this area, which then also is useful for Red Hat specifically or the industry in general Well, we don't like to start out from scratch and do something else and expand our code base Which we have to maintain and support tremendously So when we discussed this a couple of months ago, so we came to the conclusion Well, let's try to do something else Let's try to do something where we are not expanding our code base that much where we are trying to use exactly What we are already doing as Red Hat So last corner links user land environment and try to do something which is Provides the benefits of a unicernal which Ali will explain in a bit for those who are not aware of this But in the same time make something which we can actually support and that's going to be the Entity what Ali will present is yes. This is actually possible and For us this means that perhaps we will actually be able to base products on that and we will do Be able to do more research in exactly that area going forward. So I might just speak a couple more words in the end, but for now, let's talk. Let's Yes, so I don't have a display over here, so I'll have to just point there So hi everybody, I'm Ali and I'll be talking about the project that I did over the summer Unicernals based on Linux my advisors for the project are Oli and Richard Jones from Red Hat and my PhD advisor professor Oran Frieger from BU and Also from Boston University my fellow PhD students Jim and Tommy who've been also involved in this project and have been helping with this project So first questions first. What is a unicernal? So this is what a normal kernel looks like there's a separation between the kernel space and a user space The kernel contains whatever the applications might need file systems device drivers the core OS functionality for example scheduler or memory management things like that and multiple applications run on top of that But in a Unicernal, there's just one application which includes only the required functionality in it as a as a library Whatever is that is come contained in this one binary, which then runs on top of the hardware This is what a Unicernal is Now the next big question why Unicernals and the answer is really simple because they're great So why are they great first of all they're lightweight because The Unicernal only contains the required functionality the bare minimum required functionality that at that an application needs and nothing else Which means that for example, let's say if there's an application which does not need the file system support The kernel will the Unicernal will not have the file system support in it Which means smaller attack surface because there's a lot less code to exploit Also, the Unicernal when booting up will only initialize devices and other things that the application needs So that results in faster boot times Unicernals also give us improved performance because since there's no Separation between a user space and a kernel space there. We do not do not incur the ring transition overhead And with the Unicernal we can do application specific optimizations For example in a Unicernal you can have very efficient Extremely stripped-down network stack because there's only one application that uses it and there are many Unicernals out there for example abrT Click OS Mirage OS and many others and as I already said they give us huge performance benefits For example abrT gives us more than two times memcache D throughput as compared to something running on simple Linux links own website which runs on link Unicernal takes only 25 MB of memory and Network processing software made on click OS the Unicernal it processes more than five million packets per second and its Booter time is under 30 milliseconds. So as As we see Unicernals are great, but They're they've mostly been limited to academic and research circles. They're not as widely used One of the reasons is that they are not general purpose which means that you cannot just take any application and Compile it into a Unicernal without modification So what we're going to do we're going to look at why that is the case and what can we do for it? So starting off with the development model of Unicernals This is one of the main reasons that Unicernals are not as widely used There are two approaches to basically build a Unicernal that have been taken by developers who develop different Unicernals First is the clean slate approach where you start fresh write the entire Unicernal yourself build it from the ground up and second is that you Excuse me second is that you fork an existing code base for example Linux or net BST change it modify it Do all the optimizes you optimizations that you want strip it down and make make a Unicernal out of it As we discussed this is a great approach. This has Nice advantages to it. For example, the developers have total control over the code They can do different optimizations that I want to do and this results in great performance benefits But it also has its drawbacks For example, you cannot just run any legacy application on these Unicernals without modification Since there's no community around these Projects for example as compared to Linux or glibc where there's an entire huge community which they keeps maintaining these projects These Unicernal projects, they don't have that community around them. So they are maintaining and testing nightmare So what can the future be what questions can we ask? The main question that we ask is how can we change the development model? Can this Unicernal be part of the Linux code base the Linux project on and the glibc project in? This developer in this development model the main thing would be that you just need a Unicernal that works It does not it might not be the best Unicernal are the most optimized Unicernal out there It just has to work and as over the years as Linux has undergone Incremental improvements this Unicernal when it will be part of Linux code base or the glibc code base It will also undergo incremental optimizations And also since this is a huge community they will keep maintaining this Unicernal as part of the Linux and the glibc code base What are the advantages for it first of all we don't have to reinvent the wheel we can just simply use the entire Linux code base Which means that we have unchanged API for developers and for legacy applications and We can support all the different device drivers file systems which results in a Unicernal Which can run in virtual machine as well as on bare metal. That's an important thing So imagine with all this if if this were true if we could do this Imagine a Unicernal with GPU support. How cool would that be? right so Now the big question the big elephant in the room is it even possible? So yes over the summer we built such an Unicernal which is based which is based on Linux what I'm going to show you is that Unicernal booting up We have a simple TCP server running on it so As soon as I can figure it out So you will see the kernel booting up and then you will see the server come up So if you can all see this is this is what we've done over the summer a Unicernal on Linux Which runs a simple server and it works Now what we'll discuss is what were the goals for this project and how did we actually do it? Goals these are based on pretty much what I've discussed before We want upstream acceptance We want something which we want to make as little changes to the Linux code base and the G-Lib C code as possible So they we have a higher chance of this getting accepted upstream We want an unchanged Linux API. We want this Unicernal to be deployed on to be able to deploy it on virtual machines and bare metal and We want Unicernals which are application specific Talking about the architecture. This is a normal Linux architecture Know how a normal next looks like there's kernel space user space. There's applications user libraries than the C library Normally what happens is that the application makes function calls in the C library and function calls into user library The user library then makes function calls into the C library and normally it is the C library which then makes System calls into the next kernel now since this is a Unicernal. We don't want this system call functionality What we have in its place is a very small UKL library and what that does is simply C library makes function calls not system calls function calls into the C into the UK library which then calls appropriate code in the kernel and All of this together is our Linux based Unicernal and when we compile it This all becomes one big binary There's no separation between different libraries different address spaces This is just one thing which runs on hardware Now this is what we've done currently this is what this is where we are now But in the future what we want is we don't want this UK library to be there We want some of its functionality to be part of the C library and some of its functionality to be part of the next kernel And this is what we want are finally Linux based Unicernal to look like So here are some of the implementation details So this might look like that in order to build this Unicernal based on Linux We had to do so many different changes to the Linux code base the glibc, but that is not the case Here we have the main dot c file from the Linux next tree in it Just before the user space is started we add our own function the K main which comes from the application So just before the user space is started we call our own function This came in now. This is some application code this came in as you can see is provided by the application Continuing with the example of the TCP server that we discussed earlier that I showed you booting up This is simple TCP server code as you can see the headers are same as what any application headers might look like The main thing is the initializations code now Normally the initializations are done by system D But in our case we have to do them ourselves In a initialization code will run before the application code will run and it will do the application specific initializations in our case We are bringing up the network interface in this TCP server example, so that goes go code goes in initializations after that the rest of the code remain same and and an interesting thing you look at the bottom is that this function does not return Because this is just one process in the current right now if this returns Then the kernel will be rendered unusable. So this function does not return Now coming on to how we compile and build this and run this As compared to how you will compile and build a normal application what we're doing with in our case is We're compiling the application code We're also compiling the ukl library and we're packaging both of them together into a ukl.a and archive After that after making an archive what we're doing is we're borrowing a lot of functionality from the Linux kernel build process So while we're making the kernel in the linking phase we add our ukl.a our archive So it gets linked at link time and that becomes one binary which can then deploy it Right now we're testing it on QMU. So the the video that I showed you was this same this in this unicernel running on QMU In the future what we want is We just want to call make on a user application We just want to say okay use this compiler ukl.gcc compiler and that's it You run make on your application and it should give you a unique kernel Which can then be deployed on bare metal on virtual machine or on some emulator like QMU So there were a lot of positive outcomes over the summer the things that we can build on for example We just added one line to the Linux code base The glibc code the redirection code which instead of making the system system calls in the kernel Which does function calls into the ukl library that is in a separate sub tree API remained unchanged and Because of all this most of the user libraries will not have to be re-compiled for unicernel for our unicernel The ones which do make direct system calls We just need to change the name of the function and they can also be used with our unicernel So we believe that with these modest changes our chances of getting accepted upstream eventually are very high So what are the next steps There's lots to do first of all as I said we need to take care of take care of the initializations That are normally done by system D So whenever an application code runs before that we have application specific initializations that run Also we need config options so that user based on their application can tell okay This is these are the things that I need these turn on these this functionality and not this turn off this based on What the application needs and does not need So that our unicernel is extremely application specific trim down We want support for p-threads and c++ Also, we need to clean up everything as I showed you earlier right now We're borrowing a lot from the Linux build process We want to clean this up in it should in nice and pretty way that this can be finally accepted upstream also We want final of automatic optimization so that instead of users telling us that this is what the application needs we can just look at the code and Only include those things that are needed by that application. So this I think I have my work cut out for my phd so Based on what we've done and what we plan to do there are a lot of Interesting research questions and interesting questions that we can ask For example, what are the performance and start time? Benefits that we can get for example, if we don't do any optimizations at all just run it as it is What is the benefit the performance and start time benefit that we can get as compared to simple Linux kernels? If we only do link time optimizations What are the benefits that we get then and also after that? Building on what a lot of other research unique kernels have done for example customized code paths and preemptible threads Sorry, non-preemptible threads things like that How can we get those optimizations into our unique kernel and then see what are the benefits that we can get? Also, once we add all those up to keep adding all those optimizations to our to our unique kernel Can we still keep the changes to a minimum so that this can still be acceptable upstream? And also this unique kernel model. How can we you know get benefit of its security? Guarantees that it can give us and we want to look at the Performance and security benefits with all the different use cases that are in use today For example, normal cloud workloads memcache D. How can we improve the performance of memcache D and give better security? guarantees, how can faster boot times help functions as a service Can we explore these benefits in other use cases for example HPC and embedded systems? And also for example the self storage if there's an application which is extremely IO intensive What can we do to optimize that in our use case here? So yes, as I said based on what we did and what we're going to do there are a lot of interesting questions that are out there and We'll be looking at them and now I'll ask Uli to Give his comments in his input and where this project is going Should be back on yet. Okay So hopefully you're a little bit more excited about the topic than before if you didn't know anything about it before Now it should be a little bit more excited. So the main benefit is that we don't have to change much code So that is makes it this is a prerequisite for record actually using this in some future So if we actually get to the point that we can't do this We have to just recompile the kernel and some of the low-level libraries with a special option And we get it to be usable in as a unicorn So the you saw a couple of slides back there which we call in the moment UK LGC So however, we are going to call this in the future is remains to be seen But the goal is really that the build process the only thing which a program has to change is to change use a different compiler So we will have to get from the kernel build process all the code which makes up the kernel normally as an archive That's the part we can get the startup code Some of it perhaps to be dynamically generated when the program comes up based on the configuration of the program the user is providing Then the user code see by itself Plus perhaps some third-party archives of some sort and as Ali mentioned We don't even have to recompile them in many cases if these archives are not doing any system calls The ABI hasn't changed and we can just reuse them So all of this gets linked together. We have something which can boot then Regardless of what it is and what it is running. So parts of the work which have to do as well is make this more generic So we might have to add initialization things like DHCP server clients He said for these kind of things, but we know about this and where we're on the track to do that So what you see here on the slides for instance is something which is a little bit alien to me So this is not my world for things like function as a service So I talked with Oren about this and he said yeah, no problem We can port in a JVM on top of a unicorn in a couple of days because he has done that in the past So I'm looking forward to that. So Okay So no, she has some Python on that is very much. So for those anyone here know a micro Python So micro Python is Python implementation for micro controllers So I wanted to have this basically a normal Python in the unicorn which is the equivalent that for my larger machines Which just can do that and as Ali said this has implication not only for performance It also means that some of the attacks which might exist out that will have no effect because all of a sudden these programs looking completely different So but the function as a service the main thing so back to that kind of thing for function as service the main Draw is that you don't waste running the code if if there's no user around So you only spin up the code when it is necessary and for this to be useful the hysteresis of the Of the control system which decides when you shut down the process and when you start it up again Has to be as narrow as possible to be efficient as possible and the first the startup time is of the process The narrow you can make it the more energy can save the less resources you're using Normal color boots up if you tune it you can make good it down to let's say 30 seconds this thing We can perhaps put up in a millisecond So it's much more efficient than that We can just kill it and bring it up back up again and have the same function running again So I think this this and the other part which I want to mention as well So in my life in my career here. I'm mostly Interesting not in in Java services and these kind of things I'm mostly interested in high-performance low-latency computing etc among them are some machine learning now Which is just machine high-performance computing and what has been bothering the HPC community at the very least For a long time is that they don't have control over the machine as much as they want So they are doing things like core isolation now and moving interrupts a specific course and all these kind of things Imagine if they are combining the application now as a unicorn and have 100% control over how the different courses that were used Vince But even in even in that use case you're still saying that this would have to be invocable through something like QMU or otherwise Could be but and so even in the HPC cluster and functions as a service the thing would be running likely as non-route like Yeah function as a service yes, you would run this in the container virtual machine And bring this up there. So this can boot on naked hardware It can boot on a virtual machine. You don't need QMU under this QMU is just for the development This is this just like a normal Linux kernel, which you would have instead. It doesn't spawn the PID one It does the work internally That's the only difference. So this is what's what Ali showed there in the slide So in a D the exit V is the one which spawns in it The PID one we took this out. We run the code directly. So this is actual code Which can boot on the bay a metal or virtual machines. So can I jump in for a second? So two points first of all I never said two days But couple one thing I just want to emphasize here that it's been said but just is really important to me is that you know there's all this work going on now because we're dealing with really high-speed devices and SSDs and and Networks are going up to 100 gigabits and people are mood using DP DK and SP DK and doing things in user level and bypassing the operating system and You know this approach gives you a totally different alternative way of addressing these problems You know hugely less complexity to recompile your program in this way have a part of the kernel And then we can start seeing incrementally how to get rid of copies and things like that So I think it's going to be a much much more efficient way And the second thing is because the application and the kernel are bundled together It's an encapsulated object so you can snapshot this thing you can move this state around you can replicate it And so it becomes a manageable object in much easier way than what we have today And that's what these guys are actually looking at with function service One question we are on like security in this way when you go to compile it Do you only pull in the pieces of the Linux kernel that you care about so like for example file system drivers? Blah-blah-blah, etc. Yeah, so that's the that's the goal So for the time being so Ali mentioned that we are going to work this based on config options But later on so there's a mention of link time optimization So in this path in the compiler we actually know a lot about global information about the code We have because we see all the code So if everything is compiled with LTO enabled we actually not only have the source for the binary cotons on We have all the meta information still available there you can run information discovery based on that you can find out Oh, yeah, there's never going to be a file system access. So we compile this out. There's never going to be SCCP SCCP protocol we compile that out. We never have TCP compile that out So imagine how small the attack surface all of a sudden becomes So you said that there is going to be little changes needed for the user space or the kernel program But just curious if there's any sort of Programming constructs that would be encouraged such as avoiding floating point or mutex's and semaphores in the lake Yeah, so jump in if you're wrong. So now the the thing is that The only thing which you cannot do is call fork So this is the one process environment And everything else should work So we have our work cut out so part of the slides was that we have to enable TLS we have to enable C++ initialization all these kind of things constructors this remains work to be done But we're expecting that we actually can have the the ABI on API being 100% compatible to add what it is and the The mutex and so the synchronization parts and so on so multiple threads We will support of course we have to support and the synchronization is using few taxes and the kernel internally is already using few taxes There is nothing in there. The only thing is that instead of having a system call We are actually using function calls in time internally Okay, I Just had a quick question. You said talked about running Java and I see all kinds of alligators Right if you have dynamic class loading, then all of a sudden you weren't using a particular TCP and now you are So you can't know ahead of time We do you that's one of the limitations which you have there So we need to have not a JVM binary. You need to have the archive which makes it up and everything is statically linked Dynamic linking will not will not well at least not initially supported. So theoretically we could do these kind of things but then we would rely on Not the whole program and another analysis to actually decide what to leave out or not then it will become a user configurable and The other one What you said you're limited to one thread right no one process one process so I can start up several GC threads and yeah, no other things, okay Big difference post post expose post X thread post X process So in other words or an already did this work in the past and this is why I claimed he said today's work Have there been any upstream comments yet or we haven't passed this on to upstream right now, so but the as I did describe it so far the only real change in the corner is the single line of code which we have to do that There will be a little bit more is unquestionably be there, but we can isolate these kind of things We will have discussions with the folks So the C library part is even less controversial So after fixing a couple of bugs in there only thing which I had to do is add some Subtree to the source code and it hasn't has it's not touched by any other configuration. So I see no objections Do you have a gate link or I could help link to that? Do you have a gate or it could help link to that not yet? So it's all private in a moment, but we are going to have this at some point soon Clean up we want to clean up a lot of the build process before we go upstream I Really got it the server to work a couple of days ago, so there were a couple of challenges and he had many papers to write lamest professors From a security perspective So if the application in the kernel are separate when the application is compromised and the attacker also has to find the Fault in the kernel to also have control of the whole system But when you combine them all into the same memory space then when the application is compromised the whole system is now Compromise right so is that not so but the problem with compromising a kernel is only that then you can use it to leverage Compromising the other applications running on it, but if there are no other applications, what do you gain?