 Hi everyone, I'm excited to be here at QCon North America and today we're going to tell you about one of the CNCF-6, CNCF-6 runtime. We hope you can learn about the SIG and maybe you get excited about contributing to the SIG or also contributing to one of the projects in the SIG. So, SIG Runtime is there to enable the successful and widespread execution for all kinds of workloads and that includes any latency-sensitive type of workload or any type of batch workload or even some really specialized workloads that are maybe related to specific hardware like CPUs or FPGAs and all of them with cloud data environments in mind. We have two TOC liations, CNCF TOC, currently we also have three chairs and also a tech lead. Our meetings are every first and third Thursday of every month and we meet at 8 a.m. Pacific time. Our communication channels are the email list and also we have a Slack channel. So, what is it? So, SIG Runtime is at the forefront of the cloud-native space. So, we want to reach out to new and exciting projects within the SIG Runtime scope. I want to help increase that contribution to those projects and also contributions to the CNCF and we also want to support some of those existing projects or other projects that help navigate that CNCF universe. There are many projects in the CNCF and it can be quite confusing about which projects are needed for what or what the capabilities are for each one of these projects. We also interact with other relevant SIGs. So, we interact with SIGs that are in different technology spaces like SIG observability or SIG app delivery or other SIGs like SIG contribution strategy which helps projects set up a framework on how to grow and how to become more popular and get more contributions. Another one of our goals is to educate and users and also educate the community on the projects, on the technologies and some of the other SIGs and also the CNCF. So, in the scope of the SIG, we have several projects. Some of them are in different stages in the CNCF. Some of them are not even in the CNCF. Here are some examples of these projects. For example, Container D or Cryo for Runtimes or K3S and Kubernetes for workload or orchestration or you have Cube Edge to run workloads at the edge and so forth. So, we also have different areas with where these projects actually fitting. You have the general workload or orchestration area and Kubernetes, K3S, Volcano are in this area. They allow you to run workloads in cloud data environments. Metal3 or MetalCube.io also is another project that allows you to provision bare metal machines and that's also in that space. A different area or scope of the SIG is the containers, runtimes or VMs type of projects. And in this scope or in this area, we have runtimes like Cryo or Container D. We also have things like WebAssembly. In this example, we have Wasm3, which is a runtime, or WA-SCC, which is WebAssembly Secure Capabilities Connector. And we also have Container Image Registry projects like Harbor. Another space is the Operating Systems for Containers. And two examples of these are Flat Car and Talos. Very lightweight operating systems that are meant to be just for running containers. Then the SIG is also interested in projects or has an interest in the ML Ops, Edge, and AI type of space. There are also some projects that are in that area, for example, Seldom Core that allows you to serve your machine learning models. We have Cube Edge, which is currently in incubation in the CNCF. And we have something like Fogflow, which is similar to Cube Edge that allows to run workloads at the edge. And then we're also interested in workgroups. We currently have one workgroup. And as we get more contributors, we would like to have more. But our current workgroup is the Container Orchestrated Device workgroup. And we know we'll talk about this in the presentation. So in the general workload orchestration area, we have a few projects. So one of the projects is Volcano currently in Sandbox. And this project allows you to run Kubernetes native batch workloads. You can run different kinds of workloads with scheduling mechanisms. For example, it provides gang scheduling. It helps you with TensorFlow training. It has custom resource definitions and Kubernetes that allows you to substitute the standard job mechanism and Kubernetes with its own job mechanism that is richer and has more capabilities. You can also use this project to run batch type of processing using specialized hardware like GPUs or FPGAs. And also helps you to handle some of the errors for some of these batch workloads. Currently, this project is in Sandbox. Another interesting project is Cata. And that's Kubernetes event driven auto scaling. And this project allows you allows you to auto scale your pods or containers and resources, say in your Kubernetes cluster based on events. And these events can be anything, maybe in a cloud provider service, like an AWS queue mechanism, or a Kafka queue that triggers some events or some topics, or maybe you have a database trigger that some data actually gets stored in the database or some message gets passed through a database. So multiple events can be actually seen. And then you cannot scale your Kubernetes cluster or Kubernetes pods based on these events. Typically, it helps to scale function pods. So it actually works in that serverless space. It is vendor agnostic and supports multiple cloud cloud providers and plugins. Another project that is currently in Sandbox in the CNCF is MetalQ.io. Essentially, this project allows you to provision bare metal machines. And your cloud provider like Amazon, where you may have bare metal machines, or just your data center or your colo where you have your own machines. It makes use of Kubernetes itself to provision these machines. With using the cluster API, another Kubernetes project, it allows you to provision all these nodes. It could be Kubernetes nodes running on bare metal. So it allows you to provision the components like the kubelet and everything that you need to run in a Kubernetes node. Another project that presented in the SIG is node resource interface. This project allows you to manage resources in Kubernetes nodes. It's an interface for manage these resources. And what are these resources? When you talk about container level resources, you talk about CPU slices or memory slices, or even devices on these nodes. So this is a standard way to manage some of these resources and target it for things like Kubernetes. It actually builds on what the container network interface has. So the container network interface or CNI has this plugin mechanism. So it uses the same idea of having plugins for the different resources for each one of these nodes. It's a container D sub project and it's still in the works. So a lot of stuff going on. This was started by folks working at Apple right now. And I think it's just early, but it will soon have more developments. So in the runtime, in VMs and container space, there are some other projects. So you have Harbor, which graduated a few months ago. It's a container image registry and just not a regular, a regular container registry, which is the single process. It's a multi fold tolerant container registry, where it allows you to store your metadata in a system, high availability systems, it allows you to have a caching system with Redis. It also allows you to store and multiple nodes using physical volumes. So it allows you to set your container registry behind a low balancer. So full scale container registry currently in graduation or graduated. So interesting project. It's very mature project now. Another interesting project that presented in our meeting is Lupin. And this is an operating system or a kernel or a mini kernel, I would say that it sits between what a unit kernel is and a regular kernel or a regular Linux kernel is. So the idea behind unit kernels is that you package everything together with your application, your system calls your kernel, everything together, and you just run it. And that can be very costly in terms of the tooling and it's just not available out there. So this project, it's a different take on that, where it actually strips down a regular Linux kernel. And it makes a really lightweight. And so when you run, it actually doesn't take a lot of time to boot up. But at the same time, it's compatible with the most important or the most popular types of workloads like databases, like run times for languages. So you can run most of your workloads and you don't need to prepackage everything. And at the same is very lightweight. As opposed to a regular kernel, regular kernel can just run everything. You will have some limitations here. But again, you're trying to address more of the more popular applications that you may run. This is targeted more towards serverless type of workloads, where you maybe want to instantiate a VM with a very lightweight kernel, just to run a function. So cloud providers have some of these with something like Firecracker, but they don't necessarily have something like Lupin, that is a very lightweight kernel. That allows you to run, for example, like an AWS Lambda function, very quickly, or not just one, but like many, many of them at the same time and instantiating them and then tearing them down. So another interesting project that presented is WASCC, which is WebAssembly Secure Capabilities Connector. And there's been a lot of buzz about WebAssembly and how that's actually going to be used to run cloud native environments in the future. And the reason for that is that WebAssembly is a very transferable format. It can actually be run on the web, on web browsers, but it can also be run on systems provided there's a runtime. So it's just like a binary, kind of like you have a Java binary that is interchangeable between what you can run on the web, and then you can also run on just regular systems. So this project allows you to connect all of these WebAssembly modules that you may have, but each one of them having very unique capabilities that are very granular. So you may have just a module that talks directly to your database, but you may also have a module that implements your business logic. So you can couple these these WebAssembly modules together with something like WA, WA SEC. And these are just an example of two WebAssembly modules, but you may want to actually connect hundreds of these different WebAssembly modules in cloud native environments. So the idea here is to facilitate the connection between these modules and allow you to run a full blown cloud native application on a full or a full set of multiple microservices. The other scope that the SIG is actually having presentations for is the operating systems for containers space. Talos is another project that presented. And it's a modern operating system that is very lightweight, and it's just meant to run containers. It's minimal, it's immutable. It allows you only to interact with it using just regular APIs, not just, you know, by SSHing or doing an SSH connection to the operating system. But this external way of interacting just using APIs. And the philosophy here is that you have more security but you have more control of what can happen inside that operating system. And you can do things like MTLS, mutual authentication. You reduce unknown factors with immutable infrastructure. So the idea is to simplify your infrastructure to run containers. Another project is that presented is Flatcar. And Flatcar is an evolution of CoreOS. CoreOS just reached the end of life. And this is a continuation of CoreOS with more capabilities. They've actually gotten a lot of adoption, just because CoreOS also reached end of life. And they're not necessarily an API based operating system. But the idea is the same behind CoreOS that you want to have this lightweight operating system just for running containers. And you reduce your attack surface by having less components and just having the components necessary for running your container images. Currently, it's getting a lot more popularity and it keeps on growing. The other space where the SIG actually has had presentations is the Edge Machine Learning or MLOP space or AI or artificial intelligent type, intelligence type workloads. For that, we have a project that is actually an incubation. It's called Cube Edge. And Cube Edge allows you to run workloads at the edge, but also manage through a centralized location using Kubernetes. So everything runs on top of Kubernetes. You have this cloud core component, the central Kubernetes location. And then you have the edge core component running at the edge where your CubeNet runs, where you actually run your workloads. And that edge component also talks to all these different devices that may be gathering information at the edge. You may have transmitters, sensors, cameras, Bluetooth devices, anything that may be actually gathering data at the edge. So the core, another project, not in the CNCF per say, but they had a presentation in our meeting. And they allowed you to serve machine learning models using either a REST API or a GRPC API. So you will build your model with some kind of tool. They also have another tool themselves to build it, or you could use something like Qflow. And then once you build that model, you need to serve it. And that serving means creating that inference graph or inference on the fly. It supports this very simple inference graph, but it also supports this very complex inference graph if you have this really complex machine learning modules. Interesting project. We have also some other projects coming up that are presented and that we actually reached out to. So we have Watson 3, which is a WebAssembly runtime. We have Crow, which is container image registry. We have Qflow for running the whole pipeline of machine learning type of workloads. Beringnet allows you to run Raspberry Pi type of workloads at the edge using AI. Reach out maybe to Cloud Kernels, which is a team working on different Cloud Kernels, so the minimizing kernel type of projects. And Crosslet, it's another project from Microsoft that helps you run WebAssembly modules on Kubernetes. So it allows you to substitute that kubelet with WebAssembly module. And then that can talk to Kubernetes and be managed by Kubernetes. So lots of interesting stuff. And I think that's about it that I have for the SIG itself. And now I'll hand it off to Renault and he will talk about the container orchestrator workgroup. Thank you, Ricardo. Let's talk about the container orchestrated device workgroup, or CAD workgroup. We are a small group of device vendors, container runtime, maintainers and contributors, as was given in SIG members. The problems that we're trying to solve is related to the fact that we've seen an exponential usage of devices in the past five years. From AI, machine learning, deep learning to network data plan or encryption, decryption acceleration, devices such as FPGAs, GPUs, NICs, or even ASICs have become quite ubiquitous in the data center. And so the charm of the workgroup is really to enable device support across the cognitive space. What that means is that we're trying to enable these new workloads. We're trying to make it so that you as a user or cluster administrator have a production experience. And being as users as well as vendors and cluster administrators, we have a feel for what the problems are. And so we've actually set up a roadmap that is kind of layered so that you can build on top of each brick. The first problem that we're trying to solve really is the ability to actually expose a device to a container runtime. And the reason we're trying to solve that problem first is that the space is very fragmented. Kubernetes has a concept of device plugin, while Nomad has its own concept of device plugin, very different. Dunker has a concept of entry plugin mechanism, while Pawnman has a concept of hooks, and Alexi has even its own concept of hooks. And all these different or all these differences across the space makes it very difficult for vendors to provide a uniform experience for users across these different projects. And that makes it very difficult to provide some features or the same features across these projects, meaning that some projects have different capabilities than others. And so the next problem that you want to solve or that we want to be able to solve is that when you are offloading computer a task to the device, one of the main reason you are doing that is speed. And choosing the right CPU, choosing the right memory, choosing the right mic, or choosing the right device is very important. If you have the wrong CPU and the wrong device, it can sometimes destroy your performance and nullify the effect of actually offloading the computer to another device. The last problem, when you are in a data center, and you have, for example, tasks that need to be spread across multiple nodes, tasks that need to talk to each other, most of the time, you want these nodes to be close to each other, and for example, on the same racks. And so figuring out where the right knobs, where the right policies, where the right extension points is very important. It's a hard problem, but it's a very exciting problem. And so the first problem that we tackled, as mentioned on our roadmap, is really the ability to expose a device to a container. And our answer to that is CDI, the container device interface. It's a unified plugin architecture for runtimes. It's based on CNI, the container network interface. And it basically tells the runtime what are the devices that are available on the node. And what are the operations that the runtime needs to perform to expose the device. And so here on the slide, what you can see is that we have one device vendor, vendor.com slash device or one device kind. And it exposes one single device, my device. And the operation that must be performed here is to expose to device node and slash dev. And so from a user perspective, what this really means is instead of typing my container run dash dash device dev, my card, and dash dash device dev, my card render. What you'll be typing is really just Docker, or my container runtime run dash dash device, my device, and then my image and my CNI command. From a roadmap perspective, we've created this roadmap to give you a feel for where we are and where we're going. We are still in the stage zero. We are building POCs. We have two POCs on platform and in container D. We are still building these formal spec, whether in go and in writing. And we are talking about the idea with different people. What we're hoping to achieve probably and hopefully by the end of the year is the integration and run times maybe as an alpha feature is the ability or the Kubernetes cap. And the intersection or talking about CDI and how it can enable other powerful IDs such as networking or storage. And the direction that we're going really is integration with many run times and having multiple plugins built on top of us. And so we've decided to plant a one oh flag saying that this, this will be stable. This will be one oh as an API. When we have two run times, they're using it when we have more than three plugins. And when we are better in Kubernetes, giving a feeling for users that this is the feature that is actually going in in a direction that is adoption. And so that's CDI. But don't let CDI discourage you. The Cod World Group still has a ton of huge IDs and exciting IDs that haven't been thought through completely. And we'd be super happy to have people contribute just to have a different opinion and a different view of how these IDs should be addressed. We'd be also super happy if you want to give us some feedback on what we've built. And if you think that there's an intersection with your ID, we'd be happy to hear from you. If you want to integrate or build on top of CDI or our IDs, feel free to just drop by and we'd be happy to hear from you. And that's it. That was the CIG Runtime presentation. Go on cncf.io to see CIG Runtimes description. You can join the Slack. You can also look up the GitHub that has a lot of information. And we have bi-weekly meetings on Thursdays. Feel free to join the CIG Runtime bi-weekly meetings. Join our community and give us some feedback. Thanks a lot.