 Hi, good morning. Good morning everyone. I think let's wait maybe a minute or two. Sounds good. Do we want to start doing a little bit of a welcome stroke introduction? Yes. Yes, we want to do that. Let me post the link on the chat for the notes and we can do stand-up, right? So yeah, please add yourself on the attendance. We can start. So who wants to go first? Do I want virtual kool-it to present? Yeah, that's a great. Okay, I'll go ahead and start then. Hi guys, my name is Rhea Batia. I'm part of Microsoft and I've been part of virtual kool-it since the inception of the project. So yeah, we were surprised that we were going to present to you guys. It's also nice because I know the TOC has so many things to do. So it's cool that now there's a thing called SIG Runtime and we can kind of filter our discussions through you guys because yeah, we really want to see this project move forward. It's been in sandboxing for now. Since it's been in sandbox since November 2018. So we're kind of ready to move it forward and see it grow into incubation. So yeah, I have a little powerpoint that I'm going to present. Let me go ahead and share my screen if it works. Yeah, I think first we want to ask everybody else to kind of just say hi. Yeah, sorry, that's me. Yeah, so Ricardo, I'm a co-chair, so I don't have any updates. We have a roadmap to talk about a little bit later, so after your presentation. Yeah, and I think Quinton, you want to say anything? Yeah, hi, am I unmuted? Yes, I am. I'm Quinton. I'm also a co-chair here and I've been involved in the TOC pretty much, or rather the CNCF pretty much since the beginning and was on the TOC for a while. And yeah, just here to help where I can. I might actually have cosponsored virtual cupelet back in the day if I remember correctly. I think so, you might still be Eric has cosponsored. Jeff? Yep, Jeff Hahn from Microsoft. I'm one of the K2 maintainers. Just dialing in to listen and see what's new. No agenda items for me. Okay, great. Eric, Eric Cardi? Okay, sorry, can you hear me now? Yeah. Yeah, sorry. So yeah, my name is Eric Cardi. I'm somebody who's just been, you know, I guess now lurking around the Kubernetes ecosystem for a bit, saw this new SIG, you know, was interested in container runtimes following, you know, some of the other projects as well from Cota to Nabila. And yeah, just sorry about my kids in the background. But yeah, just, you know, excited to learn more and see what, yeah, it's the roadmap for the SIG. Great. And Brendan? Yeah, hi, my name is Brendan. I work at a company called Elotal. We're currently using virtual Cubelet, one of our products and have been working in the conformance space with Ria on seeing how conformant we can get our provider. So yeah, that's pretty much it. Great. And Velmos? Yeah, maybe, maybe harsh. Yeah, it's very much. Hi. I'm also with Elotal. I work with Brendan on the same project. So right now we are working on virtual Cubelet provider. Awesome. And Philippe? Philippe? Oh, sorry, cannot hear me. I was on mute. Hi, so Philippe from ARM. I'm also joining interested in tracking the activity and roadmap here coming from the ARM side. Also, I'm interested into the multi-architecture support generally for all the projects involved here. Perfect. Klaus, you want to say anything? Anything? Any updates? Hello, everyone. This is Klaus. I do the match so far so good. Yeah, so you get a sponsor for your volcano. So thank you. Yes, we got one sponsor for the volcano right now and we are trying to get them all, the other two sponsor. Yeah, great. And Diane? Hi, I'm Diane. How do I work for Red Hat in the CTO office and I work primarily on AI and ML and so I'm, and I volunteer to be a co-chair. So I'm mostly interested in runtimes, involve AI and ML, but here to just help out. Great. I can never can spell your last name correctly. So sometimes I do two D's and sometimes I do two M's. Yeah, it's a funny Dutch name. Yeah. Okay, so anybody else? I think that we got everybody. Okay, cool. So I think the next item on the agenda is just a virtual cubelet presentation. So. Okay. Yeah, so we go now. All right, let me present my screen. Okay, can you see my screen? Yeah. Okay, perfect. Yeah, so hi guys. I'm a virtual cubelet core maintainer. So yeah, let's just, I have a lightweight PowerPoint. A lot of our links are in the PR that we have in the CNCF project. But I just want to start the discussion and kind of see what we need to do to get to the status to incubation status. That's kind of the goal for for us today. Okay, so introduction of virtual cubelet for folks that don't really know about it. It was created in December 2017. We moved to the CNCF as a sandbox project in November 2018. And the, so the way that virtual cubelet works is we basically create, we decided to kind of create a cubelet that doesn't actually need a node to survive. The main point of it was we wanted to be able to have an abstraction that didn't rely on a node. So we could get closer and closer to basically packaging up workloads within a pod, rather than thinking about pods and then how you fit that within nodes. We had some customers within Azure itself that were particularly interested in that because we had a product called Azure container instances, but they were also interested in how to scale, how to load balance, and how to do a bunch of the other things that Kubernetes already did. So instead of reinventing the wheel, we decided to make a simpler version of what the cubelet could be. And so it gave us a lot more flexibility into Kubernetes and allowed us to really decide what it meant when you get a pod in Kubernetes, you get the pod status. There's a couple of things that we didn't really need like capacity for a node or like, but we did need capacity to sense for the workload. And that's how you understood how much the pod really spun out to. So yeah, so that's why we decided to create it. And after that, we were also working with a company called HyperSH. That company no longer exists, but a lot of them are kind of now around, whether it's in Alibaba or other places or Microsoft, and some of them are still kind of working on the project. Past that, we also got more people involved. So there's a lot more providers now today, since when we released, when we released, we really just created this virtual cubelet interface, and then we created a concept of a provider. So any person or company or organization that felt like they could create a pod, get a pod, delete it, possibly update it, they could be part of virtual cubelet. And it might only be a few lines of code. So that was kind of powerful for people to get their abstractions into a virtual cubelet. And we did something after we, when we released 1.0, and before that, we also did something employee over split out our providers from the core virtual cubelet project. And that's important because now providers can kind of live on their own while virtual cubelet continues to improve on its own before they were kind of stuck together. So when we released, we released everything. So every provider went out at once. We've done that now at 1.0, which is really, really, really important because now we don't really have to keep track of providers, and they're allowed to do what they need to on their own. But we created a governance policy for providers so we can support them within the project, but it's more like us supporting them as an organization rather than through a release process. So yeah. So some stats. We have five core maintainers for virtual cubelet and one core maintainer, which has, I don't know, graduated or not. He's there, but he's not part of the core maintainer stats anymore. So we have companies from Netflix, Microsoft, VMware. We have someone from another company that doesn't want to be named from that company. That's cool. And then we have 10 project maintainers. So this is that provider situation that I was talking about before. We have every project or provider has their own repo within virtual cubelet itself. And we asked that there's one to two people from that provider that lists out their names, give us their emails, and they're kind of the point of contact for that. So we call them project maintainers. We've done 33 releases. We're currently on B1.1.1, which released on December 2. We have monthly office hours. It used to be like weekly, then it was bi-weekly. But now that virtual cubelet, the interface itself is stabilized, and we don't see a lot of new feature work. It's more like maintenance work and kind of working with the community through GitHub. We decided that monthly office hours worked better, especially since we have Slack and Google for GitHub. And this project has been going on for now like three years. So yeah. And then eight official providers in the repo, and then five unofficial providers that we know about. So these are like the IoT Edge provider. This is kind of like EODL's provider, Netflix's provider, VMware's provider. And then I think there was one more, which I'm forgetting, but these are just the ones that we know about. People have the ability to go in and kind of, if they want to work it, they can and create their own provider. Or a lot of people are just using this in private in their own companies. And it's just, it's sometimes difficult for us to know about it. Even if they create an issue, we don't exactly know what company they're coming from. But yeah, this is what we know so far and always open to figuring out more. So progress since our sandbox entry, we released 1.0, about, I think it's been a year now since that, or not, sorry, not a year, but like, you know, the new year. We've gained momentum and contributors across the industry. Netflix is a pretty huge one. We worked really, really closely with them to get to 1.0. Because there was a lot of things that they needed for their project to move forward. They're actually using it for Titus. So they built, Netflix built out virtual cooblet on top of Mesos. So they have an entire Mesos infrastructure. But they wanted their developers and their entire company basically of compute, they have their own compute notion to move with Kubernetes. So to get that API, they created virtual cooblet and anytime they need like a virtual cooblet node or whatever, it's actually spinning out to their Mesos underlies. So that's really interesting the way that they created that flexibility. It's not open sourced today, but there's a really, really good talk that we did at the last coupon around what Netflix is doing with Titus. So if you guys want to learn more and kind of understand what they're doing, they went really, really in depth at our last, so it's the virtual cooblet intro session, the last coupon. So go ahead and check that out. So we split out the providers, we added a tree, now we're in maintenance mode, and we've done a lot of talks. We've done talks with HashiCorp, VMware, Netflix, any partner that we've had, and any maintainer really, which has been awesome. Okay, so our virtual, I really want to kind of set the stage to focus on providers, because I think that's the greatest point of tension for us at least, because every provider has the flexibility to do what they want with the interface that we provide within virtual cooblet core. And that's why things like conformance and things like understanding what even the deployment would look like for customers. We have an understanding within like Azure, for example, for what it should, what the customer experience is, and we kind of hold that to a high standard. And everybody else has the ability to, and the flexibility to figure out what that experience looks like. So that's why when it's hard when we talk about the virtual cooblet project in general, we've enabled a couple things, and it's up to the providers to go further, like enable networking, enable volume support, and things like that. So this is our definition for them. It's to provide the back and plumbing necessary to support the life cycle management of pods containers, and supporting resources in the context of Kubernetes. They must conform to the current API by virtual cooblet. And they must kind of write their provider in a way that it doesn't have access to the API server. And it's a well-defined callback mechanism for getting data back, like secrets, and config maps. So we put a lot of responsibility on the provider this way. We create the methods for them, like we have the place where they create a pod, update a pod, get pod status, and we help reconcile all of that. But they really must, they're the ones that have to figure out how to spin out to whatever abstraction they're trying to spin out to, whether it's ACI, whether it's mesos, whether it's, I don't know, Pizza Hut, like you can really create a pod and make that pod do whatever you want. It could turn on your lights, it just could, so it could really do anything. So they must figure out what it means to create a pod, update a pod, get pod status, get pods, get node conditions. It gets a little wonky there. The operating system, whether it's Linux, Windows, et cetera, these are things that Kubernetes exposes, and we allow providers to also expose the same things within virtual people. Node, daemon, pod, endpoints, dillipod, and then environment variables, config maps. For example, no one's done like node, daemon, pod, endpoints, DNS names, and more. So in ACI, we created our own notion of networking. It doesn't go through Kubernetes networking, but it does, I mean, it works through Kubernetes, but in our back end, the way that we do it is completely different from what you'd expect in a node in a cluster. But for our customers, it looks basically seamless, which was our goal within Azure itself. And so these are kinds of the things that every provider has to go through to build out that experience. This is what the interface looks like. So super simple. This is what we kind of hand off to people, to providers, and then adopters that are in production usage. So these are the ones that we just know about, and there's tenfold more just because a lot of these are cloud, we're like cloud providers, for example, right? And we have a lot of customers that we're tracking internally that we can't exactly talk about, but that are there. And I know like VMware, not VMware, but Amazon also has them. They have virtual people that is kind of working experimental project, but I'm sure there's some people using them. So Netflix, Alibaba, Azure, VMware, and EODL, I put it, Azure because their stuff is it. I don't know if it's done yet, but they're working on it. And then there's so many more that are not public, just because of our relationship between providers. And then that it's like a two-degree process to get to end users specifically, the way that we've seen virtual public grow. And then I'm going to hand it off to Brendan to talk through conformance testing. Yeah. So we're going to load it like I said before. And when we present our provider at conferences, we got a lot of questions of, well, how conformant is it with Kubernetes? And so we started looking into this. And at the beginning, I thought, well, since Virtual Cubelet is a library, essentially, that we can use to build providers, conformance is something that is very much provider-specific. So these numbers really only apply to our provider. Somebody's ACI's providers' conformance might be different because they implement things on the back end very differently. But what we wanted to show is that you can create a provider that is close to conformant. And we're hoping to create a provider that gets us as close as possible to 100% conformance. And so we've been running the conformance tests using Sonobui, like the official conformance framework. And you can see from these numbers here that we're doing pretty good in terms of, well, we're about 73, 74% conformant. And we can use the conformance suites to identify areas where we need to apply engineering effort and either implement that in our provider or push it up into Virtual Cubelet, which we're looking forward to do. Some of the places where our provider is weak, for example, is in the support of downward API. There's a lot of downward APIs end to end tests. So we're hoping to get those pushed back up into Virtual Cubelet for anyone to use and help out the other providers, essentially, in supporting that was that functionality. So, yeah, Virtual Cubelets made it really easy to actually get our product out. And right now, just as a note, we are not open source as of today. We hope that changes literally tomorrow. So looking forward to that. Yeah, like I said, conformance is going up every week or so for us. And we're able to use the conformance tests to really push those numbers up and ensure that our product, in most areas, works exactly like something your pods will work exactly like they work on a regular Cubelet. I think that touches on everything. I'll hand it back to Rhea. Cool. Thank you. And we're getting towards the end here. So I think in a second, here's an example for what people have to add to their pod deployments or their deployments in general, to get to spin out to Virtual Cubelet for at least Azure. So we use tolerations and taints. And if you have the right key, you're able to spin out to Virtual Cubelet, we don't automatically spin out anybody to Virtual Cubelet. So when you create a deployment and you're with an AKS or just AKS Engine or something else, and you have Virtual Cubelet, we will never automatically spin you out just because the Virtual Cubelet is different. And so with not everything, if you haven't tailored your deployment for what works, like you don't have daemon sets, you don't have maybe specific metrics or things like that, we're never going to do that. It's always a user choice. And that's very explicit. That's the recommendation we give to everybody. It's user-specific. They need to understand what they're doing here. And then the way, if people want to fill up their nodes, for example, and they do have a deployment that does want to spin out to Virtual Cubelet eventually, for a burst scenario, that's when they could go in and do node affinities. And basically, they're able to spin out to Virtual Cubelet if things happen. And that's the way that we end up doing it. So yeah. That's basically it from my side. We didn't go too in-depth on use cases. And that's because we've probably talked about use cases for now, like two to three years. I'm happy to go through use cases in this forum again. But a lot of it boils down to burst use cases, flexibility, easily billing by the resources that customers use rather than a big bucket of resources. And yeah, for providers, it's a lot about that flexibility and being able to kind of, a lot of people are using Virtual Cubelet to mesh their current infrastructure with Kubernetes because they have all this old infrastructure and they want to continue to use it. It works well, but they want their organization to move forward in terms of how they develop and how they deploy workloads in that life cycle. And they see the industry moving towards Kubernetes. And so Virtual Cubelet gave them a nice middle ground where they don't have to completely overthrow everything. But they still are able to use Kubernetes with the stuff that they like, that they've built at least. And so we've seen a lot of that too. And yeah, so I'll hand it over to y'all to say runtime. Great. So any questions? I have a question because this is the first time I've really seen this. I need to go watch those videos from KubeCon that were recommended. But so for no-definity with this, is that mostly for debugging? People can see what node they actually ran on if there's a problem? Or how is that used? It's not specifically for a problem. Wow, it's been so long since I've deployed no-definities. The way I remember using it was for when I wanted to spill over to Virtual Cubelet. So basically we said that only if Virtual Cubelet, like there is a certain thing, only if Virtual Cubelet, the other nodes reach capacity basically, then you're allowed to spill over to the Virtual Cubelet node. So that's the way that we used it. I'm sure there's other ways that folks can use it. Okay, so no-definity doesn't mean, my idea of no-definity is I say I need this to run this particular physical hardware. Yeah, we did prefer scheduling, but ignored during execution. So we have a different, that is usually what it's used for, but we have a different tag underneath that like don't do it unless there's this exception. Okay. Yeah, it's weird. It's a weird thing to get your head wrapped around, but it's kind of the way that we've been able to Kubernetes for us. But this is just for spillover situations. Okay, so if you do want to know what you actually ran, which hardware you actually ran on, is there a way of finding that out? Like down to where like the, so this is up to the provider. So Virtual Cubelet will expose the nodes, so we will expose the CPU. We will expose some sort of node capacity, but usually it's ridiculous. Usually it's like three terabytes and like X whatever for CPU. So it's not very realistic, but we do have, there is a place where we've been able Kubernetes again to say like, when you, when you talk about the OS, that's where you can put Linux or Windows, etc. You could expose the actual capacity through Kubernetes if you wanted to, but usually on the back end, we've seen a lot of providers have larger pools of resources. But if you wanted to specifically tie in your back end to your front end, you could do that. Kubernetes lets you do that already. And Virtual Cubelet would also like to do that, but it's up to the provider how a lot of providers have multiple machines behind that one on Virtual Cubelet. So they kind of make it more vague. They have to implement that if they want that. Yeah. Thank you. No problem. Thanks. I have a question. So you mentioned that in terms of people migrated to Kubernetes are using this to replicate more of their what it would look like running Kubernetes, right? So, but are there any other use cases for people who, you know, have already decided to move to Kubernetes and they have all of their infrastructure in Kubernetes, right? So, even going forward. Yeah. So for folks that already have all their infrastructure, a lot of folks, so when we think about larger organizations, we think about people that have, and we can talk about a smaller use case too, but for folks that have larger enterprises or companies, a lot of folks have their own compute notion within their team. So these are the folks, and we've seen this all over, where people are building out teams specifically just for Kubernetes. So they're building out infrastructure like their infrastructure ops teams or whatever, and those are the folks that touch Kubernetes. When you're, they're looking at lots of ways to abstract away Kubernetes for their users, for their users being their internal developers and app operators, et cetera. And this kind of using Virtual Cubelet as an abstraction reduces the overhead of trying to understand all of the intricacies of their nodes and how they're scheduling out and like fixing node errors when you could just abstract it with Virtual Cubelet. And kind of, we've seen even in Alibaba, what they do is they end up giving each team within their, I don't want to talk about the specific version, but they basically give out one Virtual Cubelet per team. So instead of holding X amount of nodes, if nothing's running, nothing's running. They don't get like X amount of capacity. So instead, they give out a Virtual Cubelet and they're like, spin out to Virtual Cubelet when you have workloads, but, and this will get distributed across their entire company, because now they can represent Kubernetes clusters as like, they don't have to represent a cluster per team. Now they're representing a Virtual Cubelet per team in that cluster could fly anywhere and it could resemble X amount of capacity, but they don't care about that. They don't need to care about that. And so they don't have to think about node errors, cluster errors, that's up to another team to figure that out. And if something happens, they'll automatically move over workloads so they can move to different Virtual Cubelet, but behind Virtual Cubelet usually there's X amount of nodes. So like, they have the flexibility to kind of really do some cool stuff and get their clusters to crazy amounts of utilization. Great. Thanks. Any other questions? Yeah, I still have a kind of a gap in my understanding here. So baked into all sorts of parts of the Kubernetes APIs as concept of a node. So you can, for example, request that containers be on different nodes or the same node. And that has, you know, specific meanings and all the load balancing is sort of across pods, but there's routing between these things. And all of the downward API stuff you mentioned is all sort of downward to the node. So I'm sort of curious how this concept of a node is so baked into the Kubernetes API, you can't sort of wave your hands and just make it go away as far as I can tell. But I'm curious how all of that stuff actually works. Like for example, auto-scaling presumably, I mean, it deals with the concept of nodes and you're running out of nodes and you need to scale them up. I presume all that stuff doesn't work either. Yeah. So I wish Brian was here. So he's our kind of, I would say engineer on the project who would be able to go through all of that with you. I don't know where he is right now, but I'm going to give it my best shot from basically a user and more like PM-centric view of what's going on here. What we do is we basically do our best to, we fault towards if things work in Kubernetes, we want it to look like Kubernetes as much as possible. So when you're thinking about things like auto-scaling and you're thinking about things like load balancing as much as possible at the surface layer, they look the same to the user in the way that the user would describe these things in pod deployment. So the way that you describe auto-scaling, et cetera is the same, but you still have to, even when you have, like if you have multiple kinds of nodes in your cluster, that's why we have node affinities. That's why we have teens and tolerations. So folks are able to kind of pick which nodes they want things to run on. With that notion, we do the same thing. So we treat virtual kubelet as a node from the user's standpoint. It's just what's happening on the background. Like that doesn't get exposed to the user at all. So it does look like just a node in Kubernetes. The way, the how, and you're asking like, how do we do this? That's a very good question. I didn't write any of that code. And I would love, if anyone, Brendan, if you want to jump in or if someone else wants to jump in exactly how go for it. Yeah. It's an expansive question and one that we've actually worked with throughout all of our work on our provider at least. So just to give you a quick background, our provider, when you submit a, when a pod gets dispatched to the virtual kubelet, our provider spins up a virtual machine in the cloud in the background and your container is run on that virtual machine actually. So in order to make that work, for example, the downward API, the plan is there. Well, once the machine comes up, we when dispatching the pod to the machine to run there will overwrite environment variables or create environment variables that have the specs of that machine. So your container or your pod running on that virtual machine? I just want to interrupt you quickly. I understand all that stuff. What I don't understand is, I mean, there are lots and lots of examples I can give, but here's a basic one. So I launch a deployment and I say that I want each of the pods in the deployment on a different node. Yeah, okay. Kubernetes scheduler then tries to do that. But actually, there's no ways that the Kubernetes scheduler can do that. And there's no way in the API that the virtual kubelet can do that either. So that very basic function doesn't appear to work. Yeah, well, we have a tape on virtual kubelet. So you won't like the scheduler will understand not to schedule that to virtual kubelet until there's that toleration in your pod spec. And then it will see that toleration. So let's let's assume what what I want as a user is I have I have a bunch of capacity to run containers, right? And right now, unbeknown to me and I don't really care. It's living in a virtual kubelet. So it's living in Azure container or whatever it's called. And all I want to do is run my pretty standard deployment that says don't run these pods on the same machine just run them all on different nodes. And yes, my administrator told me that I have to put this magic incantation to allow my stuff to run on on virtual kubelet. But actually, there's no way that I can understand that that request can be actually fulfilled because the Kubernetes scheduler can't schedule it onto different nodes. It can only it only knows about one virtual kubelet node so it can't actually schedule them onto different nodes. And it has no way of passing that information through to the virtual kubelet to make sure they end up on different nodes. And like I said, there seem to be like hundreds of other examples like a conjure up where where stuff like basic that just can't work. So that's that's that's basic in the mindset of you're using Kubernetes with multiple nodes like it depends on your cluster. If you wanted to have a multiple node cluster with a virtual kubelet, I mean, you could technically do that if that's your if that's what you want for your organization, you could do that virtual kubelet is really the way that users will use it and do use it is very specified and they have a really good reason for why they want to spin out to virtual kubelet. And it's usually not because they want to run multiple things on multiple nodes that notion goes away completely. Like that's not even a thing you would think about in this world, a virtual kubelet that I need to run multiple workloads across excellent nodes, we are looking at things like if you want to be fault taught, like if you want to make sure that you're you have high availability, etc. that's really up to the back end and the provider to provide that in another way. And those are still use cases that we're working towards. For example, ACI would probably already do that. But in this scenario, like why would you want to run things on multiple nodes, I guess is the crux of the question and can we solve that rather than can we solve the problem of running multiple things, like the same part in multiple nodes and Kubernetes like instead, let's look at the use case. And that's perfectly valid. So so ideally what one should tell Kubernetes is run this deployment in HA mode, for example. Yeah, exactly. It would then translate that into run it on multiple nodes. But but even if that is the case, there is no way in Kubernetes to do that today. So so actually, there's no way to do it in Kubernetes, but we could actually finagle out within virtual kubelet because we can we can we're able to just to take in different parameters and be really wonky and we haven't done that. We haven't done that. We're just talking about theoretical use. My question is, we would be able to do that, actually. We could give users that like if we did that in ACI or Yodels providers and think we could use an environment variable or something like that to express that you want a high availability like set for this you want it to be I don't know times three to make sure it goes on three different machines. And we would do that in the back end and when what you would see in the front end like we could also finagle that you could see three pods running and maybe each machine it's tied to something different like there's a lot that we could do in this world. It is a weird place of like half your serverless half you're not but people are working through it and basically my my standpoint just being a core maintainer. I care about all the providers kind of getting towards the goals that they have for their end users and that's why virtual kool-it is so flexible and so simple at its core is to provide for any of these kinds of use cases and every time we do a talk with coupon we get a million questions not really like yours but more like what if I wanted to put a virtual kool-it on a satellite like what if I wanted to spin out like use Kubernetes as my control plane for all of these different kinds of machines across my home or my business or et cetera it's the the creativity of people just kind of gets up leveled with this project and I think that's something really amazing um and yeah it doesn't exactly work with Kubernetes but that's why we're not going towards Kubernetes like this is a CNCF or within the CNCF we're not try like we are hold on sorry let me interrupt you quickly I have yeah do you know what I'm saying fundamental premise correct me if I'm wrong and and and I'm just trying to clarify my confusion here so the fundamental premise here is you have some other container orchestrator be it mesos or azure container service or anything else and users want a Kubernetes interface to it that that is sort of the they want the functionality of Kubernetes and the API that that seems to be a fundamental premise of this project and you're saying we're not moving towards Kubernetes and and we've also identified that that quite a large swath of the Kubernetes API does not actually work and that you need to do things differently than they're done in Kubernetes in order to achieve them when you're running on a virtual kubelet for example HA so that the mechanisms that Kubernetes provides for HA like uh node affinity uh saying I don't want my containers on the same node because they don't want to fail at the same time doesn't work um so I guess like what are we left with if if we set out to create an Kubernetes API on top of a different container orchestrator and the Kubernetes API substantially doesn't work what do we have um that's a strange way to look at it too um so basically like virtual kubelet isn't boiling the ocean with what it can do it understands who it is it understands that it's providing burst workloads it's providing abstraction providers can implement those things if they feel like they need those for their end users like Netflix can go ahead and implement whatever they need for the API and it's working for them today um to make sure that their workloads are spun out the way they need to but they're using like Netflix is currently using this because they have virtual kubelet they're using Kubernetes they're using the API because they were given the ability with this project to get the flexibility that they needed for their infrastructure um saying that kubernetes is the end all for everybody's infrared is just not going to be and the entire API even maybe it's not just what all end users need um but we are like we're doing our best to make sure if a customer or if a user or if a provider needs a certain functionality we will work through it and figure it out it's like a day by day basis we're not saying that we know we're doing everything kubernetes can do but the things that you're talking about with no tolerations like we can still do those if you have um for example if you have a special like GPU enabled node there's some workloads you don't want to run on that right on that node you're going to save it off for the specialized workloads that's exactly what we're saying virtual kubelet is think of it as a different kind of OS or a different kind of operating system where you know it's only for special things and you're not going to schedule linux workloads if it's a windows node like think of it in that premise and then i think things become a little bit more clear that you can still use all of those notions that we have in kubernetes um and yeah like downward and api doesn't work but we if we feel a need to do it and we've only seen a need actually within conformance like we will go and implement it but um i guess what i'm asking for is like is the core premise of virtual kubelet what we're doing here today is that enough are we debating like the core premise of virtual kubelet or are we debating what we need to get virtual kubelet towards incubation because we're already in the cnc app um so i'm trying to figure out like how we can get to incubation i guess and if it's making those work this those specific use cases that we hold as virtual kubelet more clear i thought they i mean we leave it super flexible for a provider so it's really for the providers to go and provide those use cases but um yeah i'm just trying to figure out like what we can do i guess yeah so so i was the one who sponsored or one of the ones who sponsored uh virtual kubelet into the sandbox initially and the the reason that that was done was and i did this in consultation with randon burns at the time i seem to recall um so that the the reason i thought it was a good idea was to explore the concept of how do we remove the the concept of node from kubernetes um because it is a i mean virtual kubelet is not the only project that wants to do away with the concept of nodes because they are problematic um and as you pointed out there are other ways you know because you provide nodes people say well put my stuff on different nodes when actually what they mean is make it highly available etc and so there are myriad of reasons why it is a good idea to remove the concept of nodes from kubernetes um from the api um and and achieve you know try and achieve the same aims by different purposes or the the same aims with different mechanisms and so so that was the intention of putting virtual kubelet into the sandbox was to to come up with an answer to that question can we remove the node concept from kubernetes and and have it a useful kind of abstraction and and that seemed like a very useful exploration to be done in the sandbox um it superficially and i'm still not clear like what part of kubernetes does work and what part doesn't work we've got some conformance percentages and those are useful but we also know that conformance only covers a relatively small portion of the kubernetes api so so we really need to understand like as a usual normal user of kubernetes like how much of it actually works if i if i run it on a on a virtual kubelet and how much just simply doesn't work anymore i mean that's going to be up to each provider to do that right like depending on the the providers in the what functionality they make on a daily basis like it's very up to them and how they explain that to their users a lot of them they understand it's slightly different from the what they're going to get from a normal kubernetes node um and so what my question is how do we do that if like conformance was the way to figure out what's in kubernetes right and that was kind of our program out of the way to figure that out how do we go back and like answer that question of what works because today we're saying this is a subset for each provider that works today this is our functionality but we're not going through and saying all of this doesn't work um and that we haven't seen a need even for that um i don't know i yeah it's definitely a question i want to i want to answer it just seems like it's going to take i has anyone else on that that we can go off of done what sorry gone through everything in kubernetes is past conformance to figure out what does work and what doesn't and i don't know how to express that um and i think the windows provided so the comment brian grant made in the pr uh which is that the windows people had had similar problems um they wanted to support windows containers um uh patrick i think was the main person there and uh and so they had to go through the whole of kubernetes and figure out like which parts of it work when i'm using a windows container and which parts of it don't okay would you want to us do that per provider say again would you want us to do that per provider then um i'm not i mean i understand that that you can't vouch for any given providers implementation of the interface but i i think the interface itself fundamentally limits a certain amount of stuff i mean presumably you have a kind of a reference implementation of a provider at the moment and there's also certain limitations that the interface itself creates um if there's no way to um allow the kubernetes scheduler to do anything than schedule a pod onto this large group of of nodes that are sort of hidden underneath the virtual pubic there's a whole class of things in kubernetes that don't work it kind of doesn't matter who the provider is that that functionality is not available in the in the api in the virtual pubic api okay so that so i think that's the kind of and that's what brian grant asked for i think a long time ago maybe in october i don't remember exactly what when his comment came in but what he wants us to do is understand what works and what doesn't work to the same level of detail as the windows container group did that's the thing we can't do it to the same level of detail unless we're doing a per provider like our interface is so simplistic we could copy and paste the interface in that and describe every like all the methods we have um but that wouldn't be super useful for anybody because you can go and go and get a page and look at our interface and see what we support um i think like understanding one provider right you could do it for your reference you could do it for one provider for example your reference provider perhaps the azure one i'm not sure which one is the reference is the sort of yeah the one that exercises the api the best right now we're doing it through eotl's provider and we were going to do kind of like a write up on what that meant and that's kind of and we're going to do it in the format that windows wanted um that windows had before um and that's something we can definitely do um is okay yeah and if that's it then that's that makes sense it's just that doing it per provider it just like the reason i'm saying this is like it doesn't make a lot of sense from the virtual kubu perspective because every end like it doesn't make sense for end users because every end user is using a different provider so going through eotl's it going through any like just azure or just eotl it'll sure it'll give you some sort of sense that like almost everything can kind of work but like it's so specific per provider that we're not going to be getting what an end user would be looking for like my i understand yeah yeah and that's why i would ask us to look at this from a different perspective and look at virtual kublitz core and what we offer there but think of it in a different way instead of conformance like can we do that because we're still like and i'm going to get back to this like we're not part of the kubernetes project like and there was a reason we did that because we were so different from kubernetes so is there a different set of ideals that virtual kublitz can make as a project understanding that we're meshing in two different infrastructures whether it's mesos whether it's aci whether it's something else two kubernetes like i we're bringing the value of kubernetes to everybody but the way that we do it is slightly different to end users so i'm just asking if we can like because we can write at that documentation i have no problem doing it i just don't think it's going to be useful for anybody other than us what about doing what he suggested quit and said there are certain things that just couldn't be implemented if you just documented what those are yeah and we actually we have a lot of that in our repo and like in our um in our google drive and things like that i can plop that up on a read me in virtual kublitz we've done a lot of like you you can't do daemon sets um again if a provider decides to implement them you can do daemon sets um but it will be the notion will be different from what a kubernetes user would expect um you can't do private networking in some of the providers you can in aci like it's just i'd like to say that most providers do point out their shortcomings in their repos um i regret that we're about 24 hours late in open sourcing ours but we could point to exact uh specifications of what we don't subvert at this time and what we do support so users can know that um also i just want to say that virtual kubelet is something that fundamentally does redefine what a node is in kubernetes and as a result of that if you if you're redefining what a node what a node is and how a node behaves i think that's uh you're fundamentally by the nature doing that you're breaking down the abstractions um and certain things like node affinity well what does that mean well it means something to a regular kubernetes cluster but you're doing something specialized here because you couldn't get it out of kubernetes originally and so our whole point is like yeah you can you can say node affinity doesn't work but why would you even want to mess with node affinity in this type of system you're building a specialized system using kubernetes and this is a building block that allows you to do that um so if there are other examples there i'd love to hear about them but um and we can talk more about that seems like node affinity would be useful for reproducibility if you've had some system you wanted to troubleshoot you would want to know you want to reproduce what happened and you want to know what node had actually ran on yeah uh i think that the much like a container a lot of the back end uh implementations are ephemeral and so reproducing that you can run the same thing but you're never going to have things that are persistent after your container or pod uh goes away it is uh it's a different way of looking at it much like in serverless for example i think there are a lot of parallels there you you might run you know a lambda function in amazon but being able to run it again and reproduce something well you're probably going to be able to achieve that but you can't get on to the you know whatever backing instance that is and look around for what went wrong does that make sense somewhat yeah i mean ultimately though if there's something's broke and you're going to have to try to reproduce yeah i think that you would just uh in our two and a half years of running a similar system um i would say that you run the pod again you can uh and you troubleshoot it that way maybe you don't let it end maybe you stop it before the failure or you stop it after the failure and inspect things um reproducibility comes out a little bit different in uh fundamentally a lot of this stuff that we're building is serverless systems uh whatever that might mean to you and uh so a lot of the guarantees of an old we have the infrastructure we can ssh into nodes just goes away uh fundamentally and that's a good thing um but it's a new way of working yeah i think i think this thing that's up on the screen in front of us here sort of summarizes my my question and then i'll leave it at that so so as a user and i'm trying to put myself in the you know the in the the position of someone wanting to consider using this thing either to build a back end or to um or to use you know somebody else's back end that sits behind this api and i want to say i mean my my fundamental reason for coming here is i want to use the kubernetes api against the non kubernetes uh container orchestrator that's that's the the fundamental premise here and my first question is going to be of the functionality and of the api in kubernetes which parts of it are going to work correctly and which parts of it should i either not use or i need to relearn how they use because they work differently and if i look at that page that was up on the screen a moment ago it doesn't cover any of that uh it covers a bunch of azure container service features that are or are not accessible through this if i understood correctly so that that's the fundamental question and i i'm not suggesting that the answer has to be 100 of kubernetes works but i need to be able to find out what works and what doesn't work and i don't seem to have a way of doing that yet the conformance tests are some way towards that but they actually and this is not the fault of this project this is the fault of the conformance tests the coverage of them is so low that it actually doesn't give you a good uh good answer there um and and the and the documentation that we had in front of us doesn't begin to answer that question either so that's that's what i'm suggesting we need here it sounds like to me that you yeah uh you want more stringent documentation but without uh uh so exact uh api level just so i understand api level uh documentation of what features don't work when you create a pod when you create a daemon set or when you create uh spec by node affinity or something like that uh is that correct i'm not arguing that we're not detailed enough i'm arguing that we don't have any so if you pull up that page that was in front of us a few minutes ago can i share my screen actually uh would that be okay because i can show you the read from our i think unfortunately we're out of time now so we i think we might have to uh do some offline follow-up here i hope and yeah we didn't get onto the rest of the agenda unfortunately but i think this was important hopefully it was educational to everybody and yeah let's perhaps take us offline i don't know if we want to can we wait two weeks before we continue the conversation or um should we schedule something uh separate from the next meeting Ricardo what do you think man yeah we might be able to schedule something next week around the same time uh thursday at uh eight eight a.m uh pacific time does that sound good to you guys yeah yeah and i'm trying to yeah um do you want us to start to try and write up that documentation within the week because i think that's the biggest point of um discussion yeah yeah okay yeah definitely we will um we will start that um and we'll see if that kind of looks like what you guys are expecting um and kind of go from there and we'll keep and just to be clear this is the same thing brian grant requested i believe in like october yeah um request i think it's just the same request over again yeah i agree with that we were i mean we were doing conform and that's why we're going through conformance testing right now um vote for eodol and the bigger question for and that's like i wanted i didn't reply in that thread but the bigger question for me was does this like does this align with every single provider um and how is this going to be useful for anybody else and if we can answer those questions i am 100 like i am happy to do the documentation and and i'm sure it will be useful so it's just a confusing part of like we have a core and we have all these providers that have the ability to be so flexible so it's hard to figure out like how conformance and he he asked for conformance specifically i think so that's why we were going through conformance yeah um in my in my mind that you can conceptually create a table uh where you know down the left hand side is all the kubernetes features and across the top is all the existing providers and you can say these providers support correctly support these features of the kubernetes api and there will probably be huge bands across that table where none of the providers do and those are actually cases arguably where the api cannot support those things today and maybe we you know in future we could we could address those or maybe we just make it explicit that if you use a virtual cubelet uh back end doesn't matter which one it is it isn't going to do the following things uh i think that's you know conceptually what we're looking for okay and then yes you know ideally be able to dig down into more detail and find out exactly like what doesn't doesn't work there but at least that at a high level one needs to be able to understand what that table looks like okay doesn't have to have every possible provider it might just have three or five you know the primary ones as an illustration all right yeah we'll we'll pick out three thank you so much for all you guys this time we really really appreciate it yeah all right guys uh yeah time check is um nine nine or three a.m um yeah so do you want to have this discussion two weeks for now again or or you want to have it next week so it's up to you guys so um you should do next week i mean got no plans i would suggest that we that we wait until we have that that documentation or at least a table yeah you can have it done in a week that's great but i don't want us to have the whole conversation over again without without that recognition so let us know let us okay yeah yeah and then we didn't have time to do the the roadmap so i think if we schedule a meeting next week we might be able to talk about that and if not then we'll talk about it in in the next you know whatever the two-week cadence that we have does that make sense yeah i think let's push that out to two weeks i don't want to exclude people from that conversation just because they couldn't make the um unusually scheduled one we'll the one in a week's time we can focus specifically on virtual cube that i would suggest got it got it yeah all right anybody else has anything else okay well thank you everybody we'll see you next time thank you for joining