 Hey everyone, welcome to Winkerday day. Good to see everyone here So as Flynn said, I'm gonna be talking about the negatives to a per service host per host service mesh the sidecar model being a more ideal solution the Winkerday way and Providing a most a more robust security boundary. So focusing on security So my name is Chad Kroll. I am a DevSecOps engineer at Raft and I'm also the author of the book called acing the certified Kubernetes administrator exam. I am a CNCF ambassador and Also a SIG contributor So at Raft we work with federal and public agencies in the US to scale cloud native ecosystems With a security first approach So this is a topic of interest for us So let's go through the objectives. What we'll talk about in this in this session We'll first talk about the sidecar versus the per-host debate how it came about Then we'll talk about sidecar proxies and how they work in both scenarios both at the pod level and at the host level Then we'll talk about the security implications of a per-host model And then we'll wrap it up with a summary of how to move forward with improving the service mesh overall So why have we been talking about getting rid of the sidecar proxy seems to be Pretty big debate. You've probably heard of it. What's the big deal? Well in certain service mesh proxies. Oh boy Sorry the overhead impacts of sidecars can be severe So there are some resource requirements of the proxy the ability to scale the proxies up and down really fast Sometimes there's issues with jobs and apps that start before the sidecar So like in-it containers for example Sidecars also require converting every TCP connection into a three-way segment. So the three-way handshake So also when you're doing TLS handshakes additional or multiple TLS handshakes introduced additional latency as the repeated communication back and forth Happens so sometimes that takes more than up to up to two milliseconds or more to encode and re-encode l7 and Proxies they don't support non TCP and multicast transports that could be an argument So UDP ICMP and other protocols carried by IP So see that these are the some of the so these are some of the problems that sidecar proxies or probably why this debate is So popular now. These are some of the problems that have been presented EPPF has been a proposed solution to this so EPPF I'm sure you've heard of it is a In-kernel sandbox virtual machine. So instead of writing a kernel loadable module We can write a set of instructions in user space and send it to the kernel have it have it do stuff in the kernel directly For example, let's say you want your application to process network packets You can give the app a set of instructions that execute at the kernel level Now can get access to the machine's network buffer directly And skip the passing back and forth between user space and kernel space or kernels. Yes kernel space So now that we have EPPF many have announced to the world we can now go sidecar lists But there's this is a big but EPPF is just not able to take the place of proxies and here's why So EPPF is invoked at the kernel at a certain hook point. So it doesn't just arbitrarily execute code EBA EBA programs are not Turing complete and for good reason. So for the safety of the kernel You know it has it operates this way Also, EPPF programs have to be verified to terminate The kernel has to know that the EPPF program will terminate at a specific time In addition, all EPPF programs must pass through a verifier So if the verifier rejects the program the kernel won't run it So naturally this verifier has to err on the side of being restrictive after all what we are talking about the kernel So EPPF EPPF programs can't just start and execute logic, right? Like I said, they're Turing incomplete. So they have to execute in a finite time So as a result EPPF programs are very limited For example, they cannot block. They cannot have unbound loops. They cannot exceed a predefined size They're also limited in their complexity the verifier evaluates all the possible execution paths and if it can't complete within some time limit or if it Can't prove that every loop has an exit condition the program the program doesn't pass So now that we know that we can't just simply apply EPPF and have sidecars disappear Let's talk about the the proxy being at the sidecar versus per host So because of the problems with sidecars that we just discussed about resource requirements latency support for other protocols etc Some have devised a solution to move the proxy to the host instead of being at the pod level and As you may have guessed this is the the crux of my talk today So let's talk about some of the problems that I've discovered When moving the proxy from the pod level to the node level So the first problem we have or first problem that we run into is overhead Even though you just have one proxy at the node level instead of having a proxy for each pod You have a lot of overhead issues So this comes in the form of an application consuming a high amount of resources on the host which in turn allows the sidecar to also consume a high amount of resources on the host This impacts other applications running on that node and this presents a noisy neighbor problem Another issue that we run into is resource management So when you have a proxy per host a single host is now managing traffic for a seemingly random set of pods The proxy is completely decoupled from the application so the app So app failure is hard to track and the implications of performing maintenance tasks becomes a little bit less predictable You also widen the failure domain. So when you have a proxy at the host level the the blast radius increases, right? So if a proxy fails, there's a greater impact on all the other apps running on that same node So if there's a proxy malfunction because of that because that proxy is servicing many applications all at once There's a there's a single point of failure there and then Most importantly, there's the security aspect of it So when you have a single proxy per host Performing MTLS becomes far more complex and leaves you vulnerable to the confused deputy problem if you've ever heard of that So the confused deputy problem Means that the proxy is vulnerable to being tricked into misusing its authority As a result any CVE in the proxy introduces a vulnerability to the entire host And because it's at the entire host level that vulnerability allows bad actors to Possibly impersonate every network on the node Or every network node I should say So moving on to the alternative or the linker d way We have a with sidecar proxies. We have a more clear security boundary, right? So the security boundary is at the pod level. So it's a very smaller boundary The sidecar is within the same security context as the application Therefore it gets the same IP address It enforces the same policy It applies MTLS to traffic to and from the same pod And it only needs the key material for that pod Also, another good thing is that it mirrors the app consumption. So The resource consumption scales with the load of the application This is a good thing because we as we talked about in the last slide We don't have that noisy neighbor problem But not only that if the application is consuming very little traffic The proxy also doesn't need to consume a lot of resources And then another benefit of the sidecar model is the ease of maintenance, right? So maintenance at the Maintenance to the proxy is handled in the same fashion as maintenance to the app Whether it's, you know, rolling updates or or blue green or canary or what have you So now that we know the the downside of a per host proxy And some of the benefits of a sidecar proxy How do we move forward and how do we kind of You know take that and improve the After all it's an enterprise grade, right? So how do we improve the service service mesh overall? And kind of my argument is making the the sidecar model better as opposed to Trying to get rid of entirely or move it to To to a per host model So since we've discovered that epbf will never be a replacement for the l7 proxy Let's let the proxies do what they do best at layer seven And let's let the lower level protocols do what they do best at l3 and l4 Instead of trying to combine l3 l4 and l7 For example when the sidecar proxy is deployed to a pod Instead of using an ip table's redirect. We can use sock map and epbf To talk directly to the proxy from within the pod So instead of traveling through the tcp stack multiple times Which could which could provide that latency It would utilize socket maps to intercept the traffic to the application Send the send the traffic to the proxy directly Accelerating communication and eliminating those chips back and forth through the tcp stack Another way we can think about Using epbf to improve the service mesh is when it comes to observability, right? Because we can only observe the metrics that are part of the service mesh We can't see other possibly critical issues within our environment So things like issues at the the linux kernel level for example, and that's where epbf shines really In this case we could we could use epbf to observe applications whether they're a part of the mesh or not So everything in our environment, which is nice for example In a in a previous talk by brennan greg at netflix He was using biosnup Where he finds a log rotation service, which is causing spikes and disk utilization Which is not necessarily something that service mesh would catch So looking ahead epbf has a long way to go before it can replace the sidecar proxy And maybe that's not even where we need to be focusing our efforts, right? Maybe that's a that's a lost cause We've definitely realized that a per host model introduces a whole host of other issues And making my job as a sysadmin more challenging And then overall Let's start to focus on augmenting the service mesh Focus on improving the c&i with epbf at layer three and four And let's keep on focusing on improving the proxy at layer seven So improving the sidecar model, uh as I spoke to you before And that's it. Thank you all for listening Thank you. We actually have oh, I already see questions. Yep. We have a few minutes for questions. Hang on a moment. Please Hi, um I guess it was just talking more like proposal and not based on like I was wondering is if there any actual In production user stories of an ebpf host model failing, um I guess that's the first part like or is this more like proposals of how to make it better and then um, what would you say is good some good examples of a Like using the gradi in the sidecar. What can we use for c&i ebpf? Like would you would you recommend like selium or a uh calico Kind of thing using that together Yeah, so to answer your first question uh, actually linker d1 used a per host model and actually, um, william did a uh had a good article on that talking about the um the history of linker d1 and how Some of the stories some of the user stories that you're looking for will be in there. So so look that up And to answer your second question. Um, yeah, we can use use tools like selium and um That's probably the the best use case for improving the the c&i anyone else one moment So i'm not i'm not really familiar with but but i've heard of The ambient mesh because which I think is in the istio space. How's this related to that? Yeah, so um ambient is a daemon set Which is essentially a per host model as well and so I think they I've heard that ambient has some efficiencies when used with gke specifically in terms of kind of You know kind of merging together the l3 l4 and l7 As as we talked about before, but I think there still comes the complexities, especially outside of gke when You're trying to merge all those things together. There's It's it's not a silver bullet. You still You still start to recognize some of the the downsides that we talked about and um start to run into some of those those same problems, so Even though yeah ambient is a daemon set Technically a per host model. Um, I think you still have those problems, especially outside of gke anybody else Hey, this is william. I can add a little color to that to the first bit um about the per host proxies You know kind of failure mode in practice uh Like chat said if you go You know the original version of linker d the 1.x branch was all per host And monzo was one of the big early adopters and if you go back in time and read some of their early blog posts About some of the struggles with linker d you'll find some of the you know I think that's probably the clearest kind of production story we have around The the challenges they had a lot of those were operational challenges. I think more than security challenges the security one is more of like a theoretical one like okay per host model We're mixing all of our tls certificates You know into the memory of one process Do we have a actual story of someone breaking in and like stealing those certificates? No, not that anyone has talked about but the operational ones are in some of those monzo blog posts It's things like oh if a proxy fails or if we're trying to upgrade an individual proxy then um, You know all all the pods on that host You know are are affected and those pods are not they're like a random set of Of pods from random applications, right? So whatever kubernetes decided to schedule on that node So it's not even correlated with oh, we're gonna you know do maintenance on the service or whatever It's it's like you get a random cross-section that gets affected by either failure or upgrade So yeah, check out the early monzo blog posts. I think for some examples of per host obviously it was not with ebps, you know, it's like ebps was a You know still a I think a dream back then or certainly hadn't made its way to kubernetes, but the fundamentals are the same Yeah, you can also find me on uh lincardy slack and I can share with you All the the research that I did for this talk and uh, yeah, I'd love to talk to you Anyone else questions All right. Well in that case. Thank you, sir Thank you we have a