 Should we get started or we yeah? Yeah. Yeah, let's do it. Um, okay I'm trying to think of anything else, uh Okay Looks like Chris wanted to mention that on boy con is almost sold out By your tickets, it's gonna be awesome Do we do we know which maintainers are going? Are you going Greg? Yeah, I'm gonna be there great Dad's gonna be going Who Dan? Great Dan, I think you're gonna go Harvey Alyssa, obviously I'm trying to think I'm not sure who else alright. We'll try to get a I'm gonna start working on my slides probably for the For the last section and I think I had talked to Richard from data wire And I think we had decided that it's mostly gonna be like a like a structured Q&A But it'd be fun to figure out which maintainers are gonna be there and like get some photos so that people know Like know who to go talk to you after so we can hopefully hopefully organize that that'll be great Awesome, okay Great Harvey did you want to talk about the event manager stuff? Yeah, and this is really just I mean It's probably I don't think anything new is gonna be said on this call I just want to just raise general community awareness that right now I'm looking at event manager replacements for our point This it won't be an either or thing the idea would be you could actually turn on alternative event managers such as BUB or BUB The main motivation that I have is I would I think there are multiple motivations actually the first is I Would like better control of the loop to be able to gather better statistics live event doesn't really provide that There's also performance Aspects as you scale up to large vision number events and really you're talking about like thousands or tens of thousands of active events When you start to see real differences, which might occur in scale of proxies, but Secretly like you know, let us say lots of you know idol connections But it's there's nothing that most of the overall community will face And I think the final one is just having a code base to work with which is like I kind of feel both LibEvent and LibEV are Ancient code bases which are never updated and people all make the argument Well, you know, what's what's new in the e-poll handling and nothing this has changed in the last five years That's probably true But there's various, you know small features like adding in hooks for special additional flags to control connection management or which we have an open issue for and also for You know these additional statistics which would Motivate us working on an active code base in particular I would like to work on a code base adding features if I am going to upstream them That's going to actually make it to a point to release sometime soon as we work with Distributions who might want to for example dynamically linking system, but libraries such as their Lib EV or LibEvent or LibEV They're unlikely to want to work with a patched version of that library one which is it's some random points in the version control history, so That's sort of the case for LibEvent today. It's has very infrequent releases and it's Very stable LibEV is not much better LibEV is actually it's it's canonical source control. I think it's CVS which It's been a long time since I've heard that word and And that alone, I think would cross that off off the list for me. I don't think I've used CVS in like 25 years I was using it only 11 years ago, but that was on a code base that was itself a couple of decades. I mean yeah Okay, basically LibEV is where all the cool kids are these days Yeah, it's a LibEV is likely preventing that it does a lot more than we actually need it to do but if we just take the event loop from that at least we'll be on an active for code base and One which has shared fate with Node.js. So we know that at least one other pretty massive open source project Also considers it a key dependency and we also have other nice overlap Node.js like you know, NGHTP2 and so on yeah Yeah, I think my my two comments here are I think it's worth doing some investigation like just do you know like playing around like hacking it off like because I think Greg Had some comments that you know, it might not be so easy. I just don't know Yeah, I've done this before and and LibEV completely abstraction Abstracts the notion of a file descriptor and so you just have operations on an opaque thing And so I think the code will have to change substantially It might be a good change, but I don't think it's going to be trivial or straightforward Also the lifetime of everything worked around I thought I thought that abstraction was uh inevitable anyway. It is. Yeah, I agree it is Yeah, but I think it would be somewhat challenging to uh be able to swap between like LibEV and LibEV in the same codebase You could do that, but you're building an abstraction layer so that you can use multiple abstraction layers and yeah That was my other comment, which is that I personally think that if we do this we should just change So so so like I'm not actually opposed to To this investigation, but to me Um, I guess I'm of the camp that LibEVent doesn't change because it's done. Uh, so And like At the same time you suggest and we go in and we modify LibEVent No, no, no, no, sorry. So so like just just for people out there to to be aware of our private conversation Like I don't disagree with any of the things that you want to do like I wanted to do them also Um, I would argue that they're likely very small changes to LibEVent so I I guess what I'm saying is this seems like a very worthwhile thing to explore But I feel like we should do some investigation and like come back with like a little more of a firmed out proposal As to what it would look like and then we can make an educated decision as to whether okay, like let's just switch to LibEV or Do we want to like make some targeted patches to to LibEVent and I suspect that again like we can of course have a conversation of whether that's worthwhile for a point release Or whether we'd have to fork or or or blah But like knowing what I think we need to do like these are very small hooks that we could add in like Very small lines of code I agree Okay. Yeah, I mean the nice thing about LibEV is it will pave the way to better window support in the long term And for sure. Yeah. Yeah. Yeah So are you so I I guess to to try to understand like are you committed to just like Doing some hacking and then like coming back with like a little bit of a more informed proposal Like is that is that what your plan is basically? Yeah, that's that's pretty much You know, I was actually gonna look at LibEV this afternoon Yeah, I was treating it as essentially something much closer to LibEV than what Greg tells me it is Greg, do you know if you can actually get to the lower level like does it have a lower level library abstraction? which allows you to You know essentially stick with something closer to the LibEV model um If you did such a thing you wouldn't support windows at all anymore uh, you're you're It really doesn't want you to do such a thing. Yeah Okay, I was just curious. I mean I assume you would still be able to support winners in the same way that LibEV supports Windows through a giant select loop or something right or not um You know, I'd have to look at whether it actually supports windows because you know on windows the the apis for Files versus sockets are completely different. You can't just You know switch between them like you can on on unix systems uh, so It may or may not work at all if you kind of treated it as all file watchers Okay, I'll I'll I'll go in there and I'll take a little bit What what needs to be done? I'm I'm kind of wondering if there's going to be any overlap between this and the ongoing work from um Yeah, they're they're very they're very likely will be like not Not probably in v1 of the sysco fd work But I mean, I think eventually they are going to have to replace the event loop with dbdk or or something similar so Like that there is parallel work here. It's just it's hard for me without Like you're going to do looking into the let like the the libv api It's hard for me to understand what like what this means. So I don't know I I feel like even if you were to go off and spend four hours and then just and then just report back Like I feel like that would give us a ton more information. I'll do that and I'll hopefully get some piara I mean, basically the api looks like you you create, you know a connection like a stream connection And then to do a write on it you tell it I want to write and you give it a completion callback Which may or may not be called back immediately and the same thing for reads So Yeah That is a very different model. Okay. Yeah. Yeah, I mean that's a that's a better model in my opinion So yeah, no the iocp model is right. I think it's superior. Yeah, oh me too Yeah, yeah So so I mean this this might be like the the better long-term direction because it would support windows better Um, like whether the library natively supports windows or not We eventually would want to move windows over to using iocompletion ports anyway. So this might make it easier Um, but yeah, like per per greg and again without me knowing anything I'm just I'm worried that this is gonna it will turn into a large and probably scary project Which which might be the right thing to do. Um, but it's but it's definitely worth analyzing Also be very challenging to do this on the living patient while other people are making changes Yeah, um, okay. Yeah, uh, let me just get some information and we can uh think back Yeah Okay Cool Um, I was gonna briefly talk about quick Um, any any any other stuff that folks wanted to bring up? Hey, this is uh, Kyle LaRose. Can you hear me? Yep Yeah, I've been um poking around with this uh source transparency idea that Was submitted as an issue a while back and figured I could talk a bit about that Sure go for it All right. Um, so for those who don't know, um We would like to have the ability to Connect to the upstreams with the original Source ip and maybe port. Um, I came into the system And you re-determine that either through uh something like proxy protocol Or just assume that what arrives at envoy from the downstream is the correct source of destination Sorry source port and ip Um, so to do that, uh, there's some some work needs to happen internally In the connection pool so that we can actually Reflect the those ip that information in the connection Um And we also need to do a bit of work to get that connection information into the connection pool So we had a bit of a back and forth on this I put up a design document. Um, and I think the approach we're going to take Is to create a connection pool which wraps other connection pools and sort of handles um a lot of the complexities Of ensuring that the source port and ip Matches what came into the system the biggest complexity being the fact that You can't just open a connection to the upstream expect to work. Um, you need to make sure that Uh any incoming requests maps to the correct upstream connection for its source ip and port In case there's multiple requests in flight for a given connection So, um, I kind of been moving slowly on it For the past bit. I've just been pretty busy with other things, but I'm hoping in the next week to release Sort of buckle down and then get to work on it. Um, so we can have something to show soon Anyway, if anyone has comments on that, uh Let me know Yeah, if if if folks have not seen the design doc, it's super detailed. There's lots of comments on it It's it's it's worth reading. Um I just the one thing that I wanted to point out Is that the reason that I'm I'm I'm really excited about this work is that it won't just it won't just solve source port Uh, there's been like there's been repeated requests for other features like this for example to do routing to a A host with like the same sni information as was set from downstream And this wrapped connection pool would also allow us to do things like that, which is nice All right, that's all I have Cool. Did anyone have any questions or comments on that? Anyone have any other things to bring up? Um, was somebody going to talk about it quick? Yeah, I was just I was just asking if anyone had any other Did I did stat memory reduction? Well, yeah, I was actually going to say do you do you want to talk briefly about our Our email thread because I think it's interesting Yeah, um so I kind of um Did this pattern where I just went hog wild in something. I'll never try to do a pr4 to um, eliminate as much stat memory as possible which I was able to cut like half of it at like Well, actually as background there's cases where We are scaling envoy to a surprising number of clusters tens of thousands maybe more And in that scenario like almost all the memory is stat names so um, that seems like not a great usage of memory and most of those names are just Different combinations of the same strings put together in different ways So I've done kind of I think most of what I can do along that line without Doing the more radical thing of introducing a simple table um And that's kind of in flight now um Along the way, I found various things that might make that work easier that Matt and I have been chatting about one of them is to um Simplify or maybe even eliminate shared memory stats um, the elimination of it would be either still be hot restart, but it would be involve um having you know a transfer of control and a transfer of data from the old process to the new process that was Matt's idea actually I haven't looked at that at all other than to talk with him about it. Um, I also thought that it would simplify um shared memory stats to not store the stat name at all in the shared memory, but instead store like a shah hash And just use that for unifying For deterministically unifying stat names, but with a fixed size and that would eliminate a lot of complexity and reduce the amount of shared memory needed but Mostly what i'm looking at now is A flow where I can have this symbol table all of the stat name memory held in a symbol table Without taking lock in the hot path that last part is the tricky one Because the symbol table itself requires locks so I think I have a solution to that but I don't know if I have a solution to that that doesn't involve changing a whole bunch of lines of code So that I'm kind of messing them around with that now and trying to get something that's reviewable um Cool. Yeah, one one thing just just to throw out there and this is something that I I've been thinking about for a while and I'm curious if people have thoughts on this, which is this whole topic of There's been a bunch of work already done with like symbol table and other stuff and So, you know, just just for historical reasons Stats started out way simpler than they are now like things have gotten a lot more complicated So some of the design decisions that we started with say three and a half years ago um, they may not apply anymore and one of those decisions I am becoming Increasingly convinced is not worth it anymore Is to keep the stats themselves in shared memory and and very roughly the way that I think that would work Is we would move each process individually over to symbol table like it'd be able to do heap and minimize memory as much as possible And then in the hot restart protocol We would actually have some type of pagination API where the new process can just ask the old process Give me all your counter engage values, right? And it would do that until the old process shuts down So say every flush interval like every five seconds every 60 seconds The new process would say to the old process. Hey like give me all your stats The old process would start sending rpc frames to the new process with some blocks of stats And then the new process would basically when it outputs stats It would add the old processes counters engages to the new processes counters engages So, you know, it's not going to be as real time or as accurate But to be honest, I think it will work perfectly well and people won't even know the difference And I think if we did that it will simplify so much like we can get rid of all the truncation stuff Like we can start using the same co-path for for like symbol tables so And again just for historical context the reason that I didn't do that way back in the day is that things used to be Way, way simpler. So like shoving the stats and shared memory was a lot simpler than writing this pagination thing But now writing the pagination thing seems trivial compared to all of the other work that we're doing. So So, you know, that's my current thinking is actually that we should rip stats out of shared memory So I'm I'm curious if people have any thoughts on that I think that's worth investigating It sounds it sounds like a good change if we can pull it off I'm curious how it'll scale with large numbers of stats and you know, how much data we have to send between the processes And yeah, how much cpu time that consumes, right? Yeah, I mean My water of a hundred a hundred stats per cluster and each one takes a hundred bytes But it's only the gauges that need this sort of real-time responsiveness, right? So so yeah, I mean and like as as part of this we could we couldn't rethink a bunch of things. Um I'd say gauges are the most important thing, right? Like you could argue that we could just do this for gauges In a perfect world you would have counters to because the old process is still doing things, right? So like it would be nice to be able to add both of the things that the old process is doing like closing connections draining connections, etc Um, but I I do agree that as if we go down this road I think we can have a larger conversation of like what what is the optimal way of of doing this And I I do think that it is a coherent argument to say that well It's too much work if we do counters and gauges. There's way fewer gauges like let's just do gauges But I do think like I guess the optimal solution would continue to send counters and gauges So like I guess my opinion would be let's start there If it looks like per greg that it's using too much cpu or like something like that You know like we Could back it off, but my gut Well, as I say just real quick my gut tells me that it won't be an issue because if we only send the data from the old Process to the new process every like 30 seconds. I just don't see it being a big deal Right, and if it's only the delta rights, you know, it'll be much smaller like even if you have like 10 000 clusters You may only be actively using yes 50 of them. Yes, right We would we would only send like we already keep track of what stats have been quote used So it's like we would only send used stats. So I think the number of stats that we send would be small Yeah, I wonder if you could actually rather than developing new api almost somehow make use of the new processes as a stat sync for the old one I haven't quite thought that yeah. Yeah. Yeah, actually maybe That that is that's worth thinking about for sure Um It'll be a little tricky because you know when you send stats and when you request them is a hot restart Implementation detail because the new process is shutting down the old process. So there might be some timing issues but yeah, I I mean This this might be the opportunity that we've looked for also to change the the rpc api to using protos So it's like I mean there there's some real benefits here that might That we might want to invest in this so I guess I mean, I'm not sure if you're up for it joshua, but I mean it might be worth it to at least open an issue Like listing out some of these options And then we can talk about it from there, but I the the more that I think about it the more that I think we should just rip it out That's cool. I I will say that um That has a bunch of benefits what you talked about. Um The existence of the hot restart path and the alternate way of allocating stat memory Is not really getting in my way at all. Okay, all right Because I managed to find kind of the right points to virtualize that and I feel like um That's not really making my even the truncation is not really a big deal But uh, but I understand there's a bunch of benefits to this and it sounds more portable and Well, yeah, it just seems like I feel like as this code has gotten more and more complicated I feel like this would actually allow us to vastly simplify it and make it easier to understand again, right? Whereas now like with the different code paths with the heap allocator and the shared allocator It's like it's so complicated It's very very hard to wrap once head around it. Yeah I'm I'm deep in it. So I I get it, but I understand that's definitely harder to understand Okay, are you are you willing to potentially open an an issue though on this whole conversation? And then we can decide like we'll we'll find someone to look into it Yeah, I'm not sure if I'm gonna have time to dig into that But I definitely I'll sure open the issue and I'll try to record the stuff that we discussed already Okay, great. Yeah, if it turns out that you don't have time to open the issue Let me know and I can do it. I can do that great, um awesome, uh So just in like three minutes about quick. I sent some emails to the list We're we're gonna be kicking off a cross company effort to get quick going Um, so just check out the emails that I sent if you're interested in and quick I think over the next couple of months will increasingly be looking for some help Uh, right now, I don't think there's as much to do but there'll be more to do soon Want some of the initial udp stuff lands once some of the VPP fd refactors land and at some of the google work. So That'll be great cool Anyone else have anything that they wanted to bring up? All right, just out of curiosity. What's the rough model for Mapping udp onto envoys, you know worker model Do you receive every packet and then forward it to what you calculate as the correct worker? Yeah, exactly. So the way that we've been thinking about this is that um in newer kernels with udp If you use I think so reuse port I think the kernel will actually hash So forwarding wouldn't necessarily be necessary like it should it should wind up at the right place like that's that's how google does it at scale So I like I don't in the common case. I don't think that there will there will have to be cross worker sense It should just work Okay, but we'll have a fallback path for yeah, right, right do that or whatever. Yeah, exactly Yeah, so we'll have to keep track of of the map and then basically set set packets over. All right. That sounds reasonable. Yeah Cool anything else had a quick question on uh What we were deprecating some of the tcp stuff we had uh kind of a blocker with the source ip is not being implemented fully in b2 For filter matching Yep And then there was kind of another parallel pr for adding a source type Which seemed like it was potentially also going to do this. I had kind of a parallel pr If you add source ip and source ports, but then we decided source ports is probably something you want to deprecate entirely anyone have any more follow-up on if that source Hype pr is something that we should just kind of To go to where we actually ended up going with that and may have just been sitting there and it's got this fail the problem really there was that it was an incomplete solution like it was Definitely an interesting feature which you couldn't emulate with the existing Um matches this idea of trying to make sure to determine whether an incoming ip was essentially coming from the local host or not Uh, but yeah, the claim was I'm being kicked out That's you could do that just by trying to match ip addresses was Not really complete for more complex systems. So My take was that it was fine to add what they were after but they needed to Rephrase what it actually was doing versus versus claiming there was matching from same host to at least, you know, since you just say my p So yeah, but anyway, I'm being Out so I can continue this online Yeah, let's let's attempt to find some resolution to that. So maybe christopher if you could ping those prs. We could figure it out Yeah Yep, okay All right. Thanks everyone. Have a great day Thanks