 So, okay, anyone who's used Streams and knows that it's in and he asks, not to mirror things but how it works. The short version is Streams model and Node is essentially pushed. You have, you know, the data provider, the data consumer. The data provider is basically just shoving data at the consumers, right? There is a version of the back pressure. The state model is involved is rather complicated and we can't separate the state model from the implementation. There's three versions of Streams that are all going to overlap each other in the API. There's a lot of crux that has come into this. There's not only three versions at the JavaScript level. There's also at the native level the Stream-based class that underlies stuff and there's complexity in terms of how it works. And the native flow that goes back and forth, makes work there. There's the Streams pipe method, right? We can pass and test the data down for a series of Streams transforms, right? That doesn't work as a native layer. I know Ann has been, you know, just looking at that a little bit to see if we can get a native pipe going to make the data flowing and make it even more efficient. There's just, over the years, it's been, the Stream-based API has just gotten cropped here, right? Was it two years ago, two or five years ago? Yeah, where it started. A few of us got together and we had lunch at Interactive two years ago at Vancouver. We started taking around these ideas for this new Streams API. And somebody asked, what are we going to call it? And somebody suggested Bob. So it's Bob, Streams API. Jeremiah has been the one, primarily pushing it forward because the rest of us that were talking about it, just got distracted and he was the only one that, you know, it's done fantastically to work. But reaching a point now where in order to move it forward, we need to expand it and those of us that were talking about it, I can help. And then if anyone else wants to help out with it, then we just want to kind of give a basic idea of how it works. So, you ready to go? Not really. I'm going to keep talking. Sure. Like the presentation that I have modified. The way that you're going to go into the details of the difference is with Bob is that it's entirely full based on not a push. You have a sink and a source. The source is providing the data, but the sink has to be pulled, right? It has to say when it's ready. And the state model is greatly simplified that, you know, you only have the one read, right? And if you read, there's no data available to tell you right away. If it's an error reading, it will tell you right away. There's very few state transitions as you were actually reading the data. And part of it is it emphasizes being memory stable. So, rather than the source allocating memory and then pushing it and then the receiver having the copy data, right? Or notified of this course when it's done, the puller would allocate the source for it. And the source would basically write it as storage. So the intent is to hopefully reduce the number of men copies that actually happen. I can't get the other micrometer work. So I guess I'm just going to go over this. So, James, I kind of give an intro and this is kind of just the end. This is slides that I modified from Vancouver last time. Yeah, I share this on the screen. I don't know how. You can just look into the Zoom group. Zoom, where? Zoom, you start Zoom. You go to issue number 171 on the subject. I think that's easier than you think of the thing. So yeah, James kind of like one of the things a little bit. So this is kind of a modified presentation from last time. But it's because somewhat limited progress has been made since then, but it's still very interesting and saying that we can still continue to discuss regardless of that. So I'm going to go over like the slides that I do have. Perhaps a little bit more quickly than last time. I think you have a little bit more time to spend, but we'll try to discuss some things around them. So Bob, yeah, we went over why it's called that and future questions with the streams or stream like things. So I have some stuff here. There's kind of like why I think James said a little bit of that and I had a little bit of a lay of the tools, API, the status and stuff to discuss. So why reinvent streams? You might know why here, but the user experience is really bad. People run into problems all the time. We hear about it all the time. I don't think this is a surprise to very many of us at this point. And three performances. We knew that too. So I have some goals that I'm going to go over and just like last time, there's like some terms. So things that I'm calling a consumer were saying is the API end point where data goes to and a producer or the source is the API and where the data comes from. And the protocol is a combination. So the goal of this effort is to make a protocol really. It's pull based. It is binary online. So if you don't need to worry about logic mode. Where's Jane? Thank you. The stateless as much as we can get to. So any state is in protocol. We got this wild state gene. There is as one to one as possible. There's no event bidders. Because those also. It's very important. There is also timing. As much as we can get it so that we don't need to worry about next six and events like you currently do. No buffer in the protocol. So we would like the underlying classes to not have buffering logic and any buffering that is done. It should be done in components where it actually needs to be done rather than all the time. And also there should be able to be in line and sort of like ends or ends of files. She'll also be able to be in line message. So I make a current proposed API and unfortunately is late in the day. And I'm still going to show you code which I regret, but that's what it is. So the same API of this essentially looks like this. Thankfully it's rather simple code. There's a method to kind of bind things together. The same to the source. So you get the protocol and then on the same, which is the thing that is going to be receiving the data. You have a next method, which is what the source is going to call. And that can be filled out a bit more. So this actually more or less kind of like works. Find the source together. And then in your next where you get there, you will be, once you are done receiving and processing data, you are going to pull it again because these are pull basers. So you request the data always. And the source API is kind of like this. So there are two methods that are the other side of proteins, you have plain sync and pull this call. And that can be filled out to once you get the data, send that back to the sync with next. And in practice, obviously there's a bit more filler, but that's really a lot to protocol. So a bare pass-through can be constructed just as this. This works as a bare pass-through completely. There is no extra code needed. And it fits on this. It's pretty nice. So composing these things into something that you can actually like stream data through is kind of like not the most perfectly pleasant right now. A helper would be appreciated or like as in writing a helper is probably going to be the ideal thing to do. Or to just change the binding mechanism. I left it rough like that. That is kind of it. It's not as simple as them raw for discussion. And so if you want to put a transformer between you basically have your sync bind to your source. So you have your transform bind to your source. And then your sync would then bind to that transform. And the way the way most of it is set up right now is so that it would kind of like automatically start. There's some stuff I basically wrote like this very minimal thing on like what we may want to add to our extensions. I'm not sure if that's any slides, but in cases where you have a network socket, it's nice to be able to start that explicitly and not have it on right at the start. So all this is found on my in-night and repos under FISHORC 123 slash BOG. It links to everything. And if you go to slash diagrams, you can find these diagrams. So this shows you the entire flow where there is no error. So I don't have a point or anything, but you start down where the bind, et cetera, kind of thing is here. That's where your setup is. And then you go and you pull up the stream. So in this case, you may from a network socket or something pull from a compression transform, that is going to pull from say a file source. And then once your file source has that data, it's going to send that back down. It's going to send it back to you in a transformer, something you may have it need to do some buffering. So it may do that all the time. And then you're going to get down to the bottom and it's going to repeat as many times as necessary. And when that's done, you will get a message there to find that everything is or you got error. In which case the error flow looks very similar. The error can be omitted anywhere. It bubbles up so that you can close anything above it. And then it comes back down so that you can close anything above it. And also the thing that admitted here. So I don't know if that makes sense, but we can go over it again. Hopefully discuss it. The current status is I have some various modules published. They're published at MPM. And there's GitHub sources for these things. Bob status, epic source, epic sync, Z-look transform. They do work. And then there's socket, which kind of attempts to make a network socket of this. Requests with that socket work, but the server, the server looks at least memory because I'm not a very good CC program. And I limited time to spend on this. There's some things on API performance if someone wants to know them, it doesn't already, but this is much cheaper to do than using current streams. And maybe we can discuss that. Any questions on basically API? Any questions on the benefits or concerns? Will we have four streams? How do you want to kind of support the new API in the existing like net or H2P? And doesn't have any overlap with the promises friendly API for event and return streams. So I kind of left that section on the slides, but the plan is to change all of the undersides of node basically so that streams internally would use this because streams is mapable on top of this, including all the events stuff as far as I can tell. So the path for introducing this is basically just to add it as a separate new streams implementation, right? In parallel to what's there. The existing streams API would continue to exist untouched. But there would be a map layer on top of this that would give a legacy stream mapping on top of Bob. Yeah, the idea was to do this in a way that where streams could be implemented on top of it so that we could then prototype this essentially also in node by switching out the internals to use this. And then once we were happy with how that works, then we could publicize it and not have to go through. And one of the other nice things about this is that the same protocol. So this is essentially just a protocol and it works both at the C++ and JavaScript using the same model. So we don't have two completely different API streams models operating those two layers. I may have missed it because I just came in but inside the question was how does this relate to some existing and ongoing standardization efforts on streams right now? So one of the other intents of this was to give a new low-level stream primitive upon which something like whatever additions could be built. Right now the existing streams in node, if we tried to do the WG implementation on top of that the performance would be rather bad and the models don't quite fit. With this, it should be rather straightforward to build what WG streams in that part of the system on top of this are fairly trivial. I guess so then one question would be does this need to be exposed? So if you put this in and you built the existing stream then you built WG streams and you'd have to set it free. Does that make any sense? Yeah. We have to decide whatever makes sense to expose this. Yes. The idea is that we would potentially eventually expose it because it would be much cheaper to do and it is much simpler than either. And it is much more, well, yeah, cheaper to do. It is much more performant than what WG streams are. WG streams can be built on top of it without any real problem. WG streams are push based when you are in the stream but the actual end points of consuming something or giving data somewhere are actually full based. So that interacts with this. Also it's a lot easier to build push based streams on top of full based streams. All you do is have something that immediately grabs data and pushes it along rather than the opposite way try to build full based streams on top of push based streams you just have to buffer a ton of data all the time. So this is probably just exposing my ignorance about how streams work. But if you don't have any buffering and it's full based entirely, what happens when the source is out? So this is intended to reduce buffering as much as possible. When we do have a source which isn't far close then it will have to buffer. What we do, the goal is right now streams require buffering at every layer. Like every variable has this buffering. We want to isolate the buffering where it only went where it's absolutely needed. There are some, unfortunately a lot of the other line protocols now are giving us options for buffering the flow control in the protocol. HP2 and quick for instance we can actually if we're not reading we can tell the sender stop. Some of the other protocols just will have to buffer inside of that one layer. So it's whatever implementation of the source is. Yeah, exactly. Something will always be buffering somewhere if you are in your program and there's your operating system. But the goal is to try to avoid buffering that at every step along the way. Another thing that's not in here that I mentioned is kind of like the libv layer of things. So in my target work this is also implemented in C++ and should be implementable in C also. So at the system level a lot of the calls you're going to make in that are going to end up one way or the other being like a request from the operating system which kind of maps a lot better to place things than push-based streams do. And libv currently exposes push-based streams but could expose full-based streams potentially in a way that may make more sense. So that is kind of like a thing that I can open up to. And one thing that we want to touch on. So the current streams API has both the readable and writable interfaces. And those are two various things. With this there's only one protocol that works for both. So you have the writable site and implementing is a source and then you will rather than pushing data into it you're essentially just waiting for whatever your destination to pull data out. So it would be a slightly different model of what we have now with this pushing stuff in but we will be able to support that writable site as well. So any other questions in terms of the basic model? In terms of what's needed here is just more validation of the model. This is something I'm going to be going to take a look again at what Jeremiah has done. I've been working on this for two years and you know needs some help. So if anybody interested in how pushing this forward I would imagine the next step is actually getting nice significant stuff is actually landing to support for this. The next step is getting some sort of conversion between this and streams. That is required to do the stuff that you just said. I just wondered like still a separate project. Is it time or reasonable to move that project under the Node.js or something that is more easily discoverable and having other people get involved? I'm not really convinced how much that matters currently. So yeah the streams, the existing stream JPI is rather complicated. It requires somebody that has a good understanding of the streams. Not even but you know if you want to get involved in that is definitely something that we can do but I would love to see this actually get into fairly quickly and it's something we can introduce and we should be able to introduce it in the minor. Yeah I mean I just want to add one good thing. Once we have the transform layer one thing that would be really really nice to check is that if this mechanism is more performant than our current stream base approach. I mean if you like stream base or know what stream base even is and basically if we can switch from that model to this by providing better performance with the transform layer on streams that would be promising if I want to maintain it overall. So yeah that would be very very very good. Fortunately I have no time. I have seen this. I might be a bit tanked by crunchy so I'm sorry just you call me so I need something. There's a bunch of example what is code in the socket that I was working on the PCB socket that's implemented with this. That transforms also from JavaScript and I mean that's a decent amount of code especially because of how NADI works but the overall pattern once you've got it across the barrier of needing to talk to JavaScript and SQL is pretty much the same so there's not some wildly different player that's kind of like needed in case of stream base. And a couple other things I'm going to mention since James pointed out places in all of them so there's other things I probably could list somewhere that could still use like prototyping like if you do want events with this like for some reason you want to see when data is going for a pipe or something the pipe I did more favoritively putting like a transforming that it means events off of it and you use for ways from the possible you just want to like listen to errors or something for some reason that way another thing is okay this is promise base it's not really apparently call back base either it's kind of just like objects that talk to each other that call each other and others but consuming a whole base thing with a async iterator also maxed very well because async iterators are essentially called mechanism as it is on that note the reason why it isn't sort of like async away base is because or async iterator sort of base because it's not really a way to pass buffers up from the consumer to the the thing that is giving you data and that is memory copy reduction that we want that the one thing that I will see the current API let's just basically pull on the next there is one limitation there that we have with quick for instance is that we might also need an hack because with quick we have to hold on to the buffer data until it's explicitly acknowledged or we have to write it again so as far as the guys concerned we might need to have one additional bit on there but Benjamin also told me to look at Dennis streaming kind of stuff recently the one thing it has that this doesn't really inherently have is like seeking although I guess you could probably like implement a like thing before out all your other pipeline that does some kind of like continuing until we get to this point or your source could just do that so yeah so on those lines you know just you know validation of the API right you know just you know think through the model how it works how that would go and if there is if for some reason this current model just doesn't work for a particular case we can serve as that yeah we can see the great every vote that we can start having discussions on so we can move the mid-pop into any other questions comments you mentioned how performance improvements do you have numbers on like how much faster TCP gets with this implementation than the master if there's numbers about that that you might recall and I don't quite remember I know that for like doing like file reads and stuff even in bad cases you get like a 30% decrease in the amount of CPU that you're using and for good cases like it's several times so like a bad case is like your is I believe was done with Zela transforms and it still takes like 30% less CPU and just like doing a wild copy a streaming wild copy is about any time decrease in the amount of CPU that's taken and with the current stream-based model you know just I don't have the numbers on the current stream-based when you write data to the stream to the on the dot script type and it passes that through to the C++ you can't write another chunk until that callback is involved and because of the way I have to buffer the data for four quick and to get to the maximum throughput I essentially have to do a mem copy on every write then call the callback right away so I don't have to wait for the data to be acknowledged which will free it this would completely eliminate the needs of mem copy at any point all the way down to the actual write which just that alone will have a significant performance improvement so just just looking at the model we can see that there's some improvements thank you for dealing with my slugging us if I blame James for dragging me into this as far as I know that's the agenda so thank you I would just like to do a quick wrap up of the last few days we have lost a bunch of people along the way I don't know where they all end up it's it has been a fantastic experience I think for everybody and I will open an issue to receive to collect feedback on the repo so that we can improve for next time so things that have worked, things that didn't work and things that we might do better next time I've seen that a bunch of people would like to have name tags which is a fantastic thing because it means that more people that doesn't know each other so I will put that into the list of things that we have to do for the next time name tags great so we're going to do that that seems a very simple thing to do thank you I also would like to thank Dori which is not here I don't know but thank you Dori for picking up a lot of work and also to Manil that picked up a lot of work that you know I was not alone it was a big team working on this also crazy from this is the club helped a lot and Eva as well you have seen Eva she's been fantastic I can go with the logistics with the rooms and everything it's been great so I would also say thank you for coming that's it