 J's camera. So it has like that. Open the door. Open the door. All right. So what I basically did is I wrote a blog post. Right. That was interesting. Did it? Was it? Good. Excellent. Nobody else seemed to agree with that. Brilliant. So I'm going to shove it down everyone's phone by putting it into a two or three episode. Boom. Good tactic. Good tactic. Excellent. What are we talking about? Your article. Yes. We're talking once again. I'm talking about workers. Yay. I'm never going to stop. Brilliant. I've been doing some research and figuring out the rough edges with workers. What actually is the performance benefit and the performance impact? Where's the cost? Where's the gain? Yep. Because basically at the very least I want most apps to use workers to manage their state because state most of the time is completely decoupled from any DOM or most main thread APIs. It's just a state object. Can we see that with the Redux stores kind of stuff? There's very little in there that is DOM specific. So why isn't Redux running a worker, for example, right? Right. It would be one example. So let's talk about a fictional, another to-do list app because we don't have enough of those. Brilliant. But let's assume we would put our state in a class, to-do state. It has all the method you would want, like you add a to-do, you toggle a to-do. You could even subscribe to changes so you get a callback that gets called whenever something changes in the to-do list. To JSON, basically it turns this class into a small JSON object that you can serialize or send over the wire. And then we have an internal notify function that notifies all subscribers about the new state. Right. So if you put a callback into subscribe, it's going to add it to the subscribers array, presumably. Right. Right. Okay. So what we're doing here is just on the other, I add a com link because it makes it easier. So in this sense, we're just exposing an instance of to-do state, which means we can use it in other threads without having to worry about post-message. Make sense? All right. All right. So that means on the main thread, it would look like this. We create a worker. We wrap it in com link. And now what we have is this instance of the class, even though it actually lives somewhere else. Com link is magic. Woo! So what we can do, we can, you know, to-do state, we can call subscribe, pass on our callback. So this callback will call every time the state changes. We could call render, which could be your react render or your lit HTML or whatnot. Doesn't really matter right now. Really inconsistent use of semicolons in this slide. I know. Actually. Very much bothering me. Only, I'm missing one. One, two. Oh. Fine. I'm missing two. So, yeah. Once they invented prettier, like, half of my skills are ruined. Legit, same. I just didn't run it on this for some reason. I should have. Yep. You can add a task button. You can add a click listener and, you know, call add to do because that's the benefit that Comlink gives you. Yeah. Okay. Gotcha. That's just so everybody's on the same page, roughly how I imagine people using a worker, like, literally just write a class that has your state and just call the methods on it. And suddenly you have your logic of a main thread, which has loads of benefits, but I'm not going to go into every blog post about them. We can link them in the description. More reads for me. All right. Yes. Cool. But I want to take a closer look at this notify function because the fact that we're calling a function that is on the main thread while we are in the worker is actually that's the core magic of Comlink. Right. And that is kind of comparable to this. You know, there's more details happening under the hood, but really what Comlink is doing, it just sends a message via post message to the other thread saying, you know what, invoke this function with these arguments and let me know what the return value is. A lot of Comlink is just knowing, like, the thing that called it where to send the reply back to. Exactly. That's most of the core logic of Comlink. Cool. What most people worry about for some reason is the slow or they actually have the kind of the assumption that this is slow. Well, you're thread hopping. There's the serialization. Something about serialization, but nobody really, I think, knows what they mean specifically when they say it. So I thought, I want to dig into this and I want to measure it and I want to see at what point we're actually entering a problem zone. Okay. Okay. So I looked at the spec and I kept looking at the spec and I was like, this is a bit overwhelming. Yeah. It's the HTML spec. I can tell by the colors. Exactly. So I dug through it. I kind of get it now, but I'm going to, for the sake of this episode. I didn't have time to read that. I'm going to turn it into pseudocode. Oh. It catches the spirit. Right. Okay. So when you send call post message, the parameter is the message that you want to send. The target is kind of implicit. If it's worker.postmessage, you're obviously sending it to the worker. In the worker, it's self.postmessage, which implicitly means the main thread that spawned you will receive the message. What is happening is the parameter, which I call data here, the message that you want to send, gets serialized with a function called structured serialize. The function is not real, at least not in the sense that it's exposed to JavaScript, but it exists in the spec. So I guess we should say that your two JSON function that we used before was returning an object, not a string. Not a string, no, no object. Right. It could also be a string. It doesn't, like, structured serialize doesn't care about it. But it wasn't doing its own serialization. No, not really. It was just turning it into a. A JSON-compatible object. Yeah. Right. I see. So the structured serialize will turn this into a serialized format. That serialized format is not specced. It's JavaScript engine internal. Right. Gotcha. Okay. But it basically just means it is some form of binary representation that is not a JavaScript object in this current state, but encapsulates all the keys and values and all the things that it could contain. The next step is queue a task in the target realm. The target realm is basically the thread that will receive the message. And we're now putting something into the task queue that will run some code once that task gets scheduled, which we don't necessarily know when that is going to be. And that's how it's going to schedule. We could currently run the HTML document, but it needs to be doing stuff on the worker thread. And this is how it gets onto that thread. Exactly. Right. And then the first step of the task is to turn that serialized data back into a new instance of data. So this is effectively a deep copy of our original message object. Yes. This is kind of important because in JavaScript, you can't share objects. One of the basic assumptions of JavaScript is everything is single threaded. So there is no synchronized or parallel access to the same data object or same memory with JavaScript. So we can't just send the same object over. It has to be a copy. Right. And these two steps is how that is achieved. Gotcha. And this is different to JSON serialization because it supports more formats, right? It handled stuff like cyclic data structures. Yes. It can do blobs, maps, sets, array buffers, all these things that JSON cannot do. Yes. And then the last step here is basically dispatching the actual event. So at that point, your message event handler will get called and will have the data object on the event object, which you can just access and now you own it and it's in your realm. And all is good. Cool. Brilliant. Yes. Right. So this is how post message works. Structured serialize and structured deserialize, which are the functions that are most likely the expensive ones. Yes, of course. But even more importantly and something I didn't realize before I wrote my article is that structured serialize is a function that will block the sending realm while structured deserialize will block the receiving realm. And I guess, OK, so for this, for the serialize, that makes sense because it doesn't want to be doing that work while you're editing the object. Right. OK. Even with deserialize, it makes sense because it is using the objects of your realm, like the arrays and whatnot. So it can't really run those necessarily while your code is still running. Right. So if you changed, like, I don't know, the prototype or something halfway through, you wouldn't expect that to be reflected in. Exactly. But this won't be calling stuff on the global, like the new array stuff. Not sure. I feel like that could be... I think I've heard they have considered doing that. But as of now, at least it is not currently it will block the receiving realm. Gotcha. So that's actually interesting because it means so far I've always been measuring, I keep measuring the moment from when I start sending an object to when I receive it. But the number I get out is actually two parts. It's one part is the serializing and one part is the deserializing, which are happening in different realms. And so I would like to measure them separately, but I haven't found a good way to do that. So I'm still measuring that in my benchmark we're going to talk about. But just something to keep in mind that the numbers that we are going to talk about while I do the dreaded micro benchmark. Oh, no. OK. And these numbers will represent the sum of serialization and deserialization. So the actual cost on each thread will be something lower than that number. So how are you measuring that? It's basically you make a marker just before you call a post message and it's at the other end. More or less. OK. Yeah. You got the message. Pretty much that's what I'm measuring. Right. I decided to do that because I found a couple of ways how to maybe measure just deserialized and isolation. I found no way to just measure serialized and isolation. But I could only figure out these ways, use these ways in Chrome and Safari. So I wouldn't have been able to do Firefox. And so I thought the end to end test gives me an upper bound. Wouldn't either side of post message give you the serialized time? Not sure. OK. Maybe. Right. In my add of stuff as well. Yeah. OK, OK. So I rather do end to end and say like this is the upper bound. Like it will definitely be less expensive than this. So it gives you something, it gives you a worst case to reason about. Which I think is probably better if we're talking about resilience and similar issues. The first thing that I wanted to find out is not necessarily get hard numbers but just figure out what shape of a message will make post message slow. Is it just really complex object with lots of objects or can it also be a very simple object with very big strings as values? Right. And so what I did is basically I wrote a function that generates very different objects with sometimes very small with long, long keys between like two to four kilobytes and sometimes very complex graphs of objects with just very short keys and short values, something of those lines. It feels like deep would be more effort because it's going to spin around these serialized functions more. So I think about it that both functions will have to somehow traverse the entire object. So my hunch was as well that simple object with longer values would be faster to copy than complex object with simplest. It turns out it's actually fairly linear with the serialized payload size. So if you have an object like a kind of wave, what's going on? All right. I'm about to explain that. So if you basically JSON stringify your payload and look at the length of that string. Yes. As a size. That's a very, very strong indicator for how long it's going to take. Really? So even just like if it's a thing with just one massive string. Yes. Even long strings take a long time to copy it seems. Or the other way around. It's just as fast to copy a complex object in a simple object if the same size. So hang on while we're talking. You got it. So keep in mind both scales are logarithmic because otherwise... Of course they are. So this correlation kind of holds mathematically. But only really for objects above 10 kilobytes because if you look at it, you kind of see it's curving inwards. There's a couple of outliers. And that's for a couple of reasons. One is that we still have reduced precision timers due to Spectre and Meltdown. You also even add some jitter to the timing so that you can make it less useful for these high precision timing attacks that were involved. Of course. But also let's add the lower end just some weird fluctuations and static overhead just add excuse the numbers more. Gotcha. So this was run on my MacBook Pro where I ended up with about 5 microseconds per kilobyte. This number is not really useful. It will be different on any other device. So it's not something it's worth measuring. So you're targeting a low end phone and you're not. So I definitely don't measure these numbers and make decisions of it. I just found it interesting to see like which kind of scale we're talking. Gotcha. So that's what I did. So now that I knew basically it doesn't matter just like the stringified size is a good indicator of how long it works. All right. I just want to use that. Across a couple of devices and see at what point do we run into trouble in terms of rail budgets. Okay. So we're talking when it starts getting over 16. For example. Yeah. That's basically what I did. I started on my MacBook in Chrome did a thousand runs for each constellation here and basically wanted to see at what point do we run into rail budget problems. So basically what we're looking for is numbers around or bigger than 16 milliseconds. So if we look at this over here, the green area, which means anything between 100 kilobytes and one megabyte of payload size that's where on a MacBook Pro in Chrome we are in trouble zone. Everything lower than that. We're absolutely fine. We won't risk. When you say payload breadth, what's the units here? It's not just like three objects. D for it. I mean it's basically it's like every note on breath every note has three values or four values or five values and the depth is how deep do we go. So how complex is the tree that I'm building here? I have more examples on my blog post but basically what we're just looking at is because we know now that object size is a good indicator. We just look at the colors really. So hang on. So six, six. Are we talking 36? What's the how many? Six to the six. Six to the six? Of course. Of course. Of course. Of course. Of course. Of course. Of course. Of course. Of course. Of course. Of course. Of course. Of course. And that's a lot, a lot of data. Got you. Right, okay I'm following. So yeah, we can see if we 10 cent megabytes we're blocking for 74 milliseconds which is quite a lot and if we have animations running that will be too much? 47. 47 milliseconds. Is it 74 mate? German. Is that how numbers work in Germany? Yes. Amazing. I did not know So 32 in German is 32. It's still, like, after four years here, still screws me up. Yeah, three and 20, isn't it? Like, it's 24. I don't know why. Like, everything else in German is logical, mostly. So it takes seven and four milliseconds. Yes, exactly. 47. But anything below 100 kilobytes, we will be absolutely fine. We won't block the main thread too long to, like, make our animations jank. However, and this is the most important bit, this is on a MacBook Pro. Like, this is not representative of the average device especially. Oh, so this is going to run when you're on a phone, right? Right, so I ran it on a Nokia 2. So it's actually pretty representative of the 15th percentile device across the world. Yeah, so it's pretty much in the middle, despite the hardware being stuck in, like, 2014, roughly speaking. But yeah, that was a worse. Most circles got bigger. Yeah, so if we now look for, you know, numbers around the 60 millisecond mark, we can now see that, you know what, to be safe, we can't go much bigger than 10 kilobytes, right? Because this here, we have a 12 up here, which might be a bit of a problem. But, you know, if we look at the transition between blue and turquoise, that's between, yeah, so we can't go much bigger than 10 kilobytes without risking our rail budget. But the rail boundary hasn't shifted that much, it feels like. Not that much. Like, this number has got bigger. Right, but if you send 10 megabytes, you're bonkers. Yes, it's going to take a long time. It's going to take over half a second. But generally, it's still OK. And that I could make me really happy because I feel like if you have animations running, you're limited to 10 kilobytes. But actually, 10 kilobytes is quite a lot. You can put a lot of stuff into 10 kilobytes. Yeah, if you're just shifting around booleans and numbers, which, yeah. That being said, do you remember this game that we built? Oh, my word. It's Prox. Yeah, so because this is the maximum possible field. It doesn't even actually fit on the screen. It's 40 by 40. That's the biggest field we currently allow. Yes. So it's 1,600 cells. Right, yes. And each of these cells has a couple of flags to store. And these are actually these that we have in the code. Yes, of course. And that is basically our game state. We have this 40 by 40 two-dimensional array. And each cell has this data store in it. So we know what the game field looks like. Now it turns out that that JSON stringer fight actually adds up to about 130 kilobytes of JSON. So we are way too big to send the entire state over. Yes. And I would say that we're being somewhat like, well, not lazy, but like. Yeah, we could just have one array buffer per cell, for example, or something. Yeah, it could be an array buffer. Because yeah, what about like 32? Well, how many bits of information do we need here? I mean, maybe 32 would be fine. I mean, because like the touching mind, touching flags only goes up to eight. Yeah, which is three bits. Which is three bits. And then you've got. Nine bits. We need nine bits. Nine bits. Right, OK. So this could just be an array buffer. It doesn't need to be. Or one number. It doesn't need to be two-dimensional, because as long as we know the width. Yeah, yeah. That being said, we ended up noticing that we take. At the start, we were just saying, whenever somebody taps something, we have to send the updated state to the main thread, so we can re-render. Turns out that took too long on the low end phones. Like, we couldn't do that. And we saw that on the performance panel in DevTools that that was too long. And now, what we did is something else. Now, this is, again, a game field. But we said that we would, instead of sending the entire state, we are sending only the fields that changed. Yes. So we basically did a diff. And that's kind of cool, because that means our, the amount that we have to send is now not proportional to the game state anymore, but only to the amount of changes that we do to the game state. Yes. Now, even that, in some situations, especially on the first click, could add up to all of data. Because you end up with a big reveal at the start, right? So that could add up to, in theory, something about 70 kilobytes, if we assume that like 80% of the field gets revealed. Which is unlikely, but you know, assume the worst case here. Right, right, right. So we did another thing. And that's actually the thing I want to talk about, because I find it really smart. We've gotten to the thing you want to talk about. Mate, we have been recording for quite a while. And now we get to the meat of the episode. So when somebody taps a field, our game logic basically traverses through the game field and figures out which fields need to change the state. And we record these changes because we want to send over the changes. But whenever we have found 10 changes or more. Which we just, like, 10, why not? Pretty much. It's the number we pulled up out of thin air. We send those 10 changes immediately, so the main thread can start rendering and doing stuff while the worker keeps going and keeps traversing. Because it's not just the serialized and deserialized cost. It's the cost of just crawling the grid as well, right? Exactly. And then we realized that the way we wrote the algorithm to traverse the field actually looks really nice. So on slow phones, which I have a recording here, it looks like a really nice reveal animation. Even though, on bigger devices, this will be instant. This animation is disabled, so it will just be there immediately. But on slower devices, you will get a nice animation. And this is how we made this huge state object actually work on the lowest of low end devices. I was quite proud of us for doing that. Well, to the point where when we found this, when we saw this happening, we made it happen on desktop as well. Just artificially with the animation. Yeah, we artificially did it on desktop to get this animation. Because it was the second pass of this algorithm. Because the first time we were doing depth first. And we ran into stack problems. And then so we switched to be queue based. And that's when we ended up with that. The point is that for managing your state in a worker, there's so many little tricks you can do without bending over backwards that I think there is no good excuse to not put your state in a worker. It helps you so much on these slow and device. We have actually done a test where we ran prox all on main thread. And it performed horribly. So we actually, we chose to use a worker right from the start. Especially because we have animation happening all the time, right? It's difficult to see probably from this angle. But there's a, well, because this is the no animated view as well. But usually these little squares have got in a rotate. Like we actually have WebGL running. We need all the budget on the main thread we can get to the WebGL. So everything else, as much as we can, we moved it somewhere else, and we're quite happy we did that. So I think post message has a cost. But not the point where it makes off main thread like completely unviable. So I'm hoping that with this people will be kind of inspired to try it out and give off main thread a shot for managing the state. Yeah. We have some jitter that is artificially added. And of course we have... Artificially added. Artificially added jitter. Sorry. Say it's artificially added again. Because I interrupted and also you said artificially added. Did I? Pretty sure you did. I didn't even notice.