 So first of all, thank you for coming and I've been seeing a lot of old faces as well as new faces, okay. So today I will be telling you several stories, yeah. So no demo, so yeah, stay comfortable, okay. So, sorry, shit, okay, yeah, that was me eight years ago, hopefully I still look the same. Okay, I work for PocketMesh, we are one of the biggest mobile DSP, yeah. So in terms of what we do, we are doing programmatic mobile advertising. So just a little bit of background of what is actually programmatic mobile advertising or in a more industrial standard term, it's called open RTB, real-time bidding. So I believe all of you has, in some way, used any applications which has this like advertisement banner, right. So basically that's the area we work for. So start of the, whenever you start this application, okay, the application basically will send a request to what we call that exchange, which you can just realize it as an auction house. So then the exchange will propagate the request to several different DSPs. So now we are actually one of the DSP, Demonsite platform, participating in this auction. So we have to answer one single question, do we want that opportunity or not to show advertisement here? And also we need to give how much money we want to pay for this. And of course the pricing itself is not fixed, so it really depends on a lot of factors. So we need to do some computation. After that, the price as well as the image itself together will send back to exchange. So now we are the one participating in this auction. And the exchange will pick up the highest paid impression. And then eventually, exchange will help us to kind of deliver the image eventually into the application itself. So basically that's how more or less RTB is doing. What is the lag of this whole thing? Yes, yes. So feel free to ask me any question, okay, I'll try to answer as far as I know, okay. So a good question, but basically we cover here. So the challenge, okay, first of all, this thing looks pretty straightforward. And it's like, you guys are not doing too much things. But whenever you talk about the numbers, now this thing become a little bit tricky. So from the exchange to our servers, and back to the round trip, we have to respect 100 milliseconds latency. So beyond that, the exchange will basically cut off the connection and treat us as we are not going to buy for that. And in terms of throughput, every day we'll be bombarded like 60 billion requests a day. If you do a little bit of translation, that translates like 700,000 requests a second. So by doing that, and also we have thousands of customers and thousands of these images. So we have to make a very quick decision, and the throughput is also huge. So that kind of poses a unique challenge to our server infrastructure. So starting off the, I mean, that's the legacy setup. Back in the old days, we were using ELB, by the way, all our servers on AWS. So we don't really host our own dedicated servers. So all the exchange traffic, first of all, hit on the elastic load balancer, which will be load balancer across different bidders, where we do the bidding itself. And so load balancer is on the transposition layer. So it's not application layer. So basically load balancer is just doing nothing but to shuffling the bytes. And we have an upper sheet host on the same machine with JBoss. The reason picking upper sheet is more like we do need to have a SSL termination. So doing the SSL termination in JBoss is less robust and efficient. So that's the reason for us to pick up this setup. From upper sheet into JBoss, it will be like a AJP protocol. So pretty standard setup. And along the way, a lot of things start to border us. So obviously, if you look at this graph, since these two components are hosted in a single EC2 instance, we basically need to pass the request twice, first time at this layer. And once the request is passed, it will be propagated to JBoss. I mean, basically, we need to process the request one more time. So this makes the processing less efficient. We are burning more CPU cycles, of course. And basically, upper sheet itself and the JBoss, they are also competing with the CPU cycles. And also, the other one is like, it's still at reasonable overhead on the latency itself. So remember, we have like 100 millisecond, 200 millisecond deadline. We have to respect. That's including the network round trip. So really, what we have, the time, what we have for processing each of the request is not that much. So that gives us some room to think. I think it's all started with the Java 7 end of the line. So we say, oh, since anyway, by the way, JBoss was on Java 7. So we say, huh, since we are going to migrate to Java anyway, why not we just do some experiment and let's say pick up a lighter weight server. So that's the reason for us to pick up Tomcat. So by doing this alone, suddenly, our 99% latency goes like, whoa. So basically, that's like a pure win. Right now, we have one less component to care about, the Apache. And in terms of latency, this is like a pure win for us. By the way, I'm only talking about 99% latency. So on average, it was still very, very fast. It's way faster than this. But again, always aim for the slowest. And the setup for Tomcat is actually very straightforward. We pick up the APR, Apache Portable Runtime, as our means to terminate SSL. And again, since the server actually taking a lot of TCP connections at any particular given time. So we also set the very relatively, very high max connections. And yes. And that's the pretty standard setup for SSL connector. So basically, the rest of them, we just take it as it is. We don't do too much optimization, too much tuning. And just Tomcat there. And then we have the latency improvement. So it's like, oh, everybody's like, yeah, we go party. Everything looks fine. So this taught us a story. Basically, Apache itself, if you are using it as a low balance, then probably it starts to make sense. But if you just use the Apache as front end for your Tomcat, then probably you don't really need that. If we can process like 65 billion requests a day just using Tomcat alone, I think you can do that also. So continue with the story, with the latency story. So the easiest way for us to implement a timeout mechanism to respect the 100 milliseconds latency is just to throw the request into the executor service and wait for the executor service to finish. So for those of you who are not so familiar with the executor service, I'll just go to a very brief introduction of these things. So first of all, we have the set of a selfless stretch, which are processing the real HTTP requests. Then we'll be basically encapsulating the request itself into the task, where basically just the pipelines and send into the executor pool, like the queue. So then the executor itself will have a set of worker stress, which will be pick up the request from the queue and carry out the real actual computation itself. So this gives us a nice side effect also, like here we can also specify how many stress will be alive. That means we will have a built-in way to throttle how much load we'll be posing onto the backend, let's say, database, backend, whatever the instruction we have to support for these activities. So again, so this thing will be propagating to here. And therefore, it carries out the action itself. Occasionally, occasionally, one of the requests will be running relatively longish, not in seconds, but probably 200 milliseconds, 300 milliseconds. By that definition, we don't want that anymore, because from the action point of view, this auction is gone. So we just reply as no bid. We are not going to buy that. So that's the executor part. Now we're talking about the task itself. The task, OK, sorry for that. Yeah, I skipped too fast, sorry, sorry. So the task itself is just an abstraction encapsulation of the computation itself. There are several stages of the pipeline itself. So each stage, there are several stages which are in memory only. So those operations are very, very fast, basically like the blue stage. Basically bump into the RAM and you check some parameters and oh, it doesn't match. OK, I say no bid. If this stage pass on to the next stage and pass on to the next stage, as you go deeper and deeper into this pipeline, it's getting more and more heavier. Eventually, we do have some external lookups. So we want to optimize such a way that if, let's say, you can already detect. At this stage, we have already gone through the timeout deadline. There's no need to carry on the operation anymore. We just need to reply. So this has a built-in kind of way to say let's cut off the connection earlier, let's cut off the request earlier. So what we decided, actually what we learned from the Apache story is like, because we have removed one process from the operating system, obviously there are much less context switch because the Apache doesn't need to wait for the Tomcat to respond what is the HTTP request before Apache kind of send a reply back to the wild internet. So we learned that, OK, by cutting off the number of context switches, we have a win in terms of latency. So from here, naturally, this executor itself is just another queue built into the GVM itself. And every time we call this, we have to pay multiple context switches because the calling strap, which is normally the self-led strap, has to wait after you submit the task. And then the executor itself will wake up another worker's track. Remember the slides I showed here? Another worker's track has to be wake up. Then once the worker's track finished, it kind of send back the signal to the original self-led request, a self-led strap. So that's probably several times of context switch, which again we learned from Apache story. This is in our way of reducing the latency. So what we really do is why not we just remove the executor itself and call the task itself directly. Since we already have the pipeline built-in mechanism to terminate the request earlier, there's no big race for us to carry on the request all the way to the external resources if, let's say, we have already passed the timeline somewhere up here. So then, yes, you have already seen this slide. So our 99% now goes even faster going to within 10 milliseconds. Now it's like one digit millisecond latency, which is like a pure win for us. Of course, we can't really do too much on the network stack because that's what AWS is providing to us. We take as granted. But what we can really do is to optimize the internal latencies. Yeah, again, so pure win. There's no additional overhead. Actually, less code also. So far, any question? Or do I go too fast or do slow? Questions? Yeah. Why don't you terminate the s itself at the e of e? If you terminate SSL at ELB, you have to go for application layer load balancing. So ELB itself, it doesn't maintain a steady back end connection, TCP connection. It has to, every time, set up new connection for each and every application, sorry, each and every request ELB receive. That means also we will do nothing but to kind of set up TCP connection and tear it down. That's another TCP connection and tear it down. So we have already done some benchmarking that doesn't come for us. So the only solution left to us is transport layer load balancing. Any question? No? Good. Your guys are very smart. So let's carry on to the next story. So along the way, we have identified a very, very strange pattern you can probably see from here. This is the server CPU. We have many, many servers. Occasionally, one of the servers will go haywire, 100% CPU shooting off the roof. Then what we actually, we really observe is during this period of time, if you happen to look into the server lock, everything looks fine. There's no excessive exceptions. There's basically no nothing. It's only the server is hitting 100% CPU. And it kind of auto heals itself. Probably that's the, we can phrase that, the best part of all the developer ones, where the bug will heal itself automatically. The only thing for us is like, what is happening over here? What is happening over here? So we have checked all these things. No clue. And it actually auto heals itself. And it's like, oh, it doesn't make sense. So until one day, until one day, we run the command line, perf top. For those of you who don't know, the perf is just a profiling tool for Linux. So it doesn't, it's not tied to GVM itself. You can also provide native applications. So I mean, we were in a state like we have no clue. So might as well just run the command C. So what we really see from the log of perf top is like there is a very distinct line, item pointing to garbage collector of the JDK native code. OK, that kind of start to give us some clue. OK, something is going on for the GC for that machine. Then one of our developers, James, who is the handsome boy here. He said, oh man, what if we can attach a profiler into that machine and see what is the GVM it's actually doing. But the attaching profiler itself doesn't really solve the problem, but we have a good side effect. By attaching the profiler, and full GC was kicked out. And this full GC was actually the clue to solve the problem. Once the full GC kicked out, then you will basically see the server fever to go off. So this is like, oh yeah, now we still have more information. So what turns out to be is like eventually we find out for the server set up, we were running before. We only set the maximum heap size. So we don't really specify the distribution between the young generation and the old generation. So for those who are not so familiar with the GVM memory layout, most of your allocation actually is done in the young generation. And when the GC kicks in, there are different stages of GC. There are young generation GC, minor GC, which will be only doing something here. We will not be touching that. The reason is more like there is a paper showing that 80% of your allocations actually died out very, very young in a short period of time. They are eligible for garbage collection. But there are a lot of data that lives quite long time. So they just push the data into the old generation and forget about that and then postpone the garbage collection as long as possible. What really happens is like, since we do not specify the distribution of these two partitions of your heap, GVM is free to resize. Resize the young generation and the old generation. So if you happen to read any of these performance tuning guys, they always tell you, do not specify the distribution between the young generation and the old generation. GVM itself has a lot of built-in heuristics to kind of adjust itself. So adjust the distribution between the young generation and the old generation so that it can go to an optimal setup. However, for our use case, it's very, very unique for us, I would say. What really happens is like, the old generation keeps on growing and the young generation keeps on shrinking. So then while the consequence is there, if the young generation is very, very small and if you are doing a lot of allocations, then we have no choice but to run a lot of minor GC, young generation collection. So this is actually the reason kind of shooting the server into 100% CPU usage. And now we kind of, OK, no other problem. The solution will be, again, very straightforward. First of all, we said explicitly give a flag for the GVM. Telling the GVM, do not specify, do not auto-resize the young generation and the old generation, which, again, going against all the guys, but we probably have no other choice to go when the server is on fire and the production is on fire and the project manager is besides you, man, can you fix the problem? Yes, you have to fix the problem. So another side effect is that we discover, OK, since the reason we, if we can send in a full GC command to the GVM, such that it will clear out all the old generation and it will solve the server fever, yeah. By the way, the server fever is more like, it's a cute name for us to name that even when the server goes like 100% CPU usage, where the server is not really dying. It's just like, well, it's not really that healthy also. So we decided to call the cure for this thing to be a panadom. So, turns out that's a very neat command, GCMD, which you can send to any GVM. And you can run additional operations. For those who you want to see and probably can run a manual command for this GCMD to see all the options, one of the options called gc.run, this will trigger the full GC. So the other opportunity for us to run this is because our traffic, OK, our main traffic is coming from US. So that kind of gives us an opportunity where the traffic day-to-day goes like a sine wave. So there will be a peak, there will be a low peak. So this gives us the perfect opportunity for us to kick in the panadom, the SH during the low peak. Well, we still get hit by the GC a little bit. Well, when you do garbage collection, obviously you can't process anything. We have to time out a little bit. But at least it's off the server fever for us. So and this is the graph after we apply a panadom here. That's the point we apply a panadom. This is a server start to having fever, like what, 38 degrees, but not that high, like 40 degrees. So we're just kicking in the panadom and all these things. And you can also see a trend along the day. This is more like a daily graph. Along the day, the server start to converge the CPU usage. Why is more like the build-up effects of these old generation getting larger and larger, young generation getting smaller? But once we apply the panadom, things like getting much better, they kind of start to converge again, which is like, OK, now we kind of solve the problem. For this time, while we have already solved the problem, we explicitly go against all the conventional wisdom, where they tell us, do not specify a young generation. And we have a very shitty thing like, oh, we need to apply the panadom daily. Well, yeah, I kind of get a drop down, but yeah. So how's the time? Good. So any question? So when you do the full GC, when you trigger that, is that it stops the world and then you lose lots of requests? Yes, yes. Yes, that's the trade-off we have to make. Because otherwise, when the server goes like a fever, we'll be losing more bits and we'll be losing more money. So, yeah, might as well just lose some small money than the big money. Do you damage collected, eat up all your CPU resources? Even if you put more of your heat up everything? Yes, yes, because the new generation, the young generation can keep on shrinking, keep on shrinking. And it becomes to a point where every request or every total request comes in, we have to kick in minor GC, which is like the main reason driving the CPU usage going up very, very high. Can I show the GC from a zoom? Right now, we are more like lean towards the open source so we don't really want to pay the commercial service as for now. Probably that would be a good solution for us. Yeah. Yeah, yeah, we should still wait. The minor GC is going to take longer, or a lot longer. Because we are using G1 GC, so the minor GC doesn't really collect all the young generation. It's like, depends on how much time it has, you collect as much as possible, but it doesn't really have to collect the whole region. So that gives us a little bit of flexibility. Is there any memory link in the application that you needed? I would say no. I would say no, because after we applied the Pandora SH and it go back to the similar level compared to previous day. So in terms of memory leak, that could be a possibility but we have more or less ruled out that possibility. Also in the server chat, only one server seems to go high. Neither the servers are better, not better. So is that it is missing in there? Sorry? In the server, CPU is at high. Only one of the servers are high. Okay, during that server fever, right, the server actually, the throughput for that particular server is comparable, the same compared to the rest of the servers. So basically that server is not like a processing less or processing more data, it's processed more or less the same. Only using more CPU usage. It's not always the same server, right? Yeah, it's random. So it can't really have any high, strong correlation between which server goes to fever when, yeah. Okay? Okay, let's move on. So our traffic starts to, as we approach to the end of the year, our traffic starts to grow. So early last year, we start with $30 billion. Then this time it grows to $40 billion. So as you know, anytime you have something start to grow, some new problem will arise. So, okay, this is a problem. This graph comes from the exchange. So that's the external view. How is the performance of Pocamas? Okay, so the blue line shows how much traffic the exchange sent to us. The green line on the other hand is saying what is the percentage, this is a 50%, what is the percentage of throttling? Okay, let me give you a very brief explanation on what is the throttling. So the exchange itself mostly debuts that infrastructure very smart. So obviously there's no way for them to know how is Pocamas doing? What is the CPU usage, right? All we are exposed to them is just like black box. Just send me that big request via HTTP. So from then they are building something like heuristically. First of all, they will keep sending as much requests as many requests as possible to DSP, so to Pocamas. And if let's say for some reason, they observe some kind of slowness in terms of time loss. So in this case, 100 requests sent to us, 50 response goes back in time. So that kind of give them some kind of idea, say, huh, these guys probably are having problem. Let's not send that fast, let's not send that many. Let's send a little bit less. So the next round they will be sending less. So this thing, this is basically throttling. And for us, this is really a bad situation because basically we are losing like from this graph, 80 requests where we can make a purchase, make some money. So this is like the missing opportunities. So obviously, throttling is bad, okay? So we want the green line always stay as close as zero. So the other observation we have seen is like, since we are using ELB, ELB has this built-in matrix also. So this is something I support from ELB into New Relic. As you can see the ELB, this is the unique TCP connections ELB has seen during that period of time. So it's kind of going up and also the spillover. Spillover refer to where ELB has already received a request from the outside world, but haven't be able to make internal connection to our bidders such that it kind of like waiting for too long and they say, okay, I'll just drop the TCP request, TCP connection. So as you can see, there's a steady trend going up. So again, this kind of start to bother us. Yeah? So as more and more TCP connections are established with this ELB and as well as the backend server and we have seen more spillovers. So one of the observation is like because we have already set ELB to be keep alive as long as possible. We rule out the possibility where ELB was early terminating all the TCP connections from exchange. So the only possibility left to us is like Tomcat itself is doing something wrong. So it's kind of cutting off the connection way too early. It has to be, it's supposed to keep alive as long as possible. So then saying, let's just say the, revise the keep alive timeout. Remember I told you earlier before, most of the parameters we just take as using the default. Okay, we don't change anything. So then this is another parameter we change. So we make it like five minutes for Tomcat. So we want that connection to be stay alive for five minutes. So that it doesn't terminate or destroy that early. But this makes no difference. So again, it's like, okay, what, okay. Then the conventional wisdom kicks in again. Okay, when it comes out of the dock. Okay, so okay, let's go back to the Tomcat documentation again and see what else we can do. Turns out, turns out there is not only keep alive timeout, but also there is a keep alive request, a counter. So that's what really makes the difference is like keep alive will be keeping alive as long as five minutes or this counter itself was default to 100. 100 over here. So it's like either five minutes or 100 requests. So after 100 requests has been served over that TCP connection, gone for you. So that's the catch for us. So how do we solve the problem? Easy, just make sure there's no counter, there's no upper limit. So this connection will stay as alive as five minutes. Okay, so again, so let's quickly, so the blue line in here representing the quick deployment. Let's just deploy everything together and solve the problem. So of course, during this deployment, the throttle rate goes like 100% because no cell is responding, they were nappy all the time. Then after that, what really matters is like, okay, everything goes to zero. So a single line of configuration, change all of them, make all the difference, yeah? So time to party and all, and the reason is as we are pushing Christmas, so the traffic keeps on growing, keeps on growing. Nothing kind of can stop the traffic from sending to us. So basically the throttling back again. So this time, we know definitely it's not going to be the max keep alive request account because we have already solved the problem. And yeah, you have already seen this before. And the other observation we have already seen, I can't really start to notice. It's like, for any given time, the server, okay, this is during the low peak of our traffic. So you can only see like 6,000 established TCP connection per server. So this number can go up to 10K to 12K. So there is always this pattern, is this stable for a while? Then suddenly it kind of like take a dive, goes from 6,000 to like 2,000. For each line, it's like five seconds away from each other. So it's like within a minute, we have lost like 60% of our traffic, of our connections. So this kind of started bothering us. And after that, this guy auto heals itself again. He's like, oh, another bug of this auto heal, auto healing. So we know we have already solved more or less the garbage collector problem, all right? We know we have already solved maximum keep alive request. We go and look at the log, everything looks normal. We go look at the new relic monitoring system, everything looks normal except this thing. Okay, so really what we hear is we were in this state for a while, literally, until one day, until one day. When we were doing garbage collector tuning, accidentally actually this flag was added, for no good reason, because we are told to. So we just add a flag. So normally the flag will enable the logging like this line. Most of the time, most of the time, the time stop is like in a range of milliseconds, five milliseconds, 10 milliseconds, maximum like 50 to 100 milliseconds, that's like when a GC kicks in, minor GC kicks in. But what really happens is like we start to see some very strange auto liars. It's like in several orders of magnitude larger than a normal one. So these start to kind of not so usual. And the other observation we have observed is like, whenever we see this four second delay, four second stop, we are going to do the dive. We are going to do the deep diving here. So that kind of start to give us some kind of clue. Okay, there's something related. There's very high correlation between these two events. Whenever you see the four, there will be a dive. So that kind of give us something to start with. So again, the conventional wisdom say when you have a problem, go and ask style overflow. Okay, so okay, let's just do some Google searching and Google search lead us to one of these questions, which doesn't have many answers. There's only two answers. So the second answer, the second answer shows us you probably want to enable a safe point statistic flag. Then the thing is like, what is really a safe point? Okay, so again, go ask Google and when in doubt, read the doc, okay? The doc says, whenever GVM want to do something tricky, as in if let's say they want to do a garbage collection, obviously you have to stop all the stress. Otherwise, what happens like, still one of the pointer is still on your stack and the memory pointing to has been already reclaimed by the garbage collector. So what happened when you access that pointer again? Stack fault. So obviously that shouldn't happen, right? So GVM will be, for the safe point, it basically says the GVM will coordinate all the stress to stop at the same time so that it will carry on some tricky stuff. So one of the things like garbage collection, there are many, many possibilities for this thing, like code reloading and let's say compilation when just done a git compile your interpreted code into binary code, you have to replace the original metacode, dispatched to the native routine instead of the interpreter mode. So all these things requires the GVM to go into the safe point where the hotspot implementation where all of you are using the Oracle implementation, it requires all the stress to be agreed on the same safe point. So basically that kind of acting like a global lock. So if you, again, if you happen to use Azure, ZIN, then their implementation of safe point is more advanced. They are more like several tiers, they can be post threat level. So because we are not using ZIN, so this is something we have to live with. So when can the GVM kind of stop a threat and go into the safe point? So this is a lot of potential opportunities, right? So like block by a lock doing blocking IO, all these things. So let's just enable the flag. So what we really see is like, okay, so there are a lot of possibilities for this reason, for the safe point reason. There's no, and what we really observe is like there's no correlation between which type of the event happened. So again, we are getting closer to the problem, but that's not really a clue what is really happening. What really catches our eyes is like, every time we see this lock happen into our sub-lock, there will be a distinct lock which linked to one of our component. For internally, internally the beta is designed such that we will have a central report for all the bidding instructions and the bidding instruction will pass down to the compiler which is built into the beta itself. The compiler will do some pre-computation such that to make it fast and adopt the result into the matching engine and therefore that matching engine can take the request from outside world. So what really happened is like, one of the steps of this compilation is gone haywire. It was designed, it was working pretty okay. Until the point we have more customers and the outcome for that step is not linear, it's more like exponential. So it grow pretty crazy. In terms of, we had an array of 200 million long elements. So that kind of like consume about one giga to two gigabytes of data and we are doing a sorting on top of that array. So kind of that's like when the GVM see this kind of thing to happen in a very tight loop, GVM tend to optimize the code to forgone all the safe point checks. So basically you'll just send a tray into that loop, very tight loop and just do the computation itself and never share the safe point. So when that happens, when that happens and all the stress has to wait for that single try to finish because we have to agree on the same safe point for the GVM to stop. So this actually is the real problem. So let's say, okay, then we say, okay, let's just do a quick experiment. Let's stop the compiler from doing any compilation and this is what we really see. For a very long period of time, the TCP connection stays very, very stable. Okay, so now we know the problem. Let's just deploy the fix. After we deploy the fix, you can see on the graph is clearly something is really effective. And the other side effect, a good side effect, is remember the Panadode or S-H because now we start to kind of link back, connect the dots, right? So we start to realize, okay, because of this compilation stage where I take in too much RAM, then for G1GC, whenever you have a huge array, huge object allocation, you always, by default, go to old generation. It doesn't even stay in the young generation because that's the one of the limitation for G1GC. So they call that humongous object. So that object goes directly into the old generation. So this kind of, we now know the culprit of these old generation growing, young generation shrinking. And after we apply the fix for this problem, it actually solved two problems. So the other problem, the other problem solved for us is like now we do not need to apply Panadode daily. So the distribution, basically the old generation stayed more or less very stable across the day. And yeah, this time one solution solved two problems. We definitely need to go to a party. And just before I end the talk, these are some numbers where I want to kind of show off. So starting off the day, we are talking about 30 billion requests. Now we are handling 65, sometimes even close to 70. So that's like two times the amount of traffic. In terms of number of servers, we started with like a 60, around 60, not really 60. 50, 50, 55, 58, 59, something like that. Now we are talking about 47. That was day before yesterday. Yesterday, this number goes to 45. Today, today, this number goes to 41, okay? So if I go in to deliver this presentation one more time, you will see a much smaller number over here. And in terms of 99 percentile latency, and you can see this like 10 times better. So down here, this is one of our exchange, our, the send us daily report. This, I think that this report sent us, sent to us like the day before yesterday. You can see the total number of requests sent to us, two billion, and unfortunately, we still have a little bit of time loss, like a 6,000. Yeah, and unfortunately, we still have a little bit of transfer error, like a 5,000, but if you do the math, then this is like a negligible, right? And another graph, really, really want to show you. And I know when you talk about latency, people tell you, do not talk about average, do not talk about anything. Only talk about 99 and the max, okay? So I just want to show you anyway. What is our average time? So if you happen to see the average, it's like a 600 microsecond. It's not even a millisecond. Medium is even smaller, so not even funny anymore. So 99% stays comfortably, below 10. So that's today. And yep. Any question? Any question? Do you have some solution in mind where you will separate the requests, which suppose you have some... This is something we like to have, we don't have now. The reason it's more like a simulation of these big requests is quite difficult because you need to simulate a pattern and simulate all the different parameters we have to compute with. So yeah, definitely that could be something that can be something to be done, but we haven't done something yet. We haven't done anything yet on top of that. Really like to have that. And by the way, PokerMessage is also hiring. Yeah. Yeah. Like PayPal. Okay, if you like ping me, yeah. Okay, so probably I will stay around for a while and if you guys have any other questions, you can ask me personally. Yeah. Yeah. Thank you.