 Let's start with our TOEFL friend here. So hello everyone. Good afternoon. I want to tell you a little bit about my favorite machines, windmills. I think they're actually amazing, though disclaimer I'm Dutch, so it's mandatory. But I'm going to talk about windmills, about websites, and about using automation to do the boring part for us. After all, we want to let the machines do the boring part, so that we can spend human minds and stuff like poetry and conferences and watching catfish, you know, like important stuff. And sadly, for my dust sensibilities, my favorite windmill is actually not part of these. They are in the Zanseschand Windmill Museum, but it's actually this one in Herne, England. And what I really like about this one is, even though it's very old, it was constructed in 1781. It already has like so much automation. So I mean, obviously it uses wind power to do the actual milling of the grain, but it also has wind-powered elevators to lift all the heavy bags of grain and flour. You can see there is like a tiny windmill on the back that will actually rotate the entire windmill around to make sure that it keeps pointing into the wind. But actually my favorite bit might be this. So you can see that the blades of this windmill, they are actually not solid. They are comprised of dozens of tiny slats that can be rotated to open and closed to actually catch more or less wind, because other than what you might think, the biggest problem for a windmill is not actually that there is too little wind, so that can be a problem as well. But it turns out that the amount of energy in wind goes up as the cube of the wind speed. So it's fair possible that if there's a storm, that will just tear the inside of such a mill to shred. And that is obviously a problem. So it uses these slats to, they can be opened when there's a storm, and then it will catch less wind and do less work, but keep the inside of the windmill in one piece. And the amazing part is that it doesn't do this by any human intervention, actually does this automatically. So when the blades are turning too fast, then it will pull on a string and open these slats so that it will slow down again. And this is actually not so different than the tools that I use in my normal work to prevent my service from the sudden storms that our users can put up. So work, yeah, I work for a company that is called WeTransfer. It works very simple. Anyone here could probably build it in a few days for 10 users. You can upload files and you can get a link where you can download them again. Now this is a tweet by Ms. Iggy Azalea. She's an Australian rapper that uses us to share music with her fans. And it's actually, we're used by a lot of musicians and people have very large files. So it's mostly meant for sending large files or large amounts of smaller files. We do about a billion files per month. And some of them just go absolutely viral. Ms. Azalea, sometimes she has a hit and then it gets downloaded 90 million times. And that, yeah, that's a load spike, no joke. And one of the main features about the service is that after seven days, your files will get deleted again and afterwards they can't be downloaded. But at scale, you can imagine that if you have a billion files coming in every month, that averages out to 23 transfers per second. And some of them actually contain thousands of files. So deleting all that stuff again has to happen in the background. But it's decidedly not trivial anymore. And of course my boss also wants me to do some actual work myself. So we need to improve the service that sometimes that requires refactoring our backend or adding tables to our database. And like when these database tables get very large, it's also like it can literally take a week to add a column to a table with a billion rows in it. Also, we broke Amazon as three ones that go to some very angry emails. So also there, we actually want to slow down what we do in order to make the internals of our service not break. Now sadly, it's not always predictable. So, yeah, most people are asleep at night and that's usually when they're not uploading files. But that's not always the case. Like some people are just downloading stuff throughout the night. Some people live in some weird time zone and they do want to download when we think it's a quiet period. Sadly, not everyone is nice on the internet. So for example, DDoS attacks do happen. And some transfers have very fan art. So this is another example as a guy called Zane. I think he was part of One Direction once. And yeah, if you put something up, that's also, you can see that on the metrics. And this really limits what we can do and when and we really heavily depend on automation to make sure that nothing breaks. So how does it actually work? Well, let's build our own because it's actually pretty simple conceptually. And the best way to do it is to just look at a simple example and I build it up as we go. So I abstracted our background services as far as I could. It basically looks like this. So, yeah, we do some stuff and then we loop around. And if our do stuff method takes one millisecond and this loop is obviously gonna run about a thousand times per second. But this might make some of our backing services like Amazon very sad. So we need to slow it down a bit. Now, the obvious example to do that would be like this. We add some sleep to the loop and suddenly everything is slower. But now we have made another group, sad. So the developer that wanted to add this column to his database, which was already gonna take multiple days, now has to wait almost a thousand days for his column to be added. And that is, yeah, we tried to be agile but a thousand days is too long. So the next iteration you might think of is like, how about most of our users come during the day? So during the night we have a quiet period. So we do stuff as quickly as we can during the night. And then if it's during the day or not at night, and some people might call it, then we slow it down. That already works, but right now we are lying to us because we don't actually care whether it's at night or not. Because if Beyonce posts something, even if it's the middle of the night, you know it's gonna be busy. So what we actually want is if the system is busy. And this is already the entire thing that I want to show you. And I think it's actually super remarkable that these six lines of code really embody what you want to do. So you only need to know two things. You need to know how can you tell if it's very busy? And what can you do to remedy that? And for our service in our bed, for most services, just waiting a bit is okay. So that's the second question answered already. And I have some code here that I copy pasted from our production software. You can see we had to cope with the real world for a bit and add some logging and stuff. But actually, there are only two things in the conditional and one sleep statement. So for our particular problem, we wanted to make sure that the replication lag of the database wasn't too large. And we want to make sure that the checkpoint H was a metric that describes database IO. That also couldn't be too large. And that's it. And so the obvious question now is, does it actually work? And I wouldn't be standing here. So yes, yes, it does work. So at peak times, we really used to have a lot of problems where we would start a migration. And if it was during the middle of the day that was just bringing some of background services to a crashing halt, and one particular problem was the database. So it would overload its connection to the underlying disk. It would overload its replication capacity. Running this during the night or during a weekend was sadly not possible because it was just longer than a single weekend. So what we did, we went back to this thing. We said, what do we actually need to watch? What do we need to monitor? And this turned out to be the replication lag and the checkpoint H. And you can see how does it work. This is a graph of our checkpoint H. At the dotted red line, we released our new version with the throttling enabled. And while you can see pretty clearly where we enabled or where we started the migration. And what is actually pretty nice is that without our throttling code in place is what easily hit the red line at which the database point says nope. But the throttling catches it easily and then just sleeps until the database is ready again. And actually, what we got for free and what's super nice is that later that month, someone, it was me, actually broke the replication database, but the script automatically caught it and paused all our migrations. And so just like in the case of our windmills, we can effectively protect our systems from storms, whether they're caused by the weather or by our users. And I find it really amazing that this innovation that helps millers in the 1700s is still so relevant today. Thank you very much.