 Hello, everyone. My name's Joel Potishman. I'm going to be talking about what the heck time is it, how NTP does the impossible, and why I care. Very briefly about me, Joel Potishman. There's my email, Twitter, GitHub, Facebook. It's all the same. I design and build systems and APIs. If it shows up in a browser, I probably didn't do it. I'm not a systems engineer. I just like NTP, and I submitted a talk, and things got out of hand. Who has two thumbs and is available for work? This guy. So talk to me afterwards. And I'd like to thank Bang-Bang-Khan. Thank Mirabai for doing the stenography. You should come talk to her afterwards. It's really, really cool technology, and she's really cool and interesting. So what is NTP? NTP stands for Network Time Protocol. It was created by David Mills back in 1981. And simply it sets a clock to match another clock. My computer would like to know what time it is. There's a clock up on the internet that's connected to an atomic clock. It's the way that clock tells my clock what time it is and gracefully adjusts me to match it. That's all it is. And it's been around since 1981, version zero, I'm calling it. But basically, there was a formal spec in 1988, 1989. They added some authentication because you didn't want people impersonating NTP servers and doing bad things. 1992 is really modern NTP, which adds formal correctness principles, revised algorithms, more modes, mainly higher precision on faster networks. I mean, if you think about 1992, the advances they'd had in network speed since 1981, still a long time ago. But what's amazing to me about NTP is that it's worked the same way fundamentally since 1981. 1994 is version four. It's dotted because it's still a proposed standard. It's not, we're not there just yet. I don't think you want to rush into these things. 23 years is good, but let's just think it through. You might be saying that sounds kind of boring, which would be a mean thing to say. But if so, sorry, I'll be done soon. But I would respectfully disagree. I think it's really important. If you think about NTP or you think about a clock, if I'm sitting at home on my computer and I'm writing a document in Microsoft Word and I print it and hand it to someone, no one cares if my clock is off by two years. But as soon as you get online, as soon as we're talking to each other, as soon as you have distributed systems, it becomes really important to have some common reality about what time it is. And it can't be done perfectly. Let's say I connect to an atomic clock that is exactly right to a billionth of a second. It takes some amount of time for that response to get back to me. So I can't ever really know what time it is. I can only know what time it was. And it was sort of fascinating to me that you can't do the thing that this does. And it has to be done carefully. It has to be done carefully because if you run your clock and reverse, as I'll show, some bad things can happen. So for all those reasons, it's super important. You can't actually do it right. And you can cause terrible problems if you do it wrong. It's a pretty interesting thing to me. So why is it important? Let's dive into that. The classic example is going to be like a bank transfer. I deposit some money into my account at 12 o'clock. And then some naive network time-syncing protocol that's not as smart as NTP says, adjust your clock back by a tenth of a second, which it does. And then I withdraw $100. On that slide, things happen in the correct order linearly. But if you look at the timestamp, uh-oh, I withdrew before I deposited. And depending on the implementation of the banking software, if it processes it in timestamp order, I'm going to be overdrawn. So that's bad. Another reason why it's important is leap seconds. I don't want to alarm anyone, but the earth is slowing down. And it doesn't take 24 hours to rotate. It takes very slightly less than that. So every now and then they have to add a leap second to keep us in sync with our motion in space. And NTP is how that is done. With NTP, it tells everybody you've got to add a second today. SSL certificates, if it's revoked at a particular date and you can have an NTP server force everyone to believe it's prior to that revocation date, you can have a whole class of attacks there. iOS had a bug in a previous version where if you set your clock to 1970, you could brick your phone. So imagine if an NTP server was able to convince 100 million iPhones, it was 1970, how much destruction you could wreak. And lastly, sometimes it's really important to agree on the order in which things happen. So I'm gonna speed up a little bit because I'm behind on time, but NTP in 64 seconds, that isn't a joke, that's just me. So how does NTP work? Ask someone with a good watch what time it is. What is a good watch? A good watch is gonna be an atomic clock. That's considered a stratum zero NTP server, but you can't talk to a stratum zero server, you're not allowed to. You could talk to a stratum one server which synchronizes to that, or a stratum two which synchronizes the stratum one and so on up to stratum 15. So try to get in touch with the best clock you can. Typically you're gonna be talking to an NTP one, a stratum one or two server. Find out how long it took them to answer. They tell you this is the time, figure out that latency and then you have to calculate how much to offset by that. Then decide if you believe them. If they told you it was 1776 or 1 million BC, you probably don't wanna set your clock to match that. But if you do believe them, you wanna adjust your clock safely. And what does safely mean? Safely means don't run in reverse. Sometimes you have to, but you try to avoid that. And then you repeat that every 64 seconds. That's the default time on a Mac and probably on Windows. It's powers of two. Every 64 seconds forever, you synchronize to check to see how much time, check to see if you are drifting and you can do very gentle corrections that way. So let's go into a little more depth. Here's an NTP client, my computer, an NTP server up on the internet. So at 915, I don't know if you could see that, but at 915 on the dot, I make a request to the NTP server. And the NTP server says I've received that at 915 and 4 100ths of a second. And remember, my clock may be wrong. These may not be the same time, but I may be off by three years, but it says 915 and 4 100ths of a second. It sends a response back to 100ths of a second later, and then I mark the response there. So what do we know? What we know is that destination timestamp, according to my clock, minus the origination timestamp on my clock is the round trip latency. And it doesn't even matter what the NTP server latency really is. So by doing that, we have the round trip latency. Cut that in half and we estimate the one-way latency. And the asterisk represents we can't really deal with asymmetry and latency. NTP doesn't deal with that. No one really deals with that. You just hope that you hope for the best. And if you have a low latency connection, the maximum error is not that big anyway. So from this, if I take the NTP authoritative timestamp of 915 and 6 100ths of a second plus 5 100ths of a second latency tells me right now it's 915 and 11 100ths of a second. But look, I said it's 10 100ths of a second. I'm behind by 0.01 seconds. So now we have to correct that. How do we fix it? You fix it by slewing your clock, which is varying your clock speed, but staying positive. If you're on the highway and you're separated from your friends in the other car, you might step on the gas a little harder. You might lay off of it. You would not go in reverse. It is not recommended. So this graph shows you that. So blue line represents a perfectly accurate clock. That is exactly in time with reality. The purple line represents my clock is running a little bit fast. So what slewing is gonna do is say, let's just slow the clock down a little bit until the blue line catches up and then we're good. The green line similarly, my clock is too slow. Speed my clock up a little bit until we catch up. Now we're back in sync with reality. But there are limits of slewing. And the limits are because NTP is really designed for clock maintenance, not the initial setting. If I buy a brand new computer and I plug it in, it has no idea what time it is. And if it was off by a lot, it's gonna take decades to catch up because the limit of slewing is 500 parts per million. The design is to have really small changes from reality so you don't have weird, crazy transactions happening. So if it's off by more than 120 milliseconds, it just jumps to the correct time. Even if it has to go in reverse, the thinking is this is an initial step. It's better to do that than have your clock be off for 40 minutes or two weeks or three years. And if you're off by more than a thousand seconds, NTP just says, no. You can override that manually. That would be for that initial when I just plug in my computer. Last step is don't eat garbage. This is a good rule of thumb in general. But NTP, remember, was from 1981. The premise that in 1981, you'd have reliable hardware, reliable network, and reliable everything would be insane. They knew this was gonna happen. So NTP says hit multiple servers multiple times, favor the responses with the lowest latencies and discard statistical outliers. So if one of the servers or one of the responses says, it's a million AD, you probably don't wanna go with that one. On average, it takes about five minutes to trust a server. You could say a single response could be bad. The whole server could be bad. NTP expects those failures. And there's all kinds of reasons like it out there. It could be malice. It could be that it's hot in the server room. That will affect the time. And then NTP, and there's no way to do this in 10 minutes, but it performs statistical analysis to filter what's called the true chimers, which are the NTP responses in servers that it trusts from the false tickers. And I don't know why they didn't go with true tickers and false tickers, but whatever. They could take that up with David Mills. So my made-up graph here shows, and my impression of what's happening there, where you basically have, you perform your statistical analysis. There's a clock filter algorithm. I had the vector scattergram, wedge scattergram, which it couldn't fit in there. But basically you say, these results I trust, these are the ones I'm going to use, and these are gonna discard. So in conclusion, you can account for latency well enough, but not perfectly, but well enough. And that is sufficient. Math protects you from bad responses. If you are on guard against terrible things being told to you on the internet, also a good rule in general, you can protect yourself from that. Don't drive in reverse unless you really have to, and you can see NTP on your own computer. And I have Julia Evans to thank for this bullet because of her zine showing all the cool tools. I'm like, wait, I wonder if I can look at TCP, and you can do this. So sudo tcp dump dash vv port one, two, three, and you can watch your computer every 64 seconds and make these tiny little adjustments. And remember, who's available to help you ship products? This guy. Thank you very much.