 John's talk is in the big room, where you have the keynote. If you run, you can submit the good bits. I don't need to introduce him. Hal, you were all here for the beginning. Russell, Rusty Russell, welcome. Thank you. Who's been to one of my talks before? I knew I should have put something new in there. OK. Right. This is my talk, advanced C coding for fun. It's about C coding, so there will be C code. If you were not a C coder, exits that way and that way. That one is the best. Go straight across, straight back into the venue, happiness ensues as you watch John Corbett's talk. You will actually be watching C code. It is advanced. If you do not know C, this is not the time to learn it. Probably will scar you for life if this is your starting point. It's certainly not for profit. That's why that's there. And this is me. Now, unfortunately, my slides go about 10 minutes too long. So I'm not actually going to do it like this at all. We're going to do it like this. OK. No, that's not a good thing. I was going to cut a few things out and polish it a little bit more, but for some reason I felt inspired to write a secan module that allowed you to simultaneously connect to IPv4 and IPv6 last night. And so, yeah, so this will be a little bit less polished. What we're going to do today is we're going to write a server called O-server. I'm actually going to write it as a secan module. The reason I'm doing that is that most of my tips and tricks have gone into other secan modules. So I get to wave at them as we go past. This is going to be very, very quick. And there's going to be lots of hooks in there, like URLs on the side and stuff. And I expect you to all be furiously typing away and missing what I say next. OK. So, yeah, I do. How annoying. OK, let's also switch to silent mode. OK. Right. So, we go to that. OK. So this is what our server looks like to start with. It has two functions. It has a setup function and a serve function. I'm not going to spend too much time on the mechanics of the server itself, except to know that it exists and it's fairly normal C code. The first thing you'll notice is actually I've already pulled in another secan module in order to get the GCC wrapper for the no return stuff. It does exactly what you'd expect using the attributes in a mass portable way. And here's what it does. Setup, basically, sets up a socket, binds and listens. Anyone notice that that's using IPv4? Yeah. I didn't have time to rewrite it between the keynote yesterday and today, so still using IPv4. It returns the listening socket waiting for connections. Once you connect, you call O-server-serve, which basically reads in a string until it hits a new line, uppercase is the whole thing and just writes out some stupid string with the whole thing uppercase. Another secan module there, write all very simple write loop that just repeats the write until you've written the whole thing out, really, really dumb. This is an incredibly simple server. Now, by itself, this could actually be submitted to secan today, because we accept anything in the whole treated junk code style. There's a whole junk code section. And I just take anything that is C code or resembles C code and I put it in there. But if you really want to be a secan module, then you should have some other things. And if we run secan-lint, it immediately complains that we don't have an underscore info file. Should I create one? Yes, create one for me. The info file looks like this. It's a little C program. And you type a little bit and you populate it like so, okay. Very, very straightforward. Okay, there we go. It's a little thing. It has an example in it. When you execute it, it gives you the dependencies, the information, in this case, the two other modules. Okay, cool. So run secan-lint again. Now, it complains about not having any tests. It actually gives a very, very long verbose complaint that I don't think I can fit on the screen about how you should write all about testing. One secan-lint thing is take my code and turn to a secan-lint module and just keep secan-lint module. Just run it again and again and again until it stops whining at you. It even offers to create a template file for you. You create a run test. You notice that it doesn't get linked. It actually has to include the C file directly. We use the TAP module, which is based on the Perl test anything protocol. It's very, very simplistic testing. I someone say overly simplistic, but if we're talking about C code, it's getting them to test anything at all. It's an achievement. You basically say how many tests you're going to have, and then you do some tests, so we type, type, type, and away we go. Okay, now, in my test, I'm going to cheat. I create a temporary file. The test will get run on temporary directory, so you can just drop turds everywhere. I open a file called runfd. I write the input to it, seek back to the beginning, and then I run that o-server-serve function in a child. Then I wait for the child to exit because o-server-serve will actually exit. Check it, exit at okay, and check that it appended the string we expected to the end. Okay, all good. Now, this will make CKLint happy. CKLint, and it no longer complains. Okay, good. But our score is still pretty low. If we run it with verbose, it will tell us about the things that we passed, but didn't get 100% in. And I'm first going to concentrate on the fact that there are no examples in o-server-del-h. In fact, there's no documentation at all in o-server-del-h. So we should write some documentation. It looks like this. We're really going to have to go fast, keep up, people. Okay, so, right. I'm not a fan of this document extracting phase. The documentation should stay with the code, but it should have some vague structure in that you can actually do some same parsing on it to check it. In this case, you can see this example here, and the example for serve, and we've inverted the two, so we've put, as you expect, set up at the top, because I believe your header should be readable. And we've actually put these dot, dot, dots here. You see that this second example won't compile by itself. CKLint smart enough to figure that it has to sew them together to get them to compile. Okay, CKLint-v. It will now no longer complain about no examples. The thing that it's complaining about is our coverage. If we're getting a zero for our coverage test, because CKLint uses the heuristic that if you get 50% coverage, you get one point, 75%, you get two points, et cetera, up to five points, and you get a bonus point if you get 100%, we don't even get to 50%, that would probably be the function that we don't test at all. So, let's look back at our testing and see what we can do. Okay, well the first thing we could do is actually test more than one string. The standard way to do this in C, old school would be like this. You would declare, you'd declare an array of all the things you wanna test, you'd iterate through the array, you would use the array size macro, which basically takes the size of the array, divides it by the size of a single element. This one has a few twists in that you have that something's not an array under GCC, it will spit out this rather bizarre warning, but you will get a warning. So, when you refactor things from an array to a pointer, at least you'll find out the places you've got array size rather than randomly getting one. Very, very important. The other thing you may have noted that I actually had an example before is this. I've been coding in C professionally a good 10 years when I made that mistake again and got stricter up around the wrong way. And so, since then I've decided I'm going to write a wrapper and use it everywhere. Of course, I've never made the mistake since either, but the bathroom drawer at the time, you wanna use something like that. That is in the secan strip module I might add. Okay, so that's all good, but it kind of sucks. I mean, basically anyone who's writing a scripting, use the scripting languages goes, this whole array so you can iterate through is a little primitive and they're right. There's a secan module for that called foreach. Foreach pointer in fact will iterate as you expect, iterate input through those various constants. So input will first be set to that around the loop and then that one and that one, that one exactly what you would expect the foreach loop to do. There's a bof on this secan package that's tomorrow morning because I don't have time to cover it here. I do not recommend you look at the code without holding someone's hand. And it's a little bit like that. I'm very, very proud of it. Actually, that's the wrong URL. Sorry, it's actually that one. Did anyone get re-crawled? Too quick. Okay, so where are we? Okay, so that's foreach, but of course we still haven't actually improved our coverage very much because secan-lint will still tell us that we score zero for our test coverage. Okay, if you look at what it's actually doing we write with double V and we will see that server.c it will actually print out the G-cub output and it's not testing the read failure case here where we exit. There are ways of actually testing that, the classic being to override urx with something that does a long jump back to the original code and I've done that before when you really can't modify the code. That's why we use the hash include rather than linking against the module so you can do ugly macro tricks like that. But really the better thing to do in this case is to change this function to actually return a bool and just return false when it fails and leave it exiting up to the external. Now we've changed our function a little bit in order to make it easy to test. I feel that people who do not consider testing when they write their code are a bit like people who build a building and go, I'm not moving anything to put in loading bays and maintenance corridors. It's gonna be really messy and at some point someone will point at the messiness and go let's those lifeboats make the whole thing look cluttered and you'll end up in problems. Okay, so it's definitely worth making minor and sometimes not so minor changes to make it maintainable, assuming that if you are actually ever going to have to maintain it. There was a joke in there about the necessity of having Jeffery's tubes, but I left that one on the floor for lack of time. Okay, so at this point, you're going wow, he's going on about testing a lot, but we will get through this. Now if we're gonna do that one, we might as well do O-server setup even though for lack of time we don't actually test it. What's interesting here is that instead of exiting we now just return is this function. If we fail the bind, we have to close the socket before we return minus one. As all of you should know, Libc is quite happy to destroy Erno even when there's no error. So you have to protect it and there's a secant module which provides you with a whole heap of wrappers that just do that for you because it's such a common thing to wanna do. So we do have to fix that up. Other than that, we're pretty good. Our coverage, of course, we haven't actually changed anything, so it doesn't help. So the next thing to do is to go and add another secant module. It's a bit of a silly example for this, but this is the fail test module. Was anyone at OLS the year that Jeremy and I gave our talk on NFSIM? I was there, oh, okay, you were there, yeah. It's the same trick, basically, new clothing. This fail test override, before you include the main C file, basically macro overrides, malloc, open, pipe, read, write. We've only got a few so far. And every time you call that, it basically forks off the child and the child will fail. And all you have to modify your program is you call fail test init at the beginning and then instead of exit, you call fail test exit to clean up. That's it. And that will do the basically add fail test to your program for free. Now there is one other thing that I did and that is I used the tap fail callback because normally tap will quite happily fail one test and keep going. That leads you to combinatorial explosion. So there's a callback you can set in tap, at least as of two weeks ago, which in this case, I just make it exit immediately as soon as we fail. That way we don't get our children exploding as well. Okay, so now if we run this, we look at our coverage. We'll see that it takes a little bit longer. We'll look back at our coverage. We can see we've actually covered that read failure case here. That we covered it 90 times. So you didn't even have to write in your test or anything else, you just added fail test and boom, you got your testing. It's all over hooks you can put into fail test and we're working on making ways to make it extensible and everything else. Because if you notice, we didn't test the write all fail case because write all isn't overridden by fail test. So fail test is awesome. If you're pretending to write a program that can handle mallet failures and you're not using something like fail test to actually check that you are, you're not. And frankly, in a lot of cases, you're so much better off just giving up. So because I mean, every time you will find bugs. Coming soon, yeah, I'm still thinking about it but I had to talk to write. Okay, so now let's get back to some actual functionality adding because it's so much more interesting than testing. Let's add some options. Who here has used get opt? If you haven't put your hand up, you shouldn't have been in the room. Okay, who here has used Popt? The, yeah, the Samba one that did. Yeah, Popt's not bad. It's like Popt only, I think, it's what I wanted Popt to be. I actually had some interesting discussions with someone who's looking at rewriting Popt and eventually, this discussion's called because you didn't have time and I eventually thought, I'll write him a little example version of how I'd like the interface to look and next thing you know, I had another CCAN module called opt. I did send him a link. If you wanna call that Popt too, go nuts. It is different from Popt in that it's extensible and it's type safe. Type safe is good. You can register tables of options the same way you can in Popt and get up long, but you can also just chuck them in at runtime. For example, like this, that's the way you do it. Two options here, help or usage or dash H, which is printer, you know, it uses a standard opt help or opt usage and exit, which generates the usage table and all that stuff. And we have a port option. And we use another built-in opt set uintval and in fact opt show uintval. I can actually run this. Now, what's interesting here is and I wanna divert a bit into the usage of opt is that we use the type safe callback module, which avoids the whole classic void star thing where you're handing your callback some random pointer, you change the type and the compiler still treats it as a void star, so it has no idea that it should give you a warning about that having changed. Type safe callback actually checks that this function here takes the same thing as this thing is here. So if I actually change that unsun import to something else, I will get a compiler warning, a really odd compiler warning, but I'll get a warning. So, look in type safe callback. Heap of GCC extensions. That has to have the right type to match that. Just believe me, it works. This is in fact a macro that does the macro magic around the real one. Yeah, look in the model, there are lots of examples in there. So, okay, so type safe callback is really important. Now, because it actually builds that example, we can actually run it. And there you go, it prints out help as you'd expect, okay. Now, a little bit of divergence here. Oh yeah, then we call opt-parse to actually do the parsing and then it basically leaves in the arguments left in argv that weren't options and so in this case we just go blah if it's wrong. So very, very straightforward change. Two things to note, one is that it's much more grep-friendly rather than grepping for port to find out where someone's put the stuff that sets the port. You can actually grep for dash dash port and you will find the place that registered the option. And I'm going to divert a little bit and this is the only diversion I can afford to do is to the header here. When you build up an opt table, you do the standard thing in C where you build up a table and the last one is a row of nulls. So it knows the table at the end. We don't do that, opt-n table is in fact not all zeros. It's in fact opt-n which is eight. Zero is in fact not a valid type value for that field which is the type field. The reason for that is that one of the most common sources of bugs is that you forget to terminate your table. And it works beautifully because the next thing in memory happens to look all zero-ish and it works fine for you until something else unrelated changes. So don't do that. It makes it slightly easier to use to have that but it makes it much easier to misuse. So hard to misuse should always be the only library design and so opt-n is not zero. Okay, so this however is still not a real server. It is actually kind of pathetic. If we have a real server, it will need to do something radical like have multiple connections at once. People have these expectations. And there are two ways when we think about how do we handle multiple things to do this. You basically have some kind of select loop thing or you use threads. Those who know me knew that that was never gonna happen so I did briefly consider turning this into an anti-thread tutorial and spending the time that I actually spent writing this code on actually turning anti-thread into something that I wasn't embarrassed to show to people. But in the end it was just simple to go, no, let's go for an event loop style thing. I chose to use tier event. There are a whole heap of libraries out there. Tier event is a little bit baroque these days but it's what SAMBA uses. And it's a very straightforward, here's my file descriptor, here's my callback, throw all those in and then call the event loop and it will call all the callbacks. Very, very standard stuff for this kind of programming. Of course at this point we're basically talking about a complete rewrite of O-server which would look like this. And when we look through it, we have a structure for each client because now we have more than one. It has some state, first it's receiving user question, then it's sending an answer and then it's finished. For each state we have a, whether we're interested in reading or writing events, we have our event struct that we've registered, the file descriptor itself, we have the question that we're reading or that we're sending out, we have the number of bytes sent. We have an array of five clients, we have the listing file descriptor and the event struct for that. Yeah, well, you know. So our setup function basically, interface has changed radically. We now just have a setup function because it registers all the callbacks. Now, the event uses taloc. Does anyone not familiar with taloc? Okay, you're in the wrong room again, but we will go through that very, very quickly. So we do our socket and everything and listen, but then we basically throw it into the event context that they've handed us with the callback being add client. We set it to auto close as well. So when they get a connection, we call add client, we set up our client state, we do a accept, we register that file descriptor with our event context and we say service client when that's ready. We go through our clients array, we find something, we go, okay, here's my slot. We set our pointer up, so our clients in the array. We set up destructor. So taloc is a hierarchical allocator. So every time you allocate something in taloc, you can then allocate other things off it. And when you free the parent, you'll free the rest of it. It makes a lot of memory handling much, much simpler and much nicer. One of the things that it does also give you is the ability to go, when you free up this memory, here's the destructor. So things can clean themselves up, which turns out to be incredibly useful. So that's what the destructor's about. And as you'd expect, clean up client just go through the array, finds itself, nulls itself out. The other twist is that if our array is full, we actually tell T event, we're not interested in any more listen, listening like any more things on the socket that we're listening on. So it sets that event flag to zero. So in the destructor, when we zero ourselves out, we have to say, we're interested in read events again. So new connections can come in just in case we were the one that made it full. Okay. Our state machine looks like this. We read a string. If something goes wrong, we go fail. If the input's finished, we uppercase the whole thing, set our state to sending answer. Setting answer calls send string, which does exactly what you expect. If it's finished, it bumps the state by one. We should never get to a default state in here. If we're not finished, we return. Otherwise, we tellock free. We just free up the client, which we call the destructor, and have it remove itself. Okay. Cool. So that is our simple program. And that is what the talk hierarchy looks like. We have this instructee event FD thing that we register. We hang the client off it. And off that, we hang its thing, et cetera, et cetera. Okay. Now there are two things that suck. Oh. No. That's through me. There are two things that suck about this library. One is the use of void star. This is not an appropriate usage for void star. This should be a struct oserba. Because you can quite happily in C have declarations without definitions. So you don't need to know anything about it just that it exists. That makes the library user's life much, much simpler because it's a lot harder for them to misuse it. And the other thing which happens to involve the same solution is this global here. We should put that inside a struct oserba. We'll obviously need a back pointer from our client to say which oserba you belong to. Even if you have no intention of anyone actually ever creating multiple oserbas in the same process, there's nothing particularly to stop them so you should make it more robust. And while we're doing this oserba trick, we should. Our telok hierarchy will now look like this. We have this struct oserba. That contains everything else. And we told them in the header that they could simply telok-free the thing that's returned. And it would all get freed. In fact, that will happen correctly. When we free up the oserba, it will destroy that. It'll destroy that. It'll destroy all the clients that hang off it. It'll all clean up. They will call their instructors all good. Okay, ah, yes. So, having done that conversion, we run sickenlint. Bye, there we go. Okay, valgrint. I cannot say enough good things about valgrint. Of course, sickenlint runs your test under valgrint. I actually heard a story of a busload of orphans that were heading towards a cliff who installed valgrint. You know, it just, it's amazing. I don't understand the logo, but it's amazing stuff. So, it tells us that in oserba.c, we can actually have it attached to the debugger. But if we just go to 172, it becomes pretty obvious. When we moved this client's array into oserba, it used to be a static. So it was implicitly defined to be initialized to nulls. We didn't explicitly initialize it when I moved into the oserba struct. So when we allocate the oserba, I just tell it to it and didn't initialize it. That's what valgrint's complaining about. So we fix that up. We are at a clear clients, which is a bit gratuitous and marginally non-portable. And we call it from oserba. And now valgrint should be happy. Let's run sickenlint again. And this time it will be happy. Okay. So we're still pretty pathetic. We basically receive an answer and spit it back out in uppercase. We should at least get back to where we were, where we send a greeting. So we add a few states. When we send a greeting, it looks like, welcome, please ask your question. And then it says our answer is, and of course then spits out the same string in uppercase. That's still pretty pathetic. Slightly better is to actually change it just a little bit and keep the last answer that they gave. So the last question that they gave and use that as the answer. So we change it slightly, we ask the question and then you say, I believe a better question is, which is a great political stunt. Okay. If we do, actually I have a script that does this for me. It's called RunServer. I think it should work. Okay. It doesn't work. I wrote that in the keynote this morning. So no huge surprise dash V, okay. Well, so second let it creates a temporary directory and then blows it away afterwards unless you say, just keep. You can tell it to run a particular test too, but I can't remember the syntax off the top of my head. So 26, example, I'm just going for OServer. So then we go tell net localhost 2727 and we go, how much wood would a wood chuck chuck? And it says, you know, that it had a pre-canned question. So if you tell it again, if they would chuck would, do you know how long it took me to come up with that? So, you know, it basically just gives you the previous question. Okay. So that's dumb. Now my plan was to basically leave that running so that you guys could tell that in, but because the wireless tends to go up and down and I wanted to fix the IP address, I plugged myself in, the downside of that being you guys can't reach me anyway. So, but you know, let's pretend that your guys are all furiously hacking, you know, trying to crash my flawless server which could never be crashed and that I'm equally relieved that I'll never have to put that statement to the test. So, local bin parrot and let's run parrot. Okay. So there's our server off and running. Cool. So if any of you managed to get onto my IP address there, then you would be able to tell it in and ask yourself stupid questions. Okay. Now, let's turn this into a real, a bit of a real project by adding a few more states. In this case, I will give you, what did that actually say? Okay, cool. I'll actually give you a demo by doing, blow away all the secanelint stuff that I've had temporary before and do secanelint-v-kl, secanelint-tabexample, underscore info. So basically I'm just using the info at the, that example was in the intro, that is my server which is a little bit odd. But okay. So now for this, I would normally use someone else from the audience but I'll do it myself. We've changed the port number since we've changed the implementation of the field we're running to 2828. Please ask your question. Since this is supposed to be an oracle, we'll say, now we can ask it other questions like this one. And of course it uses the old trick where it goes, oh, well, I'm wondering about that. Why don't you answer? Here's another question for you and you say something like, I don't know if there are risks or whatever. Okay. And sorry, no. For those of you who remember the early nineties, this is of course the use net oracle. For those of you who don't, it's like chat roulette without the nudity. So, the state machine now looks like this. I'm not going to write it in too much detail but we've basically just added a few states, particularly we end up, this is an example we've got two clients set up and at some point in the state machine, they look to find, you go, oh, I'm trying to find my oracle which is another client and I'm going to be someone else's oracle. So we have a sub-client pointer and an oracle pointer. So at some time in the state machine, as we get to the, you need, when it's going to ask you the question, it waits till it can find one. If necessary, it jumps into that state that says wait for client. So someone else looks through, oh, you're in wait for client. Okay, I'm ready. So let's swap. So that's a pretty straightforward code transformation. Let's see, we've enhanced our state machine now. It's getting a little bit more serious but it's still kind of dumb. Now, so when our input's finished, we go and get a sub-client. If you've got one, then we move on. Otherwise, we go into the waiting state and similarly, we look for an oracle for ourselves and then when we've answered the question, we bump the other person out of the state. All very, very simple stuff. There's just one problem and that is that what if a client disconnects, you go, well, while I'm doing that, ponder this question and they just leave. Well, I didn't quite know the answer so I wrote test, which looked like this and of course it just starts a client which then, forks of a client which then disconnects halfway through and in fact it starts two clients for this case. Tells one of them to disconnect halfway through and we see what happens. Seek and Lint, they're on Seek and Lint. And of course it goes away and runs all the tests and of course hits Balgore and Balgore and goes use after free. Okay, I'm not gonna go into the details of how you should actually fix that. I had a real moral here about just because you fixed your test doesn't mean your code doesn't suck. The way I do it is that every time in the client destructor I now go, oh, over someone's oracle, let's just set that point back to null. And then, every time they wanna need an oracle or they're about to do something, do I have one or let's just find another one if we don't have one. That means you get reconnected randomly with someone else so if you were halfway through answering someone's question and they vanish, your answer will go to someone else. The true believers would consider this a sign of the mysticism of the oracle. I consider it a feature and refuse to fix it. So, okay. So, we hack that, all good. Where did I, did I actually hit that? Test, just doing it, okay. So I fixed that, hack, hack, hack and we basically just reconnect as required on demand. It almost works. Now, a serious problem with this kind of server is Valgans are great for just giving you a report of memory leaks when you exit but while it's running, that's not so useful, particularly if it does actually clean up at the end. So, a really, really cute trick is to hook up a signal handler to do something like this. T event, so hold on, in oserver.c. T event, of course, does signal handlers as well. So you can say, on signal user one, call this callback telogdump and handle the oserver. Telogdump basically opens a file and does telogreport full which basically dumps that whole tree for you. And if you want, there's a cute utility I think out there somewhere which can convert that to a dot file so you can graph it in graphbiz and put it on your wall. But you could run that every hour and see, is this, why is this growing over time? And it will show you all your memory allocations and how they're related and incredibly useful. One of the problems with this implementation, of course, is that your server stops while you're doing this potentially large dump, particularly if you do have a memory leak and it's writing out megabytes and megabytes. So, one clever trick is actually do the whole thing in a child and so the main server can go off and do stuff using the operating systems copy on right for all that stuff. So that's kind of cute. Okay, so that's kind of useful. I'm not gonna demo it. Does anyone remember something Tridge wrote in like 98, I think, called genpazer.pl? Yeah, Tridge remembers, Tridge remembers. It's basically a code generator that looks through your headers and generates bundle and unbundle functions for your C code. It is a pearl monstrosity and so I spent a day converting it to Ccan. I planned to add all these features. I just got it to work and then I was happy and backed away, but what we can do is we basically, if we pull out the types that we're interested in into a separate header rather than leaving them internal, we then include Cdump and we annotate them slightly to say which ones we're interested in saving. Any structures that it doesn't see the definitions for, we can either write our own bundle unbundle functions, expose that implementation so it can actually bundle and unbundle them itself, or just say Cdump ignore on stuff. So we do that in a couple of places where we can't, where we're just too lazy. It doesn't really matter that we're not dumping that. It's not particularly informative for us. Okay, and now, sorry, I did have a cheat sheet but I left it in my bag. Okay, now we have a tool that actually, a little helper file that calls the C code to parser and everything else. And the first time I ran this, of course it, yeah, it chewed up on my memory and then fell over. Cdump's kind of dumb in the same way that GenParser was and that is the, yeah, it doesn't handle loops. So we had a pointer from the O-server to the client and a pointer from the client to the O-server and it just went around and around and around. Well, that's kind of easy to solve. We just set the O-server because we know that to Cdump ignore, no point in having the back pointers, but for the sub-client, oracle pointers between the clients, that gets a bit harder. Oh, I really should hack it so that it remembers and does replicate. Or I could just change them to integers and use an index, that works too. So that was an easy fix. So we did that and then of course we hook it up to a, signal handler and call it dump. Let's tell it dump, here's our dump. We call Cdump bundle, we open up a dump file, we write the whole thing out and then we close it and free the string. So we add CUP. Basically, we've added a dump file parameter to the O-server, server-to-linch. We add a dump file. If they set it to non-null, then we initialize the signal handler to dump out to the dump file. Straightforward. Now, that's probably worth showing. So I'm going to do, let's see if this works this time. Cool. Okay, so it's actually now running our server. We tell net, do 28, and I'll look at those, 28, 28. You guys, that's an interesting question. Okay, and then we kill all dash-hup-o-server. Oh, sorry, it's called the example-underscore-info-o-server. And we cat, you wrote this crap, far run o-server-dump, and we actually see, there we've actually dumped out the state in a nice readable form. So in fact, you can edit it if you wanted to. Now that's kind of cute. You actually get more visibility into what your server's doing, more than just out of the straight-telec dump, but the fun bit is really not dump, it's more restore. Cdump can also re-bundle it back into your structure. So let's do that. Nine, let me apply that patch. Now we have no server restore. So you have basically had a dump file in event context and it returns a successfully reconstructed no server, sucks in the file and calls cdump restore on it. Hopefully things work. Gets it out. Well, that's reasonably straightforward. Restore, here we go. We wrap all the stuff that we need in both branches into a separate function so we can just o-server restore. We load in the file. If it works, we complete the server. Okay, remember that stuff we ignored? We kind of need to put it back at this point. So we basically iterate through and we reconnect those o-servers and recreate the file descriptor event. So basically re-register everything with the event context and that works. But the other thing is that cdump doesn't understand TALAC at all. So instead of this nice TALAC hierarchy, we end up with this flat row of crap. So our TALAC hierarchy is simple. So rather than fix cdump, I just TALAC steal things back where this works to be and rearrange my tree. All done. Okay, so that gives us a nice restore but that is completely useless because how do you restore all the connections and everything else? I mean, you know, there's no point other than the cuteness and this is where you get to the joy of exec. Right, everyone see where we're going now? Okay, so o-server.h. If they hand in an array of arguments, it means when I get sig-hub, I want you to just dump it out and then exec those arguments. Now, those arguments are the same ones we had before with dash-restore put in the middle. When we hit dash-restore, we actually try to restore from the file. Okay, now this is my first risky demo. Other than the other ones, this is my first really risky demo. Okay, so we round up the server, might build it and everything else. Still getting 11 out of 11, that's good. We tell that in and go, this is my question, kill all, let it in again, another question, whoa! Okay, it's still alive. That's a good sign. Varun, server, dump. Indeed, it actually restored the whole question. Well, that's actually really good. Oh, wow, okay, so that worked, good. The real thing means that you can do live upgrades. If you need to fix a bug or whatever, you can basically just invoke this code and re-upgrade. The cute thing is that Cdump, if it reads from this dump file, if there's something that is not in the dump file, it sets it all to zero. So this means that you can actually do extensions as well. You can add a new flag or a new pointer in there, knowing that it'll come null when it comes off the I out of a dump file. So you can either fix it up or just have your code naturally handle that case as null. That means you can do incremental enhancements. Tridge wrote this so that he could do it with his chess server without ever taking it down. He could actually upgrade it. Now you can do slightly more complicated upgrades and oh yeah, you should also probably not do it that way. You should probably do it inside a child, test that the dump works, because it has a tendency to sink vault and get the infinite loops. And if it works, then have a dash dash restore dash check and try exacting that first. If that works, then second time do it for real. But I had perfect faith in this working demo. So, okay, let's leave our successful victim running. Running, we'll move temp second lint example to use a local bin O-server and run O-server dash dash port 2829. Okay, so the idea was that you guys would be able to kind of tally in everything. Yeah, no, that's we're in trouble. One minute warning. Okay, now the other thing that we hit is causes this clients array brackets five. Okay, so what we'd really like to do is change that, introduce like a max clients rather than that. The problem is that if we went from here to here and we did a dump, max clients wouldn't be in the dump, so it would come out as zero at the other end. Several ways of doing that, one is to write the new one to go, oh, zero, you could never be zero, we'll assume that you're five or whatever. The better way to do it is to do it indirectly. You do an upgrade like this. You put in max clients, you set it to the maximum number of clients, but you don't actually use it yet. You run that. So you upgrade, that's what I called. This will build it, move it over and tell the thing to re-exec. So hopefully that has got us to this middle stage here and then we actually start using it. We actually tell C dump, that's the length of the array, using that annotation. So now we've actually got a variable number of clients and then we upgrade again, still running. Now you're supposed to all be furiously typing away and suddenly more than five of you could get on the server at this point. It's gonna be so cool. Okay, of course, what if you don't have C dump? What about that old parrot program that we have, for example? What if that were a mission-critical piece of software to upgrade? Well, that's infeasible. To do it, you'd have to do something like, you'd have to write some kind of script in your favorite scripting language. And it would have to, you know, fire off GDB and attach to it. I mean, when you think about it, though, we didn't, in that original example, you might not remember, we didn't actually save the O-server pointer. So you'd actually have to grovel through the event structures to find the one with the right callback and know that the private pointer was our O-server pointer. Suck that out, dump it out, grab the file descriptor, walk the whole client array, and repopulate that, setting this, parsing the states with some kind of really, really hacky regular expression parser, and then bundle the whole thing up, tell GDB, please exact the new version with dash dash restore, and hope the whole thing worked. Now, I would not recommend anybody ever do this. Oh, yeah. You'd have to make, you'd have to bridge in that. That, too. And, okay. Wow, my parrot has turned into an oracle. Okay. And we have no time for questions. Thank you. He does, mate. He's magnificent. When he worked in the, in the large room, you know, the one, the larger auditorium that holds about 400 people, we gave this huge, great bowl of, made from macadamia nutshell. Here he's only working in a little tiny auditorium with half the number of people, so we've given him a little tiny bowl. A tiny bowl.