 All right. First of all, I have a pretty bad headache right now, so I hope I'm not going to bore you too much, but I'll try to stay on top of things. So what I want to do in this session today is basically show you a little bit of the code and answer any questions that you may have. Has anyone of you guys looked at the code? All right, so I think maybe it's good if we start from the high level again, start from this graph and then talk a little bit about it or just recap this. Maybe you've forgotten it from yesterday. Oh man, these arrows are beautiful. And then get the code. So let me start with our webpage. I'm sure you've seen that one. And there are all the links that you need on here, specifically how you can clone the Git repository. Are you all familiar with Git? Okay, so that's not going to be a problem. That's great. Then I assume you have the code or you can get it with this command and we'll look at it later. I can talk again about this and then walk through the code and look at it. So one of the core, the main component of NetConf at the moment is a reactor loop. And this reactor loop is basically centered around a select call. There is a little bit that we have to do in addition to select because we have timeouts and we have signals and all that kind of stuff and Python doesn't make it very easy. So we'll see that later on. But basically the core has a select loop with file descriptor. So whenever there is a file descriptor that has data on it to read, right now it's only reading. I initially thought we might have to do writing at one point in time. But it's only reading. So whenever in the code you will often see FD Reactor and they're actually called FDR Reactor. So file descriptor read reactors. We might change that at a later point in time because I don't think we need any of the other select possibilities. Like can we actually write to a file descriptor? Is there an error on the file descriptor? Because so far all the file descriptor stuff has been very unproblematic. So I didn't see any need to do anything there. So in this loop you have a couple of reactors and that's basically sort of the core approach to putting anything into it's registering anything with Netconf. Like for instance the control socket which is what I used yesterday in the demo where you issued the commands and let Netconf do something. But I also DH-clined and then also WPA-sublicant and so on which is a control socket. So everything goes over file descriptors and we're actually using a lot of file descriptors which may or may not be a problem. I don't think it's going to be a problem at all. I mean I don't actually know what the maximum number of file descriptors that the process can open is but I would assume it's somewhere around 64,000 or 65. But I don't think we're going to reach that. I hope not. I don't see us doing that. So what happens basically if you have the control socket and you want to issue a command on the control socket is that there will be data on the file descriptor that is associated with the socket, the listening socket and the reactor for this will then actually create another reactor that will handle the client. So this is basically the multi, I forgot what the term for that is but it's basically like sock stream which allows you to do multiple clients on the single socket so it's not a one-to-one thing but for every single client that connects we spawn a reactor of its own and that reactor then actually does the event creation and passing. So the reactor creates an event. It goes through the authorizer. The authorizer's job is basically just to figure out whether this should be able, this should be possible to do. There are going to be other event sources here. One of them that I didn't include here for instance is Netlink which is basically listening to kernel events so that if you plug in or have a new network interface or if there is a link status change on the network interface then there's also an event created here. So they also go through the authorizer and the authorizer uses the data from the request as well as the source where it comes from to authorize the request. So because Netlink doesn't have UIDs it's always the kernel that does it. The authorizer will actually not look at UIDs at all when it comes from the Netlink socket. This is actually implemented such that the authorizer asks again the control socket. I'll show you this in the code later. So then we come to the interface policy and at the interface policy we have two events basically so it's an event-based system and there are two classes of events that exist. One of them is the command event and the other one is the result event. Now we've only recently added all this event stuff or re-added it until January. NetCon was multi-threaded and had events and everything was going via events and then I decided to kick out the multi-threading and it was just single-threaded from that point on and for some reason which I can't really explain I also got rid of the events. I honestly don't know why I did that because events seem to be a very good approach to taking this sort of asynchronous approach. A question? Why did I want to go to single-thread? In the archive if you're actually if you're very interested in the technical reasons I'm not going to be able to discuss all of them but so if you look into the archive I think in REF there is an IRC log in January 16th and then also in January 22nd about this was in Debbie and DeVell about all the things that threads do and are bad and so on and so forth. The main reason that made me think about it first is no polling so the initial implementation that I had was basically a multi-threaded event passing structure and each event had an event lock like a mutex basically and in order you would basically wait for the event you would issue the event and you would wait for the event to be completed in some other thread and this waiting is polling and polling wakes up the processor all the time and so if you run power top and you had the netcon running at the same time it was going at 99.8% or something like that so it was very very bad and that's not that's not how you write code. There were two solutions out of that one of them is to say I'm going to not use polling but I'm going to use a select-based approach where I have a pipe between well basically the marking an event done is done via a pipe that is created for each event and then we use select block on that pipe and when the event is done you write some data to the pipe and then the select returns and it's basically the same thing without polling. That was one possibility and you know that would have also consumed a lot of file descriptors it's always two for each pipe but at that point in time I also didn't like the multi-threaded approach at all it was horrible to debug it was very very difficult especially I mean as soon as you're in the debugger you're stepping through code and it's not real time anymore it's absolutely non-deterministic so I found it really hard to figure out what's going on at what point in time and I guess that's a problem that everyone has when you do multi-threaded development so then I decided like let's just go completely single-threaded and it seems to me that this is a sort of like common thing to do when you have like first of all Twisted was named yesterday so I looked at Twisted today which is a framework a reactor framework for Python that allows you to write servers very easily in clients for the internet and they are also single-threaded well that their main loop is single-threaded they have an API for the main loop so that you can swap in any other one GTK and so on and so forth but their default approach is a single-threaded approach and I also looked at implementations like Apache for instance where you actually have a model that has multiple threads but each one of these threads is in and of itself single threaded so the it's simply an optimization actually of the single-threaded model so that you just simply have five that can react much faster five threads but each of them is actually single-threaded so it's not it's not that Apache just creates a lot of threads all over the place I think this is the preferred approach now this actually cost us a lot of problems because if you I just is there a pen for this do we have one well I can use I can use the text editor here quickly like if you imagine you have this function right and the function oh yeah it's Python there's no braces the function basically does something and then it says event wait right and then it would continue to do something else so this was what what was in use in the beginning when it was multi-threaded and it was very simple because you had a function and you could wait and then you could continue with a function every single variable that you declared up here was available down here right so very very simple now as soon as you go to single-threaded you don't have this anymore because none of the languages that I know make it easy to write something like co-routines so you have no possibility to jump into the execution of a function from the outside you can't say look I'm going to stop executing here and then when you're done I want you to continue executing right there where you were with all the context and all the state information so if you if you wanted to do it with a single threaded approach then you would have your function and it would basically do something like it would issue an event whatever if up and that would be it function would be done right now in order to be able to react to this like in order to be able to say look I have to actually do something once this event has been processed I have to know that the event has been processed you have to start using a callback so we call these things CB and for instance result and then CB result is just another function where you then do your stuff but because you don't actually have x so if you say y equals x here that's going to be a syntax error there's no x right so in order to do that now you suddenly have to have arguments here so now you want x to be passed here but how are you going to do this so because you can't really be you have to pass a function pointer and the function point that can't take any arguments so then we started to do stuff like this you used python which was the python closure handle result or something like that and handle results would simply do why how did it work it would call CB results that's it with x because x is actually in the scope here right and so now instead of CB result we did handle result and now this would work because now CB result was being called with an x I could be passing around arguments so this is actually very pythonic you can't really port this into c or c++ however what you can port it first of all see newer standards allow you to do this kind of stuff and then on the other hand any single time you're using a closure function like this you can also create a class that is callable and store the data in there so it's it is portable and we should have in very many cases it would have been a lot better if we had done a class if we had used the class for this approach simply because it's cleaner twisted which I looked at today and saw James also looking at it does it like that it uses a class that is called a deferred and the deferred basically manages your callbacks so whenever this function is is done it would actually return a deferred object and that deferred object would then could then be used to register callbacks with it and you would still have the callback situation though but I think it does all the argument passing for you so that was very simple but if you if you look at if you compare this function up here which is very simple and it's very linear to this you can see that stuff gets pretty complicated very quickly so by changing from multi-threaded to single-threaded we had we had to completely rethink the way that the main loop works the way that we do stuff in the in the process and it gets it gets horribly disgusting if I show you the etc network interface handler later which actually uses a work queue to you know be do what what multi-threaded stuff does for you so I'm in very many ways emulating or simulating multi-threaded programming in a single-threaded design so that that you know it's a it had many repercussions this decision but I still maintain that it's a good approach and if I look at twisted enough I look at Apache then it seems to confer confirm that this is actually the way to go so where were we we have these events and we have the interface policy in the interface policy we'd like to think about it as just a giant case statement that looks at command its arguments the source where it's coming from any previously used handlers on this command and their results so initially any event would be just the command its arguments the source and then none and none because nothing has been tried and nothing has been returned so that's sort of the initial handler that you get in return once we tried to do eth 0 if up on the control socket and e and i was returned and that failed then the interface policy would now make or issue the question like look I want if up on eth 0 to happen it's coming from the control socket and we actually previously tried e and i which said it failed what should we do now and then it returns the hcp I'll get into the code in in just a second the way we ended up realizing this is by saying that we have event an event hierarchy so a command event is apparent and then every other event that is related that is issued because the command event came around has a parent pointer to the previous event so it's it's actually rather than being a sort of a shallow tree every single event has one and the same parent pointer to the command event we actually have events pointing to the previous event so if the hcp also fails and we do link local then the link local result event will have a pointer to the event that to the result event from the hcp which was a failure which was will have a pointer to the result event from e and i which was a failure which will have a pointer to the initial command if up and by going up and down that tree the ancestry we can figure out exactly what was being tried and we also have all the information available in passing around the commands so I think that that's sort of the core I will spare you DH client I will not even talk about the implementation of DHCP handler I'll show you the code briefly and then I guess we should also look at this fd proxy stuff and let me pull up the code but let me say before I do so that first of all the last three hours I had network problems so I couldn't pull the changes that my student made I just managed to pull the changes that he made so I haven't looked at them yet so I might be a little surprised here and there when the code comes up although we've been talking about it what he was doing and it's probably correct the other thing though is that a lot of the a lot of the code is very hackish at the moment there are still functions in the code that aren't being used anymore we just haven't deleted them there are also certain constructs that are absolutely horrid in there the way that the reason why we did that is because we said that at this point in time we are unable to actually make informed design decisions because we don't know what the requirements are in like small cases right so we would say look let's just do it this way let's just do it the quick and dirty way and see if this gets us anywhere and then if it does we can actually we conceptualize so this is this is pretty much the point where we are at there are a few of those cases and because of these these cases exist we both refused to well I don't actually know if my student refused I think he would have liked to have netcon 1.0 out for the talk yesterday as well a lot but I was the one that decided that we're not going to do it because it's it would have just it would have scared people off I'm pretty sure so I hope it doesn't scare you off let's have a look so there's actually like demon.py you don't have to care about that it just basically makes sure that it does all the setup and does all the argument parsing and it it detaches you from the pty and kills your makes puts you in it your own process group and so on and so forth so that's pretty boring the demon actually instantiates a core object and then calls that core object and specifically calls the function run on it so this is how the program starts right there's actually a flag for is running we need that for the signal processing so we can take it down easily and then you can see here that we do signal listening stuff signal listening involves us to actually have a pipe that listens to signals so what what's happening here is that we're basically we're registering a signal listening pipe and we are registering a dictionary that says that if if we receive sig int on that pipe then call self dot quit and the signal handler is the same for all of them so whenever we call something like this then sig int is added to the list of signals that are being handled by sig handler and sig handler will simply just write currently it's the string representation of the signal to the pipe followed by two new lines obviously this is brittle as how like there's no guarantee that this will actually arrive on the other end of the pipe by the time that you read it or by the time that you return to the select loop the single threaded approach makes that a lot easier it actually makes it very very likely that you get it but if we were to implement this in in c or c++ or as soon as we will then this has to be done differently and one of the easy ways of course is to just simply write one byte right with the signal but you know that python kind of makes you want to be explicit and it works in python at all times because this stuff is being handled properly by python so that you don't actually have to worry about it at all and on the other hand you don't um all of the reactors that are shaving the data of the file descriptors are engineered in such a way that they can be called multiple times so if the request that you got or if the data that you got on the reactor is incomplete it'll just be buffered and then the reactor is expected to be called again because there will be more data available on the file descriptor and then it depends to the buffer and then it fires off the request so it's kind of it kind of it works at least in python so then we create another the control socket listener here and you can see here the register fdr reactor the file descriptor read reactor basically sock fd being the file descriptor integer and then here is the reactor the callable that is getting called which is then responsible for taking the data off the file descriptor and we start the main loop and now the main loop has sorry looks rather awful at the moment but a lot of it is debugging information and except for this stuff which says here temporary so there's this weird bug that I totally can't figure out where at some points in time there is um a file descriptor integer left in the list of file descriptors to check even though I removed it previously because the EOF had been received I have no idea but python is lovely you know like it it can it allows me to like fire up the debugger when this happens um and then I can I can go ahead and inspect stuff so uh this is definitely temporary and if you factor out from from here down to there if you just forget about that then and then you take out all the debugging information then it actually looks okay again right so there are two things we're doing here because usually three things any program has to worry about one of them is file descriptors then signals and then uh you want to do something that is not event based like timed callbacks is what we call them um so since the signals are handled by file descriptors now um that's all the only two things we have to worry about file descriptors and timed callbacks time callbacks is very simple you just register a function that is called together with a timestamp and then you can see up here and basically this is the calculation if that timestamp is reached the list is sorted so we insert we bisect insert every single time so the list always has the next timed callback up front and then we just simply look if there are any time callbacks um if the one has to be fired now and if it does then we do we we set a timeout which could potentially be you know like 15 seconds in the future so that the select loop then returns because as soon as we actually block in the select group there's nothing we can do so if there is a timed callback we set the timeout so that further down here with the select loop will then actually return even though no data has been received on any of the file descriptors um so the timeout is being assembled up here and this is the select loop this is the the important call this is where everything blocks this is where the program spends most of the time and we're passing it simply the um all of the file descriptors that have been registered as reactors and get it back a list and well I explained the timeout so this will block and we only get here if there is data on any of the file descriptors or there's a timeout and then debugging info and we simply say look invoke all the file descriptors shape of all the data or call all the reactors actually um this function is very very simple I'll show it to you in a second and then when that when that returns then we do all the timed callbacks and then we need this because python is weird at times so let's have a look at that um very very simple this is a dictionary indexed by file descriptor it gives you a callable you call the callable this time um we're passing in read up is the operation read write or error which we get back from select we only need read the file descriptor and we're also passing in the core object because we need that to later unregister the file the the reactor again and and if we want to add time callbacks so we we kind of we we're passing around the core object a lot which I don't like very much but the alternative seems to be a global object and I don't like that either um this is where the test driven development comes in makes things very very interesting because uh you have a dependency of almost every single item that we test it has a dependency on core so we kind of have to mock that up all the time which makes it I like mock objects by now um my student has convinced me of them they're pretty good but uh it adds complexity and I wish there was a better way to do it but I don't think there is all right so let's have a look at one of the reactors um and I guess I'll start with a control socket are there any questions so far so this is specifically very hackish at the moment simply because we haven't exactly figured out what events we need we I call it a language of events um there's still stuff that we have to work on um and then we probably want to use something like an event hierarchy where you can say um there is an event class and then there is a command event subclass and the result event subclass and for instance a success result such as the hcp bound is a child of a success result is a child of a result event is a child of an event so that at any point in time you can actually like decide whether you want all of the events or only successful result events and so on and so forth so basically polymorphism um or with c plus plus we'd be using something like uh over overloading sorry sorry virtual um yeah it is that's polymorphism so yeah definitely um I think we would have to do it with overloading but that's also polymorphism they're all related so virtuals um come into play um so you'll see this further down here is instance is level lovely pythonism you don't have that in in other languages especially not as strongly typed or compound languages um so this has to go and then you also see that we're currently using 500 a lot and then we have like 000 for stuff that we uh totally didn't expect and then we have 999 for stuff that we even less expected than that um this is this is influx but basically um it's a callable so python gives you this function call and because I'm still at this point in time I was still when I was writing it I was still assuming that maybe we have to do writes as well I didn't know um I actually checked that whether the operation is read but I think all of this could be factored out so that only handle read one hour long it takes James to tell me that I can just press the pound key to jump between searches I just remembered it so this is this is how a read is being handled um there's a base class to this um which is called the base buffered fd reactor and it does most of the ugly work for you so all you do here is you basically shave data from the fd into a buffer that you don't even have to expose um if if you get it zero bytes off that means that there was an eof on the file descriptor so at this point in time we do whatever we have to do in terms of cleaning up the reactor and we also unregister from the core because we don't ever want to be called again with that file descriptor that file descriptor is now invalid and somehow that doesn't always work that's why I have that temporary ugly stuff in the main loop um this is this is stuff that's simply uh currently we would like to support on the control socket a protocol such as imap where you can have multiple commands going at the same time and the multiple results so you prefix each command with a unique id and you get results prefixed with that unique id so that you can associate them at the moment we don't do that at the moment you can only do command response command response and therefore we check whether there's currently a dispatch in progress and then if if there is then we just simply discard all the data of the control socket at the moment um does some checking and then calls a dispatcher and I'll show you the dispatcher so this is this is pretty much what any reactor looks like you know it just checks the data for for basic consistency and then uh if that passes it does whatever it has to do with it and in this case it uses a dispatcher the dispatcher is also callable and I use something that is also very pythonic here but it could obviously be implemented with uh either dynamic loading and c++ or or um in another way with a map to callables or something like that but I basically didn't want to have everything in here so if we get a command if we get if we get called the dispatcher gets called it'll get called with a request object that basically uh is just a struct with the stuff that we got so pier is is uid gid and so on as a fourth also has the file descriptors that we need to write back on the socket the command arguments is what follows the the arguments and then after the arguments it's HTTP like after the first line you have any number of parameter colon value pairs and these are the parameters so it's very simple the request object contains them all and then uses request dot cmd to find out how to handle that command so there are a couple of um simple handlers down here um these are handled like ping boing hello who am I and so on and so forth are handled directly in the control socket and uh everything else if there is not such a function then we actually currently try to load dynamically load a command handler for this so netconf dot commands is the namespace for that in netconf dot commands are classes that represent individual commands that can be issued they are not control socket specific anymore so these are um also what netlink uses and if we get a factory back a command class then um well here you see the authorization that is happening and we create an event based well it's a command event we are unsure how this is going to happen right a string that says capital command I don't like it either but we kind of like went that route right now when as a matter of fact we could probably just get rid of this um and make cmd be a child of an event class um so that you can use polymorphism to determine what kind of event it is but at the moment this seemed simpler because python doesn't actually do overloading properly you do have to use this instance all the time so that's why why we use this more explicit approach so we create the event that is basically the type the payload um a pointer to the source which is self and because this is an initial initial event there is no parent and we publish the event so once the event gets published um it goes to this pub sub object which is basically just associating um events event types with subscribers so the control socket can say I would like to subscribe to all result events or I would like to subscribe to all events that have this event as a parent and then potentially later we would like to be able to say stuff like I would like to be able to subscribe to all events relating to eth0 or um any any other form of selection here so it's a very it's a very simple publishing and subscription mechanism um as you can see very little code and basically what it does is it takes the event and then when for each event I think subscription entails a callback as well so when there is an event received for each of the subscribers we fire off the callback in series and pass it the event and then the subscribers can do whatever they want with it so now we can get to the policy because the policies are a central event handler you can see here it's also callable it gets an event and now it's expected to do something with it um and this is where it gets slightly ugly because this is completely work in progress let me just quickly find where the ugly part is oh man we don't have it anymore this is slightly ugly here you know like doing a switch on on whether parent is none or not there might be a better way to do it but this is what we have right now so if there if there is a parent um then we obtain the command that actually led to this event so if there is a parent this is a result event which is related to some sort of command event and then we um obtain the handler that was responsible for this event and also what the result was that we yielded so um in our earlier example when we had ENI DHCP and link local um if ENI returns I couldn't find an iface stanza for this interface then that will be a result event because it has a parent pointer so CMD will become the original ifa command object and I I guess I didn't I didn't show you that but it that's ifa is actually nothing I think this is actually empty um it just inherits from ifa's command base um which basically just checks whether they're you know like it does basic command checks so the commands know what they expect what sort of syntax there is and that's it um and store this the data so the everything that is related to the if up is contained in this CMD object which we get by traversing up the tree and obtaining the payload from the original command um this in in our case of ENI when the interface wasn't found um the ENI handler instance will be the source of this result event and the result will be something like cannot handle or interface not found if it's a command in and of itself then we don't have to do all this traversing and we don't have a handler we don't have a result um this is the manipulator that is being created which does the actual work on the interface later on neither of us likes the fact that we are creating it at this point but we feel that this is a configuration thing um that you should be able to choose a different manipulator for instance one that only logs what it would be doing and doesn't actually do anything that was the original idea I had because I why I wanted to have manipulators um but anyway man up is being uh created here and then basically we look is there already a handler that tried what it was trying to do if not then we uh obtain an initial handler and call it and if there is was already a handler then we use this information to command the handler and the result to obtain the next one to try and if we get the next one then we call that one um there's currently this go away stuff um I we don't like that either so far I haven't found a better way to do it um go away is basically like after DHCP failed to get a lease and we want to go into power off we really kind of want to be killing that DHCP line process as well so after an event after a handler has notified policy and policy has said like okay nice try now like go and sleep again it needs to have a chance to react one more time because it has to be the policy that makes the decision whether the handler is done or not we could also conceivably just leave DHCP line running which is what's happening at the moment and ignore the fact that it failed maybe that's a policy decision you'd like to take as an administrator so you need we need to go by the policy here and that makes it kind of ugly but uh I haven't looked at this code in a long time and I have to say my student did a lot of good work because I was very afraid of showing you the policy and it looks a lot lot better than I thought so anyway there's another a third type of event coming up here which is a policy complete event as I said before the event language is not complete yet a policy complete event smells kind of fishy to me but we needed this in order to be able to tell any subscribers that the policy is done with all of what it's trying to do it has tried all the handlers either one of them succeeded or none of them succeeded now we're done we can't deduce this information from the events that we get from the handlers if E and I says I couldn't find any interface definition for this interface then that doesn't mean anything to the source because maybe the policy is happy with that and just simply wants to try something else or maybe we're done at this point in time so the policy actually has to also issue an event it makes sense uh it's it's really just the fact that I think I'm just not happy with the fact that we have capital letters here um so just quickly here um I showed this off yesterday as well this is uh this is how we try to make the decisions here um currently all hard coded I'm trying to think about a a way to do the decisions in in a configuration file format and the new PAM format is actually very nice and PAM allows you to for each thing to try or for each requisite or or so on and so forth it allows you to say what what to do in the event of which result so in in terms of like you can say do something else if the user is not found then do do this if the authorization failed so it is kind of like a three-dimensional thing that we're dealing with here and the new PAM format does express that pretty nicely but so far we haven't settled well my student actually misunderstood me on this and he wrote the new PAM format parser but uh we're not using it at the moment at the moment we're hard coding this and you can see that we simply have this is all very ugly um and this needs to be um expressed in a class that can be handled much better because what we're basically doing is mapping a triplet to a callable which is not necessarily it doesn't allow me to do any wild card matching and all that kind of stuff I'd really like to be able to do that obviously you need to be able to say that I don't actually care about the source of the event net link control socket who cares right if the htp fails go to power off so you need to be able to have that asterisk in there and it needs to be evaluated at point at decision time and not like the parsing time so uh but that's as as good as we have it right now um but you you can see very very simple at the moment let's figure out what we need before anything else um yeah so this this just does the the ugly parsing you can ignore that um all right now let's have a look quickly at the picture and see where we are so I've shown you the core I've shown you the control socket a bit I've shown you how it authorizes and dispatches events and how they arrive at the interface policy via the pub sub object um and then get processed by the policy which is actually just a subscriber I don't actually know how we implemented at this point but I think policy is simply a subscriber to all events um and so the event also arrives at the policy and now here the policy is um responsible for finding out what to do and then delegating to a handler and let's have a look at the DHCP handler I promised I won't tell you about DH clients but uh let's have a look at the DHCP handler because I'm scared of ENI handler but I'll show you that as well yes 10 minutes oh god um so the handler gets a command and a manipulator and it's basically responsible for doing whatever it should be doing and in our case it's very simple um we either do IF up and we do IF down so uh we either spawn a DH client process or we kill it um this DH client three proxy is actually how we interface with DH client and all I want to show you is how it is actually a reactor in and of itself so DH client will be publishing events on the file descriptor and you can see here up fd and core again any reactor got called like this and well there's a there's a handle read here and basically you can see how it tries it shaves off the environment does DHCP stuff and then delegates to one of the functions here if callable down there they are called react to and here you can see what happens like react to pre in it for instance we don't do medium and IP aliasing at the moment but basically that just says manipulator bring the interface up or bound which basically does look here we create a new address an IP address object manipulator add that IP address to the interface please now in all of these cases if the IP address is already added to the interface we still return success because we are actually declaring what we want and not being imperative about it routers and so on and so forth and then there's expire and fail there's also a lot that's not implemented it yet like time out we haven't we haven't simply haven't implemented that yet and that's largely because DH client is actually so broken that we can't test time out properly I'm not kidding if you saw my last post to the planet DH client actually if you have two interfaces one for testing one for Wi-Fi and there is a DHCP packet that arrives on the interface it is undetermined which of the two DH client processes gets that packet not both of them only one of them so it makes it very very difficult to do time out testing um all right that was DHCP let me do ENI handler and I also have the fd proxy stuff and that's that's actually something I'll do at the very end because maybe one of you guys has a good idea of how to handle it I'm very frustrated with that stuff um ENI used to be very modular and nicely spread out and everything and then I and we tried to be very smart about it and I think it was on the flight to Argentina when I got thoroughly fed up and I deleted the entire like modularity and everything and implemented it all in one function so it's a long function um the reason why I did that is because we have something that I call a cooperative work queue runner the work you runner is basically something where you register events uh sorry jobs and then you call the work you runner and it calls the next job in the queue and when the job is done your market is done that means it gets popped off the queue and you call the work you run it again and the next job get it gets executed the reason why I did this um and that's in hindsight very stupid because I wanted to make sure that when we actually if up the interface that we don't block the entire program while the if up is executing we want to come back to the select loop so that you can press control c and kill that net con process without having to wait until your DHCP DH client decided after 60 seconds that can't get a lease um so that that's a good approach but I I did it at a very very granular level so that even hooks even between hooks we come back to the select loop and that just made things very ugly um but you'll see in a second the reason so why I did this all in one function was simply because I hated passing arguments around at that point in time with all the callbacks and stuff I was very annoyed um this is a very short line we have other lines in there that have functions that take like 12 arguments and all this kind of stuff and I really just don't like it so what we do here is we create a cooperative work you runner and inform caller success is the last thing that gets called when the queue is empty so if the queue is empty then we simply raise an event that is a success result um this handled event the fact that it's called handled and there's also a um issue event function that's one of those hackish things where we haven't found a clear way in which you can channel both of these events through the same function so that's a that's a little hackish at this point but um so we create this work you runner and then we simply add jobs to it and uh if we do an if up then oh yeah here's the if state file couldn't do without if up down etc network interfaces doesn't allow you to do without because you don't know what hooks have been run or whether hooks have been run you can't make it stateless it doesn't work by design uh I tried but I gave up um so then we did actually here we delegate to let me let me see just make sure that I'm not losing it we actually delegate to a method handler so inet dhcp or inet six static those things are all handled outside it's still modular in this sense and then we populate the work you runner so first of all we have we run the pre-up hooks hooks proxy simply uh make sure that the hooks get and results from the hooks gets processed properly so we add this job and then we add the handler that we instantiated up here which does the actual configuration and then we add another job that does the state file and then we add a job that runs the post up hooks very simple right same thing for if down won't go through that and the last thing we do um is call the work you runner and let it do its job so um I think that is pretty much all I have to say to this um except for the fact if you look at for instance handler proxy um it's this stuff that that throws me off at times right so if we get first of all this is closure again right so we have to have a call back inside the function so that we actually have all the data available um this could be done better but it works for now um we call the function and we call it with a callback wait for handler and so this function gets called as soon as the handler is done and it gets the result um if it's a success result then we mark the job is done and add a time call back time out zero so immediately as soon as possible um to the core loop and it'll get the the work you run it gets called again otherwise we process the event so let's have a quick look here at static inet static if up we simply shave off all the data from the live face the logical interface um which the parser returned us and then set ip address instead of add ip address because we're actually doing what if up down does which overrides all your interfaces if you bring it up all your addresses and so on no rocket science here so I think that's pretty much the the whole picture and the code to it and um the the code is in flux but we're we're getting to a point where I can actually like I didn't think I was going to be able to get up here and explain this code and I hope I managed to do it a little bit um I hope you see that it it still needs work but uh I hope you can make some more sense of it let me are there any questions I'm happy of course after this to discuss more if anyone is interested but uh let me finish off then by showing you this file descriptor stuff and uh actually in the root directory of the checkout there's a doc sub directory in the design document and uh so everything that I've told you now is essentially explained here the main loop events the policy what they do um event sources um this is the point where I'm currently a little bit uh helpless I shall say uh part of the helplessness is when remember those command objects the command that um has all the data that relates to an IF up command which gets authorized and passed to the policy and then causes a handle to respond one of the attributes of this object is the source of where it came from and some of the so so we have something that is um command source um and basically every single like the control socket is a command source and netlink is a command source and look at this stuff um we we I made this a proxy where you can register what I call output readers and output reactors so the difference is simply the output reactor um gets a file descriptor and reads the data off the output reader writes directly to the target file descriptor um if I look at the control socket reactors constructor you will see that what we're doing here is we're adding to our command source base class an output named output reader for standard out which is a callback here forward to standard out so whenever there's data that is written to the output reader by the name of standard out we actually end up calling this function um forward standard out which prefixes the data with an o or an e because we only have one disc um channel for the control socket um so one of the problems why I don't like this is because we are sometimes passing around the entire command object when as a matter of fact we only actually needed the source the command source object so that's not that's that's something that's easily fixed we just haven't done it yet on the other hand it feels like um I'm not entirely sure that whether this is the the right way to do it if you spawn a dh client you need to give it a standard out or something um file descriptor so it can dump all the debug info that we've come to love so much um if you disconnect so that that's initially um I about a year ago when I started there was simply a file descriptor to the socket being passed around and you would just simply link that file descriptor up and just dump everything on there but now if the control socket disconnects you couldn't reattach to the running dh client so this is why I wanted to have that proxy in there so that now if the control socket reconnects assuming that you can you know there's no authentication going on like everybody can see everyone's dhcp output basically um now you can you can shift those functions around and you can redirect at this level because the actual uh reading the actual reacting is still happening in the same file descriptor proxy um I I'm unhappy with it even though it feels like there's no other way to do it but maybe some of you guys know a better way or when you look at it you come up with some better way to do it maybe you can also you know maybe you'll just take the clue bat and say like you're trying to solve problems that aren't non-existent um so far I'm I would prefer to support um this sort of reconnecting and making sure that we don't end up with data on file descriptors that is not being read because on the other hand as soon as you have you need to with this model with a select loop you need to always read your data of the file descriptors because um I'm almost done um you always need to read the data of the file descriptors otherwise your file descriptors do not get closed the kernel will actually keep the file descriptor open and that means that you cannot actually remove sub you can't read your sub processes and that means you get zombie entries in your process table all over the place so this is this is one of the points where I was you know I said yesterday a couple times I'm not a unix systems programmer that can't make me realize this even more but uh I realized when I was doing this that I actually think unix is not always the best thing this file descriptor stuff and reaping children and all this it's very very annoying reaping children is another topic in there there's a there's a function that in the core that allows you to register child reapers so whenever you get a sick child all of them are called there's no other way for you to actually properly finish a sub process than going via signals which is kind of annoying but anyway um I hope this wasn't too short and I hope this wasn't too confusing um I would love to have questions now or on the mailing list or at any point in time and uh of course I'd like to see all a few more on the mailing list and everywhere getting involved on this um I'm totally happy to answer any form of questions and it's not I'd like to uh make sure that my student needs to hear that as well I'd like to make sure that everyone knows that I don't take any claim to know what is the right thing to do with things like that I just kind of move forward but I'm very happy if any one of you says that that's complete bonkers tell me I'm very very interested in hearing that so thanks for your attention