 Right, so let's hope that's it and that we can finally start so I'm Marcus P. I have with the opensbsd project and I've been working on harports tree and package tools and lots of other things related to that over the past ten years and I wanted to make this talk about quite recent advances stuff We did very recently for some of them. It was last week So obviously if you're looking into the openbsd tree, you won't find everything that I'm talking about in it yet But with luck it should be committed in about a month or so So those slides are not yet up obviously because I want you to listen to the talk first But I should be able to upload them probably tomorrow. I expect Yeah, very shortly So This is not the first talk I've given about this topic if you were here two years ago for the DPB part And nice if you don't know everything about it. That's going to be a very quick summary This is our distributed port builder. It has been in production used for about two and a half year I'd say and It's basically used by everything in openbsd that build packages for almost every architecture you have at least one port builder here who is Doing spark and maybe some other stuff. I don't remember Yeah, so Is a mesochist you see because park that's really old machine. I'm not talking spark 64, right? looking good stuff it distributes lots of clusters and it has lots of Very simple mechanisms actually that manage to distribute loads of a load quite Evenly and give very nice results in the end So I think I have an example Oh, yeah Two years ago. I said that there was a lot of stuff to do yet with DPB So I wanted to come back so that I could be able to tell you that most of that stuff has been done and works now and We have done even more and now we are obviously running into new problems But first I'm going to talk about the stuff that we managed to solve But both package pass problems, which I'm going to explain Architecture dependence, which is very important for stuff like spark because you don't want to have heroes all over your logs for instance fetching stuff making things simpler for some other people who are not here today like Bob for instance and Yeah, I'm going to talk a bit about LibreVis Okay, this projector is a bit like shit, but it should be enough to see the graphs This is what DPB looks like when it finishes running. So you have several curves here When it starts up first, it's scanning the port street and it's already building stuff right from the start That's one very big difference from what we had before is that as soon as it finds something that it can build Without any depends it starts its right away So you basically have one scanning job on one machine and everything else is already building So you see here you have actually two interesting queues for a cue is the stuff that has been scanned But cannot be built yet because it's missing dependencies and the blue one here is the stuff that we can actually build So in normal usage, it goes up very very quickly. So here we have about 7,000 ports and about Well, when it's finished caning the port street basically it already has almost 3,000 ports that it can already build to choose from to try to find a very good build order and Then it goes very slowly for a while There's a very good reason for that actually it's that Since it knows what it's going to build and what's going to take a lot of time In all of this part You see those machines crawling because they're building huge things like Qt for GCC compilers the briefcase Stuff like that and then near the end on this part it accelerates quite quickly You see the queue going down you notice that the stuff that cannot be built yet is almost Done to zero so that that this by by this point dpb has the whole ports tree at its Exposure it can choose whatever it wishes to build next and so the end of the queue almost always looks like this the green line that's actually the number of build package that you can install and on Everywhere now of dpb. You will see that in the end. It's an exponential usually for All common situations you will see dpb and Building packages on every machine in the cluster in parallel and everything is going to end at exactly the same time I Should have some photo from synchronized dancing or something like that Because everything ends within five seconds of each other usually So this graph was taken from OP which is the cluster with six machines with with two cores each It's almost an optimal setup for dpb these days and I until we get better SMP and I could have put a graph of runs of dpb on over machines, but it would be very boring because The shape of the graph and whatever you see here It's mostly dependent on the actual port street and how much time it's going to take to build stuff relative to each Over and not that much on which machine you are going to use actually So this graph is what I see on the P It's also what I see on my laptop when I do the same thing and it's also what I see on over machines like Nettie's new box The only difference of course is going to be time If you have a slower machine is going to be stretched over three or four days For a P it usually takes something like 11-12 hours each each and Of course on the sparkluster it takes what three weeks, but it looks exactly the same everywhere So how does it manage to do that? The very simple idea was simply that each time you are going to build the world port street You're going to store how much time it's going to take to build one given component So when you go into the next situation you have a build times for each and every port and You just use that to direct DpB in the right direction so it can build the biggest dependencies first We don't store the dependencies. That's a very important point You only have to store the build times for each individual port and since they baby We'll figure out the most current set of dependencies is something changes like for instance So don't leave we have lots of ports. That's depends on GCC 4.6 Then knowing how much time the baby PCC 4.6 takes how much time the both is takes and sheets like that. It's going to figure out how to build things in parallel so that Everything is built at about the same time It's very strange because I wrote this code and sometime I've put some new workload and I see it do it In a very efficient way and figure out critical path with what's me having to do anything So this is slightly different Graph actually Do you notice the difference from the previous one? It's a bit tricky Not really the difference is Yeah You have a green line that goes over for quite some time actually Yeah, I know it's a shitty projector, so you probably can see it but on the previous build the built indeed here Here it goes on for about two hours. So This is a problem What happened actually? Well, you can talk all you want about doing DPB Loading stuff correctly so that things and At the same time, but we actually have an elephant in our port street Strangely enough The elephant of course is LibreOffice If you look at the world port street on OpenBSD, you'll notice that sheets like LibreOffice Is going to take about I think one fourth of the full time it will take to build most of the rest So in that case which was a new machine with Two, sorry, a new cluster with two machines per 8 core each There was the critical path which was very easy to see which was leading for GCC 4.2 GDKey and LibreOffice and at the same time you also had to have GCC 4.6 suddenly and It didn't go fast enough In the end you saw the world cluster trying to build that and ending up Building LibreOffice while All the stuff was finishing and still two hours to go and LibreOffice is not yet done So this couldn't do and theoretical computer science is not going to save us again So we're going to have to do practical stuff Interestingly enough we had a very similar problem in Mac actually So since I was working on Mac at the time I decided to maybe reuse with some idea So let Have a small parenthesis If you run builds using Macs you Probably know about the problem you might run into if you try to do recursive Macs Like you have Mac in a directory you try to invoke it in Parallel and you go into a subject Rectory and another sub directory and if you start with Mac minus G4 for instance you end up with 64 processes usually Your poor machine is not going to like that So what we did actually is simply try to figure out whether what we were running was Recursive Mac It's very easy. You just look at the command and if you see anything that looks like Mac With a few exceptions because you want to avoid USR include Mac or Mac.C for instance But apart from that if you are running something that looks like Mac you are going to decide. Okay. I'm running a recursive Mac and Then you're not going to run into several of these because you are going to start your processes normally and As soon as you've started Something that looks like Mac then you stop starting new processes. You just All run wait for everything to finish finish that one and then keep going so If you make a very simple computation you're going to have four processes at the first level but Most only one of those processes is going to be Mac so you are going to add for more processes at the next level and For more processes at the third level. So for simple recursive Mac with three areas you end up with only 12 processes and Even so that's probably a transient because The Mac at the first level which is Which started running is going to block over stuff from starting until it finishes So you are probably going to start with 12 processes which are going to dwindle down back to one then one then four so at most six processes and Mac which is running something else in a different directory that doesn't consume much CPU So in practice we end up having recursive Mac working making efficient use of our CPU Without any problem So let's go back to DPB what we did was very simple since we don't quite know What's a recursive job in that context? We're just going to mark some very expensive ports as being buildable in parallel and Of course those ports are going to start on one core, but run with Mac minus G whatever And at some point your DPB is going to be running More jobs than it's supposed to right because you just started something that's expensive You have other stuff that is still running and you have maybe on the four core machines six or seven processes right but Since those ports are very long-lived likely both is eventually the rest is going to die and Instead of starting new stuff on the same machine you're just going to steal the CPUs for your LibreOffice and As soon as you started something big on one single machine Then you don't start anything else on that machine until you've gone down to the number of cores that you actually have There's a very small trick to it Which is that if you have eight cores on your computer You are not going to go full parallel and say you are going to start make minus G8 to build LibreOffice You're just going to use By default half of that So transiently you're going to have eight plus three processes and Very quickly it's going to drill down again to eight processes The main reason why it works is that you only have a few critical paths So you only need to mark the big ports has been built in parallel using this method and this just works I Could have put a new graph of A new run on the same machine that you saw before but it would be now again the same as OP Because now with just this small tweak Well, it took a few times to Fix it because each time you have to run the patch and check that it works and change it again and now again We're able to build the full port 3 on a 16 core cluster more or less Without having any stragglers like LibreOffice Being around for ten or two hours at the end So this is a bit of what I said. What are we going to mark as parallel stuff? only critical stuff right and also only things that parallelize well because if you build software, you know that some of that stuff is Quapping new stuff for instance and it's going to spend Most of its time doing configure and then building part of it and then configuring again and then trying to build over stuff She's like that Actually in what we are doing a little bit of configure is going to help Because if you try to build a port using parallel make right the patch the sorry extract patch configure parts Is going to be sequential anyways, even if you start it with make my UG4 in our case that means that you have Few minutes critical minutes during which your big port is not actively stealing CPUs and In which over stuff is going to be able to finish So that helps so yeah again what I said We did some more on this because just marking a few parts does help full bulk builds But we have a few not so critical ports marked as well So that for instance if you only want to rebuild to 100 or 300 ports from the open BSD tree You will benefit from it as well Stuff like JLB is marked in my tree for instance. I'm not sure it's made it to the general tree yet and In practice most of the time if you have 16 cores and you have stuff marked with parallel You'll notice that you really have 16 processes running and not more the transient part where you have started a big thing and you have more Marching training is going to be very small compared to to the normal work Any question about that part? Can you speak? Sorry? I mean what does what mean? Ah, right? I mean that We this change instead of having libo fist finishing tours after everything else Libo fist finishes and then you still have two hours of a very small sheet building to when the work So you have plenty of room if it gets bigger or something like that. It will still work some other technical stuff running fetches, it's Call back to what I was doing two years ago with DPB because at that point DPB only knew about building stuff and sometimes it could be frustrating Because You have your machine which is sitting around doing nothing just grabbing stuff with TP It was something that had to be known it was not very complicated to do but basically since this is object oriented pearl I just needed to kind of tweak the engine and I don't have a variation of the engine that instead of building stuff would just simply fetch stuff grab stuff from FTP and Run it through another queue So priority is important here as well since Of course fetching stuff won't Take any CPU bandwidth, but you don't want to be fetching 20 Distribution files at the same time or why some people are really not going to like you And One very good thing about this is that it made life simpler for some of our developers like Bob Begg and Tom Miller Well, the guy you are running the dist files mirrors because prior to that it required a lot of handling and Looking at error logs and sheets like that and at some point we reach a point where the dist files mirror were Almost never up to date. She's usually a bad thing so Instead of that the new DPB method Just works, I don't know what I can say about it, but it gives us shitloads of log Error reports that are readable and that you can fix Method to actually go through the FTP list quite quickly and figure out which one is going to work and Stuff like that I haven't like to do this somewhere to maybe even classify whether some Mirror is good or not good and preferentially go to the source-forge mirror that works for instance Not been necessary so far. It's Good enough for what we do. Oh, yeah, sorry. Last thing to say about that is that It also handles all kind of shit concerning checksums so that We can build a little bit faster Because actually once we fetch a file and discovered that it has the right checksums, we won't go check again Some people are going to say it's bad for security, but you have to realize why you are using checksum, right? You have your Ports machine you're building stuff on it and if anything gets tampered with on that machine basically your facts The only reason we have checksums, of course is to check whether the FTP Upstream is good or not, right? This brings me to the next part of this talk our beloved users Well, this projector is a little bad, but maybe you've seen the stupid movie which is called dump and number That's basically some of our users in that case that would be probably Antoine and Theo Who have the two most annoying guys in existence between Theo who is never happy with how things work and Antoine basically said now I can't use the Pb I don't want to have to put any options to it So you write a main page you write a complex program you give it Lots of options and you have this guy who is going to tell you now. I'm not going to use it It's too complicated for me Guess what? He's right. It doesn't have to be complicated Well, there are some cases in which those options are useful But that shouldn't be always and for the simple case of just building stuff You should be able to just run the Pb and that's it and Two years ago that wasn't the case Because all that stuff that I was talking about for for instance Being able to use the information from previous builds to prime the current one so that everything ends at the right time It was an option so he never used it he never benefited from it and Instead we switched to a different approach It was a little bit of code to write it's little bit more complicated for me, but my users happy now with it The idea being quite simply that Instead of Keep having to say where the log of a previous building is I just saw a journal of what's been going on before in a place that's not going to go away start with this twice usually and It's just a rolling log. So if it goes to old it just finishes But usually I think I keep the information for the 10 last previous builds of a given port And on average I get pretty good build information as far as the time it takes to build one piece So that I get percentage and so that I get optimal builds It turned out to be surprisingly useful for these files There was one problem with mirrors is that over the time of a life of a port 3 these files are going to accumulate Each time you get a new version you're going to fetch it So for stuff that changes quite often like Hebrew fees or modular and which has big these files It's soon going to be a problem. So at some point we want to remove all stuff But how do you decide? What's all stuff in the port street? Especially for mirror which has to cater to users who are using versions from two years ago I should from one year ago Snapshots or that shit you can't rely on the timestamps for a fight doesn't work Because the timestamp just tell you tells you sorry the first time the file was fetched Not the last time it was used Nope, just a log In that case access time is not really reliable in many cases Each time you start dpb for a full build it will scan for for this files and It will mark okay this one we've seen and If it doesn't see a given this file it will say okay, this is the first time we haven't seen this one So afterwards you actually have time timestamps that tell you this this this file with that specific checksum disappeared at that time It can come back for instance if later if it was a mistake or we upgraded and we got back to a previous version Then this time sample this file is no longer used with vanish And so you keep exactly the files that you want and you can even say okay I just want to pull my these files for two years ago for instance Same thing same thing Most of the stuff that's currently in dpb. You don't need options to use For instance if you want to have extra parallelism. It's owned by default now I think that it took maybe a week Between the time I added the option and the time I decided to make it a default Just make it work and trash like now what keep it around but only for a very specific case by default You have half your course from on one given machine which are going to be used to build a big stuff and So it's very good for Antoine because now if he wants to build the world. He just has to say dpb and it works No options if you want to use dpb on the cluster again. It's very simple All you need to have is your port street under NFS Set up things so that on the local machine Your work object here is going to be Local you definitely don't want to build things for an FS you would have to be crazy Make sure the bus system is the same that's the current tricky part Going to get better and That's all folks You just have to tell dpb, okay I want you to run on this machine that machine that machine and that machine and that's it There were a few issues to fix because in order to work It has to have access to NFS on a timely basis and NFS is a pile of crap so it doesn't always work sometimes files show up with a small time lag and So there's special casing inside dpb to say, okay, I'm supposed to have finished this package It hasn't shown up on this machine yet. It's not a problem. I'm going to wait for a bit Heads on probably on spark a lot overall crap We still have back son at least one We have a guy who is stupid enough to do bills on it and it only has 32 megabytes of Memory usable for a given process data, but it's data limit So I had to give it some special love so that it fits And unfortunately it has to avoid the fetch part, which isn't really a big problem considering that all those machines usually run from Same network and so they actually share their distribution files And Right now it fits I think that the last I've seen it dpb was Running on Something between 28 and 30 megabytes So if we grow about three too much We'll have a problem again But it's not much of a problem really because the only reason we are still running dpb on the Vax itself It's because it's cool to be able to What some people think it's cool to be able to run modern stuff on old shit but It's quite possible to drive the whole big process from a machine, which is from a different architecture Absolutely no problem with that Errors How much time do I ever left and probably need to cut some stuff? That's so much shit. I wanted to talk about but I decided to commit it. So Probably need to cut some shit Yeah, our package system is a bit complicated With this because we have also packages and flowers and everything Yeah, I'm being to be a bit and It all works most of the time but dpb has to deal with all the strange stuff In particular we have something which is called pseudo packages Which is that sometimes on some architectures you have to get some stuff to vanish like for instance If you want to build Hawaii it normally has a mono component, but mono only builds on I'm the 64 and If you're 186 so it doesn't he can't build that on spark even for the rest So what we did was simplify stuff so that one it's easy to build a Mac file that works for specifically Vesquez and To so that dpb is aware about it And so that if you are running something which isn't mainstream architecture and very some stuff that you can build because it requires Assembly a locks that shit It won't even try to build it and not flag it as a newer This is a big progress compared to what we were doing before because now you can run dpb on spark and actually see only package that built and not shitloads of How many euros 500 maybe maybe more I don't know Thousand thousands and thousands oh Yeah, if you want to start playing with open BSD Building ports on our machine is very simple No, I was not thinking about you actually but if you want to take it for yourself And More seriously, what's important is that in the hand we Have snapshots almost all the time for almost every architecture Right now these days Nadie is building named a 64 snapshots. I think each day or each two days So that's how fast we can crank things out with fast machines and still we spark Well, I think we could have when snapshots every month so instance to any problem Oh, yeah, and another important part is that usually kernel people don't talk to ports people and Vice versa Yeah, but what's true now is that you have no excuse if you're hacking on the kernel to try dpb to try to figure out Why are SMPs not very fast and shit like that? If you say so, yeah Last thing about production I'm going to talk about this very quickly Is that for people who are actually using dpb for production? You don't have to stop it. You can just fix errors remove a log file and it keeps going Yeah, that's part. I'm not going to talk about Basically just that we have much better ways to control dependencies these days so that we are able to Remove stuff that's not marked as a dependency from a port on a given machine This has two advantages one is that when you end up building the wall ports three You don't end up having two thousand packages installed on your machine and The second part is that we catch hidden dependencies Stuff that hasn't been declared but that no stuff auto crap is going to pick up on anyways Another thing that's been happening very simply is that I took what I did in the pb And I realized wait a second what we are doing in make is completely wrong. It's waste to complicate it So instead of forking one process for each target in make now We do it the same way that we do dpb by using one single process for each command We're going to run Which means much better control which means that we don't need a pipe since make is doing most of a printing and so All kinds of benefit in parallel mode. We have tty We have better errors, etc, etc Yeah, Fred stuff and vfs looking that was our main two problems with dpb I have plans I should be German in that case because I'm going to take over the world, you know I've started playing with using dpb to build not only the port street But Xenocara as well and sources next on the list and it works at least for X This is a full build of Xenocara done by dpb and Drops you see here. This is the critical path This part is x11 this part is gl and this part is xr. Any questions? Yeah, one point forgotten to mention all I've talked about Concerning parallelism is in dpb now in the country chunking works as well All the mixed stuff has been committed The part that still in progress is using dpb to build X and eventually source as well For now, it's only working over NFS. So There's some sense of speed of a note But that was already done two years ago Which is that you can say that this machine has got a small speed factor for instance compared to another one and then it will take the whole queue of stuff it can build and It will speed the workload into parts so that the slowest machine gets the smaller part to build and The fastest machine has access to the whole queue Right now, it's supposed that it's on a local cluster and every disk file is accessible for NFS. I We haven't had any practical use for one part yet. Just 10 seconds. I probably still connected Yeah, what you're seeing here is the OP cluster which is located elsewhere and What you're seeing here is about something like six machines running I think that Landry wanted to try it with three three jobs per CPU. So you see here 18 ports building in parallel And I don't know if you can read it, but each port actually has a percentage number right at the end of the line like for instance here Which is telling you that okay compared to previous build it's about 38% through the build And the lines at the top means that it's the stuff that's been building the longest So you'll see that it's managed to start Chromium fairly early and LibreFace is still building It's going to be here for at least three hours more. I guess maybe a bit more and The end of the list is going to be moving fairly quickly really soon because we are actually reaching smaller ports here and At the end you probably can't read the numbers But I can tell you that we have built eight hundred and two packages The queue is already at five thousand and seven hundred and there are still one thousand and three hundred packages that we Don't have all the dependencies for This is what goes on about once a day We would almost know human intervention. Thank you