 So yeah, the title is slightly different because there's a funny story there because usually you send your conference proposal about foremost in advance and Sometimes you forget a little bit what you were going to talk about which is exactly what happened so I figured it out because more or less I know the plan and This is the stock is a bit special because it's more about Plans and how to try to see her head and just the normal details So this is why this is mostly the same title actually because the new running story of catch tools is more or less that we have the system that is in production and We have to keep evolving and not break things too much so that users Why both of them keep using upon BSD actually? so there and back again The idea is that sometimes you try to change stuff and sometimes it fails In that specific case about a year ago. I broke package ads intentionally like at some point in time You try to use package ad and Either you had all package tools and new packages and it didn't work Or you had the new package tools and all packages and it didn't work either like this is what you got around six zero when you Try to install an old package. It was telling you that actually The fact that you had was a completely unsigned package which was completely false because it was signed but with your tools So this doesn't happen very often there is a very good reason for that one and I'm going to go into details in that particular instance and then I'm going to talk about why this Had to happen and why I'm trying very hard for it's never to happen again so what actually happened in that case is That we changed completely the way signature worked in a compatible ways And so all tools couldn't cope with a new one Which is always expected like you had some functionality and then you discover that doesn't Work with all tools. Yeah, sure, but the old ones didn't When your tools didn't work with the old package, which is a bit new actually All stock signatures. So if you want to look at the details, you are going to find it in the history of openBSD since it's no longer the case Okay, the basic idea is that we want to be able to install packages on the fly. So we check some every file store the checksums inside Manifest file a packing list and the packing list itself is signed This has a Case I have a picture here Yeah, probably have a picture here on that one. So it's definitely yours Thank you So the benefits are that you get on the flight check You can stop extracting an archive before the hand without losing any security There are some drawbacks as well One of which is that you have to rewrite the package for signing We choose to be real slow because you had to unpack everything Sign the packing list and put everything back again. I talk about that more later probably if I don't forget and The men drawback here. I have to pass everything to chance it first. Welcome Hello You have to pass everything to chance it or some kind of chance it before you check anything and once you start looking at the unpacking code you realize that it's almost impossible that This code is safe that they have to be bugs in the Z library in any version of the Z library and Well You could be for and try to edit it and try to fix it But you can also be lazy and say, okay, who gives a fuck we're going to get to rid of that So new deal New style we are going to store the signature itself outside so that We get rid of that issue because then we are never going to pass any unchecked information to unpack and Of course, we still have to trust some people we have to trust our fellow developers to write packages that aren't full with Trojans and everything But this should happen usually and Okay, then we don't care I have a jzip code is completely broken because we're only we are passing trusted data to it Sign notes, which is important for the rest of the talk We are not actually extending anything Like if you look at the jzip format first actually a command field which we are using to put our signatures so that it still appears to be perfectly normal jzip Files from the outside if you don't have any package tools you can still download an open bsd package And it will look as a perfectly normal Jzip tar file even for there are some very interesting things going on inside it So it's a typical signature under the new skin like if you look in the command Field of any open bsd package on the on the mirrors these days. You're going to see something like that The interested command part is deduce fault But yeah, well doesn't matter it just the way that signify is producing signatures It's going to tell you that it's using that specific key Like we don't really have any Certificate chain trust other open bsd. We don't really care about that and it's complicated and it gets wrong all the time So basically any key that's installed in the right directory is going to be considered as a valid key That's it. So it's outside what I'm doing. I'm not very sure for that one and then we have List of keywords Just so because we know that things are going to change at some point. So we have to provide for the future Like here We're going to say that we are using this algorithm for now. There was some discussion with Like which algorithm is the best for us like we want to have something which is reasonably small 256 bytes it's sorry is enough, but Shaft 512 slash 256 is apparently the best way to do checksums Almost standard in a standard way these days on modern architectures and Then we have block sizes because we still want to be able to extract stuff from the fly so Looking at packages. We decided that 64 kilobytes was probably a right size for each book So this is a majority parter from what was going on before As we no longer Check some individual files. We just check some part of a compressed jzip stream and As soon as it's actually checked to be to be okay To have the right signature we can pass it to them back up As far as I know, it's perfectly safe unless someone manages to break one of our cryptographic So, why does it break the former version of signatures in package ad Was much more friendly from a coding point of view because everything was done inside the package ad proper so you could have decent checks and decent error messages like you unpack the packing list and you see what sign or not and you can even ask the user whether they want to Add an inside package or not now with a new one Package ad no longer knows anything Not mostly about signatures. It passes everything to an outside program signify which is going to check the archive and It's all or nothing like signify is going to give you an error message that says I haven't found any signature or I haven't found any valid signature and In that case, that's all that package ad has to to to say about it. That's the message Well, this is this was the first version of a message. It's a bit of a stupid message I stumbled upon something similar during my last vacation Like I don't know if you can really see on that picture, but this is concrete, right? And you have this sign that says forbidden grass So this is a kind of an error message that doesn't mean anything in that case So breaking things was a conscious decision. Usually we try not to break things But in that case we decided even though there is no security hole In jzip for now that it was best to thoroughly deprecate the old Signature process and only support the new ones After all this is open BSD so Usually when we have a choice between more security and more Usability we decide to to earn the side of security as usual this kind of stuff actually happens all the time Yeah, that's the one. What did I do? Oh shit. Sorry. We do change internal details of packages ports about every two months and this is the one case in about Ten years that users noticed anything that we had to actually break compatibility In that manner As you probably know if you're in this room when we Do releases in open BSD. They are usually supported up until To release afterwards so that means for a year you are going to be able to work with open BSD And then you have to upgrade all the time actually, this is not true for Packages in general if you're working with package tools usually you're going to be able to play with all packages that did back five or six years like we have cables that change and Most of the time I Tried to keep supporting them for a much longer period the idea is that if we can We shouldn't break backward compatibility as long as it's not too complex And this happened maybe 12 years ago back when I was starting on package tools more or less and at that point things were evolving really fast so It became a case of metaprogramming like you have to have some Infrastructure so that you are resilient to change to unwanted change. So at that point I added specific class for all keywords in the backing list right specifically very intuitively named all and I've seen no reason to get rid of that structure and usually I clean it up every five or six Like I look at any keyword in that list and if it's been There for over five years I usually completely get rid of that because everybody who has all packages that Did back that far usually as a time to upgrade and they seem repeated messages that this is an old compact keyword And blah blah blah you should update the narration So I'm going to talk a bit about the development process and the design guidelines in general As some of you probably know when you do some open business stuff We do it by remote most of the time But we meet once twice a year to do stuff in person like you have this one This was the last hi ketones that we had in Paris Well, it was so long ago yeah, time passes like probably 10 years ago and It looks like this we put a lot of developers in the same room and we hack together. We talk together We do lots of stuff together This is also from a hackathon but more private one with only a few select people's Invited to a secret place in the middle of France with mostly cheese and Stuff to drink and a bit of what over stuff Oh Sometimes it's cool sometimes Well, actually I have another way to look at hackathons specifically port hackathons where I meet my fellow developers Which looks more like this like I have my colleagues and friends and I Just observe them trying to make their way through pot street and trying to fix things trying to make things work It's all laundry somewhere in this room. Yeah Like that. Yeah, it's it's a prime example of the guy who Sometimes acts dumb on purpose so that I don't get to be To have too much of a hero complex and that I have to fix things So that they can understand what's going on and this is very important guys Think it as a joke. Yeah This is really Important to have something back about what you're doing so that you don't go in a crazy circle and that you don't write stuff That is impossible to understand for anybody There's also the challenge that for real Ports work is Hard to do like you are solving hard problems like updating stuff Can be difficult we're dealing with massive amounts of codes this day we have something like 40 gigabytes of packages and Everything has to keep working and there are exceptions over time Why? Some of them could be avoidable like when you've got a head barrette maintaining tech life for instance You know that something wrong is going to happen But apart from that Yeah, well Very good for keeping me in check and making sure that my design is sound and Possible to change by your people So let's talk about more personal stuff like my actual work environment For real, this is my preferred work environment so Yeah, this is Budapest obviously for those of you who have been there if you haven't If you go through Budapest you definitely have to go to the baths. That's completely amazing out of this world and everything and This place not Budapest proper but the bath itself is probably where I do my best work like When you've been coding for a while I think that sometimes you have to dive deep into the code and Then you have to step back Think about what you're going to do next Because these days writing more code in packages. It's very easy. I know the code mostly by heart but I'm lazy, I don't want to write new code and Also, I know that every time I had I had some new code I'm going to introduce some new bugs the only code with no bugs is no code at all so being away from the keyboard and Not writing code just thinking about stuff This is something that's really important for every aspect of the major project if you Lose yourself too much in the detail of the codes. You're not going to see the big pictures and you're going to miss opportunities for optimization and new stuff like those signatures we happen mostly by accident a new ones at some point I was looking at jazip and we were trying to fix something. I was trying to What was I trying to yeah, I was trying to see if I could use Some street specific streaming jazip algorithm which would compress better and which would work with our sync and I realized that there were about five different patches for different version of the Z library and none of them were applicable to us and Then I realized what a mess this was and I realized that jazip was basically broken and So that we had to put signatures outside and then everything happened. The code is trivial. The main idea is that Realize there is a program somewhere and then you will fix it Actually most of the core developers of OpenBSD follow that precept some of them better than others Like for instance our fearless leader We're spending a lot of time outdoors instead of advocate board and So is a top-notch developer because of that and You have to keep giving money to OpenBSD so that we can keep having hikes and I get tons and everything And so that we can think beyond code and Actually make some good designs If you want to hear more about that you have his talk on tomorrow I guess he's going to talk about pledge if you don't know about pledge you should go and see the stock because it's really awesome for lots of reason Back to me. I can't always be in the past. I guess that Robert would kill me at some point Of course So I have reproduced the same environment at home Yeah, it's a bit easy for various reasons and I was actually a very interesting detail here. It's a bit shy so Yeah, I wanted to be fuzzy on the picture and I can't wait it to be on a better picture So this is my inspiration. I keep this in the bathtub so that when I take a bath I have to think about OpenBSD and to fix more stuff Yeah, I'm saying all this in terms of the job, but that's for real When I say that some of the best ideas and some of the best things that we have in our package tools Have been done in the bath That's true. The last one which I'm going to talk About at the end came to me while I was taking a bath Let's talk about some specific topics in package ad Yeah, so this is all about change. This is a part which is talking about the Neverending story of package tools Sometimes we have to revisit some decisions Some of them were not mine at all like when I decided to read the package tools We already had the package format, which is mostly a jazzy table There are some pro and cons for it the Most interesting things about it is that it's perfectly standard format So if you take a package from OpenBSD and you have to look at it on an over operating system You will be able to look at it for the most part It might be a bit complicated and on Windows, but you can probably find tar on it as well Good luck. I don't give a fuck. Last time I ran Windows was about five years ago, I guess And But still every time that we are going to do something to our tools We are going to have to question whether or not we should change stuff and sometimes Some details are going to change While the outside appearance is going to say the same I already talked about signatures like you have actually a significant command field in the jazzy pedals Let's talk a bit about Chang jazzy Which is something I was aware of about 15 years ago, but back when we did the first version of signing Nadine reminded me of it like, you know, you can actually pick up a file jazzy bit pick up a second file jazzy bit as well and Conquer tenets both files together and you will end up with a valid jazzy file Like you pass it to a jazzy palm It will just unpack the first stream then the second stream and on the next you specifically ask for it You won't see the difference. It's exactly as if you had one single jazzy file. So This helped a lot at first for all style signatures Instead of having your whole package as one jazzy file needing to extract everything sign the packing list and Then repackage everything you can just do your package in two parts you're going to create Beginning of a tar archive with just a packing list You compress that Then you take the rest of the stream and you compress it separately so you have one jazzy with just a packing list you have a jazzy with Everything that goes after it and When it comes time for signing you just unpack the first part sign it Compress it again and then you just copy the rest of the file don't have to GM zip and jazzy pit again we went for From I guess far as for signing a full snapshot for full package ship not for open bsd to about 40 minutes on the same machine using this technique What's your evidence these days because as you know We no longer put the signature inside jazzy We still use this technique for reasons Like when we want to transfer packages We actually put them inside chunk jazzy with It files for each chunk so that our sink is happy with it What's going on here? Is that a few years ago and again because we already have check sums for every file inside an archive? We decided to depart a bit from the usual format Well, it's still a perfectly normal turbo, but the order doesn't match a packing list anymore this days We have the packing list which lists every file inside the archive And we have the archive proper Which does list files out of order? With the most recent files first and the files that didn't change at the end so that When you upgrade your machine instead of unpacking everything from every turbo You are only going to enter the files that do change Inside your your package and as soon as it finds that the remaining checks under just the same it will be perfectly happy to skip the end of the archive and To Tell you that the upgrade is complete For most packages, this is not a significant win like sometimes you are going to get 20% 25% If you look at stuff like the client for instance, where there are lots of very small details that change but that are completing insignificant compared to the sheer size of a package I think that the tech life package is this when you put everything together you cut one gigabyte of files And on each update with this technique You get to extract maybe 5% Of the tech life turbo so instead of 20 gigabyte 200 megabytes. No Yeah, at most That's a huge win And it also work reversing these days because we chunk files from the end and With no longer stores any time stamp inside the tarbol itself. Obviously, we're on the back end list and so in viance you are going just to Transfer the new checks at the start of archive that change And the rest I think is going to say as it is supposed to do. Okay. It didn't change. So Let's not transfer it The only reason this kind of stuff keeps happening in open bsd Is because we look at stuff that we're doing and that we are Designing stuff to be fast If we didn't have all-star signatures, we don't have thoughts about Chang jzip and then none of us would have happened actually Next case 2d The opposite to reformat itself One peculiarity of open bsd is that we don't have any index for packages Well, we do have tools that you can use For instance to look at files we have Make package look at db Which gives you a specific package which is basically look at database of every file in every package But the package tools themselves do not rely on that At first it was just a game The idea was that I Did the first version of package ad which was only doing addition then I did a bit and everything And I wanted to know how far I Could get about needing any index and then at some point maybe Ten years ago. I realized okay I've done enough. I can manage to do everything about a global index And so I do not need a global index and this is good because this is one less things that needs that can get out of sync and On your local machine as well if you look at an open bsd, you'll notice that there is one major difference with Overmachines with over operating system, which is that we don't have an actual database of everything that's installed on the machine We have a directory Vardy be package in which you have one sub directory for each package, which is installed on the machine But there is no global database that says okay. We have this that depends on that and everything everything is very fine-grained and First it's very resilient because it can get it can't get out of sync There is a small price to pay sometimes things could be slightly faster. We are working on that and there are some solutions and This decision has some consequences Some contractable Consequences actually We have Snapshots, which are about 30 gigabytes With the tools that we Have these days we you have one new snapshot every two or three days for our major architecture The of course you're not going to get a new package snapshot for spark 64 every three days That would need that we do a cry probably 200 sparks or something like that Now that's not efficient So when you upload a new snapshot You are going to get into a shearing problem Like at some point you're going to have Some of the new packages already transferred and some of the old packages Which are still there and are in the process of getting updated So this means that the solution that we currently have that each package is Independently signed is the only possible solution otherwise, you would always if you had the global signature like a manifest for every package and The checksum of every package Recorded at some point It wouldn't work because all the time you would end up with discrepancies like some of your package are I don't know what the signature is and everything There is no other solution. Like I said, there are some drawbacks to this way of doing things Some limitations for the model Everything we do is based on package names because when you are going to do an update you're not going to go into that package directory and look at Every single file for something like 9,000 packages on the ftp side these days, I think So we have to have consistent package names Sometimes this doesn't happen. So yeah, we don't have an index But we have an escape mechanism called quirks Because sometimes we want to rename stuff not too often hopefully and Then we have to open each package to check for our where are it is indeed the right update whether I some What stuff that's going on or not? package it self doesn't do it because Can't do everything. It's still lazy So we use the empty the ftp command these days which contrarily to what it says also under HTTP and HTTPS and That becomes a problem because it can be slow For instance these days We have a very minor problem, which is that you are supposed to use package add on HTTP repository and You can use it HTTPS repository, but Won't be so good Because each connection to the repository is going to be a separate connection using a separate ftp command Due to limitation of our libtel s and some design decision in ftp It means that every single ftp is going to have to wear a ticket With a full HTTPS to the distant repository and it costs a lot. I Said this is a minor problem because we have signatures which are completely independent from HTTPS So whatever happens if you connect to a fully launch repository Well, you will get Packages with bad signatures and that's it I can do it stop itself right away We should interest ftp itself and we don't At about the same time that we change signatures. We also added some level of privilege separation To package add so that ftp is actually run as its own user who can do anything We can't sorry do anything on the system. So that's perfectly fine So it's just a minor problem The issue being that if you are Using packages on your open bsd system, but still one small security implication Yeah, I think that you are falling asleep. So what's the security implication, right? We are leaking the packages we're installing Yeah Yeah, that's also a valid problem the problem being that if a hacker decides to keep a copy of an old repository before there was a security update and Potentially there could be some trouble, but we took care of it Like you have the quicks package every package if you remember one of the first slides has got a timestamp, right? and Whenever you do an update using package add each wheel it will display the timestamp for quicks which is the main package with Everything useful every meta information for the world opposite array so You have to read because we don't know at which rate snapshots are updated and it depends on the architecture But you will see that your quicks. That's back from that time So if you have a real or quicks It usually means that something wrong is going on and then inside quacks itself you've got a list of packages with security holes that have been updated since then and during the great process if package add can't find a Newer package for one of those with security implications Then it will tell you that there is a problem. So yeah, that's definitely a theoretical problem that that one we thought Yep Yes, I'm still relying that I have a recent enough quack, but I can't do anything about that Like what's recent? For md64 three or four days is recent for spark 64 six months is recent Okay Yeah, well But yeah You have to engage your brand we are talking about security. It's not it's not windows where everything is insecure by default But if you don't look at what the system tells you you're going to run into trouble Oops Okay, too much as usual I'm just going to finish with a Very recent stuff Which became part of the open PSD package systems about this one a week ago And which is called version I was talking about package names being an integral part of open BSD and sometimes we do System-wide changes this happened a few years ago When we change type left to one sign long across all architectures and this broke surplus plus Yeah, because name mongling and everything and if you change the actual type name So every function name changes and you have to bump to change Specific patch version of every package depending on surplus plus and then more recently we switched on to minor architectures from GCC to clunk and This breaks also plus plus again. Hmm. Maybe there is something wrong with a plus plus So we did bug every package the first time The people who helped doing that are still scared by it We don't want to do it again. So this time we decided. Okay. We're not going to do it We're going to tell people Well, fuck you. I don't like you and you have to update everything by hand force an update and Then Stuart with a very smart guy Convince me that maybe I should try to find something else to do and at first I thought I'm not going to do that because basically, I need to put best system dependency somewhere and I Hate that because I'm going to end up with a fire in source But I will have to burn from time to time and I'm going to run into Theo all over again Which is a problem because when you get two heroes in front of each other we fight usually and In the path I realized I was going to get it all wrong Like I don't need at all to have a dependency on the best system I can put it directly inside the package and first version was born Like what we have right now is that each package has got a kind of Silent member that gets bump inside the package system proper We vote any reference to the source system and we vote any dependency to external stuff It's very simple like if you have a package that says version zero and Then you look at the snapshot and you see a package that says version one. Okay That's a new package. I just update Though it builds it's built. It's be strange. We have a Flag to package cat which is minus V where that you can use multiple time to build a version number by adding numbers together The idea is that it's very simple to do that without needing any make magic So that you can have machine and apartment parts and machine dependent parts in the same make fight Like currently you have the wall open-based the system, which is at version zero and then when you run MD 64 Interprocessors it goes up to one because of a second minus V and It turned out to be incredibly simple Like if you look at this patch For the amount of code I had to change it's about 20 lines All considered And it changes everything Like now we have a mechanism that was used to move to GCC to clonk seamlessly like if you're running open-based this table and you move from 6.1 to 6.2 you won't notice Hopefully that we change compilers especially the package system Well, you will notice that you will update almost every package, but that's it And it can be used again in the future without needing anything more. So that's really cool Yeah, well, I was a bit long so I was also going to talk about the future But I don't think I have any time for that. So, yeah, maybe another talk next year Thank you. And of course I'm going to take questions No, but I can talk in this one maybe for once Questions for those of us running snapshots Obviously But anyway, we're going to make our lives lives easier Sometimes I'm in the situation where I installed a version of user land and it turns out the snapshots Of packages are not yet updated. So I have to wait 12 24 hours Is there any way I can know beforehand that I'm not blowing my system away the next day I'm afraid that we don't have enough human resources to be able to help you on this one like The most we could do is to get the turnover to be faster, which already did but basically being able to inform you that When the snapshots are going to be online, it's just extra communication Which takes more time like if you do this stuff, you're going to have to wait for a few more hours each time probably Also, I think that very some paranoia involved Like some details of the singing process are not completely public who's signing on which machine this is done and We don't want to leak information concerning that. Sorry Maybe I can reply to this question. You everyone has its own acts and you can just Follow the CVS commits see when the library gets bumped for example And you just compare with what's on the mirror right now because as Mark told you can extract the package and look what's inside and you will see in the signature if It matches what's in the package will match what you will have on your running snapshot. So maybe you can Download the snapshot look at the last library when when it was bumped in the case of libraries When you have breaking change in the kernel But then you can compare that the snapshot will match the libraries which are in the packages And everyone has his own acts this way Actually, there might be a solution Assuming the canal doesn't change too much because you will be fucked them obviously You could use the tools that we used to build ports P would you should You can use it to populate an area of a disk with a snapshot for instance And then you go into into that area and you could try to do a package add minus M There to check whether or not it will work That's a bit intensive, but at least that's Almost no human intervention. You probably can do that in with a shell script that would be about 20 lines long But it will be Anyone else with this package versions Do you bump them manually and do you bump the version for all packages or only for that that are affected by some? GCC plus or C++ update now the idea is to bump everything system one more or less Like we have this Mechanism that some people objected to but we are that we are very happy to have these days Which is that you can have arch independent packages package arch equal to star And basically I use that to not put any version in my packages because I know that both packages are just Text files more or less and because they're independent of the architecture and everything else got bumped It's it's a it's a question of being safe better than sorry because You already have dependencies where you more or less know that Most packages that depend on clang have been updated like if you depend on the C library on the C++ library Of course, you're going to get bumped, but he didn't catch everything. So we just decided to bump it all We don't have enough human resources to check every single package that varies compatibility I got it right the package With the arch star are not updated by the version Yeah, yeah, there is no version in those packages and this mechanism is actually Somewhat more generic like if we want to add Another part to the version part we could As a question over Yeah, perhaps it doesn't really fit the model but If you have reproducible builds then Then you might be able to say well I'm rebuilding this package, but it's but the result is the same. So there's no need to upgrade Well, we try to have reproducible parts for lots of stuff but Actually, it does some security If you look at a recent OpenBSD machine, you'll notice that it does Running its kernel in random orders. It's running some libraries in random orders And we try to add some randomization so that some kind of exploits are more complex So reproducible builds. Yeah, that's nice goal when you are trying to debug stuff but It's more complexity because then you in an OpenBSD setting you will have to have Different switches so that sometimes you build reproducibly and over times you have more security and That's while there are bugs and You're going to run into them I have a very specific sample of a recent problem that we had to fix after we switched to Klang On almost every OpenBSD Exhibitable you've got an OpenBSD specific section which is called OpenBSD random data Which is mostly used to seed the canary generator for stack protection and everything and Klang which is a bit stupid just decided that because it's a section it doesn't know about It's out to be filled with zeros And we had to fix that And this was completely out of the blue nobody who worked with Klang noticed until I think it was still it's himself who noticed that it was the case and If you do reproducible build Then you lose some of the security features that we have And if you do that as a switch Uh, then maybe you will break something and you will notice six months later like this happened for some overseas Let this live So Yeah, okay All right, if there are no further questions, I'd like to thank mark for your presentation and thank you for listening guys