 Good afternoon everybody. Welcome back from lunch. I hope you're not gonna be in a food coma in a second Welcome back to track one I'm gonna introduce mark SP. He's for the people that don't know him He's one of the the package and ports gods from open VSD and he's gonna do a talk about it. So mark Thank you very much, Michelle. Oh well I'm very happy to be doing this talk For once I get access to an electrical outlet so I can use my phone and my computer which is complicated here and I'm going to talk about Architectural issues in the part three First I wanted to address what I would call the elephant in the room. No, I'm not going to talk about package ad the speed up Recently for those of you using open VSD. It's going much faster. It's also an interesting topic But I already decided I wanted to talk about something else so Architectural support in open base report three The first thing I want to say is that we have some very strict policy about things like our Chief let's say our leader maybe doesn't believe in cross-compilation for Anything except bootstrapping systems. So I think We have a very strong tendency to refuse to cross-compile anything Including packages which has advantages and which has also some drawbacks The main advantage obviously is that it's almost the best stress test you can find for general purpose computation like you have some people like Alexander who is Working a lot with networks and who is giving us lots of grass for network bandwidth But whenever you want to use a machine to do something else Be it compilation or be it using web clients or anything You really need it to work and When you are building cages You are stress testing the memory the swap the disk NFS to its breaking point and beyond especially with respect to NFS most of the time So it's still a good idea. I'm not going to pile crap on the other BSVs who sometimes had Native compilation broken for some times, but that's another story. Let's hope it doesn't happen again One nice thing about having lots of different architectures and that you have variations So this is not specific to the part three some parts are not even related to ports Street element is something which is fun like trying to work with a son and discovering suddenly that Won't work because it does shit with its pointers and everything Indian versus sweet Indian obviously That's mostly a headache for cross-compilation is this most issues have been fixed years ago Characters like this. I'm not sure this is even Interesting actually It's just a poor choice on the part of PPC people and we have to leave it We have stack That's a curiosity like if you are running HPP You have to realize that the stack goes up road and not done once So this means that you're mostly exempt from buffer of the flows and the founder flows don't happen that often so Yeah, there are all the issues with the architectures stuff like Register windows goes down on the spark and stuff like that And also some can surf like for instance, you have some architectures where where you start stressing on the system You will run out of cannon memory and that might be an issue The actual biggest issue with architectures in the post tree We are going to talk about that a bit more later is compiler banks Well, obviously for anybody who's tried to do ports and who's tried to compile them on Anything that's not Intel 64 bit So I wanted to talk about this because this is a topic. I've never talked about before and there are lots of Small bits and pieces that slowly came together over the years And it's not finished. It's never finished. Anyway, there are some interesting points that might be Useful to know if you want to play with open busy ports and might be also of use for People working on the verb is this most might get some cool ideas to improve our stuff So what's an architecture? Architecture 101 I would say I always get confused because I mostly work on Intel but if you have two variables on the system Arch is the exact machine that you have for instance Mac PPC and Machine Arch is the CPU model Most of the times the packages are Specific to a given architecture so machine arch and we don't care about the specific model except for very very specific reasons and The first issue that we are going to run into is that There are variations on some architectures most notably Intel again Which means that if you don't set up your compiler correctly You're going to end up with packages that won't run everywhere. And that's a big issue For instance, you have Minus m arch equal to native which you should never ever use when you are building packages because you don't know what it's going to do on anything except your computer and Usually when you build packages, it's for using them elsewhere So the idea in open busy and I guess behind most Operating systems these days is that you are going to target a given baseline like an firm for instance and You're going to compile everything to work at least on that on that specific make of your CPU and Possibly be slightly more optimized for more Frequent iteration of the architecture, but it should run everywhere There was some Exceptions in the past because of performance issue like for instance there was something called Altevic on PPC a long while ago, that was mostly used for multimedia ports because they would be completely We would those extensions for those processors were really way too soon Over the years we are creating architecture at some point in the past you could run open busy or anything with Nintendo processor and MMU and that's no longer the case I think that by now the bearer is at least that pension level and maybe higher I don't really give a shit actually Some what modern machine and as long as it works. I'm happy with it and we get very very few bug reports from people trying to run stuff from V-hole machines with only eight megabytes of memory or so So again, we are not gentle We try to build stuff so that it works everywhere It means that we have to talk to upstream and this is something really important If you know people who are writing software make sure that they do stuff that can be packaged Like trying to auto-detect everything on a machine and the specific a for that machine is a very bad idea Can be occasionally useful when you want lots of performance But make sure that at least in your configure script or see my stuff as some shit or whatever You get a way to build for okay. I'm going to target this whole family of computers and it should work everywhere You can do tests during a compilation to try to optimize obviously and That's the specific case of usually multimedia ports that need to decode video and stuff like that Why you might want to have some codes targeted to a specific architecture and in that case You must have a set of runtime modules that you are going to plug in depending on what you discovered and Again, it shouldn't be hard compiled into your program It's much easier to do this when you are designing the software and to retrofit it later So if you know anybody who is trying to write multimedia code Tell them before they start before it starts being a Rebolydeck and stuff that we have to patch for years compilers Most people are writing software usually don't know How software distribution works especially open vsd and so they try to do stuff with minus O3 or some over crazy options that only works well on some compilers on some versions of the compilers and We have to patch that away Please as fast as possible make it possible to have C flags CXX flags If possible make it for same options like okay look at C make how you set compiler flags I will set include flags shit so that it is best Yeah, how not to design a configuring stuff If you want to write the next Mason for instance, you have to make sure that this stuff is somewhat standardized and easy to replace for the people who are going to port software Even compilers change default so this is Something really funny a while back People in Clang or GCC. I don't know decided that they wanted to change The name of some option and add some new stuff and they decided that okay We don't want to break with all systems so the options that we have deprecated we are just going to display warning and still exception because you know some people used to specify that and It should still work and Combined that with C make auto-detection of options and you end up with the logs of 100,000 lines of warnings because hey, I have this option. Yeah, it works If we played that it worked. Yeah, I told you there was a warning, but no, we didn't crash so I'm going to use this option Great work was fun. So let's talk a bit more about ports now Over the years we have moved away mostly from tests How could he directly to make files Into time formation that describes what's going on Based on values example, I can say that usually writing directly Seeing that this machine architecture. I'm going to do this or this machine architecture. I'm going to do that Etc. Is usually a very bad idea especially when you have some ports that Don't work on some architecture or with some components that don't work on some architectures Specifically as When we try to do my stuff Which is what Dpb does and over components on the system If we have special cases each special case Is going to take a tour for instance if a port vanishes somewhere It's going to be invisible in volos or you're going to get one error message at some point and if it supports that Everything depends on suddenly half your parts three has vanished. Why is it? It's hidden behind the test so instead We have used and abused both mechanisms only for our cheese and not for our cheese to say, okay This stuff the specific stuff one compiler on that architecture That's not perfect. We would like it to work, but it's still okay We're still going to get meta information for it and ports that depend on it or Are going to be able to take an informed decision based upon that the port is still around the meta information is still around What do you choose do you use only for our cheese not for our cheese It's mostly a question of laziness, which is the shorter list and Sometimes also you are going to use explicitly broken because it's usually something that should work But somehow a compiler people work something somewhere and we have a bug report upstream And we hope that the gaze that the clung or mostly clung Are going to be able to figure out what's going wrong and how to fix it So we have tools like I said mostly this meta information is fed through two tools We have Dpb which gets a Very large subset of all the meta information that's available in every port who make them bars Dpb is supposed to be a Brazilian to a loss if somehow you can't obtain information for a given package path It's going to give you an error, but it's still going to be able to complete On the other side, we also have SQL ports, which is incredibly powerful way to check the world pod street for any kind of information like Which port using auto conf 2.50 for instance and she got and By contrast SQL ports will error out as soon as there is a port where there is a problem It's actually very useful Because people knew to be open with the port street Are usually going to make mistakes? Like say, okay, it hasn't been to this architecture So I'm not going to decide anything and so you end up with what's the package name? What are the dependencies or this path doesn't exist and We find out very quickly through SQL ports because it's built about once every three days on at least MD 64 and Very frequently on other architectures and if it breaks, it usually means that somebody fucked up Sometimes it's also me. I sometimes break the pearl behind the SQL ports Not very often. Most often. It's individual porters who committed something. That's Not actually the way it should be done So this is an example of what don't var that I just started it at the start of the pot street I stopped it before the end because we have something like 10,000 ports and it would be something like I don't know 20 or 30,000 lines something like that. So for Each package path for instance here archivers such arc You get all the variables that Actually makes sense within the pot street, right? Some stuff is actually written directly in the make file and some of the stuff is just default For instance archivers arc is a very simple port. So there is no multi-package to speak of So you have just one single package Supercage which is called dash in that case. You have a discharge a master files that stuff that's used to obviously fetch the files Fetch manually we have maybe three or four instances left in the port street is this I think Something like that stuff which has an incredibly a stupid license. So we won't even try to grab it on ourselves by ourselves and You have some more information that we are going to talk about like for instance compiler And you also have the full package name, which is the end result So basically you've got a package path with options We're going to talk about flavors and multi-packages and for each full package path Sorry, you end up with a full package name line 31 on this Okay, so far And there is a very small part which is dependent on the architecture when you do it on there you're going to get package architectures for Everything that's actually dependent on the machine you're building it on You have a few ports here and there which don't have any architectures like for instance documentation usually And also some rather big stuff in take life as well I talked about this last year It's useful when you try to produce debug packages because then you know that you don't Like that you shouldn't even try to build debug packages for stuff that doesn't contain binaries So this is a naming game The unique full package path Can be parsed automatically We have code in the port 3 that does that we also have Similar piece of code in package ad obviously For instance archiver slash arc or long slash Python slash 3.10 comma minus test which is the test package of long Python 3.10 And It's unique Which means that whenever you try to change options of whatever you build normally you should end up with a new Full package path, but something that's very deliberate in open basically we don't support having over options elsewhere If you want to have official packages if you really want to build packages with some specific options You have to make them available as a flavor This is more as needed because open basically is still a relatively small project with a very small number of developers So if we had many more options That would be a complete nightmare for bug reports and debugging. It's already difficult enough as it is We can't Take on more So flavors and multi-packages like I say flavors that's just options. You will see it in the middle of your package M and multi-packages, that's just a way to Not lose time Packaging stuff like you build stuff once and you create several packages out of one bit So how does this we had to having several architectures? In many many many cases Port will build on Specific architecture, but somewhat crippled Like for instance, you will have some parts that depend on some stuff that's not available like I don't know was plugins for instance In that case what we do is that we say okay Normally we are going to build the full port But if we are an architecture where this is not possible we are going to pass some options to say okay Don't build this it will fail, but it's not a good idea and This specific part will end up in a different sub package so The usual setup is you have multi-packages Where architecture dependent part architecture broken part is in a separate sub package And we have some glue that goes from having the full list of Multi-packages to stuff that will actually be built on the given architecture It actually looks like this I Took a real port I didn't even take the smallest one because I thought it would be more interesting to show you How this work on some big shit? Yeah, it's support from Raphael. So you can expect some edX as usual because he's only working on stuff That's difficult to port so that's open CV and As you can see the first line here Tells you that the Java part is only going to build on architectures where we have Java support on OpenBSD So I had 64 and 64 and all the until 32 bits Then you have every variable as usual For the main port and stuff that's specific to each package just yet Appended with a dash man or that's dash Java in that case So obviously the sub package names are going to be different Completely different because you want to be able to install openCV and openCV Java at the same time Assuming you want to install openCV Java, which is yeah, I don't know maybe some people use it You have lots of glue some of these glue we will probably talk about later like compiler or LipsyX6 for instance The libraries that the main package depend on the libraries that the Java sub package depend on A choice of compiler again Multi-packages is always going to be man and Java on every architecture We are going to generate a meta-information for both sub-packages You got a manual way to disable the Java part That's a bonus for our infrastructure When you have dependent ports, dependent parts, sorry, that aren't built all the time There's also a way to say okay, I have CPU that Could build the Java part, but I really don't want to Why can I avoid that? You can set the flower equal No Java and then you won't build Java and also Because you have viewer part because you have only forage Java, it's also going to disappear on anything That's not in that list So after that Yeah You've got this line which has most of the magic line 98 Include bezde.port.h.mka Which we'll look at all the variables defined before that and it will create build packages from Multi-packages and afterwards for the configure parts what you do is you check okay Do I actually want to build Java or not? and you get some Configured differences, an extra module for Java You choose a version as well. You get to build depends And if you don't want Java, all you have to do is you have to tell CMake Well, all you have to do is you have to figure out which flag does what in CMake and when it's easy To tell that okay, I won't even try to build Java and that's it The big big thing here is that all the blue, sorry To see the javascript package will only build on these architectures Is declarative you have one thing you're Include file that will do all the magic and after that you have a specific test which isn't really architecture dependent It's only dependent on whether or not you want to build Java Right, how that's clear so far so the specific module is small parts of the original glue for building ports How it works internally is that it is included as a part of the normal ports infrastructure So what you do when you include it manually is you do things in advance. It will always be run And like I said, yeah, it's going to set build packages Stripping down multi-packages depending on pseudo-favorables and depending on availability Depending on the architecture and that's it and it works It also contains some more information Because at some point we realize that as well Very some information which isn't Which is difficult to encode as specific architectures because You're going to have a whole set of ports that follow the same lines So we use your opportunity because it's a natural place to put this information to have a small smallish database of Every property of every architecture on open VSD Well, not every property, but most of the properties that are relevant to the port street actually So you have the full list of architectures that we can support You have architectures with APM support There are some ports that don't make any sense if you don't even have APM working, big Indian architecture, less Indian architectures That's usually not to decide whether or not you want to build something but to set up some specific compiler flex in some cases 64 bit architectures and also compilers again as usual And also we have some other languages in the ports which have their own supports like mono for instance like also camel, like go, like rest and Instead of putting the list of supported architectures directly In those specific modules, we decided to uplift it in a single location Because of performance issue quite simply We could leave a list of OCaml architecture in the OCaml module But then you would have to include the OCaml module Each and every time and that's not same the idea of having modules is specifically to be able to have a somewhat smaller infrastructure Yeah this line line 37 debug info architectures That's not really True, we could have debug packages on more architectures. It's a deliberate choice right now. We only provide Debug packages for those two architectures There are reasons to that related to Actually killing all architectures namely building debug packages on 32-bit architectures is Usually unreasonable. You want to have debug packages for stuff that's large So if you include debug info, it's going to be even larger when linking and we only have 4 gigabytes of memory so usually it doesn't fit and Maybe we Had some more at some point. I don't know That's also the concern that the debug packages have to be on every mirror. Oh It takes about as much space as Full package sets I guess for given architecture. I don't want the mirrors I don't actually have the exact numbers, but the debug packages time to be rather Some numbers I Did a quick break in the port 3 we have currently roughly 10,000 make files and fragments Out of these Bezde.port.arch.mk is only used 200 times Out of these you have about half of them Which are really test on build packages So we have something like 100 ports in OpenBSD That do have parts that do not build depending on the architecture And the rest obviously is using properties. It's saying okay if I am on this kind of architecture I'm going to add this flag to configure or whatever One fun thing for people not familiar with mech is that You can actually use these variables before they are defined Like for instance, it's perfectly okay to say that a given super cage will only be on rust architectures and then Include the part that does define rust architectures Because only for that sub will only be expanded when it's needed So by that point you already have rust architectures defined correctly and then you can do your test as usual So Let's go back to the main subject which is what's going on with respect to architectures and OpenBSD I believe that one thing that we are doing right is that we managed to industrialize everything which means that One things do not build We usually know about it Very early then there's obviously the other issue of having people who care enough to fix it But at least we got the logs. We got everything So the time frame is that we got binary package very early It was mostly because of Teo who pest on me until I fall lead and I decided to go binary packages only If you open the base they post three somewhere around two percent Dpb came into existence thanks to Nikolai Sturm Got something that was so horrible that I had to write it Well, actually it's something that worked, but it was a bit slow you started this Dpb And you waited for an hour and he told you ah there is an error, try again So I tried to do something better These days we have dedicated build farms for most architectures There's still the issue that our leader is very paranoid. So all those build farms are located in his basement Which might be an issue with some architecture, which are not very sturdy so You can ask bug builders how many times we have to Try to talk to Teo to tell him Hey, you need to reset Spark 64 dash 2 or something like that because it says Hank again The fastest architecture is I think still I'm the 64 I Don't know how many machines we have probably two or three and it takes 24 hours to build the full port street The slowest things I don't know which one this is these days, but if you look at a release Usually we see four or five architectures show up on time and Few hours will show up after one or two weeks because it took that time that long to build to try to build everything and We have regular build stats for everything Thanks to you laundry because I think it's still your script that creates the logs and Yeah, this is the time where I Get out of this and Show you the most recent build stats for some architectures. For instance, you can see that on risk five We built 8,000 packages recently Poor PC is much better apparently at 9,700 packages park 64 Somewhere in the middle 64 Apparently almost everything by the thing built. Whoa Yeah, I am on time to time managed to build some stuff, but it's still a small architectures So it doesn't build all that much You can find all those reports as you can see just mark Meaning this archive so you just look for What's the title again? Yeah, blah blah blah bug big report and you will see you have a state the status of every open base the architecture and You have the build Favors for anything that film so if you feel like fixing it and you have the right machine you can try Good luck Let's go back to representation again. How much time do I have 10 minutes probably? so The way I see things is that I'm the 64 is the bellweaver Might be a bit cool English you have this thing that goes really fast and If something breaks on MD 64, it's really bad for us because it's really close to what Muggles actually use so yeah We have to fix things first on MD 64 and Then the over architectures Some trivial stuff get fixed Really easy Some stuff doesn't get fixed because one thing you have to remember is because you have a binary package doesn't really trans You have to try it out and use it If you have some stuff with graphical interface sometimes it can stay broken for really long back for instance architectures with word and yenness Will not talk correctly to graphics card and you will end up with stuff with really fun colors for instance And Also, there are some stuff that we decided Just okay, we put a command there and we do not build this on that machine because for instance It doesn't have enough memory. I think that all machine learning software for instance does not make any sense on 32-bit architectures anymore for the most part compilers I said I wanted to come back to this We have an infrastructure that was really painful to create. I think it's the sixth or seventh iteration and The usual suspects helped a lot in telling me that he didn't work properly You have variable compiler that you set to choose the best compiler for for that specific airport Quarantine open Disney we still have GCC free for M8 8k we have an older GCC 4.2 because of licensing issues and then we have reasonably modern clung We've lived 6x all quite a few architectures these days and we also have More modern GCC version and a more modern 11 version in ports So what you do is you set compiler to the list of compilers that you prefer to use between base GCC base clown GCC free port GCC port It will take the first compiler in that list to try to compile your port the first compiler available in that list to try to compile your port For normal parts you do not have to set this variable, but as soon as you have anything that uses Recent C++ or some new fungals the extensions like in C20 or so You might need to switch to base clown or port clown or even port GCC and hope that it will build The port itself won't see anything because it will try to auto-detect and call CC and C++ and we have just links under The work directory of a port which is at the start of a path and that's it. That's really transparent Also some details like for instance, we switch the linker for at least until 64 bit so you have to pass some options because Doesn't have the default path and try it to all minutes Over languages. Yeah I'm not a fan of go. I'm not a fan of rust mostly because it means that you have to build lots of stuff to have them supported on architectures and The fashion people who are writing go and rest do not care about all stuff. So they're contributing quite a lot in killing all machines So I don't like that We have some small optimization in DPB, but yeah, I don't really have a time. I won't talk about that It's not really important One small detail if you really want you to look at how things work Make is a really odd program because it tries to be lazy except when it is not So early on before 2000 I took apart the main make-five framework and reorganized it so that variables were before any targets and before any tests and Now each time we try to add to that it's a bit complicated, especially when you try to Get some parts of that big make file and put it into a separate file because you are going by nature to mix up variable definitions and test and targets and If you don't do it in exactly the right order, you're almost certain you're going to break some stuff It's much easier this day specifically because we've got lots of tests the fact that SQL ports and DPB report on anything that goes wrong Usually if you try to make a change You will notice it was not a very good idea so to answer the question of What's going to happen with old-style architectures? In my opinion language support is the really main issue by far Ah Youngster don't care about all stuff and we have compilers that more and more resources I'm not sure we're going to be able to support 32-bit architectures for long I don't know if laundry is going to talk about that in this firefox port. No, you have given up, right? Yeah We are no longer really trying to compile anything big 32-bit architecture it does not link quite simply or it takes forever or it takes forever and it works I've tried to mitigate the program slightly You get DPP annotation you can put into a Mac file, which is called Lonson, which is very specific It's a lucky look. I'm a poor Lonson cobalt, right? It means that when you are trying to build this port you won't try to schedule anything else on the same machine so give stuff like firefox of fighting chance that Trying to link on a poor machine that doesn't have enough memory and is going to go swapping but In the end I think that we have to choose Specifically we can still work on lots of architectures, but they have to be somewhat recent ones Stuff that has enough memory is probably still going to be supportive for at least five of ten years, I hope But 32-bit architectures really for mainstream unique systems I'm pretty sure that shit has failed. We should have maybe spoken very low against big languages and stuff that doesn't fit on small machines But it was five or ten years ago. We can't do anything more now Okay, sorry to be depressing, but yeah, still we cut lots of packages building almost everything with this and natively That's it guys questions no questions really Package source what we are doing for a lot of Some of the older and most popular architectures where for example a port of LLVM is not available is we have a secondary version of packages like lib RSVG which has a rust dependency and we're Still shipping an older version that was still written in C for these specific architectures So there's kind of a seamless fallback But in general this stuff is not easy to maintain and there's a lot of problems for this As far as drifting from what upstream is doing and I'm wondering what your take is on this And how you're approaching these big dependencies that are being rewritten in rust or other programming languages That are not as portable as C Yeah, so as far as lib RSVG goes we I think we still have the old one as well and we run into the same issue One big problem is how many people you're going to have that actually use this And which means that how many tests are you going to get like if something breaks When are you going to notice is there actually somebody that uses this first And also it means that you have a big variation of some software and sooner or later upstream is going to Make things completely end of life I don't know if you still have any version of any supported version of Firefox That does actually work with the old lib RSVG or whether you need lots of patches for that to happen I don't think we're currently patching lots of software to work with the lib RSVG It's more an issue of the dependency chain The lib RSVG the old version has some slightly different requirements to the modern version And build systems need to be aware of that We are shipping an old version of Firefox for these architectures as well as what's what we're increasingly using is A fork of Firefox called Arctic Fox which is specifically targeted to support as many CPU architectures as possible And the upstream offer is very BFBSD friendly so I recommend So the big Arctic Fox So the big question in the end is is this old version of Firefox still receiving security updates It's getting lots of security patches back ported to it from the current version But there's some parts that are being rewritten entirely like the CSS library that won't get any patches in the future This is the issue usually you need to have lots of work put into getting that working So assuming you've got a limited workforce you have to choose the better the better I can talk again about something that's really dear to me which is that we have an over browser in our ports 3 which isn't really open source Which is called Chromium because it looks like open source But Google doesn't accept any patches from era event operating systems like us Right now the ports 3 of OpenBSD and I guess that FreeBSD has got the same code days now We got something like 900 patches for Chromium And we tried upstream it in the past and basically Google has told us to go for car cells So this is something that's really important if you want to make things go forward and make sure that BFBSDs are not really dying Maybe try to publicize the fact that Google doesn't work with us That they don't care about whether things work which is something of a paradox because if you remember at first Chromium was done They're using the preset model of OpenBSD So you've mentioned random detection of features and I know that the story for detecting like on x86 you have the CPU AD But for all the other architectures it's like well hopefully there's this color and it's slightly different on every different OS There's some kind of standardization on non x86 stuff that OpenBSD uses I'm not aware of any but that might be a good idea You should talk to kernel people about that You should talk to people working on the kernel about that Yeah that sounds like a good idea I have a small comment and a question The small comment on Rust is also that whatever speaks to the network will likely use encryption Which will likely mean that it will use the ring crate which doesn't support all Rust architectures with only a small subset So this knocks out quite a lot of Rust ports And the question I had is I was a bit surprised about your statement that using machine arch is an extraordinarily bad idea Was it just for selecting port names or package name stuff Or because we have packed things and make files that branch on machine arch all the time I'm not sure I get what you mean So you had a point that said using machine arch is an extraordinarily bad idea Yeah using it directly for tests it usually makes things more complicated If you can use the more generic property stuff only for arch for a given architecture It makes for things that is easier to read This is something that we noticed along the life of a port 3 Each time I tried to add complicated features which relied on tests People, everybody including me would sometimes miss something And create some ports that would work on MD64 but break horribly on something else So as far as possible we try to uplift most tests inside the infrastructure proper And keep the McFly junk to a minimum