 Welcome everyone. This is designing a distribution from scratch part 2. I did part 1 in ELC San Diego. You can view the slides from that link or from the ELC site and I think also from the E-Linux wiki. So the initial abstract was about init systems and C libraries and I wanted to have a talk about that but then at work I had to deal with BSPs and integrating the Mali binary only driver so a large portion of the presentation will deal with that. If you have any questions just raise your hand and then start speaking I'll repeat the question. The room is small enough that we don't have to do mics around and if we do Sean volunteer to be the mic carrier. So I always start open embedded presentation present with the naming confusion and behind actually had a search printed it's hard to see but it has why all P this is for Yachto open embedded pokey because people use the names interchangeably and when they shoot it. So most recently someone in our enterprise group explained it perfectly to me. He said the project is like tianacore and open embedded is like edk2 and now everything makes perfect sense. No so open embedded open embedded is a build system which is part of the project as a Yachto the project and the closed equivalent is built so it's not a distribution so your device does not run open embedded it most certainly does not run Yachto. You are allowed to call it Yachto Linux because that pisses off the project. So open embedded consists of recipes config files and a task executor called bit bake and for the BSP talk you have to remember we have three orthogonal concepts in open embedded one is a machine that describes the target hardware and the features so it says it's a power pc cpu it has a screen it has wi-fi etc. You have a distribution config which sets policies like selects the init system selects the c library and does some other things for example in distribution you can turn off all graphics for a headless thing so even if your machine says I have a screen in the distro you can say yeah that's nice for you we're still not going to use it and we have image recipes this is where you have a collection of software packages in an output format like x4 or a torbol and open embedded combines the distro and machine and your image recipe to output that so in theory you should be able to use any image with any distro and any machine but there might be something going on if you have an image written specifically for a certain machine. So init systems in theory you can use any init system you want in an OE build in practice you're limited to the init systems OE core and or the layer support so that basically is CSV init or systemd we try to integrate upstart but we ran into the fundamental design problems that upstart suffers from and we hit them in in OE because the reverse dependency issue so the choice is between CSV init and systemd the default is CSV init and to select systemd it's sadly not a one-time thing so you have to add systemd to your distro features then you have to remove CSV init from distro features then you have to give a hint to all the image recipes that you want systemd and once you've done that you notice it that it doesn't really do a lot so then you have to change the systemd package config to add to resolve a network demon and then you notice that it still doesn't really work and then you have to add pam to your your distro features so if you pick a different init system know what you're getting into and do a lot of testing because it's not as easy as flipping a switch and say I want systemd it takes some configuration C libraries similar story glibc and uclibc are supported in OE core muscle is yes Richard Richard says uclibc has been dropped yay so the issue with uclibc support in OE was by default it turned on nls where it's included icon v so if you actually built an image it would be larger than the glibc equivalent for small images so you had to do like in the previous slide with systemd had to do a lot of tweaking to get it down and then you notice that uclibc is not the right choice meta muscle is a meta muscle that seems to be a better choice nowadays and using it is adding the layer setting tclibc is muscle and bit big my image so muscle is in core i'd looked at an old checkout so it's in core now so you don't even have to add the layer and since it's integrated into core all the patching has been done for you at least for core and for the layers that cam cared about you still might have to patch some other software because muscle doesn't provide the same thing as glibc is it getting a bit better because intel's patch distribution clearly next is using it and these are sending patches all over the internet so maybe next year we'll have a talk about deleting glibc from OE core so like i said in the uclibc case people would say it's a lot smaller but in a default configuration in OE it wasn't nls was turned on by default and lib icon fee was included and yeah muscle seems to be displacing as Richard had just said it has actually displaced it so good now to the part that x1 so bsp's you might know john masters he is a big fan of standards and he hate cute embedded nonsense hacks so i wore the shirt that i do cute embedded nonsense hacks and he keeps talking about the embedded sue so if you think about it the linux kernel supports all different types of machine and it supports runtime configuration be it device tree or acpi or dd forbid sfi so why do we have bsp's well because people think their product is special and if your arm or an x86 bsp you think you are really special so that's why in the open embedded world we really have to deal with bsp's and since every there is no quality control because everyone can start their own layer there are no standards to live up to and bsp's in general have a very low meta data quality because people create one to scratch their itch if you buy the product the company will support it if you haven't bought it they have no incentive to support it so this is not complained against open embedded or the actor project this is squarely on the shoulders of the of the maintainers and to make it worse due to it being low quality using multiple bsp's in your distribution is actively discouraged by quote unquote yakto if you read the documentation go on eroc or an emailing list basically say like yeah don't do that so lots of people try to fix the problem from different angles and a few years ago darren hard said hey i wrote a tool to automatically enable one bsp and disable all the others and if you select a different machine it does all that and that avoids all the problems and you're like you're not avoiding the problem because one bsp might change this conflict option and the other might change the other option so you have still differences between your builds that aren't machine related just because everyone is snowflake special with their bsp and i would recommend integrating them one by one and then run bit big diff sifts and i forgot chris larson's tool to check what the bsp is actually changing and you'll be surprised at what the bsp changes when you think like that's not related to the machines and if you send patches the maintainer will generally don't care or don't understand the interaction issues he will say it builds for my distro and my bsp layer so everything works and like yes but if you add this other bsp layer and generally the response will be so fix the other bsp so you are a maintainer and suddenly you get an email from me or patches and then you say what is actually the problem so most of the arm bsp spoke at floating point abi which is a distro setting and the net result is i have two devices on my desk both i use in cortex a8 cpu in their so c one is by free scale one is by ti you think same cpu i build for one machine build an application move the application to the other machine and the landscape is on and suddenly and suddenly your application doesn't run you're like but i used oe and they're the same cpu why didn't it work well turns out that free skill says well the oe default is soft float but we don't want that we just force hard float for our very special machine and the maintainer doesn't care about that he refuses to apply the patches to fix it which he does by setting default tune so the side effect of setting default tune is that it's also changes package arc so you build for your ti machine you get in package architect you call arm fee seven t2 fee fp neon and for your free skill one you get something called cortex a8 minus no a time oh my god so it even breaks package management there we have some work done in oe core by mic by mainly mark hattley to automatically generate all the options so all the packages will be compatible but from a distribution point of view you now have a 10 gigabyte feed for this one and a 10 gigabyte feed for that one and they're largely the same that's just wasting space and other things and another thing you will see in bsbs they have an append for lib drem that has patches that apply to a single lib drem version so if it gets updated in a stable update or you have another layer with a more recent lib drem it fails to build because you have two six two four sixty seven and the patch against two four sixty six most of that specific problem has gone away the patches were accepted and both for the bsbs and upstream so two four sixty seven has most of the offending patches we had in the in the bsbs so that has been is now a problem of the past luckily but it might happen with other recipes so if you have a bb append please please please version it if you need to have a floating bb append be very clear why it needs to be floating and check if it if the patch is version specific and there are bsbs that have i kid you not a glibc recipe that changes a single option but they completely include the complete recipe and in your layer stack if you arrange it in a certain way then that recipe takes over the o we core recipe and that doesn't have good results another one that is harder to track down the linux lib c headers bb append which means that suddenly the c library you build will have different features like missing sciss calls and things like that and that's something you don't want because it's really hard to track down because your you track it down to your c library and you cannot find any changes in a glibc recipe except you find a bsp that changed it you remove that you still have the problem and then you have to track it all the way down to linux lib c headers and in this case that took me two months to realize where the problem actually was because you think it is these are just a bunch of headers why would this be a problem and as you will see later in the talk about the gpu blobs mesa baby append that delete all the libraries in do install with any without any override safeguards and this happens because people need the cronas open gl and egl headers mesa provides them their binary blobs do not for some reason and so they bb append mesa do delete all the libraries so you build for your arm machine molly comes up 3d work you think awesome then you take out your minnow board you boot it up and it doesn't work and if you have a more recent linux chapter recipe than oe core really bad things will happen because it will tend to pick your version that you customized for your machine and all the other bsp's using the linux chapter recipe will try to build your recipe and it works just well enough that the build succeeds but it won't boot and like i said especially in in the arm ecosystem people are really fond of picking the default tune for they say i have a cortex a9 i have to pick the cortex a9 tune not the cortex a8 or the generic arm because it's a lot faster and then i usually say show me the benchmark and then they go like oh but there was this article on l w n by some deviant people like yes but they compared hard float to soft float and not soft fp to hard float have you done that and go like no i have not and then when they actually benchmark it shows no real life difference and then usually they can be persuaded back to not poking at default tune but but just saying i'm an arm v7 i'm an arm v8 i'm a power pc and then things get a lot easier and if they don't you have to include include this bit of python in your distribution that basically says finds the arm v7 machines looks at the features and then resets default tune so we have this in angstrom for my pet project and now the linaro open embedded reference distribution also includes this because without this building for an arm v7 bsp more than one is is a maintenance nightmare and this saves you a lot of hard deck because it automatically fixes things and you don't have to bother with all those maintainers and then so it's a bit of a mixed blessing and that's why i was surprised that things started failing when i was working for my jadeo battle in our distribution like this works for angstrom why doesn't it work and then you look at what the bsp is doing and then go like oh yeah it's yes for for for arm v7 this takes away the well it takes away the most used ability so this is a magically a magical unbreak me thing and if you look through the git log it keeps getting bigger and bigger because the new lower power arm v7 cores support virtualization so we core added arm v7 ve which i had to add two weeks ago to fix the all-winner bsp's so ideally a bsp would be a single git repository having multiple layers and a base layer that would have kernel bootloader firmware so the bare minimum that you would need to get your machine to boot a second layer would be with all the nice to have things like codex wi-fi dsp media whatever and a third layer where you go bb append other recipes and that's where the problems will be and and these are this really happened people were bsp were bb appending busybox and it only added rf kill but it did it in such a way with immediate expansion that it killed other busybox bb pens that were in the distro layer and it turns out they just needed rf kill to turn on their bluetooth thing which in the end turned out that they actually needed a conman config file to do the same so the whole rf kill binary problem went away after they realized that and then they had a different problem that their machine needed a config file for conman while that is a distro thing eventually they went with their own distribution to fix things like that but that shows how you have one problem you solve it you the problem moves elsewhere and this one was real gem a bsp disabled picks man support everywhere they spent a lot of time doing that and patching all the software to add their 2d engine and it was awesome because their 2d engine was actually an fpga but suddenly picks man broke everywhere else so yeah so gpu blobs are a bit difficult because they're needed to make the the 3d on the machine work so should you enable them in in the bsp is it is it the distro policy because for some things like on for nvidia cards you have an open source driver and the vendor provided binary where do you make the choice do you make it a distro policy do you do that in the machine i would tend to say it's it's a distro policy but you need to be able to set a default in the machine otherwise you will have bsp layers that have a read me saying this is what you need to do to make it actually useful and we should try to avoid that because if you have a layer it should do the best it can instead of being you have to do everything yourself and i've looked around and there aren't really any best practices around and especially in the our model situation everybody just did their own way and luckily for my day job i had the opportunity to look at the system that might improve things and i think it's improved on the current situation it isn't the best solution so the problem was is as i described people were deleting the mesa libraries because they needed the headers okay fine we can do that and they needed a way to inject the Mali libraries as a provider for virtual open gl egl gls that's okay but there was really not a good way to do it so every bsp had a bb append for mesa saying do install append machine etc and then for the second machine they copied that block and append second machine so and the distribution a layer had for this machine do this for the other machine do exactly the same and that led to a lot of duplication so what we came up with was a distro include that looks at a machine feature so your machine says i machine features is Mali for 50 and if this python code finds that machine feature it will automatically set the Mali binary blob as a provider for virtual egl yeah the downside of this recipe is that if you don't have a Mali machine it will automatically set it to mesa so now your distribution supports mesa or Mali and not any of the others which isn't a problem for the distribution at work yet but this needs some improving the issue was you can look up what the provider is set to but if you try to restore it you run into a recursion loop that's probably easily fixable but we ran out of time because we had a september release to do so that we need to look into that what the problem is and see if it's a bug in bit bake or in our code and try to work around it because just forcing mesa is just as bad as forcing Mali and for the blob recipe it includes a bit of code that we lifted from the the ti bsp that basically checks for features and this qualifies as if you don't meet the feature what ti does in their bsp for their binary blob their binary blob is in hard float so they check if you have hard float and if you don't they throw an error saying this only supports hard float which is good because the free skill bsp forces hard float to make the binary blob work so if you have had less imx6 machine you go like i don't care about that and that's where the magic python comes in and a while ago someone on a comment in a linux wiki news article had a simple list of things that a project be it an application library or something like open stack which is a collection of projects could make it easier i forgot what the article was about it maybe was something about owncloud but what that person said is try to use a standard build system like automate cmake setup tools whatever something that is widely used don't roll your own unless you know what you're doing and in all the years that i've been doing this there has been only one project that knew what it was doing that this ffmpeg they rolled their own configure their own all their own mail make file it works it supports cross canadian cross anything you can think of so if you're not at ffmpeg please stick to a standard build system and don't try to roll your own samba switch to waf they were like this is so much better than auto tools it rocks and it's waf and it's python and it's awesome and then you go and cross compilation and you're silent 4.1 yeah we support cross compilation and what they mean is they just fire off qmu you're like yes but my architecture doesn't support qmu oh yeah we you can have a file where we cache the results not some results but the results and you're like okay can i use both samba 4.3 you can use both so it is an iterative process and you can now kind of cross compile samba but the switch from auto comp to waf wasn't a good thing for distributions have a clear license declaration a copying file is standard you might think of course we have a license declaration that isn't the case for a lot of projects you have to look into the source what the license are sometimes it has multiple licenses so please have a copying file in the past month github updated their ui and it will now look for licensing in the repository and actually show the license in the top right hand corner which is a big big big big improvement on the previous state so you can now see that 90 percent of the github projects don't have any licensing include unit tests uh this is in the oe context uh a bit weird i think p-test execute make check i haven't actually used it yet so please do and oe goes a long way to be able to execute them it really helps for doing validation and spotting problems earlier on this use package conflict for the dependency that goes back to using a standard build system but even in other tools and especially cmake you have a lot of freedom to detect things on your own but please use package conflict for example kodi is using from moving from other tools to cmake and they're find sse method to look for sse in intel cpus looks at prox cpu info so people please don't do that have regular releases at least for buck fixes and security fixes we really like that in open embedded if you have a clear stable branch that means that we can do when there's a release branch it is easier to do to update that to a stable release because you know the abi won't change it still needs a lot of testing and and the blessing for the from the ferrish maintainers but if you have a project please have at least a tag for your release and uh know what an abi break is if you're using c that that is fairly obvious uh c plus plus makes it a bit harder not because of c plus plus but mostly of because of gcc the bad things uh custom make file hackery not including a desk there which uh for oe builds we really like a desk there that you that you as a product decide what files you install and as open embedded we like to decide where it gets installed so if you have a make file please have both of these things both what to install and where to install and uh this is a nice one you have units tests and they always fail no clear license someone will say it doesn't have a license so you can use it as well no that's not it that person usually means this is public domain so if you encounter such a project please point them to usually the uh creative commons website that has a good explanation of why no license is bad and if you say that you probably mean public domain and usually people go like oh thank you for explaining it to me and they update their license to public domain and the problem goes away but if you have a problem see a project without a license please contact them and ask them to uh clarify the situation and related to this there have created ideas where files go libraries in bin binary region slash lib at etc when in doubt please follow uh the fhs the and to our to which i will say a subset of the fhs because the fhs allows slash opt which is a free for all so ignore slash opt and please stick to user etc far and things like that not using the system c flags that one is really noticeable in the open embedded context for cross compiling because we have to pass in the flags especially for the compiler oe builds because you can reuse that with with different flags so the built in default might not make sense and we actually don't build in a default in the cross compiler so if you your machine and your distro agree that you should use use hard float a compiler gets built and the compiler libraries are hard float but the compiler itself has no built in for hard float so you have to specify the the c flags so in the hard float case it goes horribly wrong if your project overrides c flags because it will fail to link and this one for project maintainers they don't like this suggestion don't add w error to c flags it seems like a good idea because we said you need to have unit tests make check etc but w error does not only catch errors in your project it catches errors in all the libraries and includes your project uses so going back a bit your bsp patches linux lib c headers they do it badly anything that needs a c library that uses minus w error will suddenly fail to build so you're like g streamer why do you feel filled to build and they go like we don't know and it turns out to be the c library so if you want to do w error please only do that in maintainer mode or in your make disk trip so that it gets checks automatically but in the release it gets turned off because it catches way too much outside of your application and especially if you work on weird architectures that have no official support upstream like avr32 uh w error will just error out everywhere so those were my slides and I usually reserve a large portion of my talk for questions so math so yeah so part there was a two part question part one was with bsp you see a lot of duplicate for example or winner you have metasunk c you have metachip you have metachip you have metachip seeing a pattern there uh what's the best practice to deal with duplicates um taking the specific metasunk c example uh metasunk c has a bug file that they need to add chip support and someone showed the patches that he did and yeah that's just anything you can do wrong he did wrong and they're keep working on that bug and you basically like no you need to delete all that and and start from scratch and they didn't really like that so multiple people started the meta chip on their own because they needed support um in the long term if metasunk c wants to support the chip we should all work together and merge the different meta chips into metasunk c there's work being done to merge all the meta chips into a single one um for the other ones it it depends if your uh big bsp wants to support your product for example uh ti bsp's ti says we care about these boards these are our internal evm's these are our blessed products like beagle bone we support that but your little ti based device we aren't going to support that as a silicon vendor so then you'll probably need to do your own bsp but it's always a conversation and the big bsp's can make it really easy to base your bsp on top of it so for free scale they use so c family and things like that so you can include it and derive from that so your the bsp for your favorite board can be really small just a thin layer on top of the other bsp but you also see like we currently haven't met a chip that the three chip bsp's just duplicate everything from each other because they either didn't know or didn't agree with what was happening and the best you can do is get the maintainers to talk to each other and and send patches there isn't really a one size fits all it depends on the big bsp but ideally there shouldn't be a lot of duplications but like we saw with gcc and egcs duplication and forks can be a good thing and what is the second part so cam was saying you looked at the amount of stars the project has on github as an indication yeah but the rentings are a bit hard to spot because on some people file issues on on the project book tracker but did you also have tons of blog posts and social media posts and they can be hard to find but yeah so Matt this one this is a platonic argyle yeah so the question so the question is where do you draw the line between one two and three what goes into number two that that line is is a bit fake like i said with with the binary blob issue there is some overlap between the distro and and and things like that and it's you need to look that on a case by case basis if your codec does bad things to other recipes then you might need to move it down to the third layer but if it's just bad code but it doesn't affect anything else and you need it to make your device useful then you can move it in the second layer i try to see that if something influences other things that can break other machines you probably need to move it in in number three so one should always be safe to include number two should be mostly safe and number three should just not be included by default that is what i think would be a good line to draw you cannot draw a hard line because yeah this is special vandal fender drivers uh yeah so user space wi-fi drivers etc that goes into a second layer but if you have a horrible vandal driver that you need to boot it sadly goes into the base layer yeah the the the third layer would be specific conflict that if you would ask 10 oe developers is this a distro thing or a machine thing and you get 10 different answers then that's the third layer and then sadly there are a lot of things for most things you can see you can say just don't poke at a bi but there are like gp blobs you go like you want something that works and not saying here's my bsp layer and here's the 12 step program to make it work no one likes that any more questions remarks well thank you