 Good morning. My name is Amit Pundir. Welcome to this ELC session on the maintenance of development boards in Android open source project repo. But before we proceed, a quick disclaimer on the trademarks and copyrights because I've used names and logos and I've referred many names and logos in the document. So just to avoid the copyright and trademark issues. My name is Amit Pundir. I work as a senior engineer in Leonardo. My primary role is ASP bring up and maintenance on development boards. I've been doing this for last 10 plus years. And you can find me on IRC on these channels. The agenda of this session today is to look into the realities of ASP development and maintenance on the development boards. We will start with a quick overview of the dev boards usage in ASP and the role Leonardo plays in that. Then we will dig into the technical pain points of maintaining ASP on the dev boards, be it dealing with subtle ASP breakages or keeping up with the stream projects. And then a quick slide on different device build configurations that we play around with. So this is what the ASP reference board page looks like today. Reference boards or development boards are hardware platforms that are used as a reference for developing and testing ASP. These reference boards provide a starting point for device manufacturers to build their own device. The current list of ASP reference boards list call count dragon board 845C and robotics board RB5, both of which are maintained and supported by Leonardo with the help of ASP developers and Leonardo's Qualcomm landing team. Now what roles do the development boards play in ASP? And why do we or why should upstream developers care? So this is our standard template. And if you have attended or watched similar talks from my Leonardo teammates, then at previous ELCs or plumbers, then this will look familiar to you. Now the short answer to this question is that the development boards serve as a test platform for ASP developers to smoke test or benchmark their features, not just with upstream projects, but across multiple associated boards that they do support. And similarly for upstream developers to test their changes with ASP and catch and fix regressions promptly. And over the years, we have successfully demonstrated that the dev boards are essential tools in the ASP development process, be it facilitating hardware compatibility or system integration or just serving as a reference platform for Android device manufacturers. To begin with, it is crucial to thoroughly research and choose a dev board for ASP development. It can be chosen on a number of factors like relevance to ASP that the board can keep up with the ASP requirements and the ASP development pace. And here I'm not comparing development boards with production devices because their boards may lack support for certain hardware components that limit the ASP functionality. And it is not a fair comparison anyway. But near flagship SOC's help, they help rent ASP smoothly over the number of years. And there are less chances of performance issues or difficulties in running resource intensive applications. Other factors include upstream support, which helps deliver long term software and security updates. Next up is active community and documentation. Limited user base or outdated documentation makes it hard and challenging to set up or configure ASP properly. So the key takeaways are choose a dev board that is well supported upstream, has an active community, and provide up to date documentation and software support. On to the next topic of linear dev boards and ASP. We will take a quick look at the role that Linaro plays as an Android ecosystem influencer. I'm guessing most of us here are already familiar with Linaro. We are an upstream-first organization and focus on the development of open source software on ARM architecture, and that includes AOSP. So Linaro support AOSP on a variety of member development boards and provide up-to-date software support, ensuring that the latest version of Android open source project and Android release can be installed and run on them. I'll circle back to difference between Android release and Android open source project later in my slides. And to ensure that Android operating system runs smoothly and efficiently on these dev boards, we perform extensive testing coverage on the dev boards which we support. It includes functional testing, it includes performance testing, as well as CTS and VTS compatibility test runs. And to ensure that the dev boards meet the requirements of Android compatibility definition document, we follow all the vendor guidelines. Again, I can't stress enough on the extensive testing that we do on our dev boards, and these numbers are just Android or AOSP specific numbers, and they do not include non-Android testing that we do on these dev boards. Overall, we test 13 Android kernel combinations across four user space in LKFT. You can take a closer and more detailed look at LKFT test combinations at android.dinaro.org. Now moving on to the next set of topics, keeping up with AOSP code base. The AOSP is constantly evolving, adding new features, providing new bug fixes and security patches being added regularly. Keeping up with these updates and integrating them with any local modifications which you may have can be challenging at times. So I'll be touching on a few common pain points which we usually don't enter. But before, just like we talked about choosing a right hardware for our device or project, choosing the right source code base for your device or project is equally important too. Now in the Android ecosystem, Android release and Android open source project, they're totally different concepts. They're related but totally different concepts. And Android release, release tag refers to a specific version of Android operating system. It represents major updates to Android platform, new features, new bug fixes. We're all hopefully familiar with how Android releases work. And they serve as a stable software reference tag for application developers. Now this is something which may or may not be guaranteed with AOSP. I'll tell you how. Right, sorry. So the AOSP on the other hand is an active development branch. And unlike Android release code drops, it serves as a foundation upon which future Android releases are built. Usually new features land in AOSP first before they get shipped in releases, but it's not a hard rule. So AOSP builds for your devices help in making the device future ready to avoid any surprises during the next code drop. But being an active development also means that AOSP builds are more prone to regressions. Also equally important point that AOSP do not come with vendor specific optimizations, vendor specific drivers or software enhancements that you get on production devices or commercial grade devices. This lack of support can result in limited functionalities on your device, reduce performance and difficulty in obtaining software updates. For our dev boards, we support AOSP and the relevant Android GSI and common kernel builds. Now back to the main topic in hand that is skipping up with AOSP code base. In the next few slides, I'll share the common pitfalls that we run into. I'll start with the most common one which is related to the repo sync or code synchronization. At times we run into random build failures, random runtime failures which are totally unrelated to what you are working on and most of the time it is the repo sync which has gone bad. You were syncing the code with upstream at not fortunate time. So it is advisable to re-sync the sources after some time or wait for internal pre-submit test to kick in and if there are any obvious issues then they will get rectified promptly within AOSP. If it is not, if the issue still persists then that means the issues is with your device configuration or it is something which your device is not doing correctly. For example, some core framework changes need device configuration tweaks and most recent such failures or breakages on our devices happened when last time AOSP was moving from HIDL health to AIDL health. So we had to fix our device manifests accordingly. Speaking of device manifests, the truth is that even after so many years of tinkering with Android devices I don't think I fully understand how these FCM device manifests and compatibility matrices work. So maybe I'll pay more attention to it next time. But honestly it is just easier to see how Cattlefish is dealing with those core framework changes. I mean that's what we do. Just copy those changes over to your device config and see if that helps. And if that helps then you keep tinkering with those configurations based on I mean according to your hardware. Keeping up with ever-changing build configuration tools can be a pain too. The moment you learn Android.NK they have Android.BP now and now that Android.Blueprints started to make sense everyone is moving to Bazel. So the latest is that the whole of AOSP is now moving towards Bazel and it brings certain benefits like lower incremental build times and the builds are more hermetic in nature. Next up is the feature dependency on out of tree Android common kernel patches. Over the years this dependency has come down significantly but it is still there. A quick note here is that it is still possible to boot vanilla Linux or LTS maybe if you are on your hardware platform if it is fully supported upstream. But if you are fine with booting with slnx and permissive mode and can live without a couple of features like adb remount and metadata encryption because these patches are still outstanding. Forever changing boot image header version requirements are again something which I want to bring up here. Changing partition layouts boot image header versions they require some bootloader some deep bootloader changes and from a develop some from a development board point of view or from a device from a legacy device point of view vendors rarely care about bootloaders. I mean they have shipped with one set of bootloader now it is up to you how you want to deal with that. If you are lucky you have the source code like abl uboot and you can play around you can develop the feature. If you are not then it's a painful exercise when there are workarounds we have worked around on few of our devices but it's good to have less number of changes in bootloaders and boot image header versions. This is love to throw in a new partition every new release now. Okay so troubleshooting gki boot failures is next in my to-do list it's next in my list although almost all of these changes are vendor kernel issues and nothing to do with gki as such but I thought it's a good idea to share the most common gki boot failures that we run into especially during the gki development cycle when the kbi is still in the development and it has not frozen. Top in the list is system list symbol list modifications now out of tree drivers are sorry right so the out of tree drivers are only allowed to use a subset of exported kernel symbols not all exported kernel symbols upstream are allowed to be used by the vendors so they have been maintained as a symbol list and every out of tree driver has to stick to that list or if they are adding a new symbol in the list then they have to come up with proper justification just to keep the attack surface intact but if the kbi is not frozen then if some vendor symbols get removed from the list and your out of tree driver depends on that then you can run into boot regressions again something which is device specific and gki which is like pure upstream and not pure upstream but close to upstream they do not care about that next up is protected kernel modules vendors shipping protected kernel modules run into boot failures as well the protected kernel modules are the upstream drivers upstream driver modules which are signed by gki and vendors are not allowed to ship their own version of that stack so it's an upstream stack and just to make sure that vendors do not play with that gki add signatures to the module and if you are not using that particular module and using some other module then you will run into boot issues and then there is one instance of gki config breakage worth mentioning here it has happened only once with me i don't know how frequent it is that while transitioning from gki 1.0 to 2.0 certain configs got removed again to check the attack surface there and the firmware the user space firmware loader support got removed so the drivers which depend on the user space firmware loader they broke because the that config was missing now the workaround which we had that for which we had for that problem is that now we are shipped those drivers are i won't say broken but those drivers still behave the same way they invoke the user space firmware loader and load the firmware but since this config is removed in gki we are shipping those firmware binaries in ram disk so the moment the driver loads it finds the firmware in the ram disk now other than these technical challenges there could be non-technical hurdles while dealing with usp as well one word mentioning here is the policy change around how the firmware binary blobs are handled in usp it kicked in last year where usp sent to all said to all development board vendors that they will not be hosting any firmware binary blobs vendor firmware binary blobs in a usp because of licensing constraints even though those binaries are redistributable in nature and are already part of linux firmware but as long as so the policy is that as long as they cannot rebuild the binaries they cannot host it in usp to avoid any legal issues so that has made usp develop dev board vendors like linaro to switch to their own firmware hosting mode and provide an additional step and additional build scripts to download the firmware binaries before building of usp for the devices now keeping up with the relevant upstream projects one of the most important aspect of supporting a device in a project as big as usp is to keep up with always turning changes we have already seen how the changes in usp affect devices dev boards functionalities keeping track of external projects is equally important too if you are i mean everyone uses linux so you are you have to keep track of that project in my experience in the context of this talk i will only mention linux and open graphics mesa and fridreno but the list can go on and on depending on the features being supported on your device right so working with upstream linux kernel or lts and usp bring many long-term benefits but it can be a pain too so i'll start with the one which we frequently get written with this is that upstream do not count device tree node names and ccfs entries as stable interfaces and these names and entries get renamed more often than not and hence breaking usp i mean for example a mesa has dependencies on these device nodes i mean it reads and specifically these ccfs entries too for their functionalities and usp uses asilinux which has device level access controls so every time a device node or ccfs entries get renamed we have to change our ccfs sorry our asilinux policies to tune into that change i told you that most common examples are gpu and remote procs the other important thing is the convincing upstream kernel developers that the bug you have reported is a valid bug and it can get tricky too at times this is one of the frequent point of contention among usp developers and upstream kernel developers sometimes we get lucky i mean you can convince the developers that it's a upstream breakage and you can reproduce it on usp or any other regular distribution but sometimes that we have to reproduce the bug on a regular linux distribution and show that okay this is definitely broken and this is how we have reproduced the bug to get some attention it gets more and more difficult when a bug is tied to a firmware binary if it is there on the next firmware dot git project maybe you can talk to upstream developers and figure out what is going wrong there it gets tricky if it is a signed firmware binary which is which has come from a vendor and at times we have run into cases where some non fatal regressions just go unanswered i've listed down one example which i think i reported a couple of years ago and it's not a big deal so i just skipped it the other big project or the set of projects that we use or integrate on our call com development boards is freedom open graphic stack integrating freedom in usp it bell it brings many benefits enhances the stability compatibility with new or code base and it helps and most important point is that it helps increase the lifetime of the device beyond the original device manufacturers support cycle but like other upstream projects of any nature freedom is has its own limitations and pain points starting with a fairly regular build breakages with a usb over the years roman from gloat ride has done some fantastic work keeping mesa up to date with a usb build stack i mean he got rid of the pre-builds which we use or which we used to use in a usb mesa project but the breakages tend to follow faster because it's an upstream project and there's some recent ones that i have run into are due to host build tools version mismatch and some build some errors are due to deprecation of a supported library in a usb one upstream issue which is worth mentioning here is a particular mesa runtime crash on a usb if you are using mesa binaries built with a usb but if you use the mesa binaries that that's been built with android gsi branches then they boot fine so we did some i mean i have the issue upstream and we did some follow-back with the upstream developers and uh there are things to do uh on my plate so right now it looks like alluvium 17 has some role to play in that breakage in that breakage but you'll see right so updating the a usb external project yeah so that if you ever get to do that it's again a vicious cycle uh mesa 3d project in a usb is awfully outdated right now and updating it to the latest upstream version or latest release is not a trivial exercise i mean we have done it in the past updated the mesa projects and other projects but last time i tried to update this particular project i gave up because a lot of dependencies internal patches and i thought just thought that instead of fixing that i can just publish my own set of upstream binaries in this one particular use case fixing upstream project was easy but merging the changes back in a usb got tricky because it uh introduced a new set of acelinex denials acelinex warnings in cuttlefish so we have to fix the cuttlefish warnings first and then go back and do the merging so it can get this way too last but not the least although freedom aims to provide comprehensive support for adreno gpu's and maybe cases where there may be cases where certain features are certain uh advancements are or optimizations are not fully implemented so this was certainly the case with welcome support on freedom uh i'm not up to date on that maybe it is there now maybe it is not but just one point i want to do make right so coming to the last set of slides the usb provides a platform for developers to explore new features new functionalities and design concepts in addition to their own set of customizations and modifications that suit their specific needs and preferences so other than chasing and fixing upstream or asp regressions we also get to work with different device build configurations for different different use cases so this is just a summary of kind of work we do uh bring up on newer or resource-contrained sucs it gets challenging at times booting a minimal a usb root fs with only console access or booting with software rendering support while the gpu support is already is in progress or be it experimenting with unified asp boot images so that one set of asp boot images can boot on a number of dev boards mostly from the same vendor but it can be a great experience and can help you gain a deeper understanding of android operating system that brings us to the end of this session the summary or the key takeaways from the stock is that running asp on dev boards or maintaining dev boards inside of asp can be complex and challenging process but it also brings a wide range of wide range of benefits and have often proved to be an essential tool in the asp development process and serving as reference platform went for android device manufacturers do we have any questions thank you so much for the presentation um i have a question regarding to web view and chromium that what is your strategy supporting and updating or keeping inside of your builds do you just take the binary that's part of the upstream or you're also focusing on maybe you know some some regressions and and and then aligning to chromium releases which may not necessarily align with the asp pace of of doing things specifically that there are lots of hardware interdependencies between web view encoders and decoders so things just can naturally break there as well right so we tend to go with what is already provided the binary which is already provided in asp external project but there have been cases for example the recent one is the memfd right so if you don't want to use ashman which is deprecated upstream and if you want to use memfd interface instead then when we started looking into that the chromium was broken the chromium binary was broken not the chromium was not broken the binary which was shipped as part of asp was broken so we raised the concern upstream that the chromium which is being shipped in asp is broken and i could not reproduce the breakage with the upstream chromium version so they were glad to re-spin the binaries and update the binaries which are there in the external project so i hope that answers your question right that if you run into breakages like this we just reported upstream because in asp is a huge set of projects and my understanding of codebase is limited to what i have touched so far and the moment i get to work on different different subsystems my first response is ask the developer what is going on instead of burning your own hands and then finding it out the hard way it's easier to get help thank you so much so you mentioned that the dt nodes and the ccfs nodes are not stable i mean not considered stable abi do you mean that the device names in ccfs are not stable uh so i can give you one specific example right so the colcom board specific example so the i think cys surface cys device platform and the change from so c to so c0 sometime back i think 5.4 that was right so that broke all over acelinex policies because the path which you are specifying so acelinex does the so acelinex provides the access control at the device files levels right so you have to specify that okay if one particular binary is trying to access this path then allow it or don't allow it now if that path gets changed when you update the kernel version then that breaks acelinex and it's like a hard dependency you cannot you have to boot with enforcing mode yeah indeed this is not an abi so yeah definitely so we have discussed it over the years and we understand that it's not an abi so if it changes then you have to change it so it's just like adding one more line in the acelinex rules so thanks for the insights so what dev board would you currently recommend i so i'm wearing a linaro hat right now i don't know if i should be recommending any dev boards so the dev board so i know that so the dev board vendors right and logic does a fantastic job sorry belived it does a fantastic job by maintaining the analogic devices in asp a few big girls nowadays they are supporting the new one if i mean we do support dev boards as well call com dragon board 845c rb5 the newer ones are hard to come by and we are not officially supporting them then you are dev kits but the problem is that these dev boards are getting expensive i mean it's not like playing with raspberry pies or i don't know the price point of beagle boards nowadays but they used to be cheaper back in 10 years 11 years ago right so there are dev board vendors belibri i don't know if bootlin does that or not but so belibri and linaro i suppose there may be others i mean i'm outdated on that information so if i could just do a quick follow-up from that then so in android 14 there's been a bit of a tidy up of the dev board supported in a osp so i think quite a few have been kicked out so do you have a view as to which dev boards will be in 14 dragon board will be one i think so i know that dragon board and rb5 are still there i know that i don't know if i'm the good person to talk on this right now thank you so i have a question regarding to android bp files and i don't have the right answer and i've been struggling with that so moving from android mk's to android bp's we are quite limited when it comes to hacking in all kind of build flags into the get your android android bp's one of the most challenging things i found is that if you support multiple boards in some of the sub projects you actually want to enable some features like i want to have this feature on this board but you don't want actually hard code that into the android bp files so i ended up writing plugins for the bp you know subsystems to be able to propagate certain flags to sub projects to you know maintain like a nice structure of it do you have a take on that how to support like a board specific feature with android bp but still having like a you know not too many hard code that make files make file configurations in the bp right i mean i totally get your point that supporting multiple devices was a lot easier with mk files right couple of build flags you add in your device config and android dot mk will take care of that but the pain point was namespaces right if you have two projects which uses the same project name then the build will fail i mean right with android dot mk files but with blueprint files you can specify the namespaces in your config file you can import which project you want to import now coming back to hacking blueprint files i've been to the same thing recently if i have to add a c flag just a c flag right then if there's a config i have i'm updating and if i'm running on a specific board and if so just enable this flag i do end up looking into the what you modular properties i think that's what this called that you have to import the module the android dot bp module add the property which you want to add it could be anything right properties are like project specific thing so i added a c flag property and it was working was working fine but then i realized that the project the upstream project and it was mini gbm right so external mini gbm is not a it doesn't define a namespace so first thing is that you cannot import a project bp file if it doesn't define a namespace so that was the first blocker right so i said okay let's define a namespace so i hacked the external mini gbm project added a namespace there but the moment i did that the other projects which were using mini gbm but were not importing the namespace they were failing they were crashing right and these are just the usp projects i don't know how many out of usp projects are doing that so if i update mini gbm and add a namespace i can take care of usp projects but i cannot take care of other n number of projects which might be using mini gbm the way it is right now so i totally get your point whatever you're trying to do i did the same thing but i got stuck at the namespaces and i can modify that and if i fix it in a usp then usp will be happier to take that patch as well but i did not want to break other non usp projects yeah yeah we ended up writing a custom plugin because adding that namespace just kind of went out of control yes yes how many projects you have to modify just for a single flag yes yes exactly thank you so much thank you omit unfortunately we are out of time but the questions can be carried forward to the hallway sure thank you