 Hello, welcome to the session of upstreaming for Qualcomm SOC based port. My name is Vanod Kahl. I've been doing NVIDIA Linux kernel works since 2007. Previously, I used to work for Intel in the audio team for the airfoom for it. In the kernel, I am also the maintainer of kernel DMA engine, soundware subsystems. I also do also compressed audio. Along with this, I'm co-maintaining generic FISA system. Nowadays, I'm working for Lenoaro and in the Lenoaro Qualcomm landing team. So the Qualcomm landing team is a specific team in site to Lenoaro, where our job is to upstream for Qualcomm chipsets. We try to solve problems in upstream for Qualcomm chipsets and upstreaming of various drivers. Now, when we talk about upstreaming and Qualcomm, these two terms we typically do not use in the same sentence. And if we do, it's typically in a negative fashion. So when I joined Lenoaro a couple of years back, I tried to learn about device tree, Qualcomm chipset, and ARM architecture. So during the course of my work, I picked a few things and then I was given the task of upstreaming the baseboard for pre-made chip set. And this talk essentially documents the journey which I undertook while trying to upstream the baseboard support for a Qualcomm chip set. So in that sense, when we go through this journey, we would learn about how easy or difficult it is to upstream for Qualcomm chip set. And this documentation probably may help other people when they try to do similar upstreaming for the Qualcomm chipsets they might have. So with this, let's get started. So first we'll cover how do we go about baseboard upstreaming, then how easy or difficult it is to get the serial console access and what are the steps we need to take. So once we have serial console access, we can actually start to look into other subsystems and start enabling them. So first we'll work on the pin control and clocks. Then once we have that, rest of the devices would also need regulators, so we'll work on that. And for a decent device, we need a storage. So we'll discuss UFS briefly. And then in the end, we'll discuss about the USB. In this talk, we will not be talking about modem or multimedia and these things. Typically these are big enough topics which will warrant their own talk. So we'll kind of not deal with these specific agent items. So with this, let's get started. How does one work on a Qualcomm chip set? Qualcomm is a member of Quodoroa Forum. In the Qualcomm Quodoroa Forum is basically a relaxed foundation project where the Qualcomm is one of the leading members of that particular project. So what Qualcomm does is it opens up all of its source code for the kernel in the GAF website which is listed here. For a particular SOC, you would find a specific kernel version which is available on Quodoroa Forum. So when I started this work, at that time MSM4.14 was actually the latest version available. This is typically derived from the last LTS. They take the LTS and start enabling their premium tier chipsets on that. And this is what they release to their vendors and partners. So MSM4.14 was the last released last year. And this year or year in the April, they have already released MSM4.19. So we here in future, hopefully we'll have future revisions of this kernel. So once we have a kernel downstream source, we can look at the various, how the various drivers are written, how what modifications have been done. Some of them are based on upstream drivers. Some of them are completely not using upstream drivers. Just use the downstream versions available. And then if you have a board schematics available, that will tell you how that various devices, peripherals are connected. That's typically important when you try to check what clocks are required for a device, how regulators are connected and so forth. And of course, you need the board. So once you have all these material available, we'll try to enable the serial console. So as I was saying, I was given this task of enabling the support for the premium tier chipset and that happened to be SM8150. This was released in July 2009. Last year it was one of the leading mobile SOCs around Pixel 4 and the Poco F1 and other premiered phones, which were released last year, featured this chipset. So a brief look at the diagram of this chipset. The chipset has SM8150 SOC. It features from the display point of view DSi ports. It also has Pion I2C port for sensor and then it has CSI ports for camera. It has support for both high speed and super speed USB. It supports UFS at STIO. Then it connects to Pimx PEM82009, 8150B and L variants as well. Then we have WCN chip and QC chip for connectivity. It also supports slimpers and sound wire and PDM gaming for audio. How do we go about boot to console? So the serial driver is upstream. So there's no changes required for the serial driver. In this case, if you look at the serial driver, which is Qualcomm GNI, it gives us two compatibles. One is Qualcomm GNI debug UART and one is Qualcomm GNI UART. The UART is used for various UART functionality. For serial console, we need to use the Qualcomm GNI debug UART. Along with the UART, you would need a simple reduce clock driver, basically which describes only the UART clocks and nothing else. With this, you should be able to boot to console. These are the only two things you would need for console to be enabled. Yes, of course, you need the basic TT description. We will talk about that basic TT description up next. So when you have described the basic clock driver and already with your GNI driver for the class serial console, you would need to describe the particular SOC in very simple steps, basically SOC timer and so forth. So let's see how do we do that. So what we do is we take the downstream device treat description. It has a bunch of additional fields which may or may not be relevant when we try to upstream things. So first, we go and describe the CPUs. And in the downstream details, you can find for SM8150, it has a cryo 485 cores with these frequencies, one gold at 3.6 gigahertz, three gold at 2.7 gigahertz and four cores at 2.3 gigahertz. So we go about adding the new compatible for this CPU and add the eight CPUs found in the device tree. Then GCC is basically global clock controller, not to be confused with our new C compiler. So we go and add the GCC driver and the compatible. We already have the new strivers written for UART. So that should work. Then we also need to describe the timer. Timer driver is upstream, so we don't need to worry about that. It's just that we need to look at the address in the downstream device tree use that and add the compatible on V7 timer mem. See that we have already described. So we add that description in the device tree and try to boot. These were the basic steps which I required to boot SM8150 on a platform on the upstream with I didn't need any additional changes. Next, so with this, we have the serial control console done. Next we go about the pin control. So downstream driver already has a pin control driver. So we take that and start to clean it up. One of the things which beyond understand who also works in another landing, the Qualcomm landing team has done is add support for this joint types. What I found that Qualcomm SOC have bunch of tiles which provide the pin control functionality in total. So we need to describe not just one controller instance but bunch of tiles for the whole pin controller. So in order to support these tiles, in traditionally they were joined. So it was kind of one big description but then in some of the platforms they turned out to be disjoint. So beyond added the support to have disjoint tiles. So in this case, the tip is that we even use tiles for joint tiles, this type support for joint tiles because it gives us free handling of XPU and as we don't map them, we just map the specific tile areas. In the case of SM8150, we found that we had four tiles and we add all the four tile descriptions. Then we go about adding the UFS reset after the pin control pins and ST pins at the last. This is typically a requirement from the UFS pins where it looks up at all the pin control and last pin it expects to be the UFS and ST. So that's kind of hard cornered in the driver for this chip sets. Then we have already described the reduced clock driver for the serial clocks but in order to get other devices up and running beneath the complete clock driver. Downstream already has a clock driver available so we take that. But unfortunately, downstream clock driver would be described in a older language and in upstream, Steepwired has already migrated the clock drivers to use a parent data scheme. So this is one of the new things which happened while I was doing the implementation for SM8150. So the parent data scheme is basically describing the parents of a clock in not as arrays of arrays but then a direct reference to the clocks. This helps resolve namespace issues apparently in clocks because you can have multiple clock controllers and everybody referring to the same names might cause collisions. So instead of that, we kind of use the direct references. On a platform you will typically have external clocks like crystal oscillator or sleep clock and in case of clock or outcome we have something called as RPOCC. We'll talk about that in a later. So we describe these as parents in the device tree and refer to these clocks. This is one of the new steps you would have to do in the recent kernels. If you were in older kernels, you would not need to do this. But we are upstream now so we need to work on this. Then to port the driver, as described, we need to do the parent data screen. Then we describe the parents of that particular platform in the device tree. In the downstream driver, there are a bunch of fields which are added for their own handling in downstream kernels. We don't have equivalent handling of those in upstream. So we kind of start to remove those bits. One of the things in the clock driver is VDD fields. So we remove that. Then for clock ops, we use different ops as compared to downstream. So I created a simple lookup for the clock ops. So in case of clock branch to hardware control ops, we have to use clock branch simple ops. And in case of clock gate to ops, we have to use clock branch to ops. So with this, your clock driver will be ported and you can start enabling the clock driver and boot with the clock driver. One of the debug tips in case of clocks is to look up at the debug FS. Debug FS is a very wonderful thing while enabling devices. It has lots of information for you to find out what is going around in the system and what is going wrong. So in this case, for the clock, we have sys kernel debug clock and then there's a file called summary. So clock summary basically tells you how many clocks are there on that particular platform, which is what are the parents of a specific clock, how many childs of the clock are there, what is the frequency they are running on and how many times the clock has been enabled or is it not enabled at all. So this gives you a good view. For example, if you are expecting a clock to be used by a particular device and that device is enabled, but that clock is not used or the rate is wrong. So it will help you to find out what's going on in the system. So when you look up at the clock summary in that particular concom chip set, you might be surprised that some clocks actually do not have parents. But if you look at the description of the upstream clock driver, you would also realize that. So the reason for this is that these are shared clocks and the Linux doesn't manage the parents. So we just described these clocks for the use in various devices. Excuse me. So with this, you will have the clock driver done and one of the things in the clock driver, as I said, is to look at the debug Fs to see if the things are done right or not and keep on doing it iteratively. We were talking previously about describing various clocks. So a little bit more information on that because I feel this is kind of not documented anywhere and kind of a tribal knowledge amongst the people. So on concom chip sets, crystal oscillator is present, which generates clock typically at 19.2 megahertz or 38.4 megahertz. And then it actually feeds that clock to the PMIC. In this instance, it was a PMIC PM8150. So that PMIC in turn feeds the 19.2 megahertz clock to the SOC as a CXO in. So the PMIC is the entity which takes the crystal clock and then generates 19.2 as well as bunch of other clocks. So we call this as RPMHCXO clock and that is like so for the SOC. And the controller which is resides inside the PMIC to configure and control these clocks is called as RPMHCC. So this is something which is specific to concom. Other platforms or other SOC business may or may not have this kind of architecture. So you need to figure that out. So this RPM clock controller is not directly managed by Linux, we kind of send the messages to our remote clock framework to the clock controller and ask it to configure the various clocks and so forth. So in order to describe this, so step back one, step back here, sorry. So in order to describe all these clocks, first we will first look at the device tree description of how we describe the XO and RPM clocks and the input feeding the input to the SOC and then we will go into details about the RPM clock controller. So since we have two fixed clocks, we describe these as a fixed clock in the description of the board's D-device DTS. This is not a SOC DTS, this is a board DTS. So we describe XO board as a whatever clock frequency and then provided the output name. In this case, we did XO board. There was additional sleep clock present, which is typically you will find in the bunch of platforms as 32-kilohertz clock. That sleep clock is provided here as well. Then in the previous diagram here, we saw that the crystal clock is actually feeding to PMIC. So then let's describe the PMIC RPM clock. So RPM clock controller is described here and we add a new compatible forward. We'll go into little bit details of that in a little while. And then we specify that the clock feeding to this is the XO board clock. So this is the one which is important and it describes the parent clocks. This is the work of Steve Boyard in upstream in describing the parent clocks directly. Then we go to the global clock controller and we describe that clock controller and we tell it that the parent clock is RPM clock controller and this is the clock. And it also takes sleep clock. So that's the second clock we feed to it. So this essentially tells in these two device tree description that we have a crystal clock and crystal clock is feeding to the RPM clock and then RPM clock CXO pin is feeding to the SOC as a DCXO input. So this was the parent description in the clock drivers. So with this, if you see there's a direct reference to clock, so you might have another clock named XO on the board, but we'll not have a namespace collision because we will be referring directly to this XO board clock here. Now discussing the RPM match clock controller which is the missing piece here. So as I was saying the PIMI clocks are actually managed by RPM, RPM stands for remote processor management and this driver is already upstream. So you don't need to do any much of it, which we need to take this look at the driver clock clock, all-com clock, RPMH.C. As you seen in the previous device tree description we will add a new compatible and describe the clocks because what we've seen here is that different platforms may have different clock outputs from the PIMI coming in. So these are described in RPMH with the different offsets. So this description is basically a table of description which needs to be added for the platform. So basically a platform-specific adaptation of the driver. So with this, we don't need to write the driver but just describe the compatible and the data. So with this the clock drivers are done. We switch to the different infrastructure used by the rest of the drivers. So first is a command DB. This is actually used by the clock controller. So basically command DB is a command database as the name might imply. It's a shared memory SOC driver. So it essentially helps finding the SOC-specific identifier and information. So as we discussed previously, there are RPM clocks and RPM needs to communicate to the PIMIC about the specific clock which needs to be enabled. So how does it go about identifying which has the information? So there comes this memory, there comes this command DB database which is populated by the firmware on the board and you can look up and find out, okay, for this particular clock, this is the particular identifier I need to use to communicate to the RPMIC and that's what we use. So this is one of the required prerequisites in order for us to enable the RPM clock controller. And you can, in the downstream team, you can find this information about the command DB from the memory map. So we add it in the device tree and it should work. The next is regulators. Unfortunately, this is one of the topics still where downstream is not reused in too much. Hopefully things will improve in future. RPM in this case, again, as we discussed like the clock controller, RPM also controls the regulators. So we have a regulator driver for that which is Colcom RPMH regulator. Driver, you can find it in the drivers regulator Colcom RPMH regulator.c. In the downstream description, you can see the PMIC ID. We use it to get the address from the command DB as we did for the clocks. So we describe, even if we are not doing something on the Colcom, if we are describing regulators, how would one go about it? Let's take a step back and think about that. So what we would go is we would take the schematics and look at the schematics and say how different devices are connected to the regulators and how I would go about this is to look at the PMIC supplies to start with, how many PMICs are present, how one of the supplies they provide, then if are there any SMPSs or LDOs present in the board and how they're connected to the supplies and then how the LDOs are feeding to the respective devices. So basically, I would create a map of the PMIC's regulator supplies all the way from PMIC all to the devices. And that is essentially the description we would provide in the board deep device tree. Key value here is that it should be a board device tree, not your SOC-T device tree because the way the regulators are connected to the device is typically it may have been independent on the board. Different boards may connected differently and some boards may connected straight away. So the description should actually be coded in the specific board DDS file. So we go and describe the PMIC supplies and the SMPSs and the LDOs that are fed and then describe the supply, LDO describe the supplies and then the particular device and DDS will describe that I'm using SO and SO regulators and this is how the whole connectivity will be established. Now, once we are done with the basic clock pin control and regulator VRs at the cusp of enabling the rest of the devices, but then there are small small infrastructure items which are mostly upstream and we just need to probably add a compatible or describe the device tree or add some driver data inside the particular driver. So we'll go about these items next before we switch gears. So in the SOC infrastructure, first is the PMU. In PMU, we don't need to touch the driver. We just need to add the compatible drivers upstream. It is ARMV8 PMU V3. Next is the pesky. In pesky, we use the compatible ARM pesky 1.2. No, nothing, no fancy things here. Then we have SMEM. In case of SMEM, we use the compatible Qualcomm SMEM and describe the node in the device tree. Next is the mutex. So in this case, we describe the Qualcomm TCSR mutex in as a compatible and describe the device tree node and we are done. Then comes the US QMP. In this case, we need to add a platform-specific compatible because if you look at the QMP drew a USS driver, this is present in driver SOC, Qualcomm, Qualcomm USS.C. There is a platform-specific offset and data, which is required to be added for each specific platform. So we go ahead and go and add a new compatible for this case, something like SMEM150 we added and describe the offsets at the driver and then the device tree node and we are done with that. Then we have a mailbox. We add platform compatible again and the data, like we did in the case of USS in the previous case. So in this case, the mailbox driver is driver's mailbox, Qualcomm, APC, SIPC, mailbox.C. And last is the apps RSC, in which case RSC stands for resource tag coordinator, we just need to add the compatible and the device tree node. So these are the small, small SOC infrastructure item which are required by various devices and drivers. So we just get these out of our way before we start the bigger items, so which in this case is UFS. So UFS has two parts to it. One is the controller UFS part and one is the file part. In case of controller, UFS controller is upstream. We don't need to do any code changes in it. We can use the compatible Qualcomm UFS office controller and we describe the DL and device tree. If you look at the downstream UFS description, the UFS also contains ICE, which is integrated crypto engine. That is not yet upstream and people have been posting the patches on it. So hopefully it should land upstream pretty soon. Next comes the UFS file. Typically we need to do some driver changes in order to support the required files. Files are mostly platform specific or even in this case, if the files are reused, probably we need a new initialization sequence or new calibration sequence for that particular file. One of the good things with respect to file in Qualcomm chipsets is, excuse me, they use common file which is called QMP file. It is used across various subsystems like UFS, USP and PCIEs. So QMP file driver is already upstream and what we would need to do for a specific platform is to kind of describe the sequences of its initialization as well as calibration. That obviously differs from the platform to platform. So in this case, what we do is we take the downstream file driver which actually is not the QMP file driver. They use a different platform specific file driver. So we take that particular platform specific file driver, collate the sequences for initialization and calibration and then try to code them in the initialization and calibration sequences in the QMP driver. And typically it does involve a bit of trial and error to get it working. So once you're done, your file should be up. By the way, even for the same platform, UFS, PCIE, USP, so forth will require a different sequences. They don't work, if you know a little bit more about the file, you will know why. The UFS file's initialization sequence will be a bit different from PCIE which will be a bit different from the USB. So each one is a specific sequence to the type of implementation it is targeting. So with this, if you have file driver up and running and then you should be able to see when you boot up various UFS partitions enumerated on your DMessage log. So that would tell us that the file is UFS is up. What typically I would do is to start running DD on that particular file on the particular UFS partitions and to see how much, what is the performance to tell me that if my configuration is right or not. So if you get a decent bandwidth out of the file, then it should be fine. And the last we are discussing is the USB. In this case, again, the control is upstream so no changes required to be done. We use compatible concom DWC-3 in this case. One of the distinct features I would say of this particular device node is that you also need to describe a child node which points to the core DWC-3 IP block. This is how the DWC has been designed upstream. So for this, we add the compatible for synopsis DWC-3. This controller supports both super speed and high speed file, high speed USB. Again, just like in the case of UFS, we need to add support for file for USB as well. So if the, we need to check what file is being used for the platform and again, downstream device tree will tell you the description. So typically if it's using QMP file for both super speed and high speed, then you already have a driver for it. And as discussed in the UFS case, we just go ahead and add the sequences for USB. If in this case, like few places we have seen that USB is not using QMP file, they're using different file. For example, SM815 is a zero platform uses synopsis file for high speed USB and for super speed it actually uses QMP file. So we can go about to go ahead and start adding the synopsis file driver and upstream that would be required to be done. So with this, we have kind of come to the end of the various component descriptions. So let's see where we are on the, today's upstream status of the SM815 zero platform. So global call controller is upstream, pain controller is upstream, regulators are upstream, device stream, all the description is upstream. Remote clock controller is upstream, all the remote blocks like ADSP, CDSP are upstream, UFS is upstream, USB the file is upstream, device tree I think is already probably picked up in this cycle. So with this, all the whole base associate infrastructure is upstream and we can kind of potentially look at bigger ticket items like media and modem and so forth. So there are some additional resources for people to use in case of knowing the more details about Colcom upstream support. Our team typically has a buff at Lenaro connect, we call it Colcom upstreaming buff. Here we kind of go through in depth in more details about what is the upstream state of each platform and what are the lingering items, what other folks are doing. So that gives you a good view if you want to look at the past connects, these sessions shouldn't be recorded on internet and you can find out what was the Colcom upstream status at previous connected and so forth. So from the core point of view, our Lenaro Colcom landing team has a integration tree which is based on every time on the latest kernel. So right now it will be based out of 4.rcx in the, so if you look at this integration tree, it will feature all the latest patches which are being developed by folks in my team and going upstream. So you can look at the recent work in progress code as well. Then there is 96 boats. We have a Snapdragon SDM845 based RB3 platform which is kind of our bread and butter of these tests for a development vehicle. And then there was a recently announced RB5 platform which features SM8250 chipset, a lot of patches for 8250 chipset are already upstream as well. So you can look at that board as well for your development needs. So with this, we come to the end of this presentation. Hopefully it helped you to understand various aspects of Colcom. So see how things are built on Colcom. So see how you can go about trying to upstream if that is the case you are interested in. With this, let's go to the Q&A. Hey guys, now for the questions, feel free to keep sending them in. We have got almost like five, 10 minutes to sort out the questions. The first question is from Alex about 32.764. That's typically sleep clock. You can see, this is the 32-cloud sleep clock which typically the oscillators will provide as an input and people use their clock. I don't see anything unusual in that. Can you follow up the unusual part of this for me? Next question is from Daniel Gomez. Do you have any sort of dark data sheets beside the downstream driver for doing the all upstream work? So as part of Colcom Landing Team, we do have access to some of the specs from Colcom and some data sheets, but not all of them. So sometimes we do get, take advantage of that and are able to figure out answers. But in bunch of cases, we do not get a lot of information. So yeah, but we have access to a lot of Colcom engineers who actually help us on understanding or answering our questions. Then the third question is from Ken. Colcom Associates are heterogeneous with many course such as RPM and ESS, which run non-open source binary software changes in these need to align to the version of kernel open source code. How is to best manage the versioning at code control? So for all the non-open source binaries, we do not do any changes to them. We use whatever Colcom provides us as part of the release for a specific board, and we do not have access to the non-open source binaries. And we just use as they come. And we don't do any versioning or code control of those particular binaries at all. So I won't be able to answer on how to do versioning on that part at all. Next is from Sudeep. How much did it total to upstream? All of that. So for this SM8150, I roughly would have taken three months approximately to do the whole BSP upstreaming that is along with my maintenance duties and other things which keep on coming. So if somebody is focused and knows what they are doing and has experience doing this kind of stuff, for me, it was actually a first-time platform I'm working on. So it should not take more than two to three months of dedicated time to upstream the complete base port. Then from the following question, from John, is for SOC to SOC, how much code upstream is reused? So as you can see, in a lot of talks, a lot of topics, as I mentioned, it actually is adding a compatible because the base driver is already upstream. So from that point of view, it doesn't take a lot of changes in code. It's just adding compatible and adding the data. But then things like clock driver and everything, or regulators, you have to kind of write the clock driver from scratch, describe the regulators for the board from scratch. But from the rest of the BSP infrastructure items, if you look at it from the table which you had seen on the upstream state, almost I would say 70% to 80% is actually reused from one SOC to other SOCs. Next, is there any upstream support for this question from Sanjeev? Is there any upstream support for IPQ 40x9? I am not sure about the IPQ line. I have not followed on that. Probably you can ping us on the IRC on the narrow channel or on the mailing list. I have seen some IPQ verges, but I'm not sure which of the IPQ version is that. Any more questions, guys? Going once, going twice. Thank you, folks, for joining in. Hopefully, it was a good learning experience. Unfortunately, I'm having some video issues, so you are not able to see me. And I had to log back on the phone line to give answers to you guys. Feel free to ping me on the Slack channel where we can continue the discussions on this topic or any other topics you may have interest on. Thank you very much.