 Hello and welcome to this embedded Linux conference 2021 talk. We're going to talk about advanced camera support for all-winner SOCs using only mainline Linux. So, first of all, a few words about myself. So I'm Paul Schakowski. I'm an embedded Linux engineer working at Bootlin where I do some consulting as well as some training. So I've been working especially on the Linux kernel in areas related to multimedia and graphics. And I've also created and I'm giving sessions of the displaying and rendering graphics with Linux training. So there are public sessions available if you're interested. I live and work in the southwest of France in Toulouse where Bootlin has one of its offices. So to begin this talk, we're going to see some very common notions about image capture technology. And first, let's take a look at kind of a simplified image capture chain so that we can see the major elements that are necessary for an image capture, a camera-based image capture chain. Then we're going to see some of these aspects in more details. So first of all, we have the optics, which are really in charge of shaping the light and converging the light onto the sensor, which is the electronic component in charge of converting the light that it receives into some digital values that can then be transported to some more complex system like a system on a chip. And we have some processing that can happen on the sensor or on the more complex system like the system on a chip, which is really in charge of converging the raw information received from the sensor into an actual picture that looks good and that makes sense for us to see. So when we have this good-looking picture, we can then send it to a display to be seen in real time. So something like a preview, or it can be encoded to something like a JPEG for a still image or into some video codec like VP8 or H.264 to produce a video. So during this talk, we're not going to talk about the display and encoding parts so much, but we're going to focus especially on the interface and the processing steps. So let's take a closer look at the hardware interfaces that are used to carry the information from the sensor to the system on a chip. So nowadays, it's quite uncommon to find analog interfaces, so they are mostly deprecated and instead we have digital interfaces that are used. So we find two types of those digital interfaces, the parallel, let's say family of interfaces, and then the serial interfaces. So typically the parallel interfaces are kind of used for the old or low-end sensors and the serial ones are typically used with the more high-end sensors. So if we take a look at a basic example for parallel, the way that it works is that we have typically one TTL signal for each data bit that needs to be transmitted. So we'll find like 8, 10, 12, 16, 24 bits bus width and we will find some signals aside of that like the pixel clock, which is the frequency for the transmission of the information and some synchronization signals as well, h-sync and v-sync, to synchronize the beginning of a new line and the beginning of a new frame. And if we take a look at a serial interface, we have something quite different because in this case we're using differential pairs. Very often in Mi-Pi CSI2 we use double data rates, meaning that we have two samples per clock cycle. We will have one set of differential pairs dedicated to the clock, so that's the clock lane, which will typically run at pretty high rates, again because the data is serialized. And then we will find a number of data lanes that can be between 1 and 4, in which the data will be distributed and gathered back on the other side. So that's the difference between those, let's say, two main families of digital camera interfaces, so parallel and serial. And so we're going to talk about Mi-Pi CSI2 a little bit later in the case of the all-winner platforms. So that was for the interfaces. Now let's talk a little bit about the processing step, which I mentioned. So processing is necessary in the camera pipeline because the data that comes from the sensor directly, so it comes from ADC that will sample some photo sites. These are not directly pixel values that you get, but instead you only get one of red, green or blue values for each photo site. So this is called a biopattern where you basically have color filters in front of each photo site to only receive one of the colors. And so in order to create pixels, you need to do some interpolation to actually give one red, green and blue value for each pixel. And then so that's one of the things that needs to be done in terms of processing, but there are lots of different things that need to be compensated or corrected. So for example, the brightness is captured in a linear way, but in order for the image to be displayed on a typical display, you need some adaptation with a gamma curve in order to give more, let's say, weight to the information in the mid and high tones. There are also some things related to the sensors, which is the dock level current. So typically the sensors will have a non-zero value for the dock, so you need to kind of deduce an offset to have some blacks that actually look black. Then you will also get issues about noise. Typically the noise will come from the amplification stages and in an analog electronic system, you will always get some residual noise. So this will actually show by creating some bad colors on the picture. So this is something that needs to be corrected. Then the colors that are captured are typically off and they won't really look realistic, so this needs to be corrected as well. So in order to apply these enhancements as well as other types of enhancements, the dedicated components to do that are called the image signal processors. And so typically they can be attached to the sensor, so they can be in the same package as the sensor, or they can be separate. Typically in the system on a chip there can be an ISP dedicated in there. And so basically the ISP, which is in charge of applying all the enhancements that we need, will be divided into three different domains. So we have the biodomain, which is the first step that handles the raw data. Then we have some RGB domains, so after we have done the debiring step to get some actual pixels, so we will then apply some enhancements there as well. And finally we will convert the image into a YUV representation, which is a different color model than RGB, which is really well adapted for video because it separates the luminance, which is kind of like the luminosity, the brightness, from the chrominance, which is the color information, because typically we will apply some subsampling to the chrominance in order to reduce the size of the image. So this is an illustration of these three different steps, so you can see on the left side the image with the buyer pattern, which looks really off and bad, and then the RGB step where we have applied some enhancements, so we have applied some white balance, denoising, things like that to make the image look better, and then finally we have the YUV color model decomposition on the right side. So this is a list of the typical enhancements that we find in ISP. So I mentioned a few already, like the black level correction. We can also mention dead pixel correction when the photosites are stuck into a specific value, which needs to be discarded. White balance is also quite important in order to adjust the balance of the red, green and blue channels to have a correct balance between those and for the whites on the picture to actually look white. Noise filtering I mentioned already. The color matrix is used to recreate the kind of fidelity of the colors, the gamma to adjust the brightness for the non-linearity of the displays. The saturation, which will increase the colorfulness of the image, then the general brightness to increase or decrease the luminosity. We can also play on the contrast to increase or decrease the difference between the dark and the bright areas of the image. So that's typically the base enhancements that we will find in an ISP. But we can also find some more advanced enhancements as well. For example, the lens shading will correct the irregular brightness that appears on lenses, where basically you have a bright spot at the center of the lens and then on the edges you have more darker spots. So the lens shading correction step will basically flatten this to have the same brightness on all areas of the lens. Lens de-warp is used for, let's say, the fish eye or the lenses that have very low focus, where basically the geometry will be distorted. So the de-warp will basically restore the correct geometry. Then you have stabilization, which is about cropping into the image to remove the shaking that can happen if the camera is handheld, if it's not very stable, and so on. Then finally, another enhancement is the color lookup table, which is basically a way to translate from the colors that you have into another set of colors which will be slightly different in order to give a specific style to the image. So for example, this is often called a filter, so you apply a filter to the image to give it a specific style. So like I was saying, the hardware implementations of the ISPs that do all of these steps to improve the image and to make it look reasonable and look like what we expect, they can be implemented in the sensor, in which case the data that is sent on the hardware camera interface will directly be the YUV data, which is the final step of the processing. It's also possible that it only does some of the enhancements, but not all of them, so you will still, in that case, get some buyer data that was slightly modified. And in the other case where you don't have an ISP or just a very simple ISP, then it's up to the system on a chip to have its own ISP to apply the enhancement steps and get a good looking picture at the end. So in that case, the ISP on the system on a chip will need to have some calibration data and some specific configuration to be adapted to the sensor and to the lens that are used, because some of the steps are quite specific to the true. So there are some parameters that need to be adjusted depending on the situation in order to create a good picture, so not all of the parameters can just be calibrated once and then used for the whole lifetime of the camera. So this is typically the case of the focus, which will depend on the area of interest that we want to see sharp and clear. Then there is the white balance, depending on the light source that is used. Typically you can have warm light sources which tend to be more orange-ish or you can have cold sources which tend to be more blue-ish. And so depending on the light that you use, you will need to apply a different white balance to get the white areas of the picture to actually look white instead of blue-ish or orange-ish. And then there is exposition exposure, which is kind of one of the very important steps to get a good looking picture because it's about regulating the amount of light that will be captured by the sensor. So this is typically depending on the number of parameters. The first one being the diaphragm that can be in front of the sensor, which will open or close to allow more or less light to get in. Then there is the exposure time, which is the time configured in the sensor for which the photo sites are exposed to light. So if they are exposed for a short time they will get less light than if they are exposed for a long time. And then another parameter which is not directly about the light but more on the second step is the amplification gain that is applied on the sensor. So even if you get a small amount of light you can still apply an electronic gain which will increase the signal that is received but it will also increase the gain. So typically this is not a very good way to get more information because you also get more noise that way. So these parameters need to be set manually and people who are photography enthusiasts know how to kind of tweak these different parameters to get the result that they want. There are also some artistic implications when changing these parameters but in many cases we basically want these parameters to be controlled automatically and this is where the notion of 3A algorithms comes in. So these algorithms are basically about controlling these parameters automatically and basically doing a good job at that. So having automatic exposition in order to control the exposure time perhaps also the diaphragm if it's available you have the autofocus which will detect if the scene is sharp or not and then it will adjust the focus lens to get the scene sharp and in focus and then you have the auto white balance like I was saying depending on the light source that can be adjusted and so there are these 3A algorithms to do that so common algorithms are described in the academic literature so they are known but the actual implementations will depend on the specifics of the ISPs especially because these algorithms need to implement a closed loop or feedback loop system so basically the ISP will gather some statistics about the picture, about the frame that it will use to deduce the correct parameters that need to be applied for the exposition the focus and the white balance and so the way that the statistics and the parameters are defined is usually specific to the hardware and as a result the algorithms are often also hardware specific and considered to be the secret source so they are not often out in the open and there are also no generic implementations for the 3A so keep in mind that this is typically something that is dependent on the hardware which is the ISP and the sensor as well so that's it for kind of the general notions about the technology so we talked about the digital interfaces and then the ISPs more specifically which is the topic that I wanted to mention about the all-winner platforms so before we get into that let's kind of take a global look at the status of all-winner SOC support with Mainline Innex and especially regarding the components that relate to the camera integration so we basically have 3 main types of hardware blocks that are involved in the camera pipeline the first one is the central one it's called the CSI controller and so it does a bunch of things especially it has a DMA engine to write the pixels that it receives into the memory of the system and it can receive pixels from parallel interfaces so it supports different types of parallel interfaces so there's simple parallel with just TTL signals it also supports BT656 parallel interfaces which are kind of a subtype which are specific to some standards and so this is the central controller that we nearly always find in all-winner SOCs or at least we find it whenever the platform supports camera some camera interface and there are basically 3 generations of these controllers the first one was found in the A10 and A20 and similar chips so it has evolved into a second generation that was found on let's say a greater number of chips and it started with the A31 and we also find it in lots of other generations up to the A64 so the work that I've done around that was on the V3 so that was on the second generation of the CSI controller and for the newer platforms there was a third generation of this controller which is now called the CSIC and it goes from the A63 up to basically the new platforms that all-winner is releasing recently so this is the latest one that we know and so in that third generation there was basically a whole redesign of the camera interface blocks in all-winner SOCs so it's kind of quite different from the second generation even though you still find some similarities there are also some significant differences so that's for the just the CSI controller and CSIC then there is MiPy CSI2 which is basically a separate dedicated controller that is connected to the CSI controller so it acts like a bridge we find different hardware implementations of this one there was one specific implementation on the A80 and those two are really one-off of the time kind of implementations that were not found on any platforms and then we had the first generation of a common implementation which was found first on the A31 it's also found on the V3 probably also on the T7 chip and then we find a second generation of common let's say implementation which is made as a combo with other interfaces like Sublvds and High Spy which are not all of them but it's always available but it's part of a broader design of interface bridges and so we find that on basically some of the devices from the third generation of the CSI controller so we have the V5, V536 and V533 and so as you can see the MiPy CSI2 controller is not always available in the old winner chips just some of them have MiPy CSI2 supports so they are pretty much listed in the slide and then we have the third important parts of the camera pipeline which is the ISP processor and so this is the image signal processor implementation from a winner there was a first generation that was tied or glued to the CSI controller in the A10 and 20 and related then there was a second generation where it became a separate block that would still be connected to the CSI controller but it had a different register layout and memory address and etc so it's found in the A31, A80 A83T and V3 and we also find a trimmed down version on the A23, A3 and H5 even though it's not advertised it's there in the hardware but it doesn't support row bayer processing it only supports very minimal processing so it's nearly useless I would say why Aluina doesn't mention the fact that it's available in those SOCs as well and then we have a third generation which kind of matches again the third generation of CSI controller in this case the ISP was also redesigned a little bit it supports more features than the second generation and so you find also other naming now it's called ISP500 520 and 521 and so you find those on the V5, V5, V36 and H616 SOCs so the thing we can remark is that typically we get an ISP in the hardware when there is MiPy CSI2 support this is because typically MiPy CSI2 will be used with some sensors that send row bayer data in which case the ISP needs to be in the SOC whereas for the parallel interface it's quite common to have the ISP on the sensor and so then you just receive YUV data and you don't need an ISP on the SOC side so that's for the hardware related to the camera interface if we take a look at the general platform support for our winner it's in pretty good shape there's a very active community around it called the Sanxi community there's a wiki page about the current status of mainline and basically what we can see is that the multimedia areas are basically the remaining ones that are not fully supported so there's currently a driver available for the first generation CSI controller so for the A10 and A20 and related and there's also a driver for the second generation Sanxi CSI so for the A31 based CSI controller there is no support for the third generation and there was also no support for MiPy CSI2 ISP support so this is basically what I've been working on ISP support was also in the all-winner SDK and on free blob so it was quite difficult to get started in that area so let's talk a bit now about the general status of the V4L2 framework which is really the Linux kernel subsystem dedicated to media support basically it supports everything that relates to pixels which is not a display controller nor GPU which is embedded in DRM so V4L2 will typically expose video devices to user space which provide a generic and coherent API to do all the things related to image capture so the typical steps will be format negotiation to configure the format of the pixels that will be received to do the memory management to allocate the buffers to map them into the virtual memory of the system and then we have a queue interface where we can basically submit the buffers to the driver then they will be filled by the hardware by the DMA enzymes and then user space can get the buffers back with the pixels inside so this approach works well but it works pretty much for all-in-one devices where you have a DMA interface available in the same device but we're going to see that there are some limitations for more complex cases typically when we have a chain of multiple blocks like the bridges that we might have so when we have multiple components that each can have their own configuration for example the sensor would be one of these blocks the MiPy CSI2 bridge would be one the CSI controller would be one of these different blocks that we can sometimes connect in different ways so we can configure the topology as well as configuring each block and so in order to support this the notion of V4L2 device was not sufficient and so a new notion was introduced into V4L2 which is the notion of sub-devs so the sub-devs represent just a single block which is typically not DMA capable it will still be exposed to user space through sub-dev nodes they will have their own format configuration, their own stream management and basically the goal is that the video devices which are the DMA interfaces into memory will call into the sub-devs to typically stop the stream or stop the streams and to get the whole pipeline going to finally receive the pixels into memory so in order for this to work the sub-devs need to be gathered into a parent structure into a parent device which is the V4L2 device not to be confused with the video device so the V4L2 device is really the controlling entity that will group the video device and the sub-devs so if the hardware is let's say relatively simple and you just have one hardware entity that might have some sub-entities then you can have a single driver which will register a V4L2 device then it will have a video device DMA interface part and then it might have some sub-devs if there are parts internally that can be configured and whether topology can be changed but in more complex cases you might have multiple drivers involved typically when you have a sensor, the sensor has its own driver which registers a sub-dev and then you will have a DMA interface driver which will be distinct and separate and so in this case you will need to have some link between these two drivers so the way that it works is that the driver with the video device will register the V4L2 device and then the driver for the sub-dev will register the sub-dev asynchronously so basically it makes it available for another driver to bind and use it so then we need a step for that driver to identify the sub-dev and to claim that sub-dev in the V4L2 API so the way that it works to do that is that there is a representation of the connection between these blocks to the FWNOT graph so typically this will use the device tree port and endpoint representation where you will be able to create these links between the different devices so the main driver will basically get a reference to the endpoint using the FWNOT graph get endpoint by ID then it can pass the endpoint because this endpoint will also contain some information about the bus if that's applicable so for example for a sensor device you will get some information about the camera hardware interface bus that is used to connect the sensor to the SOC so for example here I've put the type and the structure for myPi CSI2 used with a DeFi so you will be able to get some information that needs to be used or the clock frequency that needs to be used with the sensor so this is an illustration of the device tree that we have to connect the sensor to the myPi CSI2 bridge so you can see the port and endpoint representation with the properties describing the specifics of the bus so now that the top driver or the main driver was able to identify the sub-devs using the FWNOT graph it will basically need to wait for that driver to be available so this happens with the V4L2 async notifier so basically the driver will create a notifier it will register the FWNOT handle that it got from the device tree representation with the port and endpoint and then it will register the notifier with the callback and when the driver becomes available so when the sub-dev becomes registered asynchronously then this driver will get a callback and it will know that the sub-dev has become available and then the sub-dev will be attached to the V4L2 device registered by that driver so this is how the link between the two actually happens so that's for the V4L2 sub-dev port so now we have a way to have multiple sub-devs that can be connected to a video device driver and the sub-dev can also depend on other sub-devs using the same mechanism so you can create complex chains like that with multiple sub-devs that will call into each other but this doesn't allow us to represent the topology of these devices and so for that we have an extra API called the media controller API where we will basically describe all of the elements that are involved in the media graph in the media pipeline so basically these blocks can be they are called entities so they can be video devices or sub-devs and each entity will have some pads which represent the connection points between those entities so there can be sync pads that will receive some data or source pads that will provide some data each entity has a function it has a particular function that is declared and we will find links between the pads of these entities so then basically we have a complete representation of the different path that can exist between those different entities so when multiple paths exist all of the links will be registered by the driver and then user space will be able to select which link can be enabled or needs to be enabled or disabled so that it's possible to change the actual topology so to change the actual data flow between the entities so we have this media CTL command line tool that can be used to enable or disable links it can also be used to visualize the topology of the different entities that interact together another important thing is that when we start streaming when we actually start streaming some data there will be some runtime validation that is performed by the media controller subsystem it will basically check that on all the enabled links the source and the sync have a valid connection so for example if the two entities have some configuration you need to make sure typically that dimensions will be the same that the formats will be the same and so on so that it makes sense to connect the sync and the source of these different entities so this is now an illustration of the IMxCaptureDriver which is quite a complex one so you can see in green you have all of the v4l2 sub-devs so you can have some sensors at the top and some bridges or processing blocks in the middle and then in yellow we find the video devices which represent the DMA interfaces where we'll actually get the data so now that we can see how to support complex pipelines with multiple devices thanks to the sub-devs FWNOTGRAPH AsyncWare Registration Media Controller API let's talk about the ISPs more specifically so the ISPs typically have very specific parameters for their internal blocks which are not represented through the media graph or with sub-devs because they are way too specifics so typically we will have one sub-dev one core sub-dev that represents the processor of the ISP and then we will also find capture devices for the outputs so there can be one or multiple capture interfaces depending on the hardware but then we will also find video interfaces which are not for capture but which will be for providing the parameters for the configuration of the ISP so you also get a video device with a queue but in this case what you provide to the queue is not a pixel buffer that was allocated but it will be a structure of data so the type of the queue will be different it will be a meta output queue for the parameters and you get something quite similar for the statistics that is provided by the ISP especially to the 3A algorithms so in this case we have a meta capture queue where we will get a specific type of data structures that represent the statistics that were gathered by the ISP so typically with those different devices we are able to represent the specific parameters of the ISP the specific statistics of the ISP and then we have the capture interfaces and a subdev in the middle which kind of coordinates everything so if we take a look at one such ISP driver which is the RK ISP1 driver which is a really good example if you want to understand how the ISP topology works you can find a subdev for the sensor a subdev for the ISP itself which coordinates and then you have in yellow the video interfaces so you have one for the parameters one for the statistics and then you have two for the capture the capture video devices so now let's kind of talk about the specific work that I have done for advanced camera support on our winner so we can see that basically all of the basic pieces that we need to support it are available in v4l2 so what was left was to actually add support for the relevant components so the scope of this task was first to add support for some specific sensors that we were going to use which use mypycsi2 so on my side I worked on the OV5648 with mypycsi2 on the old winner v3s so this required adding support for the controller of the a31 which is the one that the v3 SOC is using in parallel we had an internship in the summer of 2020 where Kevin Owenton worked on a different sensor with a different platform so that was the a83t which had a different mypycsi2 controller so we kind of worked on the two in parallel and then in a second phase I added basic support for the ISP so the second generation ISP with basic features like debiring so converting from the buyer representation into RGB and into YUV with some specific gain and offset for the red, blue and red components and I also added support for the 2D noise reduction so this is the banana pie m3 which is the board that was used for the a83t part of the work which our intern Kevin has worked with so let's talk about supporting the sensors so we had two new sensors that were not supported by the Linux kernel and adding support for them was actually quite significant work because we had some reference code but it was using lots of large arrays of register and values which were just provided as hex numbers with no documentation and no details so we had to look for the documentation in the actual literature of the manufacturer and write a clean and proper driver that would correctly define the registers instead of just use we also wanted to make some clear code and so to avoid these large arrays of pre-configured registers but instead have some proper functions to configure the registers for each part of the sensor and one of the aspects was also to find out and document the clock tree which is used to derive the specific timings that are used for the different dimensions and fram rates that will be used for the sensors so this resulted in big structures, lots of definitions to create some clean drivers but sometimes for a small number of registers we didn't have the documentation even in the literature of the manufacturer so we still had to have some fixed arrays but they were really minimal and we basically reduced that to the minimum so here you can see the structures that were defined for the modes on the right side so you can see the different elements that need to be configured and especially the configuration for the PLLs which define the different clocks that will be used internally to support the mode that we want so the dimension and the framerate that we want for our sensors so we supported multiple modes for each of the sensors so this was sent out first in October 2020 there were a number of iterations and eventually it was accepted in December 2020 and you can see that each driver is between two and three thousands of lines so these are pretty big drivers to support these sensors so now let's talk about MiPy CSI-2 support which was developed in parallel of the sensor drivers so again like I was saying a bit earlier the MiPy CSI-2 controllers will actually feed the data that they receive into the CSI controller so they are basically represented in VIFRIL2 which is the sub-devs of the CSI controller and then we have the sensors which are sub-devs of the MiPy CSI-2 bridges so to support all of that we needed to add some adaptation to the CSI code to properly select this MiPy CSI-2 interface instead of the parallel interface then these drivers also need to retrieve the pixel rate from the sensor in order to correctly configure the clocks especially the DeFi block which handles the physical layer of MiPy CSI-2 it was integrated using the generic Linux PHY API which supports the MiPy DeFi but the helpers didn't actually account for DDR so we had to add a factor of 2 at some point in the code so to actually have support for the hardware we could use some reference source code from Allwinner for the A83T which has a number of magic values which we couldn't really figure out because there was just no documentation around so we imported some of those magic values into our mainline drivers in the case of the A83T the DeFi code is mixed with the controller code so the registers are very... they are in the same layout but in the implementation we still separated the two and we created still a PHY provider and consumer within the same driver so for the A31 generation MiPy CSI-2 bridge was implemented on the V3 V3S there was some reference source code in the Allwinner SDK which wasn't so bad we also found some documentation in the A31 user manual about these controllers so this was quite helpful for the DeFi in this case it's separate from the controller it was the same that was used for MiPy DSI which was already supported in the Linux kernel the difference was that we needed to use this DeFi in receive mode so we added support for receive mode but then we needed a way to select or to distinguish to know whether the PHY was going to be used in RX or TX so at first I came up with a sub-mode which is something from the PHY API that allows to select a sub-mode for the PHY but this wasn't really appropriate because it's not a runtime decision the block will be dedicated to CSI-2 or DSI so it will either be receive or transmit but it won't be something that you can switch at runtime which is what the sub-mode is really about so then I considered using a different device 3 compatible to represent that but this is not really a good fit either because it's still the same hardware block it's just the way that it's used which is different so a different compatible wasn't really justified so in the end I used the device 3 property which is optional to indicate the direction of the transfer that the PHY was going to be used in so then when you configure the PHY depending on the mode it will configure the receive or transmit part accordingly so for this work, for the MIPEI CSI-2 bridges the first iteration was sent out in October 2020 and it was later integrated into the ISP series the biggest series adding support for the ISP so let's talk about this one now starting with a few elements about the ISP itself so the ISP will receive the data from the CSI controller it also has an input for direct D-RAM, so direct DMA but it's very likely broken it's described in the registers but it's not really functional or at least no one was able to make it work including a winner so we can consider that it just directly takes the data from the CSI controller so this requires also some work on the CSI controller itself to separate the bridge from the DMA engine on the CSI controller because when we use the ISP we no longer want to use the DMA controller of the CSI controller we just want to configure some parts of it but not the parts related to actually writing to memory which will be done by the ISP there is actually an internal MUX to inside the CSI controller to direct the data flow to the ISP or to the CSI DMA engine and we found that when you start using the ISP and the switch becomes switch to the ISP path then you cannot switch it back to the CSI DMA controller without a reboot which is kind of problematic and this is something that we will probably need to address at some point in the code to print an error or something like that so another aspect is that there are two outputs available for the hardware we only added support for one the main channel and so like I was saying adding support for the ISP required some rework of the CSI so to separate the DMA engine from let's say the more common logic that we need to keep but there was also the difficulty of attaching the CSI controller to the V4L2 and media devices registered by the ISP driver when the ISP is available is when the ISP is not available we still wanted the CSI controller to register its own V4L2 and media devices so there are basically two code paths that are a little bit different to handle these two different situations and we have a helper designed to detect if the ISP is available or not so this is the media topology that we have for our driver here you can see we have a sensor we have a mypy CSI 2 bridge then we have the CSI bridge and we have its own capture interface or which can be connected to the ISP so the ISP gets a video node for the parameters and it gets a video node for the capture so where the final pixels will be received one thing when we when I added support for this driver is that I found out that it has quite an unusual synchronization mechanism so only a very small number of drivers can actually be accessed directly with memory mapped IO but for the other registers which concern most of the modules of the ISP you need to write the register information into a buffer then you provide the address of that buffer into the first registers that can be accessed and then at the next vertical sync when a flag is set the ISP will go and read this load buffer and it will copy its contents into its actual registers and it will use those values to process the frame that it receives so this is basically a synchronization mechanism that works by writing the next registers so the registers for the next frame into a buffer in DRAM and this gets copied by the hardware itself and it will also copy the old contents of the registers into a safe buffer which is also allocated in memory so like I was saying we have a specific video device to receive the parameters of the ISP so it has an attached UAPI structure send six ISP params config which basically has the parameters for some of the modules for the ones that we support and like I was saying earlier currently we support debiring with some coefficients and we support 2D noise filtering in that driver but there are many many more modules with enhancement steps that exist in the hardware and as a result the API especially the parameters API is incomplete it doesn't support all of the modules so as a result this UAPI cannot be made a stable UAPI that is public in the kernel so this is why the UAPI and the driver were submitted to staging and so then they can be removed once all of the features are supported so the patch series for this was sent out in September 2021 and you can see it's about 8000 8000 to 9000 insertions so it's quite a big work lots of code that was written and modified for this there was basically a major redesign of the CSI driver so now that this was submitted it's still in the review and there will be more iterations before it makes its final way into Linux let's take a look at the future steps for this driver so we could add support for more platforms especially the A83T which should be very very similar to the V3 implementation we should still have some hardware revisions that are exposed and there is a way to do that with the media device because even in the same generation of ISPs there are some modules that are not available in some specific platforms so we should account for that with a hardware revision then we should also add support for the statistics for the three algorithms we can also add support for the second channel for the output but also support scaling and rotation which are available of course completing the UAPI with the description of the parameters for all of the modules would be really great and it would allow stabilizing the UAPI and of course adding support for the driver itself would be great it would add some extra enhancement steps to make the received image look good and be what we expect the image to be and then when some of these modules are available we will be able to develop 3A algorithm support to do the automatic white balance, focus and exposition and by the way this is something that should be implemented in user space and there is a community driven project specifically to gather the implementations for that so this is lib camera I provide abstractions for the applications that want to use these complex types of pipelines so it will handle both the complexity of the pipeline as well as the hardware specific 3A algorithms implementations so we will need to add one for the Alwina A31 ISP and it is definitely a very good fit this is what lib camera is there to support so it would be really nice to add support for that so if you are interested in using this ISP and if you would like to see this work continue in the steps that I have just described if you would be interested in lib camera support of course feel free to contact us and we will be happy to start a discussion to see how we can tackle all of these different points and to conclude just a final note about the hardware availability it is not very common to find Alwina boards especially with the drivers that I have been working on so with the V3 ISP and the A31 MiPy CSI2 support but there is the S3 Olinux Sino that was announced by Olimax which has a Raspberry Pi compatible MiPy CSI2 connector so you can see the board here and it will soon be available and it will typically be able to leverage all the work that I have presented to you today so that's it for me thank you for your attention if you have questions or things that you would like to discuss about this feel free to contact me again we are definitely interested in continuing this work so if you have interest in that let's get in touch and continue the discussion so thank you