 So good morning. This talk is about supporting video serializer and digitalizer chips in the Linux kernel and the works in progress to achieve that. I'd like to know how many people in the audience are using or planning to use one of those chips in their projects? Like maybe 15 people. It's a lot, okay? So I hope you're going to find this interesting. So I am Luca Ceresoli, I am an embedded Linux engineer at IAM Sportline, where I'm designing the next generation dashboard, data loggers and cameras for motorsport applications. I work on the Linux kernel, bootloader, device drivers, FPGA programming, as well as build system integration. And I level open source, of course. I contribute to some projects as soon as I can, like the Linux kernel, U-boot, build root and others. And I've been doing some work on these video service chips. So first of all, I will introduce briefly what these chips are, what they do in case it's not clear. I will then review the current efforts to support them in the Linux kernel. And then I will show my ideal implementation and the troubles I've found in trying to implement that. Then I will review the way I'm following to try to avoid these problems or solve or avoid them. And finally, I will speak briefly about how I'm implementing the remote I-squares feature of those chips. Okay. So in case you don't know what the service chipset is, consider a very, very basic video pipeline with an image sensor and a system on chip. So the video sensor is sending frames to the system on chip via one of the typical video buses, such as maybe LVDS or parallel. So there can be infinite variations on this theme. But the common thing is there is a producer chip, a producer and a consumer. The producer is sending video frames to the consumer on a video bus. So this is what we care about. It could be a system on chip sending to a display or many other things. But so there is a producer and a consumer of the video and then there are control signals, typically I-square C for register access and a few GPIOs, but it could be different again. Anyway, these are pretty slow compared to the video stream. This is okay if the two chips live on the same PCB or separated by a short flat cable, as in many camera applications or display applications. But if you need to transmit video at several meters or if you have a highly high electrical noise over the link, then that's where these service chips come into play. So they work together as a pair and there is a serializer chip that takes the video from the producer and serializes that over a very fast and very robust link, which is usually two conductors, either a coaxial cable or twisted pair or similar. And this cable goes to the digitalizer, which then reproduces the same or equivalent video stream to the consumer. So it is relatively transparent to the producer and the consumer. But of course you have this very fast video link and you want to also transmit the control signal over that because it's very handy. And so these chips also allow to do this and have remote control of the image sensor or any other chip. Okay, the typical application for those chips is automotive. So you can have rear camera or multiple cameras for autonomous driving or infotainment display. There are similar applications. These are not much different from a classic embedded application because there is a system on chip at the center. All the devices are known and fixed and always connected together. So it's just that some parts of it are a little bit more far away. And there is some high electrical noise, which the service chips will take care of. In my case, I have some extra requirement. The application I've been working on is an action camera that has a base module with the system on chip storage processing and whatever. And it has two hot pluggable camera modules. Those modules are interchangeable by the user even at runtime. And one can remove and reconnect one of the sensors while the other is still recording and potentially replace one sensor with a different model at runtime, which is a pretty strong hotplug requirement, which is not present in all classic automotive applications. Okay, so a brief review of what's available. Actually, there are two main producers for this kind of chips. Texas Instruments has the so-called FP-dealing series and Maxime produces the GMSL series. They are very different, but they are common in the main features they deliver. So they can drive a camera or display. They handle the most common video buses. They do remote, I-S-Per-C, GPIO, and some chips also have remote audio, UART, and others. And there are deserializer chips that can handle more than one serializer connection. Specifically, the chips I've been working on are the Texas Instruments DS90, UB954, and 953. The 953 is the serializer. It takes a MEP-CSI2 bus, and the 954 is the deserializer that can receive two links from two different serializers, and they are output over a single MEP-CSI2 link. So it can transport two streams over the same bus, which is one of the CSI2 features. These chips also handle remote, I-S-Per-C, and GPIO, and they transport the clock remotely, and so they are overall pretty similar to other chips in the service market. Okay, Linux support at the moment is zero for these chips. In Mainline, there is no support for these chips, even though there is interest in them, not only by the people in the room, but also in the Linux development community. And there are at least three patch sets around covering those chips. One of these patch sets is made by a bunch of video for Linux developers. It covers Maxim GMSL chips. It has been presented during last year Automotive Linux Summit. Unfortunately, the talk was not video recorded, but the slides are available online. They are quite interesting. Another patch set has been sent by Vladimir Zapolsky, and it implements many chips in the TI family. And it was presented in the same conference, and also those slides are available, and pretty interesting as well. And the third patch set is the one that I have sent to support the two mentioned chips by TI and that I'm presenting here. The initial idea behind my implementation is an ideal one, the one that I think is the best possible for this kind of chips and for the hot plug requirement that I have. So the situation is at the beginning at boot, the deserializer is always present, it's always instantiated, but nothing is connected to it. Then at some point, something is connected, and the deserializer detects the link, the driver is aware of that. And so the first thing is the deserializer driver, sorry, the serializer driver is instantiated and the chip is configured. Then an I2C adapter is instantiated inside it to control the remote I2C bus. And finally, in order to detect the model of the remote camera that has been connected, there is a small I2C apron with a model ID. This is inspired to, you know, biblical bone capes and similar. So the software can know what device has been connected, what specific module has been connected. After that, it can load a device to overlay, describing that specific model, and all of the other components are instantiated. They could be more than just an image sensor on the remote module, so they get all instantiated. This is, as I said, an ideal implementation, but unfortunately, I have been finding some troubles in trying to implement that. Basically, the sum of what I wanted to do is not doable, at least not currently, in the kernel. And this is why. So first, we have some troubles related to video for Linux 2. And a couple of them are applied also for non-HotBlag applications, so they apply to any case using this kind of chips. First is stream multiplexing. So we have, I don't think you can read, but that's a video for Linux 2 pipeline. And there is a digitalizer here, which is receiving two links and producing only one link, which is the MEPC-SI2 that transports two different streams. So what happens is stream multiplexing support is not yet implemented in mainline Linux. There are patches around to implement that, but not yet merged, so that there are still a few issues to be sorted out. Hopefully, they will get into mainline quite soon, as this kind of chip is, I mean, MEPC-SI2 is being used, so hopefully we are going to have support for that quite soon. The second problem is related to reliability. As I said in the classic automotive application, the hardware is fixed, never modified, always connected, but a sensor could be faulty. And a video for Linux 2 pipeline currently expects all of the components to be ready and connected and working before it can start streaming. So if a sensor is broken, then all the pipeline cannot work. So there is some work to be done to allow at least the other cameras to work properly. This is also well known in the video for Linux 2 developer community, and there is hopefully a solution coming at some point. But the rest of the issues I found was specific to the hot black application. And so first is dynamic pipe. In my case, I need to remove a part of the pipeline when a sensor module is removed and insert it again, or maybe insert a different piece of pipeline because a different sensor has been connected. And this is not envisioned in video for Linux 2 currently. The pipeline is not supposed to be modified, especially while streaming. And so this does not seem to be easy to cover in the near future. Other problems for a good whole plug implementation come from the device 3 word. First thing is device 3 overlay insertion and removal is not implemented in mainline Linux. There are patches around since several years now. They have not been merged mostly because they would trigger some other bugs. So I knew Frank Rowan is working on them, and this work is progressing. Maybe not super fast, but it's good that it's progressing. So at some point this will be available. But another problem related to device 3 is the rise from how video pipelines are described in device 3. So what is done to describe a pipeline is that for every link there are two pointers to be handle references actually in device 3. One from the source to the sink and the other is from the sink to the source. And so in our case we would need the overlay device 3, for example the camera sensor to point to the base device 3, but also the base device 3 to point to the overlay device 3, which is not really nice to have. So it can work, but it's not super nice because having a reference to an overlay from the base DTE doesn't look like a really good idea. Okay, so the good thing is that's all for the problems. The bad thing is there are some hard blockers to support a whole plug application and mainly the non-modifiable pipelines in video for Linux 2 and the device 3 overlay removal and insertion at runtime. I'm not thinking about solving all of these problems not before the product is supposed to be on sale. And so I've been starting to find some way out and well it cannot be if not a workaround, however I've been trying hard to find a workaround that is as mainline friendly as possible, as clean as possible for a workaround at least. And two primary goals I said to myself was I want a workaround that is not needed for non-on plug applications. So since they seem to be the most common applications they can stay clean, they can have no workaround. And the second, they must not touch these service drivers, otherwise they cannot go in mainline or I cannot use the same version as in mainline. And the result is basically the core of the problem is neither video for Linux 2 nor device 3 want to see the pipeline change at runtime. And so I will not do that. I will pretend that the sensor is always there. The sensor driver will always be instantiated. The kernel believes there is always a sensor. And so video for Linux 2 is happy. The pipeline is not modified. Device 3 is happy because there is no overlay insertion or removal. The sensor drivers become the ugly hack because it has no standard way to start and stop the stream. And also there must be a single driver to support each possible sensor which is really, really ugly. But the good news is this doesn't affect the service drivers and it works. So actually it slightly affects the service drivers in this way. In my ideal implementation, as I mentioned at the beginning, I would instantiate an iSquare C adapter inside the serializer because what, where it looks to be natural to be, because it's the chip that is actually driving the wires physically. But in this case, when the link is gone and the serializer driver is instantiated, the iSquare C adapter would be gone, which is not possible because the image sensor is still there. So actually the solution was pretty simple. Move the iSquare C adapter within this deserializer driver, which is always present. This seems a little less natural, but from the code it is actually pretty clean. And since the iSquare C remotization feature is implemented by the pair of serializer and deserializer together, it is, I consider this to be still quite okay. A similar situation happens for GPIO. This chip says very complex ways to configure remote GPIOs. I won't enter into detail if you don't care about them. But the end result is that there is a GPIO chip instantiated in the deserializer driver, just like the iSquare C adapter, which is used to drive the GPIOs in the image sensor. And so this is basically the impact of my approach on the service drivers. The rest of those drivers can stay clean and this is, all the haxes are limited to the image sensor. And well, okay, so that's all for the overview of problems and the way out from them. Now I would like to briefly introduce the remote iSquare C feature and how I've been working to implement that. This is quite important because, well, of course it has to work, but not only, you know, iSquare C not really meant to be hot black and also remote iSquare C is something a bit weird out of the classic iSquare C scheme. This has been discussed a lot with iSquare C maintainer, Vortram and other people. So I think we have now a pretty good idea of how to handle this. First, the two vendors, TI and Maxime implement a very different way of doing remote iSquare C management. These were discussed in plenty in the Linux iSquare C mailing lists and also during above, during last September, Linux Canvas conference. There are a couple of reports online if you want the details and also Vortram mentioned that yesterday during his talk. So I'm not a Maxime GMSL expert, but what they do for iSquare C is basically the serializer plus the serializer together implement what is equivalent to an iSquare C switch, meaning practically what happens is as if there were a physical connection, a physical wire, two physical wires actually connected, connecting the upstream bus to one or more of the remote buses. This is exactly what an iSquare C switch does or an iSquare C mooks. So what happens is any transaction on any of those wires, the remote bus and the upstream bus is seen on the other side. So this is quite simple and it has the same limitations of iSquare C switches. So you cannot have two chips with the same address on the remote bus and on the local bus and you have to be careful not to connect the two buses together, two remote buses together or they will speak to each other, which is a problem in some applications. So this is how the Maxime chips work. The TI chips work very, very differently. What they do is at a logical level and not at an electrical level. So they have a transition table that allows to do address translation for each of the remote ports. So for each of the remote ports, there is an, I think, an eight entry table. So you can assign an alias to any physical address on the remote bus. In the example, you have slave 0A on the remote bus and you assign alias 2A, for example, which is just a random address that is not otherwise used on the upstream bus. And so whenever you do a transaction to address 2A on the mainstream bus, it will match the table and the same transaction will be propagated to the remote bus with a modified address so that it becomes 0A. This allows to have the same address, 0A, on the upstream bus. So if you issue address 0A, it will go to the chip on the upstream bus. If you issue address 2A, it will go to the second bus. 1A will go to the first bus and so on. And this is a little more flexible and powerful than the GMSL scheme. And the way I implemented that, as I mentioned before, is to instantiate a new I-square adapter on each of the remote branches on the remote buses. And the driver, the digitalized driver, implements address translation to basically whenever there is a transaction, it modifies the address to be sent out. And this is how it appears when being used practically. So in your Linux system, in the example above, there are three buses upstream, remote 1 and remote 2. So there are three buses in the system. I-square C0 is the upstream bus. I-square C4 and 5 are the two remote buses. And so you can use any of the remote buses pretty much like any I-square C bus to almost all extent. So you can, for example, instantiate a new device on I-square C bus 4 by using its real physical address. And what happens under the hood is that it will be mapped internally to an alias, which is something you can see in the kernel logs if you are interested. So in this case, address 4B has been chosen. But you don't really care much about the alias because you can access the device using the bus number and the physical address just like any regular I-square C device. And the driver internally will translate 0A to the alias address, send the transaction with the alias over the wire, and then the service chipset will retranslate the address back to the physical address on the remote bus. And so that's all for I-square C remote station. And let me draw some conclusions. First, video service chips are pretty complex. They do something that is somewhat weird and the kernel is not completely ready to host them in a good way. So work is ongoing in many directions for different chips. The main issues are the problem with limitations with the video for Linux 2 and device 3 mainly. But these are mostly related to a hotplug requirement. For non-hotplug applications, there are only a few blockers, not really huge ones. And at least for the remote I-square C implementation of TI chips, there is a plan that is quite shared now on how to proceed. And so the rest is just gone with the work. Thank you. Any questions? Yep. Thank you. Good talk. What happens when there is a master on the remote bus? A master on what? What happens when there is a master on the remote I-square C bus? If you go on slide 20, for example? Okay. Yeah. So let's say this one is a master. Yeah. This one here. Yeah. It happens. It happens. There are applications having this problem. What happens is actually a mess. Okay. Yeah. There are work from the spoke about that yesterday. There are these remote camera modules having a microcontroller that at PowerApp configures devices on the local remote bus. And if you connect them to connect the I-square C switch at that point, basically you have multiple masters talking and everything can happen. So the best thing would be don't do that. So ask your hardware designer not to do that. After all, you have a system on chip on the other way and it can do everything. The way they're handling this in the GMSL patches is basically a hack that probably could not be better. So what they do is they disconnect all of the switch, wait for some seconds. After that, they know the remote microcontroller has done its job and then they start doing the rest. For the TI chips, if something like that happened, it would be a smaller problem because transactions on the remote side are not propagated on the upstream bus and on the other remote buses. So it is a smaller mess. But still, if you want to talk to the remote side and another master is on the remote side speaking, there would be a mess limited to that remote side. So don't do that if you can. So just to get a picture of what GMSL or what we need, can you go back just one slide back please? The thing is in our case, on the right on the remote side, we can reprogram the devices to have new addresses. So we need to reprogram one by one to have a unique address because at the end, we really want to have all of the switches closed because we need to start the cameras more or less at the same time, otherwise they will go out of sync. So this is a problem we're facing. At the beginning, we can, so ideally, we can just have one switch closed and the other open to reprogram them, but in the end, we want to have all switches closed at the same time. That's what challenge we are facing with GMSL. Yeah, yeah. One of the bad things exactly that these chips by default have all the switches connected. That's super annoying. All gates are closed at the beginning and we have two of them. Yeah, that's one of the annoyances in that setup. Actually, one thing is those who are working on the GMSL chips are working to, well, they implemented reprogramming of the remote chips so they can, all of the remote chips they are using can modify their own address. So they connect one by one, modify the address, disconnect and the same for all of them. And at the end, they connect everything. So there is no duplication of addresses on the whole big huge bus. These works, but it could be avoided because they have a MOOCs. So they could just use the MOOCs to connect one by one whenever they have to talk, but it would impact performance because you need to select the select every time. So this is what they are trying to avoid as far as I could understand. Yes, it would take a lot to start the cameras one by one, but they need to do it all together. Right. So since they started with difficult questions, I'm going to continue with other difficult questions. So these deserializers and serializers you're mentioning the use case of cameras, but they're also used for displays. And so, of course, the other way around compared to camera use cases. But it's actually the same chips that are used. So, ideally, we would like to be able to use the same drivers for video for Linux pipelines or DRM pipelines. And do you have any plans or ideas on how to do that? And I've seen an actual design using the GMSL serializer, the serializer for display. So, of course, the other way around, right, the serializer is on the SOC side and the deserializer is on the display side. Okay. Yeah. I don't know much about using these chips with display. I don't have an application for that. So I don't know how that would work on DRM KMS. Sorry. Actually, both vendors have two different series of products, the display series and the camera series. I don't know exactly why, because after all, if there is, you know, example, CSI2 entering and CSI2 exiting, it's just CSI2. I don't know if there is a specific reason for that. Maybe the display devices are more careful about display timing. I don't know. Or maybe it's just that they have audio and the camera ones do not. But there could be cameras with microphones. I don't know. Sorry. I don't know much about the display series. Yup. Yeah. About the device tree overlays, I have two comments there. Sorry. About what? Device tree overlays. Yeah. So the first one is that for your endpoint, you need basically two P handles, one in the base and one in the overlay thing. But you could add the property in the base part in the overlay. If you have a DT notifier in your deserializer, then you will become aware when the property is added. So let me see if I got your point. So you mean? Yeah. So you have two remote endpoints, one in the base DT and one in the DT overlay. Yeah. But actually the one in the base DT can be added by the overlay. Yeah. And then if you have a DT notifier in your deserializer driver, you will be notified when the remote endpoint property is added. Okay. Can it also be removed? Yes. And notified? Yeah. But there may still be some memory leaks related to adding and removing overlays. Who cares about that? And then the second comment is that for the generic case, there exists this OF config FS method to add and remove overlays in a generic way, which will not be accepted upstream. But that's not something you need. So if all your driver needs to do is get some overlay from somewhere and insert it into the system. So your driver could, for example, use request firmware or something like that, and then pass the overlay and it can be added. And I think that support is already there. Okay. But the config FS interface allows us for the user to add some arbitrary overlay to the system. But that's not what you need. You just need some overlay describing your camera to your system. And your driver could handle that based on the ID you read from the APROM or something like that. So I think most of it is there. But when removing, there may be memory leaks. That's... Yeah. Actually, there could be solutions to this. Anyway, I started investigating that part and then at some point I kind of gave up since having the device tree overlays insertion and removal would still not be enough. I would need also the video for links to part. And so it's kind of like I could solve half of the problem, but it would not change the overall picture. But anyway, thanks for this suggestion about device tree overlay. Okay. Yeah. Question is if... Where is it? This way. If you use I2C address translation, and I have like this device here, whether it appears on the upstream bus. It appears on the remote bus, as you can see from this example. However, on the... I think on the upstream bus, the address appears as busy as used. I think so. Yeah. So it should not be able to use them. I have to check anyway. So it should be marked as used by the I2C core, but of course, there's only a certain limit of what the Linux I2C core can prevent. So there's this read-write interface, where you can basically write anything to anywhere, anyhow. And so you could use the alias on the upstream bus to write to the downstream bus. It's an alias. It's like a sim link for a file. So you can access it like this or like that. Okay. I think that's all. Thank you very much. Enjoy your lunch.