 Hello, I am Pavel Machek and I will talk about cheap complex cameras. I have kind of technical issues with the slides because I don't have a copy on my screen, so it's going to be slightly interesting. Anyway, about me, I work for a company called Danx and spent a lot of my time doing embedded camera development or U-boot or building systems. When I am not hacking computers, I do a lot of horse riding, so you will see it at the pictures. This is my horse. So, why I originally did this, I would like to have a usable cell phone. Unfortunately, this brick called Nokia N900 seems to be closest to a usable cell phone that money can buy, which is kind of sad. I really wanted to have a working light because unlike cameras, I consider the light to be very, very useful and that's how it started. These days, I have a camera too, but it's not like, well, you will see it. Big thanks go to Ivalid Mitrov and Sakary. Sakary is present here. Ivalio did a lot of maintenance of patches, so they were not lost through the years. OK, so these days hardware is pretty cheap. You get it free with your cell phone and it's also pretty complex. So you get a flash, you get a voice coil support for focus, you get front and back sensors, you get a switch to select between these and then there's quite a lot of hardware in the system on the chip, which comprises of front end, the code signal from the sensors, then there's preview module, resizer, statistics collection and so on. It's quite complex. So this is how the hardware looks like from the Linux side, from the Linux view. Somehow better abstraction, but still quite difficult to use. This is what the hardware is like underneath. I don't think you are supposed to understand this in this short presentation, but the hardware is quite complex. This is actually quotes from presentation from 2010 and the point is there is pipeline where different modules are connected to each other. So there are questions such as which resolution and which format is passed between the pipeline modules, which is not, which is an option, because at the end you can have different resolution than what the different modules work with. And that's a big part of problem with the media controller. Any way, the thing is it's an embedded mess. So in the video for Linux was originally designed for TV grabber cards. It makes the world very, very simple. You have a TV card, it has set resolution or maybe it has fuser resolution you can select from. It has given format, RGB 24 bits, maybe one of those, and you just capture, right? So this looks simple. Webcams, so if your machine has a webcam, it's very, very simple there too. It basically looks like this. One device, you have it. Unfortunately cell phones are different as we've seen on the slides and this is what my cell phone looks like from the user land. Fortunately, I don't need to use most of these devices, but they are still few I need to use. So the three important are at the bottom. There's focus coil, there's flash, there's sensor chip. I definitely need to talk to those. And then I need to talk to the OMAP3 ISPC-CDC, which is where the raw data is coming from. The address are mostly not needed at the moment. So again, I am making fun of 2010 presentation when they introduced it is called media controller stuff, which was supposed to model the complex hardware, and they did model the complex hardware. But unfortunately, the claim was, this is not a new version of video for Linux and in reality they ended up with something which is in practice incompatible. So currently there's no support. If you run application, the application won't even start unless you specify options manually and then like most of the features won't be available. So kernel progress of the cell phone as sold never run anything close to mainline kernel, but after like 80 years of work, we are now in and we have support for the basic camera hardware in this old phone. So, okay, this one is Nokia N900. There's a newer one, Nokia N9. And that's a bit better supported thanks to Sakari again. N900 sensor is already working in 413 and 900 support was merged like few weeks ago, but we have it in now. So kernel support is there. Autofocus call support is there, but it is not connected to the rest of the devices. So you currently can't really use it. Flash support is in, but again it's not connected to the rest of the system. So that one is reviewed for N9. I need to do some more work to get it supported for N900. Okay, and this kind of image is what you get with the kernel. Can anyone guess what this is? No. This can be recognized as USB connector about in the middle of the picture. So without the userland support, this is what you get, and I believe this is not good enough. So I don't think I want to replace the DSLR anytime soon, but it would be nice to at least be able to recognize what's on the picture and that's clearly not here. So we need some kind of userland and there are a few options there. One option is v4LUtils. They are advantages, it's alive and well, it's being maintained and it is in C and has reasonably coding style. So it's possible to read, it's possible to modify. Unfortunately, they are disadvantages. So there is no media control support, which means first that it doesn't work on the phone and second that even if I get it working, it won't be able to change the resolution. And it would be really nice to change the resolution, right? Run the preview so I can aim the camera in low resolution and then switch to high resolution for final image. It has some kind of auto gain support, but it's quite poor. In particular, it tries to give average intensity and what we really want is not average, but we want bright picture, but not as bright as to over expose it. It doesn't have autofocus and autofocus is actually the hard part and part I spent quite a lot of time with. It has auto white balance, but again, this is tuned for webcams. So for a webcam you want to change the balance slowly so that the user doesn't notice, which is not what you want for the still camera. Biger issue probably is going to be that it's 8-bit only and even this cell phone chip has 10-bit resolution. DSLRs, they have 14-bits sensors. So if we wanted to use it without modifying, we would be losing quite a lot of data just there. Another thing is the kind of design. The V4L utils, they have a library and the library tries to mimic the kernel interface, which basically means the interface between library and user application is already set for us and we are expected to use the same kernel interface, which is not quite suitable for us. We could do advanced stuff like focus on given point and so on, kernel doesn't know how to do and there's no completely straightforward way to extend that. Of course, this is solvable. Next thing, what I really need is, so I would like to convert data and the library can do it, but it is only willing to convert data, it goes from kernel. So I would really like to be able to convert my data too. Okay. Then there's another project, it's called FCAM and it was a university project and it supports exactly this hardware. Unfortunately, so full-featured camera application, it could do autofocus, autogain, everything. It has even accelerated previews, so data came directly from hardware to your frame buffer without CPU, 10-bit support, autofocus, autogain, it could do RAW pictures, it could do JPEX. They even did stuff like HDR. They could change resolutions and they kind of had nicely looking interface for programming. They tried to hide the complexity from you and the resulting application was looking very, very nice, if it worked. So we are getting to the problems. To get this, application is in C++ which may not be a problem yet, but it uses threading and it has custom kernel interface. They realize some of the problems, they fix them, but in the process the application became dependent on their kernel patches. And unfortunately, I don't know where to get the kernel patch. If someone still has it, I'm interested. By now, it's that project. I couldn't find anyone caring anymore and it get us photos, but it doesn't solve the problems. We won't be able to run usual applications like, I don't know, maybe Skype because this is single application good for taking photos. So my real goals, first goal was to get a light working. That meant I had to do a kernel support, so by now I would like to have the kernel tested. So I'm trying to create something good enough so that the kernel can be tested easily. As a bonus, I would like basic camera application with autogain because you need to do that, preview some way to get photos and I would really like to have quick shutter speed because the original Nokia had like, you pressed the button and it took like two seconds to focus and then one more second to switch resolution and grab picture quite long time. So maybe it would be nice to get somehow around that. Bad news, what are not my goals because I don't have team of ten people working for me. I don't quite care about accelerated preview. I have CPU, my CPU can copy the data around, it will burn power, but okay. I also don't care about high quality JPEG. So as long as the data is captured in raw format and can be later focused, converted to something of reasonable quality, I don't quite care about getting high quality JPEG straight away. What's listed as very hard, so also not my goals, it would be nice to somehow solve video capture. The camera can record at VJ resolution at least, but I care about still photos more and it would be nice, but video falling doesn't support it in any way, to be able to run independent applications when one would run the preview and second would, for example, record or take photos or something and maybe start to stream it. In a way, in old sound days, you had one application and these days you had many applications working at the same time, so it would be nice to have something like that for video, but that's far away goal. As you can see from the pictures, most of them are in macro range because that's where the camera is actually quite nice. The camera won't provide like two nice pictures or other cameras can provide nice pictures in big range, but for macro photos, small chip is actually pretty useful. So I did some performance research. I should mention for some reason on a 900 full resolution, which is five megapixels doesn't work. On the other hand, we have enough problems with one megapixel and one megapixel is enough to get somehow usable photo. You will lose more quality by not having autofocus than by megapixels. A few try to convert from... GRBG10 is 10-bit buyer format and video for Linux libraries can convert it to RGB for you. Unfortunately, if you try to do this, it's so slow, then it's not usable for the viewfinder. So that's not a way to go. Fortunately, if you select pixels and display those, then you can do that in real time. And you get window on the cell phone, but that's good enough to aim, maybe at reduced frame rate. Good enough. If you remember, the camera hardware produces or has hardware accessories features to collect statistics for you and there are three modules, each specialized for different kind of statistics. Unfortunately, if I wanted to use those modules, it would mean that my application would be dependent on my hardware, which is something I would really like to avoid. So fortunately, if I only sample one in 300 pictures for autogain, I still get useful autogain and it's fast enough. Also for autofocus, I could sample whole image and do some expensive computation on that, but it turns out sampling three lines is enough to get mostly useful focus. So anything about this image? This is not an image from cell phone, right? It's obviously was taken with camera chip, so big camera chip, because the background is blurred. So this kind of image you can't get with your cell phone. So here is my SDL cam project. It's like in early phases, I would like to merge at least some of the stuff to the official utility for L. The binary is called SDL cam, because it relies on SDL heavily, and the branch to use would be my1.13. OK, so the bad news. Pipeline parameters are hardcoded, which means that you will not be able to use it on your camera easily, but it should be possible to modify. I did very simple IO control propagation, because I wanted something useful fast, so I just sent the IO control to all the devices, and it succeeds on one of them. There's a very simple user interface. I needed to control it from the touchscreen, so I had to make an UI, but it's really simple. I captured into RAW, because actually RAW is simpler than the JPEG, right? You just take the frame buffer and write it to the disk. At least that's the idea. Unfortunately, the only image format for RAW, that somehow document is DNG, and the DNG is a very, very complex variation of TIF. I don't know, don't do this. There's an old format called portable gray map, PGM, so I use PGM. It can store 10 bits, and it's easy enough. It's possible to convert PGM to DNG with use of fcam, if someone really wanted to do that. 8-bit internally, JPEG capture, I do produce JPEGs, but they are of quite bad quality. So white balance is missing, that pixel is not there, lens shading is not there. But it still gets some kind of reasonable pictures, right? You can at least tell what's in there. So that's the good news. Auto gain kind of works. There's probably not much to improve there. Auto focus mostly works, and there are some design improvements possible, because, well, I will mention that later. Shutter speed is quite reasonable, because I don't switch resolutions, I always run in one megapixel, I am able to just save the last buffer and have my capture. I can do RAW capture, which is useful given that quality of JPEG is bad. And it's good enough for testing kernels, which was kind of my original goal. So the old auto gain, targeted average, which is a bad thing. New auto gain got ideas from fcam, and it tries to first get enough bright pixels, and next not get too many too bright pixels. You want to limit amount of overexposed stuff. Next thing, it's tuned, so it changes shutter speed and gain at the same time. And the goal is to, as long as shutter is very quick, you can adjust shutter at about one over 60 seconds, you start to modify gain, and then if you run out of gain to modify, you have no other option, but to increase shutter speed, decrease shutter speed again. A big thing that should be done, and I am not currently sure how to solve it, it's really useful to be able to select adjustment. Unfortunately the algorithm is kind of iterative and if you do a too big adjustment, it just starts to oscillate. So this needs some more improvement. So there are actually two kinds of autofocus. One is autofocus in single sweep, which is suitable for taking photos. So you run the lens. Oh, actually there is three. First one is fixed focus. For taking pictures a distance about 2 meters, it should be enough to just keep the lens at fixed position and not do anything. Because the chip is small, the depth of field is pretty good, and basically you are only likely to get it worse if you try to move the lens. Then there is single shot focus. If you are trying to get a picture, you just scan the whole range and select the best, sharpest picture, then focus there. I actually do two steps. I run a second sweep around the good place. Unfortunately this is not going to work for video recording. So for video recording you want to have continuous focus. In my implementation, it switches between current position near and far, and if it looks like one of those others is better than the current one, then you move lens in the direction. So this is my wishlist for Kernel. I am not sure if all my wishes will be granted, but this is it. I would really like for the hardware to boot in a useful way. So there should be some default configuration for the pipeline, so the pipeline is useful, and I can at least get old applications to run. It would be really nice to support media control enumeration so I can query the hardware, what resolutions you can do, which is currently not there. I would like to get absolute units for many controls. So currently video for Linux basically says, hey, this is a number, you select the number, and you don't know what it means. But photographers want to say, I want 1 over 100 of second, right? And that's currently impossible or at least impossible for some controls and not for some others. So I would really like to have units there. Capture settings for each frame I will explain in one of the next slides. This is the problem. The interface is too asynchronous. So you select resolution, start the capture, OK. Image is too black. So you would like to increase the gain. So you increase the gain. But now pictures come and you don't know which picture was already taken with the new settings. Which is kind of bad. This is what the f-cam people were fixing with their interface. They provided a timestamp for each frame, which is probably a simple way. Maybe they are better ways. A similar issue exists with focus and it's probably way more critical there because it's hardware coil that's moving and if you wait for stabilization it will slow things down a lot. So wishlist for v4l utils I would really like it to work on modern hardware which means media control support. From the discussions it looks that if I want to do this I will have to implement it myself which is not too great but maybe I can do it. I would like the conversion rudder to be usable standalone so I can convert images from one format to another and I would like to see 16-bit support because losing bit resolution is bad. I would also like to have a way to get core for single pixel. The library currently only works at whole frames but converting whole frame is so slow I can't do it for preview. So it would be nice to get an interface for getting single pixel. Wishlist for world I would like better kind of pulse video for Linux so that I can run multiple applications. I would like better raw libraries because there's it seems like everyone uses UF RAW and that's kind of strange code and can be supported better could be more readable and so on. I would also like monitors and so on to have single color representation because when I prepared the presentations the pictures looked really different on different hardware. I believe Apple people and graphic designers have some solutions there but not me. Okay, bad pixels fortunately they are getting rare these days but like truly bad pixels they apparently come in some older chips they are but even like today I have pixels that go bad when I turn up the game so this is how it looks like if you can see there's a snow around the apple so it's badly focused but that's not the point because there's no there, right? Fortunately this can be removed in the processing so let's do it there White balance matters this is a picture without the white balance processing and this is same picture with white balance so to my eyes and on my monitor the second one looks significantly better so let's see how it looks on the projection next thing, lens shading the brightness is at the edges it's actually half of the brightness in the center so even you kind of might expect it here but this was capture of uniform blue background it doesn't look uniform on the monitor but again this is something that can be done in post processing so I guess I will ask for questions now and then if we have some time I can try to get the demo running but I already tried to get the demo running and it didn't work too well so questions first just to mention that your world goal actually is arriving there's a demon called pipe wire which is coming to Fedora 27 and of all the other distribution in the future and the first feature is actually to be able to multiplex and share camera across processes and it will also solve the problem of capturing from your compositor ok, so thank you I actually knew about the problem but didn't dare to try to run it on my phone so thanks for doing that good commenter said that there's no g-lip and they get thumbs up for that you're talking about low speed of buyer to RGB conversion high speed converter for previews and high quality converter for picture I asked because some webcams doing this so I believe that even low quality conversion is too slow to do in preview if I do it on CPU so there's hardware that can do it for me but it's too hard to use so yes high quality conversion will probably be even slower than what I have there there was some question over there ok, so I will try to get a demo running maybe so few moments while my phone reboots somehow USB network connection doesn't work and if you want to see something I will try to get a demo output on the PC and not on the camera ok so I have SSH connection to my phone and this is how the preparation looks because it's still booting so it's kind of slow so it goes through all the devices then it performs default setup then it tries to get some images just to well, check early if something is really broken there and this should be camera application running and it is running, but you don't see it because it's on the other screen so this is preview, you can see that auto gain kind of works if we aim it some other way it will get reasonable picture I will try to do this and there are basic controls so I can turn off auto exposure and then control exposure manually select time or select gain I'm not sure if this will work but if the auto gain works it should be possible to do auto focus let's try ok, that's not really visible but I got my original goal as you can see the light works so thanks a lot for your attention if you want to come and see it works better if the light is not against it thank you