 Okay. Shall we? Yeah. So good afternoon everyone. I hope you enjoy my fancy slide deck from today. I'm kind of proud of that. So good afternoon and as you might read on my background this is a presentation of the first stable or sort of stable camera release and we are here today to present a bit of the API we came up with and start the requirement for a public review that will take place start tomorrow there will be meeting tomorrow morning and so yeah let's start. So a bit above introduction of the project if you have not heard about that before we start one year ago so we are very young. We are a group of team core developers we are led by Laurent we started the project one year ago and kickstarted all this adventure and we are the point now where we are targeting devices where that support as close as a mail as mail and as possible kernels because that we will like to encourage vendors if not force them to say as close as mail and as possible and we depend on features which has been integrated in recent current versions. Currently we support a limited but increasing number of platforms we started with Intel CPU which feature an IPU 3 ISP we then moved to a embedded Rockchip device which is an ARM device the RK3399 we have in the pyoplan support from the Raspberry Pi the Raspberry Pi 3 and 4 which are currently not there yet not mainline in lib camera but we are working on that with the Raspberry Pi Foundation guys. We have support of course for UBC cameras because that's what you find on most laptops and we use extensively the VMC test driver for all test infrastructure. We are looking for new platforms new device to expand the scope of lib camera and new use cases so if you're a vendor if you're a camera producer get in touch with us and we will be happy to talk with you. This is just context for reference we have a website which has been updated this week. We have an IRC room where you can talk with us if you want to. We have Git repository which has been hosted Linux TV and we have a mailing list where all the development happens and but you can get in touch and ask questions there if you want to. We've been around this year we've been an all major conference basically and this talk I just have 35 minutes so I cannot go in great length and explain all the motivation that led to the conception of lib camera but I collected you a few references on the all the presentation we gave this year. Laurent on this very stage well not this very one was heading around one year ago kickstarted the project presenting the motivation. I've been in FOSDM a few months later presenting the first support for the support for the first platform we supported at the time. Kiran has been in Bangkok in April and presented support for the second platform and finally Laurent has been again in Japan on stage presenting the support for the the first support for the Android camera all unfortunately I find no reference of the video but there are slides around and today I'm here again to present the public API that it's it's the outcome of all this journey that it's one year long journey so far. I'm not going in great length into that but a bit of background we come from the word that it was much simpler than what we have today so we came from a word that SOC didn't have much computational power on the ISP side so all the image processing was done mostly on the sensor. SOC's receiving port that was parallel at the time maybe CSI2 and the DMA engine and the obstruction to user space was a single that video zero and all application went through that to control the capture pipeline. There was an abstraction libv4l and we had application that could natively talk with the libv4l2 APIs or use libv4l abstraction. Nowadays the word is much complex than that much more complex than that so we see we've seen a big increase in computational power on the SOC is specifically not on the ISP side so this part here at the SOC it's becoming more powerful can do a lot of more things and that means that sensor on the other side could be simplified and are meant to do what the sensor is supposed to do so capture great images maybe in a format which can be later processed by the ISP itself. We've seen an increasing number of DMA engines much more complex hardware infrastructure and Vide for Linux has had to evolve to keep the pace of with this evolution so we have seen the introduction of the media controller API that was like 10 years ago and that led to an explosion of user space components so now we have video devices we have video sub devices we have the media controller to control linking and setting up setup of the pipeline that requires a lot of tuning in user space that requires application to know a lot about the platform and everybody is doing that a little bit differently with scripts with custom application and so we were missing a piece there we're missing a piece that provides a unified interface for application to interact with a camera instead of dealing with single video notes single video devices media links and all of that so that's the motivation for lib camera and this is just a very simple picture of the stack that we have imagined at the time so we have a lib camera at the bottom of the stack and on top of that we have several abstraction layer we have a Vide for Linux 2 compatibility layer to guarantee that application that used to work with libv4l keep working with lib camera we will see the first drop for the compatibility layer probably next week and we're very excited about that we will have abstractions like g-streamer elements for that which are in the pipeline for the next month and we want to keep working on that we will provide language bindings so far so lib camera is C++ and we provide the C++ interface for that but we imagine to provide Python bindings and other languages bindings and as of now we have a lib camera compatibility an Android camera compatibility layer which so far supports the limited device mode but we want to bring that to support per frame control before the end of the year in the next month and of course there might be application that uses each of this abstraction layer and also lib camera native components that will use the API that I'm here to present today when speaking about the API we have several API actually we have an API towards public into public application which is the lib camera public API up there and that's what we're going to talk about today but we also have internal up the internal API one towards the device agnostic part of lib camera and this platform specific part of lib camera requires an adoption layer to the platform of course to take care of dealing with media devices we don't know links and etc etc and we have an internal API for that we also have an internal API for another component which is essential in the design of lib camera which is the what we call the the image processing algorithm or IPA these are components that are meant to be run in isolated mode or integrated with lib camera to give vendors space to implement 3a tuning algorithm in an isolated wake from lib camera and that we have an IPC based protocol that's going to be presented and discussed tomorrow again today we're going to just present the lib camera public API so if you're a camera application developer that's what you should care about so I got this wonderful slidesack deck available here to present this talk we've wrote a very simple and over commented application which I think it's useful to give a grasp of the API we have which is available here and the rest of the presentation which I hope it will fit in 25 minutes it's live coding on an application so I hope that not to screw up that too much and let's move to live coding or if you have any question on this introductory part okay so I would like to present you how easy it is to write an application for lib camera I'm cheating a bit and I'm starting with like a skeleton application with just comments we got through all of them and a simple main function so we're running it's it's C C++ directly C C++ API and I used to import you know standard C++ header and few others that are required from the camera it's the only lib camera dot h include including directive and the DRM4CC for the image formats so the first concept the lib camera presents to the application developers is the camera manager camera manager is a single entity in the new application and camera manager runs and basically creates a list all the camera that it found in the systems as I've said lib camera is a concept of pipeline manager which is device specific abstraction and the pipeline manager it's matched against all the media entities that it finds on the system and creates camera parsing all the video device that it funds so the first thing that if you are writing an application for lib camera the first thing that you have to do it's of course to create a new camera manager right and of course the camera manager needs to be started and that's leads to enumerating all the cameras in the system I would like to give you I should have prepared that it's better in white right okay so I would like to give you to compile the application and also give you a brief run on what happened there so when you run the first when you start lib camera we will put simply present the version but behind the curtain something happens and if we expand the log levels we see that when we start the camera manager all the pipelines that we have registered to get that queried in the systems and each of them registers some cameras so we see that we all go through all the pipelines that we support at the moment you receive in CI pu3 and rock chip and one of them matches with the system that the application is running and creates a camera from the dev media zero just an example if I load the vimsy test driver and I run the application again we'll see that an additional camera gets created right because that we have a new media device we have a new camera in the system and lib camera knows about that and creates a camera for you so it's possible from the camera image to get a list of cameras and just an example again yeah I didn't want to make you that I have no idea of a c++ but cameras yes thank you that's collective coding I like that so if you run the application here maybe decrease the log level to make it less confusing we see that now we have two cameras lib camera found the cameras for you so there is no need to go on the media device query all of them query the video notes we all the API revolves a single concept which is the camera concept but what is a camera actually for lib camera camera is a collection of streams streams are pixel which are produced by processing image stream from a single image source in the most simple case a camera camera as a single stream so there is no processing involved in that but depending on the platform capability you can have a multiple number of streams that just depend on what you could platform could do a simple reasonable example is that you have a single image source like a sensor and that goes through the ISP and provides you three different streams one maybe is the main capture one one is the viewfinder so provides you a scaled-down image to the viewfinding and one is still captured that maybe does video stabilization or other advanced feature for you to capture images from that it's possible to list the streams from a camera with this simple API but the most interesting part about streams is how you configure them because configure a camera is a tricky operation because it boils down to basically allocating the resources of the systems to each of the streams that you are interested in using think about the platform with a single scalar and you want to configure two streams both to use the scalar that might not be possible and the platform should prevent you from doing that or providing you a configuration which you are guaranteed to be that's possible to be used on the camera so in order to configure a camera we have a very simple API here as well which is called generate configuration so let's try to call it config is a unique pointer which is camera did I get the camera no yeah so first of all we need to get a camera and we have listed all of them but I want just to take the first one so camera it's equal CM cameras and we take the simple one the first one because it's simpler we need to acquire a camera before being able to use that cameras again acquiring a camera means that I lock the use of that camera specific for my application so no other application could clear that camera and use that while my application is using that so now that we have created a camera and I hope it's still compile camera where's that so now that I have created a camera I could simply generate a configuration for that generate a configuration as I've said is based on the concept of a stream role so I don't ask configuration for camera specifying I want this on this size I want this on this size or pixel format I specify configuration for a camera saying I want to stream to use as a viewfinder I want to stream to use it for still capture of course it's up to the platform specific part to let you to give you back a configuration that support what you ask if it's possible to do that on the on the platform you're using so right now we have a uvc camera it's a simple camera so we can only ask for a single for a single stream role and for simplicity we're gonna ask if you find the role so you can specify as many elements as you want reasonably of course the camera will give you up give you back a configuration which is guaranteed to be usable does it compile yes so right now as we said the camera is composed by streams right so it's camera configuration is a collection of stream configuration for each of them we could tune a set of parameters like sizes like pixel format but before we can use that let's take a stream configuration from the config we just generated zero we have a single stream so we take the first it's in use it's fine it's possible if you want to if you want to print out the configuration that has been returned by lib camera that's the default configuration that you have for for the stream that you asked okay so by default I get back a configuration viewfinder mode which has this size and this format I can go and tune the configuration as much as I like I can change the sizes that's pretty trivial stuff right when the pixel form yep I'll change it or even the pixel format 12 so for each stream which I receive a stream configuration before I go and set parameters specifically for that stream if I print them out if it compiles again I get the configuration which is changed of course this can be done wrong I mean I can ask for configuration which are not supported by the camera and I cannot apply configuration which is not supported to a camera right so we have an additional steps which is called configuration validation so each camera configuration is self-validating when you have when you have set up all configuration that you want for each streams that you have required you go through a single step which is called config that validate okay config validate if we print the stream after the validation the stream configuration after validation you see we have required for a stream configuration which is not supported zero it's not the valid value for a stream size validation adjust that to the closest possible configuration which is guaranteed to be supported by the camera so now that we have come through that if I ask configuration that it's supported it's not gonna be changed hopefully okay if the configuration is valid is not gonna be changed if it's not valid it's gonna be changed to you and what you got back it's guaranteed to work on the camera that you have configured so with a validated configuration we can now go and configure the camera that's unique pointers so I need to do that say again sorry I can't hear you just asking if there was a difficulty and that's why you're printing the 4cc the DRM 4cc is in hex instead of strings the dear why we use DRM 4cc well it's the DRM 4cc but in your debug you print it in hex oh well we just need a conversion table from the numerical value to the to the pre-tified name that's nothing more than that yeah I'll keep that so you can pass that around so now we have configured a camera with a set of stream configuration that means that now the camera knows how many streams you should provide you what are day sizes their format so that means that the camera now knows about the memory that we need to allocate for you to all to all the image data from the stream that you have required so we have a simple allocation step and reverse operation which is release buffer do you don't have to do anything more than that the camera will take care of as a configuration as big configure knows how many memory and buffer needs to allocate for you and from this point on the camera it's ready to provide provide image data to you so I will like just to show you very quickly that we have a very nice documentation I know it's not fair that I say that but and the camera has a very specific state machine that show you what you could do or you could not do at any specific step of the configuration process so once you have configured a camera you can allocate buffers on that and you gonna you enter the prepare state from that state it's where we can start actually to require frame from the camera how do you require frame from the camera live camera live camera as a concept which is the concept of request which is not and is not is nothing new we have a request API and before L we have other cameras tested as the concept of requests a request is nothing but a set of buffers which are created by streams which you are asking to be filled with image data request also as a set of controls associated with each request and controls not conceptually to be for it to control so our tunable knobs that let you specify the brightness the exposure or other tunable parameters of of your image capture process a camera is fed with request and return your request we once they are completed so request is completed once all it's all the buffer it contains have been filled with data so we start simply by preparing a vector of buffer of request in order to fill the pipeline the camera pipeline you know each camera has a pipeline depth so you will have to fill up up to four buffer to a camera in order to have it couldn't start streaming we need to prepare enough request to fill the pipeline of our camera and queue all of that before the camera start producing frames we create this vector we need the stream because the stream is where you create buffer from and we got that from the stream configuration config and we got the stream from that right and from the stream configuration we know the pipeline depth zero buffer count so that's the depth of our camera pipeline and from that we can start creating requests so requests are created from the camera and associated with buffers which come from a stream so our request it's created from a camera request a buffer which is a unique pointer gets created from a stream so the buffer API are under rework a bit but so far you have two ways to create buffers from your camera from your stream so you can ask the buffer to give you a buffer with a specific index if you want the memory to be allocated from the camera or you could provide the buffer a set of DMA buff fd's if you are importing buffer from external from external location which is one of the main use cases for camera nowadays now we have asked the camera to look at buffers for us so we asked each stream to create buffers for with a with a specific index we simply provide the buffer to the request and once the request has all the buffers it needs we just need to we just queue that to get that new request so again it's good so let's make sure it still compiles it doesn't stream config is a pointer good so to sum up the steps again I create a request from a camera I feed it with all the buffers I want to be filled in with image data and I add the request and I add those buffer to the request I'm not showing that here because the it's I don't basically don't have time to do so but we could take a vector of controls from the request and start tuning parameters there we have an API which is based on the set and get operation on the on a class which is called control list so you simply have a list and you say control list I want this the brightness of my stream to be 250 255 or I want exposure to have this or this level each cam each request as a set of controls and it controls are applied nowadays we don't have anything like the request API for for capture devices but as as we move toward that mode as a controls can be applied specifically for each request so now we have prepared a buffer of requests so we have filled the pipeline depth of our camera and we expect them to receive callback right when the request is completed when we have image which are available they want to display them to user or do whatever the application want to do with that we would expect the camera to provides a callback so the model that lead camera uses to register and provide notification of events to application is a signal slot model much inspired by the Q tool kit a signal is nothing but the event which is emitted by a class systems like a camera and the slot is nothing but the callback that could be registered to handle that signals it's nothing but a flexible and fancy way to implement callbacks a camera exposed three signals at the moment one that allows you to know when a single buffer is completed and that's very useful if you want to do partial request completion more advanced use cases and one signal which is the one that is called request complete that means that all the buffer in the request has been filled with data and the associated metadata which have been produced by the capture operation so in order to do that we have to prepare a callback and since I'm lazy I'm gonna copy the prototype of the callback from the documentation it's a static void callback it's gonna take a request that's the request that has completed and gonna take a map of buffer of stream to buffers that all the buffers that contain the data that we have requested for so once I do have a callback like this one I can simply go there and connect that with to the camera request complete signal so camera it has a signal which is called request completed and I could connect that with my request complete callback so that means that every time all the buffers in my request has been filled with data the application as the request complete callback invoked it compiles yes I'm surprised so looking a bit at the camera state machine again we have been creating request and preparing the camera with connected by connecting the signal to prepare for receiving data and now we are in the prepare state and we can move to a state which is called a running state that means that the camera it's actually what if you know about the video for Linux terminology that's the star stream operation basically so that's very simple to do and we have a single in single operation for that which is called start it's also at the counterpart which is a stop of course but once a camera is in a running state we can start queuing a request to that so for all request that we have prepared in our request vector we simply queue a request to the camera does it still compile good something happens so we start the camera the camera is running is ready to provide the frames to you you start queuing a request to the camera that's an interactive process and of course I removed that and I've not been smart in doing that we need an event loop live camera provides you facility to integrate your application to integrate with your application event loop event loop the glib event loop the cute event loop or gives you an easy way to create an event loop which is made which is realized by pulling a set of file descriptors the video device file descriptor usually I'm sorry but I'm gonna copy that it's three lines of code but that's easy as that so we're on an event loop for three seconds for three seconds I just say I want to receive all the events that live camera the live camera provides me so like frame completion I do expect during this this three seconds to receive callback to this function here one for each frame for each kept for each capture request that I've queued to the camera so request completed and if we try to run that hopefully I should receive four frames back right so for three second I receive four frames but I just have four requests and what should I do when I complete a request that because after that I cannot start queue request when a request complete request and buffers are transient objects so you can use them and throw away object and cue new request to the camera as much as you need them so when a request completes simply you can queue a new one doing the same thing we have been doing before so we create a new request from the camera where do I get the camera from how would the cameras global so I have a camera which create I could create a new request from the camera and for each buffer which has completed first I could simply get the stream from the buffers ray buffers first be yes sorry thank you I got a buffer back which is the I'm using too many bees buffer buffer be it's equal buffer second and I want to have an index because I the if the request number e has just completed I can recruit request with the same index so I now need to create a new a new buffer again is a transient object so I knew create a new one associated with the index e and cue that again to the camera so I got a unique pointer and it's simply stream create the buffer for me with index e and now again I can go and the request queue the buffer to the request add the buffer to the request new buff once I'm done with that for all the buffers that have been completed in the request I can queue that again to the camera queue does it compile yeah sort of because you are as I'm not stream and if I run that again I expect to receive frame completion no request completion notices for all the stream that for the request that I've been queuing for the time that I'm running they even so that's easiest that which is pretty intuitive and to show that I don't cheat I'm not cheating I do have like 10 minutes I could store a buffer to the disk and show you to that but if you have question I would prefer to do something more interactive than that and if you trust me that I'm actually capturing images I'm not gonna do that so if you have a question I would prefer to do something more interactive go ahead please hello okay sounds better do we actually need to create a new request every time could we just re queue the same request again the the fact is that a request is created with a set of controls and one request complete as a set of metadata associated so if you reuse the same one there is a mechanism if you don't change any controls you can maybe you use the same one I would say but it's easy and cheap to create a new request every time from the camera they trans they throw away objects so depends actually what you want to do okay I guess we don't need to free their request or do we this one so you would like to reuse the request that you had now visited you just allocated a new request right so do we need to do something to the old one now do you mean you can you could reuse this one yeah but if you don't reuse it do you need to free it no it's deleted after the callback complete oh okay live camera core once the call back completes delete the request and all the metadata that you have not used for that for you so that you do that manually it's no need for that okay perfect thanks in future the more complexionario like stereo camera with that requires synchronization would be considered in camera stereo camera with multiple sensor multiple sense of the BC and with request we must be a synchronizer you could create request that depends on the platform support of course but here you could create in a you could associate two buff two different buffers to a request one may be for viewfinder and you keep queuing that to the camera but at a certain point you want to capture a stream and you create you associate a buffer for the still capture use case so the request will complete with two buffers one from viewfinder one for still image capture image and the same as for stereo camera depends how your pipeline support expose that you could receive a buffer with the both sensors complete presented as one or two different buffers that you could merge into a single one depends actually use case maybe the other lead camera people as different ideas and that or in my scenario there is two sensor so two camera I went I want to synchronize the two camera with one request because I very short time terms trains you can you can do something like create a pipeline and that expose a single camera for two sensor and do this synchronization inside the pipeline and that's up to the platform support that you want to expose conceptually if you have a stereo camera with and the stream is composed by two frames that comes from two different sensor you and you would like to suppose that the single camera so you need a single request with the buffer made of two images you could do that in the pipeline and in the device specific part of the camera without changing the API towards the application is there support for other than image data in the buffers stuff like serial numbers and time stamps yeah I would could show that we can print them out that that's the full application I'm not showing that but you have all the information which you would receive from the V for the two layers and that's the very basic information like the time stamp and buffer number and associated with that a list of metadata which tells you the exposure and whatever controls have been applied to the image that has been captured again that depends on the platform on the uvc camera we got no metadata of course if your platform provides you metadata from the sensor from the ISP you will find all of them in the request the request as the method which is called metadata or controls list I don't remember and gives you back all the metadata associated with the image if I run that again do I have buffer here no I need to let you oh I will show that just because we you ask for that but it's nothing surprising yeah where's that one two three oh wow okay yeah almost 19 buffer was not the clear where's 19 oh sorry yeah and you got all the basic information that you got from the video device like the bytes use and the sequence number the timestamp and whatever other questions going back to the confirmation model do you provide some hints about resolutions formats or do you have to trial and error no well the validation part it's a trial it's basically a trial-based interface you know you try a configuration you want a day that it gives you back a validated one and what we thought about the hints the generate configuration part it basically gives it a template right so you ask for a view finder on your platform the pipeline and issue provides your configuration which is thinks is suitable for view finder use case so it doesn't make sense to provide you a huge frame in mv12 probably will provide you a why you view whatever is more resolution so templates are in our implemented through the role abstraction and it's up to the pipeline and to provide you as much roles as it could we could go to a phase action well in the well in the stream configuration indeed can you read that yeah that yeah hints that's why I was asking for hints DRM modifiers not yet okay stride adaptation no stride just stride okay so DM above import is kind of not working right okay what else well there's the offset but that's the same thing again okay someone else just quick clarification so when you go to the yes that if you would remove the setting the configuration completely would you get would you get to default values back based on the role so for a full review find it would be small for the still camera pictures would be best that the hardware can do yeah that's quite yes so the depending on the role you got the template back so okay I don't need to go and tune the configuration if I don't want to so if we without going to the capture part I'm not setting any configuration now but you see that if I print it out I got these values here without doing any configuration myself probably I'm not sure if you see supports that but if I ask for a different role I'm being adventurous here it might not work that's the same so depending on that that's that that part is interesting I'm not sure if I show that that's well that's maybe super spicy but that's a virtual method the validation because that goes down to the pipeline and that's specific to the device so depending on the constraint of the device the validation takes care of the platform constraint that gives you back a template for that so in that case you receive gives you back the same the same sizes for capture view finder on the ip3 they will give you a very small frame for you finder on the big one for first capture in different image formats another question does lib camera deal with things like encoding like if we wanted to encode into an h264 stream codec yeah like it's out of scope for the project I guess that's we got a discussion about memory-to-memory devices actually because that's something which is close to cameras not exactly cameras not exactly ISPs so there's nothing preventing you for our support for that to lib camera but it's not the main target of the project yeah okay if you're interested in that yes possibly any other question over there could you pass the mic please yeah is there any image processing support is there any image processing support image processing yeah so say if you wanted to flip the image or something like that well yeah of course if the ISP provides you the capability to do it to do that like much like you if you have like a V4L2 controls and you said V flip or H flip control you can send a request with a control flip vertical or horizontal and depending on the platform capabilities do you will get get back a flipped image on a UBC camera of course you have to do that in software because there's nothing doing that in or maybe three yeah but if your ISP provides something the controls for doing that you can set easy set the control on that request and you get the flip the image back thank you which is this part here basically we are growing the control part so we're very interested in knowing the use cases that people have in mind because we are based on the control which are defined by VDF or Linux but for image processing they are full short for many use cases so knowing that you need to turn your image of 40 45 degrees and rotate that it's interesting for the new control and support a new use cases for us just another question about control it's possible to have to set control to evaluate the control value lazy for example for video stabilization you do set some parameter when the request is handled not when this is easier to to the camera when the request is sent you include request yeah but you know the value the correct value for that control only when you will evaluate you have processed the previous frame so you can set the control when you include a request so you mean so you might need yeah you have a lazy evaluation of the part of the control it's as soon as the the request is is execute it is applied to the camera yeah so if you want to apply a delay we don't have nothing like a delay or just image decay in my scenario it's typical video stabilization problem so I evaluate the previous one to frame and decide that the next frame as I shift lateral shift lateral cropping of three pixel but I won't I already prepared the buffer because I have very tiny very tiny times in constraints so I can wait for a buffer location or something like that so I want only to adjust the last parameters to like cropping area of sensor or something like that and also I said them later it can be a quite long operation okay and pay attention you mentioned buffer location this one here is not location actually so it's not expensive in terms that you have to allocate memory if the pipeline is done right so you don't have to do remapping inside that's basically just queuing a buffer which is an expensive operation basically it's expensive but not expensive is locating that so so far the later time you can modify requests when you queue that but that's something that we should could think about okay where's that going to happen Hans where's that going to happen so I think we're running out of time I will keep the discussion but that means that we are skipping the breakout session so feel free to get in touch if you have platforms that you would like to be supported by live camera you know where to find us and if you're more interesting use cases we will be very interesting about them thank you