 All right, thanks, everyone, for joining. This is a session about the Google Summer of Code this year within AGL. We had two students, Suchindon Chakravati and Malik Talha. And here we present their Google Summer of Code projects. I'm Jan Simon Möller. I'm the lead mentor for GSOC within AGL. And I did work with our two students throughout the year. One is Malik Talha. And the other student was Suchindon Chakravati. They did prepare presentations. And I will show these now. Hello, everyone. My name is Malik Talha Saeed. And I am from Pakistan. I did my bachelor's in computer science from National University of Computer and Emerging Sciences. During my four years at university, I have always been interested in AI-based applications, mainly language models and image-based models. Apart from that, I have actually also worked on web-based stuff and desktop-based development. You can find my personal projects on my GitHub account. Apart from that, this year, I participated in Google Summer of Code as a contributor for Automotive Grid Linux. You can reach me out at my personal email, which is shown here. As part of my GSOC project can be summarized as a voice assistant, which is basically a combination of a Flutter app and a Python-based GRPC server. So my project is categorized mainly into three main stages. First one is converting voice to text. The second one is text to intent. And the third one is intent to action. I'll explain these stages in greater detail in coming slides. Apart from that, I have also implemented a customizable web word detection capability. Customizable here means that you can actually choose any specific word, phrase, or a sentence to act as a vague word. In this project, we also did voice integration to convert voice to text. This was done as part of last year's GSOC project by Aman. This year, we integrated the SNPs and Raza intent engine to extract intent from the voice commands. Before getting into any other details, I'll first like to explain the project architecture. As you can see, here is our automotive grid Linux in which we actually have our Cooksa server running, our voice assistant app running, and our voice agent service running. The user interacts with the voice assistant app. And the voice assistant app has two modes. One is the vague word mode, and the second one is the manual mode. When the user chooses the vague word mode, our voice agent service actually responds to the GST streamer pipeline. This pipeline continuously runs in the background, and it continuously processes audio buffers and chunks. It will continue processing until the vague word is detected. And as soon as the vague word is detected, we clean up this pipeline and send the response to our assistant app, or in this case, GRPC client, that the vague word is detected. This was the vague word mode. We have another mode called manual mode. In manual mode, the user starts the recording by pressing a button on our assistant app. And then as soon as the user presses the start recording button, we create a GSTreamer pipeline. We continuously record audio until the user presses the stop button. Then we clean up that pipeline, and then we use the voice or the audio that was accumulated during the pipeline's run. We convert it into text using WASCALDI. Then we extract the intent. For this, we have two available intent engines. One is SNPs, and the other one is RAZA. We can use either one of these. That depends upon the request made by the client. The client has an option to choose between SNPs or RAZA. When we have the extracted intent in a proper structured format, we then map this intent to the VSS signal, and then we finally use the Cooksa client to actually execute this intent. The Cooksa client communicates with the Cooksa server to execute this intent or perform any relevant operation on it. So this was the main project architecture. Now let's go through the project features. First of all, this project is highly extensible and customizable, meaning in case if you want to add more intents in future or stuff like that, what you can do is simply switch the train models and our mapping files, and then you'll be able to extend this application without changing the underlying code. We'll come back to this later on to see how exactly it's extensible and customizable. Apart from that, it is capable of giving highly accurate results if provided a decade compute resources, meaning if you choose bigger models, then obviously you'll get better results, but for that you will need more RAM and disk space. The third point is that it is a single GRPC Python server and it provides all the functionality from recording a voice command to executing that command. And this Python server is capable of running in the background of your operating system and it can run as a background process and any number of applications can communicate with it even concurrently. We also use GStreamer pipelines for voice recording. And we have a Flutter IVA app to communicate with the GRPC Python server. This app is actually just made to test out our main the voice agent service. Basically this GRPC Python server is mainly the voice agent service. I'll explain the three stages of this project. The first one is voice to text. In this stage, we are using VOSC to convert our audio to text. VOSC provides multiple supported models and different model sizes for different languages in order to set up, in order to run a minimal project if you have like constraint on your resources and RAM size, you can use the smallest model that requires around 50 MB of disk space and 300 MB of RAM at runtime. And if you actually want more accurate results on voice to text conversions then you can use a bigger model that will require around two GB of disk space and around six gigs of RAM. Moving on to our second stage, the text to intent stage. So this stage is kind of important. After we convert our audio to text, now we cannot directly execute this text. We need some, a mechanism of a sort so that we can extract information from this text, the kind of a structured data so that we can use it to execute it or tell our application what action to perform. So for this purpose, we have integrated two intent engines. One, the first one is SNPs and the second one is Raza. So talking about the features of SNPs, SNPs is a really lightweight engine. It uses a SecretLand library as its backend for performing emulated operations. It is easy and trained to use. The downsides are that it is less customizable and at the same time, it's less popular. However, it has a really optimal resource usage like for constraint, for the environments that are a constraint based on resource usage, SNPs is only gonna take around 300 MB of RAM to run. The second choice is Raza. Raza is heavier. It is highly customizable. It uses TensorFlow as its backend. And it's highly popular these days and it's also easy to train and use. Raza uses around one to 1.5 GB of RAM at runtime, but Raza provides us with a lot of options. Like we can even choose our training pipeline flow and stuff like that. In order to get more details about how Raza works, how you can train the model, you can look at the official documentation of Raza. It is pretty well laid out. The third stage of our, the third stage and the final stage of our project is intent to action. So this is a stage where we actually use the structure data, the extracted intent, and then execute this using Cooksacline. So for this purpose, we have a mapper, a custom mapper design, but this mapper does is it consumes our JSON structures that our VSS structures that we actually define initially to execute our extracted intent. Currently, we only support three actions, increase, decrease, and set. So the thing here, the thing how this mapper works is, we have two files, two JSON files. The first file, in the first file, we actually define what specific intent maps to what specifics VSS signal. And in the second file, we actually explain all of those VSS signals. As you can see, for example here, I have explained about the Vical Cabin Infotainment Media Volume intent, and I have actually provided all these details. These are required by each and every intent we have to provide them. So an overview, as an overview, you can see we have set the default value, the default fallback factor, the default change factor. We have also defined all the actions that can be used on this specific intent. And we also have defined the values block here. The values block, as you can see, we can perform like define the lower bound, upper bound, any values that we want to ignore, any additional value that we need, and stuff like that. Now moving forwards towards the next step, in order to set up this project, if you are interested in setting up this project, locally on your system, you can easily do that. We actually, I actually have placed a really comprehensive documentation on the official docs of Automotive Grid Linux. You can go there and see. And apart from that, you can also find a readme file for this meta offline voice agent layer. This is actually, this layer can be find under the meta agile level layer. In order to like set up a really minimal, in order to make a really minimal setup of this project, you will need around 250 MB of disk space and three gigs of RAM. But if you like want really highly accurate results in a proper setup for that, then you'll need around 2.5 gig of disk space and 10 GB of RAM. Now, this GSOC project was a really great opportunity for me. I learned a lot of new things in a really short span of time. I was able to learn about the Yocto project within this five to six months that I worked with this GSOC project. I was also, I also learned how to make servers using the GRPC protocol. I also got a know-how about the Flutter apps and how we like do state management Flutter and how we create apps in Flutter. I also learned about how background services in operating system work. Apart from that, I also got a knowledge about Cooksavall, data broker and server. And I also learned about how to make contributions to open source projects. Future work for this project that the things we can do to improve this project is we can extend the GRPC server to return more personalized responses. Because currently, if you issue a command to the voice assistant, it will return a templated response back, like stuff like that, that successfully set the value or successfully executed your command. We can make it so that it doesn't return more personalized responses. We can also, right now in our current setup, we have a really minimal intent model and voice model trained for this purpose like currently our model only caters three or four intents right now. But this can be easily extended. And the third thing that we can improve is we can actually enhance our assistant app for better integration into the AGL operating system. Because for now, you actually have to open the app and the app has to be opened in order to work like for the wake world and also for issuing commands. But in future, we can make it so that it continuously runs in the background and the user can issue commands and it will then process those commands and execute them. I want to give special thanks to my mentors, Jan Simon, Scott Murray and Walt Minor. These guys were there. They were always there to help me out whenever I got stuck and they provided a lot of insight into our existing automotive grade Linux infrastructure and stuff like that. And I learned a lot of things thanks to them. Apart from that, I'll also thank the entire automotive grade Linux community for giving me this opportunity and being always there whenever I needed them. Now it's time for a live demo of my project. So now I'll start with the demo. As you can see, I have the voice assistant app running here. I also have the voice agent service running here. Apart from that, I also have a Cooksa data broker server running here. So let's actually get the initial values here to see what currently values for volume and stuff like that we currently have seated into the data broker server. So I'll first see the volume value. As you can see, it is set to 50. And then I'll get the fan speed value. As you can see, it is currently set to 100. So now let's try changing this value with the help of our assistant app. So as you can see at the top, we have two options for Intent Engine. One is the SNPs and the second one is Raza. If we shift to Raza, it says switch to Raza Engine. And then if you go back to SNPs, you can see it says switch to SNPs Engine. So now like, first of all, there is a section also here. Try a command section. You can actually just tap any of the commands here to see if they are working or not. Let's see, if you, for example, we give the command decrease the fan speed by 5%. So let's see, let's tap this issue. Let's see what happens. As you can see, it says I successfully updated the Intent value to 95. And let's actually also verify this on our data broker server. As you can see, the fan speed value was set to decrease to 95, just like we showed our command. We can also reduce the volume. Now, as you can see here in the command, we only say reduce the volume. We do not specify any specific value. So for this purposes, as in my presentation, I showed you the intent to action stage. In that stage, I had a JSON structure where I showed you the default change factor. And that default change factor was currently set to five. And the default fallback was also set to true. So if you don't provide any value in our commands, what our voice assistant does is it first checks if the default fallback is true or false. If it's true, then it'll just use the default change factor here. So let's issue this, can you reduce the volume command? As you can see, it says successfully updated the Intent volume control value to 45. Let's also verify this. As you can see, it set the volume to 45. So like this was like a basic demo. Now let's actually try by giving a voice command because it's a voice assistant. So I'll press the record command here. Currently I'm in manual mode. As you can see, it's selected here. So I'll record the command by, for example, let's say I'll give it value for volume control. Can you set the volume to 60 percent? So as you see, it took the command can you set the volume to 60 percent? And then it says I successfully updated the Intent volume control value to 60. And now let's actually verify. Let's verify here to see if it really did set it. So I do this. And yes, as you can see, the value was set to 60 percent. So currently in this demo, I only have three intents. A model, a light pen model, only three intents. The volume control intent, the fan speed intent and the temperature intent. So this was the first manual mode. Let's actually go to the wake word mode. So currently the wake word is set to hello automotive and then I'll issue the wake word. And we'll see what happens then. Hello automotive. As you can see, wake word was successfully detected. The assistant is saying wake word detected. Now you can send your commands by pressing the record button. So in the earlier stage, you could see that there was a red glowing ball and it was saying detecting wake word. And you can see the pipeline was created here. Here the pipeline was created. It was continuously reading audio buffers. And as soon as it detected the wake word, it actually said wake word detected and it returned our output. So this is actually all integrated into the automotive grid Linux. You can actually get this projector. I think so it's currently in the master branch under the meta offline voice agent layer. So that was it for the live demo. Now we'll have a QS session if anyone is interested in asking any questions or stuff like that. Yeah, we'll proceed to the second presentation. Hi everyone, welcome to my presentation where I'll be talking about my GSOP contribution to AGL where I developed a QT5 application to simulate can messages using kooksar.val. My name is Sachinton Chakraborty and I'm from India. I'm currently a final year student at Amity University Noida. You can get in touch with me via mail or we can connect over LinkedIn. And you can visit my GitHub profile for all my projects and yeah. I have experience as a Google summer of code and contributor at AGL under the Linux foundation. And I've also worked as a SD intern at Fosse IIT Bombay. My GSOP project was to develop a AGL demo control panel which is a pie QT5 application which allows the user to interact with the various demo applications that AGL has using button sliders and dials. This is done through kooksar.val. My duration for the project was 22 weeks and my mentors were Mr. Yann Simone, Mr. Scott Murray, Mr. Maris Vlad and Mr. Walt Minor. My weekly reports are uploaded on my personal blog post website and I have the code for this project uploaded on get it and you can visit the AGL docs for the documentation of this project. The agenda of this presentation is a project overview what my project was all about, what it tries to solve, what is kooksar.val, how the demo setup works, what it does and some screenshots to go over what my application accomplishes. So the project overview what my application achieves. So it basically provides an easy to use interface, user interface for the user to interact with the various AGL demo apps using buttons and dials and sliders so that they can have like a touch-friendly interface with all these applications. The application is developed using the QT framework, QT5 framework and on the back end it uses kooksar.val stack. It also makes use of CAN messages using the Python CAN module and as you can see in the diagram the application has dedicated widgets for each AGL demo app where it has its specific signals allocated for specific widgets and all of this is handled using a kooksar client interface and the CAN interface which can all work over a single Ethernet connection or a LAN connection to the AGL demo platform wherever we're showcasing it on a separate port. So what is kooksar.val? So kooksar.val is basically a client server interface where you can handle vehicle information using a standardized data model which is called VSS vehicle specific signals we're currently on VSS4 and it enables the development of reusable and cross-lead vehicle applications. So some of the applications that AGL has which subscribe to these signals are the instrument cluster applications the HVAC the steering controls are used in your IVI systems and whatnot and currently AGL is using data broker as the server which is our space service for handling these VSS signals. So the demo application for my project can be run as a dedicated AGL image and it can also be run as a docker container you can see the GUI using a VNC viewer and as a standard desktop application on your personal PC the instrument cluster and the HVAC signals are transmitted using the kooksar signals the VSS signals and the steering wheel inputs can be switched between pure CAN messages and kooksar client messages and the control panel supports both web socket and gRPC protocol both secure and insecure for the AGL demo platform and the user can easily start running scripts and changing the vehicle speed and the RPM for the instrument cluster application So how does it work? Basically after you run the main program the application looks for the default configuration it also has a fallback config using that configuration file user preferences are loaded into the settings page where all the values are defined and once the user starts the service using a singleton class we create an object so as to keep the instance the same across various widgets that we access for the demo platform and simultaneously we run the subscription service on a separate thread so that any application any values that are updated on the demo platform they are reflected on the control panel as well once the connection is successful the user is notified visually and the user can easily navigate across these widgets to change the values start scripts manipulate all the concerned parameters so by default we launch into the dashboard page where you can have these four tiles to navigate between the instrument cluster page the HVAC controls and the steering wheel controls and then you have the configuration page you can also handle the window controls from close maximize and minimize buttons for the instrument cluster the user can easily manipulate the indicator status for the left and right indicator they can turn on and off the hazard lights they can change the speed and rpm using the sliders provided we can also manipulate the coolant temperature and the fuel level the accelerate button is a pushdown button for as long as the user holds it down the speed and the rpm are adjusted accordingly the user can also manipulate the drive mode between parking reverse neutral and drive the navigation bar at the bottom allows the user to switch between the dashboard the instrument cluster page the HVAC and the steering controls and go to the configuration easily HVAC page allows us to change the fan speed and the temperature for the left and the right vents for the passenger and the driver side and then we go to the steering controls we have access to the horn the call controls for accept and decline the voice controls the lane correction on the left side we have access to the volume buttons and the media playback buttons the information button on the right side we have access to the cruise control settings when we go to the settings page here is where the user loads up the default configurations whatever their preferences are predefined into the configuration file or they can easily change those parameters using the fields provided for the IP address and the port here is where you specify where your server is running you can also change the connection mode between secure and insecure so SSL encryption is enabled and disabled accordingly you can also switch between the web socket and the gRPC mode for the kooksa client SDK for the page settings you can change the visibility of each widget and in steering controls you can switch between the kooksa messages and the can messages at the top most you will find the start and the reconnect button they do as they're supposed to you can start and stop your client the status button shows you the connection status when it's connected you get the connection status highlighted in green, yellow and red to reflect their status accordingly and yeah so for the conclusion of this presentation uh thanks to this GSOC project I was able to learn about development of PyQt5 applications and Qt5 applications in general I got to learn about design patterns multi-threading in Qt applications can messages and the kooksa well stack and the various development tools that are used at AGL I got to test and debug my application on real hardware which was a Raspberry Pi 4 which was kindly provided by my mentors and actual canvas adapters that we also tested with I got to learn a lot about the AGL community its development tools the whole workflow and the entire pipeline and all in all it was a wonderful experience for me thank you for your time see yeah and if you want to see the Sukhinton's project live then this is upstairs in the showcase where we have the demo set up all right that's the conclusion of the talk do you have any questions about GSOC in general or about those projects all right thank you for joining