 Right, so I want to talk about speech attacks in yet see me today So who here knows about jet see me? What sorry? Yeah, so who here knows about jet see me? I mean, so that's not all right, so the jet the jet see organization as a set of as a set of open source project We focus on Video conferencing solutions, and we're also a community of developers. So We have contributed to us all over the world. All right, so I I joined yet see as part of Google summer of code and I was part and to I was I Participated in 2016 and 2017 and afterwards a joint at Lashen So Quickly what is jitsi meet? jitsi meet is a secure and scalable video conference solution for for your browser or your mobile devices so you can Create a create a room share the URL and Order and order people will be able to join and you can have your meeting. So internal internally at Lashen We use it for our team collaboration tools but Order people can use it as well. So here is Outwards noting using using jitsi meet You can also Choose to do it, you know, you can also Music yourself so we host an instance over at meet.jet.c But you can also host it yourself Go to the get a brappo and follow the instructions are if you want to host it yourself so I want to quickly go over the the jitsi meet Like a few projects in jitsi meet ecosystem So the front end up front end application is also called jitsi meet So it runs on most major Browsers like promo firefox and we also have an ios an android application So the the video bridge is the selective forwarding unit for the for the video streams and as in the previous presentation, you know, it's very CPU efficient and the main limiting factor is bandwidth So the most important project for my for the speech attacks was jigasi Normally it connects the sub world with When it's in meet conference And it does this by joining as a participant to a meeting and this way We were able to use it for for the speech attacks as well so One other project is jibri Which you can use to record a meeting and also stream it to a platform such as YouTube if you're Banned with cannot handle a lot of participants All right, so now to the transcript part So as I explained Before I took part in Google summer of code and the goal of my project was to able to to offer accurate transcripts of a meeting So this allows you to not be able to have to write notes or something like that and you can also search for them Afterwards if you made a decision for example and you forgot And we also wanted to offer real-time subtitles so that hearing impaired people Can still participate in the meeting so To be able to do this we had to choose a Speech-to-text engine and we first looked at open source solutions But there wasn't we didn't find anything which met our constraints As of yet So we had to choose for a closed source solutions and we decided on using Google speech API So the the way it works is that in the left you can see You can see a video conference with two to Chrome browsers and then somehow Internet Explorer And the gassy is the fourth and fourth participant which receives all the audio and It cannot forward this audio to the Google cloud API When it when the Google cloud API returns the the text for from of the audio a gassy can do a few things So one thing we do is we send them back to the conference and we can display subtitles or put it in the Put it in the chat We can also store all the results and then at the end of the meeting we can give a transcript of everything which was set so I Will just now show a video of action If the audio works So I'm pretty sure you weren't able to hear the audio But now you got a live demo of how it would look like if you were hearing impaired for example or were in the busy Busy maybe in the airport and you still had to go to a meeting So you could see the subtitles which were displayed while I was talking you could see that The transcripts are put in the chat and at the end Well, there was a link where you can see the transcript So this is what it looks like if you were running the gassy meet, but you can also Modify it to suit your own needs. So for example my colleagues implemented something over it and our Proprietary solution, which I'm sure now And you can also see that it works for more than one person in the meeting Meeting today, so I'm going to start the transcriber so you can catch up later Okay, cool. I'll follow the transcriptions live and clean up any issues that show up. Okay, cool. I've got it open So why don't we go around and do a little round table status update? And we can all just talk a little bit about what we're working on I can start So I did some testing on the flux capacitor today and things are looking good I want to spend a little bit more time on it this afternoon, but I think it should be pretty close Lenny how about you? Yeah, I've been modifying the DeLorean. I almost have the instrumentation panel all set. I'll just need another couple days there How about you Yana? Well, I've got an idea to turn a mystery coffee into a fusion reactor for a power source I have some designs and I want to go over with you guys later Okay, great. Yeah, why don't we do that after this meeting and then we can get back to work and sing up again tomorrow morning Perfect. Sounds good. Great. See you guys. Thanks All right, so you could see that we were able to you know have some transcripts, which is nice But the most important part is of course, you know, how accurate is it to you know, what was really set So here is an example of one one sentence in the meeting we had So so my colleague said I was mostly cleaning up code and Jetsie and jacofo for Calibri Because I find that it's very hard to understand. So you as you can see the transcript what which was returned was Missed the first three words and then it and skip the words like Jetsie jacofo and Calibri Which may like you kind of make sense because that's probably not used as much and You know, there's also some general mistakes like cold and it's which is skipped Right. So we experimented with a few different things, right. Yeah, and We noticed that actually if you upload your meeting to YouTube you get the most accuracy about three out of four Three out of four words, which are correctly transcribed so Unfortunately, it's not and the Google API was about one out of two words all right, so in the future, we want to look into using Mozilla deep speech and common voice Projects which are open source and which which would allow you more freedom as a user and Would which would also enable offline news We also want to look into using Contextual data like when you're talking in the chat of Jetsie meet to try to see which words are commonly used and maybe See if those fit in the context Right, so thank you for listening and are any questions