 Hello everyone. Good morning from Japan and greetings to everyone around the world joining in. In this talk, I want to guide you and introduce you to SUSE AI, a privacy aware smart assistant. Let's first start with a short self-introduction. I'm by education and I'm a mathematical logician and theoretical computer scientist. While doing many well-plogics proof series, but I won't bother you with proof series. I guess most of you would run away. In my real life, I'm currently working at Accelia, a Japanese company doing security, machine learning, blockchain, formal verification of protocols, this kind of stuff. In my spare time and open source activity, I'm the main developer of TechLife. So the big tech type setting system developed by Don Knuth back quite some years. So I wrote the TechLife manager, the whole infrastructure doing a lot for Japanese support since I live in Japan. And besides Tech, I'm also a TVM developer. I'm currently responsible for quite a lot of stuff including the AMD Rock'em stack, the Julia programming language, all the tech packages, KDE and Cinnamon desktop environments and quite a lot of other stuff. And if I'm free and I usually bring clients into the mountains, I'm also a professional mountain guide. I'm a member of the Japanese Mountain Guide Association Board and doing whatever comes up. So first of all, I want to thank my company Accelia who allows me first to do a lot of open source stuff and community stuff and who supports me in this area. I'm very grateful. We are doing a lot of CDN security, web services, virtual reality and services for engineers, mostly in Japan, of course. I also want to introduce you to the FOSS Asia community with whom we are developing this smart speaker, the Susie AI. I think the FOSS Asia community is one of the biggest open source community in Asia. So our aim here is really to bring together a community across all borders. So we have members from practically all Asian countries and many more countries, not only Asian countries, but mostly, and not only across borders, also across genders, across ages. I think it's the most, I mean, I've seen a lot of development communities and I think it's the most diverse community I've ever participated in. If you look at the photos, you will probably guess what I mean. Okay, so FOSS Asia is not only doing this. We are doing a lot of big events and a lot of small and several big events for our engineers and developers. We have the FOSS Asia Summit once a year with a few thousand participants and hundreds of lectures in Singapore and further coding events, competitions, a lot of stuff going on. At the same time, of course, we're working with the top-class open source, free and open source hardware and software companies and communities developing by ourselves open hardware and open software a lot. And I think above all is that we are trying really to incubate and accelerate acceptance, employment and development of free and open source, not only in the main countries where we are from, like Singapore, Japan, India, but all around the Asian continent, which spans quite a lot. So here in the background you see a few of where our members and our projects are running from. So let's go to smart speakers, the current topic. I cannot really ask now, live, who actually has one of these devices at home, but I guess with a bit of techie community, within the techie community, I guess most people will have either a Google Home, Echo or some other smart speaker from Apple, whatever. I mean, there are a lot of smart speakers nowadays on the market and more or less with the same functionality and the same problem. So I mean, they are very handy, I have to say. Personally I use them a lot, mostly actually for playing music, I have to say. If I want to know the weather, I just look outside and the news I check other ways, but still, I mean, okay, it's handy. There's only one problem and I think most people will be aware of it and that is, well, security-wise, these speakers are, well, the best up is more. Hopefully they are changing, but over the years there have been many, many cases of, in particularly unauthorized voice recording, transfers of recordings to other customers, translation. It is simply necessary for the companies because they want to improve their voice recognition. So their main aim is the voice recognition needs to be really good. That is one of the biggest problems, especially with background noise, with music playing. And there you need to check whether the recorded voice and the automatically translated voice, well, whether this agrees somehow, and this is mostly done by humans. So there are humans sitting there listening to what was actually recorded and what the computer spit out and that gives, well, that gives all those people doing this kind of job access to, often, how to say, private communication of, well, whatever client was using the device at the moment. So there are a few headlines over the last years that, well, mostly about the voice history. I recently, very popular, became the Ring, Ring devices, where a lot of cases, well, IoT security, and it was open and other people could actually peep into other people's rooms. So that makes it quite scary. And, I mean, we don't know how far smart speakers are open to IoT attacks. Maybe some bad actor is actually listening to what is going on. So these are our reasons why we actually intervene in the forced Asia community. I think that is not something we really want. And we don't think that is something we want to propose to friends, to the community, to the population that they should do. And, of course, well, our answer is Susie AI. It's a completely different approach. Well, not completely different, but different from a company. So what is this Susie AI? Well, it's a personal smart assistant. It's not surprising, right? It's very similar to Siri, Google, whatever their names might be. But it is built from the very beginning on, or to be privacy aware, has offline capability. So in principle, what our aim is that you can get one of these hardware devices, a smart speaker, and it can actually run offline. Well, of course, some things will not work offline if you ask whether about some remote place that will not work offline. But actually, if I consider my own personal usage, most of the personal usage is, of course, music streaming. But if I put, for example, a smart device into my home and I have all my music onto a Blackstream server, I actually don't need internet access. I need access to my home network and nothing else. So this is something where we really would like that as much functionality as possible to be available offline and without server access, without, well, yeah, a central server being connected. And one more thing that is of interest, because that is something we see in the big two or three, is the development of skills is rather closed. So there are APIs, but it's not really easy for a beginner to start developing their own skills, right? There are companies doing this for their own products, for their own services, for their own APIs, but users coming up with their own skills, this is something rather uncommon. So we developed a skill language that is easy to learn, something like you can do in the Wikipedia, we will see this later on. And last but not least, our main aim is that it's aiming, it's a completely open source, and it's also a completely open ecosystem. So we are completely fine with people just writing further frontends, further backends, further skills, using it in whatever way. That's the good thing of open source. So that's what sets it apart from most of the other smart speaker systems. Just for a bit of background, so what is the origin of this? It started actually with the search engine, with a distributed search engine, and Yasi, and a tweet search engine. So that is still a lot of luck. So these two were developed by me and Christian, one of the main developers in our group. And so out from there, we had discussions back then, actually I wasn't part when it originally started back then, about why not making a smart speaker or a personal assistant of this. If you can ask already things like tweets about something and it searches for you and to have a distributed search system, then kind of these things were the starting point, and they are still traces, for example, in the code just for those interested. Okay, one of the other things we really try right from the beginning was to get a lot of community involvement. So for us, it's important to have a lot of people interested in investing their time. And hopefully so we have participated in several Google, some of codes and Google code in events. Sometimes, as far as I remember, sometimes the first Asia was actually the biggest participant in Google, some of course. We are doing a lot of workshops. The most recent was at the Ars Electronica, a very famous, well, electronic art event in Linz in Austria. There are several other workshops that happened this year. Some of them online, some of them offline. Some with artists, we had several installation workshops to get things running on different software, different operating systems. So that's a lot of going on. Then the first Asia community has their own code heat contest. So a contest where people can contribute. It's not only to Susie Eiter, whatever project the first Asia community does. And there are prizes to participate in the next, for free participate in the next big conference and this kind of stuff. And we have a lot of GitHub channels where most of the communication is going on. So all of this is open to everyone. So there is no formal membership to force Asia or something whoever wants to do. It's about doing stuff and being there, of being a member. So let's go to a bit to the components of Susie Eiter. So what makes Susie Eiter? There are basically three parts of it. The one is the brain. The brain of the Susie Eiter is the Susie server. I will go into details of each of those one later on. Then there are the skills I already mentioned them. There are skills, but I'm not mentioned. So can they actually develop in a wiki style ways with a wiki front end for skill development, for this language is rather simple. And there's a large community. And then there are front ends, various front ends that communicate both with the brain and the skills for all kind of operating system and devices. So let's go first to the brain, the Susie server. So that's the core. It is the interpreter of the skill language, right? So what it does is it receives questions and does natural language processing on them. So analyzing the questions for, well, singular plural for analyzing of, well, natural language process. Then interpret then the question within the skill language. So whether a skill matches or not. So there are further properties of the Susie server. This is like reflection and introspection. That means that it can actually look back at the history of questions asked before, the history of questions and of users. So this is something where in principle, it is not used by now, but long-term introspection can be handled in principle. Other things that are rather on the technical side from the necessary side is user and device management. So you can register, you can use it anonymously. You can use register as user. You can register your devices. This is all optional in fact. So there are several deployments of this. So as I mentioned before, we are for privacy first. We have a deployment on the web. This is Susie.ai, the website down here. That is of course run on one of our servers, right? If you connect there, then, well, what you ask will be somehow, we don't save the logs for long, but I mean, at least it has to be evaluated. But the point is that it is easy to make a local installation. And actually we suggest everyone to do a local installation. The thing is that you have at the end a local Susie server running either on a little device. It's not so hard to run it, or like for example on the Raspberry Pi, that runs for you in your home. Then you have your brain at home. You don't need to contact any server for getting answers. So local installations are definitely possible and this is what we also suggest. And while we have typically on our hardware device, I will speak further on which is a Raspberry. So there we also have a Susie server running. So this is about the Susie server. It's written in Java. It is quite an extensive code base with considering that it has to do the full skill language. It has some skills built in also, but most part is the evaluation of the language reflection and user management advisement. These are the main parts. So the skills. So these are the skills, actually other smart speakers also call them skills. So these are typically question-answer stuff. So there's a huge collection, I mean really huge collection of user-contributed stuff. Unfortunately, I have to say there are also because, well, I guess you have seen also very badly written Wikis articles. The problem is everyone can create their own skills. So there is quite a variety as a quality in the skills. And this has to be, of course, vetted skills or system skills. That's the reason why we have system skills and quite reduced set by default. But in principle, there are a lot of skills out there. These skills also allow for API usage. Of course, at the moment you start using external APIs, your privacy will be compromised in some way because, well, there is a connection from an IP that is somewhere near to you, to a whatever server. But it's useful, right? If you want to query questions about the weather, about the exchange rate or whatever comes to your mind. So these skills have, we have a big, you can write it by hand, you can, whatever editor. There's a Wikis style editing support on the web. There's testing environment. So there's quite a lot of stuff. The skill language itself, it's, well, pattern matching to question answers and API calls. I won't go into details. It's not a workshop about writing skills. But yeah, so this is principle, the skills. So just to give you an idea, the skills are here on this website. These are the skills that our people have uploaded there. You can, as I mentioned before, if you have your own server at home, you can also have your own skills. There's no need that you publish anything. You can do all this kind of stuff on your local network, on your local server, a Susie server, and have all your local skills also there only. So these are skills that have been developed by the community. Some are rather trivial. Some are a bit nicer. Yeah, if you're interested, just look around. Okay, and then the third part, this is about front ends. Front ends, there are quite a lot. So we have one for Android that is running. That's actually also has the ability to configure it our smart speaker device. I speak about later. There is an iOS version in the works. I'm not sure how far this is, because this is not what I'm managing. We have a desktop version. At the moment, it runs on Linux and partly on Mac. We have tested a bit on Mac already. We haven't tested it on Windows till now, mostly because a lot of Python models have to be installed. You see here an application window, but of course there's also the option to run it in the background, like Quartana or whatever integration of a voice assistant. So you would have an icon in the notification area that tells where you can turn on and off the recognition. And integration into the desktop is something that we will talk later. This is on the to-do list. And we have a web front end. You can just chat on the website suzi.ai with the suzi server, with the chatbot there and ask him some questions. So this is what I did yesterday. Well, of course, you have to be aware that if you use the Android version or the web version, both of them connect to the suzi server on the web and does not to your local because you don't have one. The desktop client will try a local one first if there is one and is detected. Otherwise, it will connect to the suzi on the background. Of course, you can block this also. These are the front ends. They are probably some more in development. I'm mostly responsible for the whole system and for the desktop clients since I'm not very much in Android development. So what does it mean is privacy aware? I mentioned already, mostly, I mean, first we try not to send anything out, but speech to text. That's the biggest problem, I guess, because translating speech to text is challenging. And if you look at what good speech to text systems use for resources is surprising. We use, by default, deep speech, a project by the Mozilla Foundation that actually also works also on the Raspberry Pi. So on a small recipe, based on the recipe before. On the desktop works excellent. We don't have a lot of language models by now, unfortunately. I hope this might improve in the future, but the future of the project itself is a bit critical. I also have to say, if you don't mind, you can, of course, choose to use Google Speech to Text or Bing or Watson Speech to Text. These are configuration options. They, of course, provide much higher, still much higher accuracy than deep speech. Google is actually very good for Bing and Watson. You have to log in and you have to have some subscription, as far as I remember. For the Google Speech to Text is free, you can, at the moment, you'd never know with Google, right? You can use it freely. That works quite well, as long as you have internet connection and trust that your stuff is not a hold forever. For text to speech, we use, again, by default on device, the F-Lite system, which is okay. You understand the answer. It's not the best pronunciation, I would say, but mine is also not that good. Of course, again, the option is here. If you don't mind, you can use alternatives like Google or Watson. It's just a configuration setting. For those who are less concerned about this stuff, it's rather easy to switch to Google. Of course, it depends also on your usage of the device, whether information about yourself turns out. If you connect to a lot of APIs, then, well, a lot of information will also go out to do this stuff. Well, the server, as I said, we have the server deployed on the web, that is for everyone to use, which suggests to use your private installation, because then you're quite sure about your privacy, the stuff, the questions you asked and the answers to come and the log files and everything is not available to anyone else but yourself. For those interested in the development, everything is open source. You can get it on GitHub under the FossAsia organization. Practically, all the projects have started with Susie somehow. Susie in Starla is, so to say, the binds everything together. It installs on to a variety of Linux systems. At the moment, we support Debian to the Fedora Mint. I don't know quite some purpose, whatever it is. It's not everything because we have to install quite a lot of stuff, but it tries to be as resilient as possible. The Susie server itself is the development in Java of the Susie server. There are also deployment branches with the ready-compiled Java file and startup scripts that can be used. Susie Python is the interface library to talk with the Susie server and interpret the answers and, well, rebuild it into some Python objects that are used. Susie Linux is the client for Linux but also for the Raspberry Pi. It's used across all systems. It is written in Python. It has the voice recognition. There is the communication with the server via Susie Python. Lots of other features like alarms and this kind of stuff. Susie.ai is the web interface that is available post on the web. Susie.ai, you can see there. There is even more advanced version on the Raspberry Pi where you can register and control your device. This is developed. There are many other repositories in within the first Asia project because we have a lot of other projects, too. For example, one of the important ones is the fork of the speech recognition Python model, which unfortunately seems to be up and down. I have implemented support for deep speech within this project so that, well, we can use deep speech and Google and all the other TTS and STT services supported by speech recognition. If you look around, I'm happy to answer your questions. That's all for normal computers, desktops, whatever, but that's not all. We have actually been together, stuck together, still a rather rough device, I have to say, but a smart speaker built with the Susie.ai system. What we are using is... These are these devices. I actually have three or four here next to me. These are Raspberry Pi-based. I think the first version was running on Raspberry Pi 2. Nowadays, with deep speech, that doesn't work out. Raspberry Pi 3 Plus or 4 would be much better. We used the receipt head for the microphone and, well, for the audio output, it's at the same time, but that could be done differently. Just for the nice, we have this 3D-printed cover with Susie.ai on it. Just to put it all together, I have the loudspeaker. All the components I mentioned before, so these are the server, the Linux front, and the voice recognition. Everything runs on the Raspberry Pi. The installer, well, just sets everything up that you're just plugging into your environment. It starts normally by default. After a factory, it sets up like a Wi-Fi hotspot that allows you to set up the Wi-Fi password so you don't need any cable. After that, it connects, but you can leave it off because it works, as I said before, right out of the box, completely offline. Just the last thing. I think two days ago I read an article that finally on the Google area devices, you can schedule actions. I didn't know that wasn't possible till now because actually we have worked on this and have not complete, of course, but we have some support for time functionality, delayed functionality. So that is something you typically see. Well, as I said, with my Google or Alexa, usually you just play only music. We try to have functionality that supports and supersedes stuff that other companies are doing. So we had times for quite some time now. Okay, nothing goes without hurdles. This is not only the coronavirus we had to succeed. You can imagine that such a huge project has quite some complication and I think it's quite interesting to know also what are these problems. So first of all, it's a huge area, right? We are covering from back-end, front-end, speech recognition, machine learning, all kind of stuff, and everything is covered by volunteers, right? So that is something that, of course, reduces development speeds. Then a lot of the development that has gone on was with coding programs, like Google Summer of Code, in our own coding projects, and participants come from a huge, widely diverse background and they are, of course, cultural but also development skills diversity. So we have deal with these problems and that also makes the code base quite interesting because it's a wide mixture, right? You have a student with full requests from students all around, some of them are excellent, some of them are good but not excellent and not all of them have 100% the same coding standards. Of course, we have requirements and coding standards but everyone working in real life is probably aware of these problems. And one of the big problems, actually, is short versus long-term involvement, right? Most of the participants stay only for these coding programs. I mean, many of them want to have a certificate that they have done with Google Summer of Code or coding our own projects, code heat. Some of them stay and we are happy, of course, and they enjoy doing this development but getting really good people for long-term involvement is really difficult. There are also other questions. These are more on the technical side. Popularity of coding languages is a huge problem recently. Also Java is still one of the most requested one. Popularity-wise, we don't get many students wanting doing Java stuff while a huge amount is doing JavaScript and Python stuff. So development on the server side is a bit slower, quite much slower than changes going on on the front-end side and the Linux side. Well, then there is the running target Python. This is really a problem because every new version breaks half of the system because out of binary incompatibilities everything has to be newly installed. So that is really somehow a problem. And then recently, we have used Deep Speech a lot. Unfortunately, some of you might note, Mozilla has decided to scale down a lot of development efforts and one of these projects will be Deep Speech. So it is not completely clear what the future of Deep Speech is. We will hope that there is a future because we think it's a great project. Mozilla has done a lot of work on the open language connecting spoken words for speech-to-text recognition. So that would be great if there is some continuation. Okay, so what's our roadmap that's coming up? Well, very easy, right? One through the eye for each household. That is our dream, right? Because we are still a bit far from that, but long-term that would be nice. So currently, we are really working on to have this development device really like working just like out of the box, right? I mean, if our developers have this device and installed it and something does not work immediately, they can fix it, right? That's easy, but that is not what we want to give out to other people. Everyone can try it out. It's not a problem. It's everything documented. We publish also already Raspberry Pi-ready images. You just dump on the SD card. They should work, but we want to make it easier also for people to get the device and play with it. Then what I'm also currently working on is desktop integration. So I want, as I mentioned before, a nice application item down there that you can turn on and off voice recognition and integrate it with actions of the desktop environment like KDA or GNOME or Cinnamon or whatever comes up, but that would be nice. Skill management, as I said before, there are a lot of skills. There are some system skills and some we have worked it or there are good skills. So what would be really nice would be a management that you can activate skills on certain criteria management only a very restricted set or some all skills if you want to just be explorative and try all of the skills or just those vetted by our main developers. So there is something to do here. Server mesh, as I said, we really want people to install their own servers. We don't want them to use because out of privacy reasons. And then it's getting interesting with user registration and the server mesh. So I imagine you have your smart speaker or maybe two or three, like I have two or three devices of the smart speakers, each one runs its own Susie server and then you have maybe a Susie server running on your mass storage or whatever for the rest. So here it would be nice to have a server mesh so transmitting which is the main server. There is one main server into the household and this one is used and registration of usernames and devices of the family members also is done only there. So this is something we would like. And last but not least a steady development, right? I mean, since we're all volunteers and working with projects while real life, getting bread or rice on the table, it's not that easy to have steady development but that is something we hope to do. So if you want to get into contact with us, the best way is probably the Gitter channels. And there's the main for Asia slash for Asia, which is just a catch all for whatever questions you ask. You might be, please ask here or there, just also about for Asia in general, announcements and stuff. And then there are a lot of for Asia slash Susie, whatever, according to the name channel. We are at times is very high activity about development, of course, depending also on the inputs. And of course, whenever you want, you can contact me personally at one of these emails. And yeah, that's all from my side. Thanks everyone for your attention and for joining in. And I'm open now for question and answers. Okay, thanks everyone.