 All right, it's just about time to get started here. So we'll go ahead and crank this up. Welcome. My name is Mike Anderson. I'm a chief scientist for the PTR group. And just who are the PTR group and what do they do? Our big projects right now are all related to NASA robotics. I'm hoping, at some point, to get our good friends at NASA to give me permission to talk about what we're doing in a public forum. But suffice to say, at this point, it's a project called Restore L. And it is all about being able to come up underneath Landsat 7 and dock with it and refuel it in orbit. So all the robotics, all the controls and everything, and it's running Linux. How about that? In that special line? I'll take my trivia game. OK. So we also do a lot of flight software. We're basically in the business of writing flight software. We do a lot of work in the IoT as well. And it was actually because of our work in the internet of things that brought me to this particular project. So we've spent a lot of time doing real-time operating systems and then, of course, running Linux on lots of very small embedded platforms. We also are into offensive and defensive cyber operations. The defensive is pretty obvious. I'll leave your imagination to what offensive cyber operations are all about. That's me. I've been in the business now 40 years this year. How about that? I got my first job as a programmer in 77. Go figure. Of course, I've been in the real-time industry most of that time. So it's been an interesting transition as we've seen embedded systems going from IBM 360 mod 30s, mainframes that would fill this room down to things like the little laptop that I have here running the Raspberry Pi. So what we're going to do is we'll talk about why would you want to do this? I mean, what is the purpose of doing something like this? And then we'll get into exactly how Amazon Echo works. How does it do what it does? And how can you go about doing voice enablement on your individual devices? Then we'll look at setting up the system and what we have to do to register with Amazon voice services and then getting an application up and running and then we'll kind of hit the summary here. So why do this? Well, it turns out that in our business with the Internet of Things, we had a customer who came to us and said, I want to do voice enablement of plumbing fixtures. And I go, you want to do what? So the concept of Alexa clean my toilets is not too far away, I think. We'll see how that plays out. But in trying to think about what the use cases are for voice enablement, it really caused me to go down this path of, OK, how do you do voice enablement on a platform? What does that mean? How do you handle the voice recognition? Of course, traditionally voice recognition has been a major problem for a lot of systems and required a lot of horsepower. So how do you enable a plumbing fixture with voice? How do you do that? And exactly what is the process? So I managed to get my hands on an Echo Dot. And the Echo Dot is about actually about this size, about the same size as this. The original Echo, they actually have two of them. The first one over here is the original one that was for sale. And of course, that one's got a Bluetooth speaker inside of it. The Echo Dot, on the other hand, is a very small device that is focused just on doing the voice recognition piece. It does have an option for a Bluetooth speaker. It also has an option for a normal plug-in speaker if you do, in fact, want to do music enablement and things of that sort on it. But it is an interesting little device. And having seen it on iFixit, they did a teardown of it. And they did a pretty decent teardown. It turns out that the main thing that drives it is a TI DaVinci. So it uses one of their DM3730 parts or something like that. I think I have it in here in the presentation. So when you look at that, you go, OK, it's basically a Beaglebone Black with some audio processing in it. How hard can that be? So one of the things you have to first think about when we're talking about doing voice recognition and voice enablement, you have to understand why. Why are you going to do this in the case of, say, plumbing fixtures and that kind of thing. Maybe you've got your hands are all covered with chicken slime and you don't want to touch the handle because then you'll transfer salmonella or something else to it. Those are the kinds of things that you really have to think about. What is the use case here? What am I trying to do? And you also have to understand that the voice recognition, and we'll get into this as we get a little bit further presentation, the voice recognition in this particular case isn't actually done on the Echo. It's not actually done on the Raspberry Pi. It's actually done up in the cloud. And so one of the key pieces that you have to have in order to make all this work is an internet connection. And you then have to associate the internet and do all that wonderful, happy stuff. Now, when you think about a device that is really an appliance, let's see here, like that guy right there, there's no mouse keyboard, anything like that associated with it. So provisioning these devices is a perpetual problem in the IoT. We've seen this time and time again where using silicon labs, silicon and their mighty Gecko parts to do thread and a couple of other things, provisioning is a major problem. And how do you get credentials on it? How do you get certificates on it? How do you make sure that it's associated with your Wi-Fi? Has the right Wi-Fi SSID and password? All that sort of stuff has to be done with a device that has no mechanism for interacting with it other than a button. So in many cases, what a lot of the manufacturers do is they'll enable these with Bluetooth. And they'll just simply use the Bluetooth SEO model, that particular profile. So it looks like a serial port. So you download an app from the app store. You then start the app. You plug in the device. And then it starts looking for something to pair with. And in many of these cases, especially today, it's all being done with Bluetooth Low Energy, BLE. And BLE doesn't have pairing. So it just simply says, oh, hey, here I am. And what can I do for you? And your mobile app then provisions it. So the mobile app is the thing that actually downloads the certificates, get the SSIDs and all that sort of stuff taken care of. So that, we're finding that that model works well for people like her in this audience, does not work well for my grandmother. She has just discovered what a smartphone is. And unfortunately, she's now on WhatsApp. So I get lots of traffic from her. Nonetheless, so what's going to happen here? You have to have a microphone, a speaker, and an internet connection. Now, one of the things that I found is that the microphone piece of this is an interesting challenge. We'll talk more about that a little bit later. There is a wake word that's associated with this. And any of the voice recognition systems that you will run into will typically have some magic wake word. Now, the thing about the wake word is fortunately it can be changed. I have one of them at home that I've done that the wake word is now computer. And then it wakes up. But the downside of the wake word is that the system is always listening. And it's always waiting for that magic wake word or something that sounds close enough to it that it will suddenly wake up. So if you're a privacy fan, then these types of devices are probably not for you. Because they are listening all the time. They are connected to the cloud most of the time. And as a consequence, any sort of thing that comes close to the wake word will then open up a connection up to the cloud. And it'll start doing voice processing. And if you're the paranoid type, this is probably not a good thing for your house. It is one of those cases of, all right, this is kind of cool. And you can do Uber. And you can order pizza and a few things like that. But in the long run, is it something that I want in my house? Being a security guy, it makes me very nervous. So I unplug it most of the time. But every now and then, it's fun to play with. It is not much different than your cell phone, except that they are, in this particular case, it's automatically doing the cloud processing all the time. So if you're paranoid, you shouldn't have a cell phone anyway. So let's just unplug and go out into the middle of Canada someplace and wait for the beavers to show up, I guess. But in any case, so let's move on here. Now, once you have made the connection to Amazon web services, and of course, this is Amazon Echo, so everything has to involve their cloud system somehow. In Amazon web services, they have a little over, well, they actually have several thousand of these things called skills now. So one of the things about the way the Alexa system works is I can develop my own skills, which means I can develop new skills for the device. So if I wanted to have, and this is one of the things I'm working on in my hotel room, an Alexa enabled robot that you can just say, Alexa drive forward and it starts moving and have it be completely voice controlled, you can develop those skills. So the skills can be written either in Node.js or they can be written in Java. So those are the two flavors that they have right now. And your device then implements the skill. So basically the flow is my device is listening, I say the wake word and a command. They then take the voice that they just recorded, send it up to the cloud. The cloud does the audio processing. The cloud then figures out what skill that's related to and then sends a command back. And when it sends the command back, then your device does whatever it was that your skill said it was supposed to do. This implies lots of interesting things. Also from a security perspective, because now I have a message that I have initiated from the inside of my firewall, going up to the cloud. The cloud does some processing and sends me something back. Well, it has to maintain the connection. The connection has to be open for the cloud to send me something back. So as soon as it sends me something back, then I'm supposed to go off and do something else like order pizza or send for an Uber. So the whole process of implementing the skill is as secure as they can make it at the moment. But again, if you're paranoid, you probably don't want one of these devices in your house. Now, originally the Amazon Voice Services, AVS, was launched here in the United States and it was only available in the US. So as long as you spoke US English, then you were able to use the service. If you spoke British English or something else, then you were just kind of out of luck. Fortunately, in the past couple of months, they've now done new voice recognition engines that will allow them to recognize both UK English and German. So I guess they assume that those were the two high tech places they wanted to go first. Obviously they're going to be doing enablement for other languages, certainly Mandarin, probably Japanese and a few others. But at this point, the only three places that are currently supported for Alexa Voice Services are the US, the UK and Germany. Or if you happen to speak English, US English, or British English or whatever, in whatever country you're in, then you can still use the service but it connects back to the US servers to do its processing. So obviously that's going to imply some delay associated with it and there are some issues with data sovereignty. Some nations don't like the idea of servers in the US touching any of their data. So that has its own set of implications associated with doing all the processing there. Each additional region though is going to have to have its own speech engine built. And they're working on these but they don't have a kit out at this point for being able to develop new speech engines. So that's all being done internally with Amazon. Now in general, your design guidelines, before you try to add voice recognition to any project, you got to work through the use cases. Why do you want this? What is it that you think you're going to be able to do with voice recognition that you can't do with something simpler like a cell phone or a button or something really easy? We've seen this kind of creeping elegance in the IoT course when I have my Phillips Hue light bulbs and I walk into my room and I have to have my cell phone to control the color of the light bulb or to turn the lights on. It's one of those cases of, you know, they had this thing they invented called a light switch. It's worked really well for the past 100 years. It still seems to work, but now I have to have my smartphone in order to turn on the lights. Interesting approach, but this is part of the creeping elegance that goes along with the IoT. They basically have solutions that are looking for problems. And in this particular case, yes, they have an interesting solution with voice recognition that are looking for use cases. They're always trying to come up with new problems to solve. The Amazon, this particular speech recognition engine, the automatic speech recognition engine, then goes into Alexa's natural language understanding engine, and this is where all the heavy processing is done in order to do voice recognition. So all of that's gonna be done up in the cloud. Fortunately, your device doesn't have to do that. In order to get started with this, what you have to do is you do have to register at amazon.com as a developer. Now registering at amazon.com as a developer doesn't cost you anything. You just simply given a user ID and a password in your email address. And then the next thing you have to do is you have to start learning about what Alexa skills are all about. So they have this thing called the Alexa skills kit. And the Alexa skills kit takes you through the process. They have two versions of it. One for JavaScript with node.js. They have another one for Java. And you download whichever skillset kit you're interested in. And you start going through the kit, basically. It describes what a skill is, how it's implemented, and what you have to do in order to turn on a new skill. They do have several different classes of skills if you will. They have those that are public skills. So if you've developed this really cool skill, like driving a robot, and you want to put that out in the public space so that everybody can see it and use it, then you can declare it as a public skill. Alternatively, you can also declare them as private skills. And as a private skill, nobody can see it except you. So you typically want to develop as a private skill. And then if you decide at some point in the future you're going to make it public, then you'll declare it as a public skill. There are, fortunately, several different examples of skills on the Amazon developer site. So as I say, there's now several thousand of these things that have been developed. And they range anywhere from using a Phillips Hue light bulb. That's one of the options. You can control the Nest thermostat. You can order pizza from Domino's. You can get Uber. There are lots of these different skills that they have connected. And all of them seem to be operational at this point. And people are using them. So the cloud part of this. Now, when you're developing a skill, there's two pieces. There's the piece that runs on your local machine, which is basically the thing that understands, okay, I've got this new skill. I'm going to implement turning on the light bulb or I'm gonna implement changing the thermostat temperature. Then you have the cloud part of this. So this is where the Node.js piece of it comes into play. I'm gonna go up into the Amazon cloud. It doesn't run a VM. It runs one of their new services, which is just basically a remote procedure call like service. And you'll develop this remote procedure call up in the cloud. You'll test it up in the cloud. It will then connect to your device and you'll go ahead and implement that as a skill. So fortunately, there's a GitHub repo for all of this. So you can go to the GitHub repo, download the skill development kit and take a look at what it is built for. They do have most of the code. They claim they'll be able to run on Windows and of course under OS 10. But of course I run Linux. Fortunately, they have completely enabled the Linux side of it. So it's not like Linux is a second class citizen here. It's like Linux is the first platform and then they enable the other ones, which is a nice change of pace. So once you have that skills kit and you write the cloud part, now you need to say, okay, well, I've got the skill that's defined and you can test the skill with the Amazon Echo. The Echo itself, a little Echo Dot, it's only like $40. So you can get that, plug it in, test it, make sure you've got your skill working up in the cloud and then move things over to your local platform, your local embedded Linux platform. Now, what are you gonna need? Well, it turns out that there, as I said, the Echo Dot uses a TI 3725. It's basically, it's a Beaglebone Black class chip. It's a Cortex A8. It doesn't have much memory. I think it's got 256 megs of RAM, not a whole lot of flash. So in this particular case, that pretty much means that you can enable almost any of the current development boards. Even if you had an early generation Beaglebone Black or even just the original Beagleboard, you're still plenty fast enough for doing the kinds of things that this particular application requires. You have to have an internet connection and therein lies an interesting problem. How are you gonna get connected to the internet? Obviously, if you've got ethernet, that's one thing, but most of the devices these days wanna be wireless. So it says that you have to have Wi-Fi on the board. And of course, unless you're using one of the newer boards that has Wi-Fi built into it, then you have to figure out how you're gonna get Wi-Fi on the platform. USB dongles, I found actually work reasonably well. The Wi-Fi interface that you can get from Adafruit, there are a couple of others out there that work reasonably well. And so they do, you just plug them in and they go. That's an excellent thing and certainly speaks a lot to the kind of progress we've made in the Linux world where you can just buy a USB device off the shelf and just plug it in and not really have to worry about, oh, you know, who's driver is it and all that other stuff. It does work pretty much with most of the devices out there. You also have to have a microphone and a speaker. Now, the microphone that I happen to use in this particular case is the Bluetooth microphone here. But a USB microphone and an audio amplifier going to the speaker works reasonably well. Obviously, the audio output on the Raspberry Pi, because that's the one I happen to be using in this case, the audio output is not amplified. It's made for headphones. So you're gonna have to run it into like a Class D amplifier if you want to be able to actually have somebody hear what it says. The Bluetooth speaker and actually this device that I have is a, it's for cell phones. You can, it's then a microphone and speaker. But I found there's a latency in what it takes to wake it up. So, you know, you can say Alexa and it'll actually cut off. It'll clip it a little bit. So you have to kind of wait for it to wake up and then say Alexa again, and then it'll be all right. And that's just the nature of the Bluetooth speaker. Obviously, if you're using hard line connections, either speakers or microphones, it's a little bit better than that. Obviously, you've got the usual power supply, ethernet, SD card, et cetera, et cetera, that you have to have in order to get the platform up and running. In my case, as I said, I'm using Raspberry Pi 3 because it came with Bluetooth and it came with Wi-Fi built into it. There's not really a whole lot of processing that has to be done on the Raspberry Pi 3. So you could have easily used a Beaglebone green wireless or the new Beaglebone black wirelesss or Udu Boards or any of those platforms that have Wi-Fi built into them. But it just happened that I've got like a dozen of these things sitting around the house. So it was a convenient platform for me. So setting up the Raspberry Pi. This is an interesting challenge because obviously you get Raspbian and you start off. You can actually use Raspbian or you can use Ubuntu Mate, but the Raspbian has a completely worked path. So there are several places that you can find on the web that you will actually kind of walk you through the process of getting Raspbian set up in order to be able to do the voice recognition piece and to run Alexa. Because the Alexa sample app happens to use both Node.js and it uses Java. So it means you gotta get the Java 8 engine down on the platform and that of course takes quite a bit of not only disk space on the platform but also a lot of time and a fairly reasonable internet connection in order to do it in your lifetime. So the ability to make sure that all this happens, obviously it's pretty straightforward once you start to run the script and we'll show you what that script looks like here in a moment. Once it starts to run, it takes about an hour and a half to two hours on a fast internet connection to get the Raspberry Pi set up and ready to go. The, since it uses a Debian distribution, anything that runs a flavor of Jesse will probably work just fine as well. Now in my particular case, I happen to be using this little Gizmo. This is a Raspberry Pi laptop. So it's a kit that my wife asked me what I wanted for Christmas and I go, well this looks cool. So she said it, she bought it and of course they have two flavors. They have the gray one and then they have this lime green one and of course I went with a lime green because it's different, right? So this particular device, it will actually power the Raspberry Pi 3 for about 10 hours off of its battery. The screen is a 1366 by 768 display. So it's a fairly reasonable display size and it builds the keyboard, mouse and the touchpad and everything all right off the USB. So it's actually a pretty interesting approach and this made it easy for me to sit there and interact with the Raspberry Pi because normally when you're dealing with the Pi, if you first power it up, you gotta plug in HDMI and a keyboard and a mouse and get all the cabling and everything set up and that's kind of a hassle. So this because I had a ready access to some of my robotic students because I'm a mentor on first robotic teams. I had a robotic student that was sitting around twiddling her thumbs. She didn't have anything else to do. I said, hey, you wanna build a laptop? Sure, let's come over here and let's build a laptop. So I used slave, I mean student labor to put it together for me and then I had to go through and verify everything was done right but hey, that was kind of cool as well. All right, so next we need to clone the Alexa sample application and again, it's out at GitHub. We can do a get clone of that repo and that will bring down all of the sources for the audio recognition engine, the wake word engine and the client that actually implements the services that you're going to have up in the cloud. So that will then ultimately connect up to Amazon and establish its connection and its verification and all that sort of stuff. But first, before we can do that, we have to pull down the credentials for the device because we're going to be actually going up into Amazon's cloud system. They're not too keen about us just allowing any device up there. They wanna make sure that it's got certificates and it has security passwords and it has the whole smear in order to be able to access their cloud system and there's a couple of other things that you end up having to do which we'll talk about here in a moment. So we gotta register the device. Now, when you go into the Amazon cloud system, there is an option that just says Alexa and if you click on Alexa, it'll then show you a couple of different characters here. This particular option is the one you want which is to go into the Amazon voice service. You will click on their get started button there and then that will bring you to a new page called register a device or register a product type. So in registering the product type, you have to give your device an ID. Whatever name you want is fine. I think I called mine Pytop Echo or something like that. And then click next. Then it's gonna take you to a security screen that then asks you to create a new security profile. Now, the security profile is not as scary as it sounds. Basically they're looking for a name and a description for the device to say security profile description. And I'm looking at that going, security profile description, what does that mean? Does that mean that I'm going to say what kinds of services it can do? What does that mean? It actually just means they want you to in words say, this is a pie laptop or something similar to that. So it really isn't nearly as scary as it sounds. Then once we collect next, it'll then generate a series of credentials. So the credentials that it produces, you know, the screen will be washed out. The credentials that it produces, there's a whole bunch of stuff that's under the blue here that goes with your security profile ID, your client ID, and then the client secret, which is basically the certificate that they generate. And what you're gonna have to do is you're gonna have to copy this stuff out of this webpage and copy it into the Alexa build environment in order for you to be able to authenticate. Now that gets you at least part of the way there. Now I need to register the device. So I have to describe the device, specify what kind of category it is. And they have consumer electronics and TVs and other kinds of categories. This one, I couldn't find a category that matched what I wanted to do with it. So I just called it other. And you have to give it a description. And in this particular case, they actually have an option for uploading a picture of it. So it's like, oh, cool, all right, I'll upload a picture and then I'll have an icon. That sounds like a good idea. They have a couple of other questions that they ask you. And one of the major ones here is do you wanna have access to Amazon music services? If you wanna have access to Amazon music services, so if I wanna say, you know, play, you know, fallout boys, then I have to sign an extra piece of paper, basically. Obviously it's digital, but it's going to then ask me for a whole bunch of additional questions, a whole another form that has to be filled out that deals specifically with accessing Amazon music services. So if you wanna do Amazon music services on the device, you gotta go through some extra hoops. But that's simply because of all the copyrights and everything that are associated with music these days. If you don't care about Amazon music services, then you can skip that whole page, don't check the box and everything's good. Now, what will also happen, and this is one of the things that's a little confusing about the way the Echo works, and that is when you're using Amazon, the Alexa voice services, it's actually going to reach back up into the cloud and present security credentials. But what's going to happen is when it asks the cloud for the security credentials, the cloud is then going to respond back, here's your security credentials, and this is where you should go to find them. And so the client that runs in AVS on the local device then has to talk to this server that is now in communications with Amazon. So it has these redirect URLs. So there's an origin URL, which if you go through the original security piece of it, so if you go back up here to the actual device, there's an option in the security profile. If you click on web settings there, there is an option that has this mysterious thing called a redirect URL. And it's not entirely clear exactly what that is. They don't describe what it is, they just simply say, edit, okay, now what? What we found is that what they're really looking for is they're looking for the port number of where the connection to Amazon is. So you're gonna connect to Amazon, you'll present your credentials, it'll then send it down a thing, this redirect URLs that say, these are the places you are allowed to connect to. So it's another security step there to make sure that you don't just have arbitrary devices hooking up into Amazon, you only have those devices that have actually been authorized. So these redirect URLs, there's an origin URL, and then there's a return URL that has the authentication pieces built into it. So those have to be filled in. And the first time I tried this, it's saying, oh, I can't connect, do whatever. And it says, enter in your redirect URL. And I go, what's a redirect URL? And so after quite a bit of searching and poking around in various other systems, I found out what the redirect URL is. And that actually turns out to be the redirect URL that you need to make any of these things work. So it's secure, sort of, maybe not. It's just a different process. Windows, so we start up the wake word engine, actually you have to start them in order. You start the web service client, then you start the Node.js skill implementation, then you start the wake engine. And then assuming that all of that is working and that you have a working speaker and microphone, as soon as it does this, this will then actually start a connection up to Amazon. And it'll ask you, do you want me to use your default browser to connect to Amazon? If you say yes, then it'll actually allow you to log in to Amazon. And then that's where I am right now, where it said, Amazon is experiencing problems, we'll get back to you as soon as we fix them. Which is just what happened, not 20 minutes ago. So in the second window, this is where we're running our application. So this is the, actually, this is the Java client implementation. So it'll then open up that box and says, please register your device by visiting the following URL. And then it has instructions in there for provisioning the device. Then once your browser opens, you're gonna get a warning that the connection isn't private. And of course you have to make your decision as to whether or not you're going to allow that to happen. In the directed page that you, the page you get directed to, you're going to log into your Amazon account. And then once you've done that, you'll be redirected to something called developer authorization. So the developer authorization page, you will then paste in the security profile that you used before, and that little dialogue box that was popped up here. Oh, there it is, sorry. So that little thing that is the provision token. And as soon as you've got the provision token in there, it should connect and then say that it has accepted the tokens. If it accepts the tokens, then you're live at that point and you can start asking questions. You can then minimize the browser because you don't need it anymore. Actually, you can minimize all the windows because you don't need them anymore. You don't want to kill them, but you don't need to be watching them either. Then you start the wake word engine and in the wake word engine, you have the individual samples. Of course, we talked about the wake word engines and what their dialects are. In my particular case, the kit AI wake word engine seem to work reasonably well. Now it is possible, as I said, to change the wake word. So if you don't like using Alexa, then they have other options. You have to actually go in and hack the code. It's not just like a here, put the wake word here. It's not that simple, but it's not that complicated either. And then after I had it up and running, I asked Alexa to make me a sandwich and then she said, okay. But she wouldn't do it unless I said Alexa, make me a sandwich. Then that's what's required. Now if everything works, you've got Alexa running and it's time to try and button this thing up. So how do you make it stand alone? Well, obviously in my particular case, I had a keyboard mouse and monitor attached to this thing. So it was pretty easy. But when we want to actually make this into an appliance so we can actually have setting out on a desktop or on our countertop, then we need to do something to make it so that it doesn't require keyboard mouse and monitor to get everything going. Obviously in this particular case, I was using tight VNC just so that I could log into the device if I had to. The other thing that is now much more prevalent and was not automatically installed as it turned out is Avahi. One of the things that we have, of course, is once we associate and do a DHCP of the WiFi interface, then how do we know what the IP address is? How do we deal with all of that? So it turns out that if you set up the Avahi daemon, then you can just simply refer to it as Pi Alexa or whatever it is you wanna call it and then it'll be able to find it on your local network segment. So that's one of those little things that they don't really talk about much in any of the setup documentation or anything like that that I found. If I have no idea what its IP address is, how do I talk to it? Avahi makes that possible. We also have, in this case, setting up tight VNC as an auto start. So you simply plug these files in and then save the file and then as soon as the system reboots, it'll automatically start VNC and then you can do a tight, you can actually do a Remina is what I use and I just simply go to my Raspberry Pi top and it finds it on the network using MDNS and lo and behold I'm now connected. Obviously we need to also then create the three different processes that will then run automatically instead of opening them up in Windows, you can then run them automatically that way as well. And as soon as all of that is done, depending on how you wanna work, obviously we can do like I did and use VNC or we can actually have it be just normal startup scripts if you're using system D garbage. I've had lots of problems with system D, especially on the Raspberry Pi and not wanting to start things like SSH servers and things that it should normally be able to start but have managed to beat it into submission at this point. So we have, at this point, we have a connection, we have a very simple skill, the skills that it knows about out of the box are Alexa and then you ask it a question, where is Uganda? And it'll then come back and tell you Uganda is located in Africa. So those kinds of skills, Uber is automatically there, the pizza ordering is already there. So all of those skills are already there and then you can go through the list of skills and say I want that skill on my device and so you can kind of pick and choose which skill sets you want on the device and of course you can customize your own skills. This becomes kind of that intelligent agent that you didn't know you wanted or needed and it sets there and just listens for things and if you ask it a question it answers which some of my neighbors find particularly enjoyable, they will come over to my house just to ask it stupid questions and get an answer and I go why don't you buy your own? And they go yours is green, that's much more cool. So there are lots of opportunities here for expanding the system, adding new skills, adding new capabilities. What you do is you actually import their skill into your particular system and then it has a unique name that you can provide it. So it'll actually use the credentials that you have created for your particular device and then it extends it out. It's kind of like scoping and C++, having a variable called SAM isn't SAM anymore, it's now SAM plus a whole bunch of garbage, similar sort of thing. So in this particular session of course we did kind of a quick whirlwind tour of setting Alexa up and getting the Amazon voice services interfaced. I really, all things considered the only significant problem that I've run into with this device so far has always been continues to be related to security and our good friend Amazon timing out on me and then saying it can't do something and they're working on it and they'll get back to me as soon as they finish, which they've never gotten back to me. So in any case, I just simply restart it usually and it'll reconnect. But the process of actually getting the device to be voice enabled is really not that hard. Once we get past the initial setup, now once you get to that point where you've registered as a developer, you've registered the device, you've got the security credentials and all that sort of administrative stuff, the actual code itself is not that complicated, especially the Node.js stuff. If you have any facility at all with JavaScript, it's pretty easy to set up a new interface and again it does not require a VM setting up in the Amazon cloud. It uses their remote procedure call mechanism for doing all their voice processing. And I assume that if you're going to become a manufacturer and this is what you're going to be selling, there's probably some money that has to change hands somewhere. But at least for hobby work, it doesn't really cost you anything other than the cost of the device itself and the audio engine, the microphone and the speaker. There is one thing that I've found that is a bit of a problem. And if you look in the actual design for the Amazon dot, the Echo Dot, it actually has like six different speakers around, six different microphones around it. And there's a lot of additional voice processing that's done to do echo cancellation and things of that sort. And that doesn't happen on this device because I only have one microphone. That's going to be a problem and that is one of the places where, we really need to set up a steerable phased array of microphones and use beam formers in order to be able to do the echo cancellation correctly. Maybe that's next year's talk. We'll talk about beam formers and how we do audio processing. Any questions? Yeah. Yes, have you looked at them? Nope, not familiar with them. Is there an open source? Not figured out yet. But there certainly is enough stuff out in the open source to be able to voice enable almost anything if you really want. Any other questions? Yeah. That's something that's peculiar to this piece of crap that I happen to buy. And I've just noticed that what happens is, that particular unit, it just takes too long to wake up. It shouldn't, but it does. So it also doesn't speak English. So nonetheless, that's probably my mistake for having purchased that particular unit in the first place. The easy way to avoid that problem is just simply use wired audio, wired microphone and something like this little dongle, which is, I think it's like five bucks. And it works with pulse audio. So obviously if you're gonna be doing any audio work on it, you have to install pulse audio and all the rest of the stuff that goes along with that for those people who don't like pulse audio, that's the solution they have worked right now with pulse audio. If you wanna use Jack or something else, then by all means. Any other, yes, next question back there? Yes, you can. Yep, you don't have to use Java, mercifully. Yeah. Actually, I'll be talking about that on, they are compatible with MQTT. So that's the one that I've worked with so far. There may be others, but that's the one that I found documentation on. So it's a PubSub standard MQTT type in implementation. Yeah, question? No, no, no. In this particular case, the, see here, the wake word engine is not Java. The client in this case was Java, but that was just the choice of the implementation. It didn't have to be in Java. And then the skills are done in either Java or Node.js. At this point, I'm getting about 65% success. It'll come up most of the time, unfortunately, not all of the time, but I'm trying to figure out exactly why it's, there's some weird timing thing in the Raspberry Pi when it starts to bring up all these individual engines. Remember, that's the problem. If I would say there's a problem in the Raspberry Pi unlike the Beaglebone that has an EMMC built into it. Raspberry Pi relies on the SD card and any oxidation on the SD card, any jostling of the device when it's coming up. You know, if your dog barks really loudly, I don't know, whatever it is, but something happens and sometimes it doesn't start up. But if you were doing this on something that had a real EMMC implementation with a much more solid sort of boot device then I suspect it would probably be much more reliable. Yeah. Is there a board? Mm-hmm. Yeah, now the native language understanding piece of it, there is some tweaks that you can do. I don't know exactly the level that you can tweak it for, but there are some, as I said, in the case of the two different wake word engines, there was one of them that worked reasonably well for my dialect, one of them that worked reasonably well for my wife's dialect from the Northeast. So you may have to experiment around a little bit to figure out which one's gonna do the best job for you, but they definitely, they have at least two of them so far, hopefully there'll be more. Yeah, oops, yes, sorry. I'm sorry, what was that? There's no real training sequence that I've been able to find. There may in fact be something that you can do inside the wake word engine itself to train it to be better at being able to recognize things because I found that words that sound like Alexa will also set it off. And so, you know, it's, there was something, some technical discussion I was having with somebody on the phone and I said the sequence and it woke up and I go, oh, that's interesting. I wouldn't have expected it to have done that, but it did. So as far as the wake word engine is concerned, I'm sure there's some tuning that you can do to it, but there's no particular training or calibration cycle that you have to go through. So unlike a lot of what we see with voice recognition systems, you don't have to train it to understand you, which is one of the reasons why they send all this stuff up into the cloud, because they are going to have a lot of horsepower up there in order to be able to do the audio processing. Yeah. If you, the answer I suspect, let me think through, I think the answer is yes. What you can do is you can use the Alexa speech processing and the skill is to simply return you the text. No, no, it's not required to be an audio response thing because I can say Alexa, set the Nest thermostat to 70 degrees and the Nest thermostat then turns over to 70 degrees. So you can have it do anything in terms of the way you implement your skill. Oh yeah, yeah, yeah. So you can have the skill come right back to your local device and your local device implements the skill. So if you say, Alexa do blah, blah, blah, blah, and the skill that you have set up is to simply send the processed speech, the actual text out of the speech to text recognition back to you, then absolutely you can do that. Yeah, we're getting close. Last question? A problem. Yeah. But it turns out that in the web set up, they actually have an entry there for proxies. So if you do use a proxy, it will, basically when you're setting up the device, you'll tell it that it's got a proxy. Yeah. I have not done the reverse engineer on the actual firmware yet. And I would never, I would never do that because that would be a violation of the Digital Millennium Copyright Act. That's it, thanks guys.