 I'd like to welcome you all to Building How. We're going to learn how to run some Ruby code with our voice today. It is kind of a lot of fun and also terribly difficult. So I hope you will forgive me when probably all of my demos fail. But maybe one of them will be funny to make up for it. So I am Jonan. I live in Portland, Oregon. I go by the Jonan Show on Twitter. You can find me there. I don't properly live in Portland. I actually live in Beaverton, which is a very big deal if you're from Portland. Is anyone here from Portland? Except Jason, I know. Yes, we have a couple of other Portlanders here. If you are not from Portland proper, you live outside of the city limits in one of the burbs of Portland. And you call yourself a Portlander, you are very likely to be forcibly tattooed with an owl by the Hipsters Union. So please, nobody tell them that I said that. I am actually from Beaverton, and I apologize. I work for a company called Heroku. They invented the color purple. You may have heard of it. They're quite kind to me. They send me to fancy places like Cincinnati to come and talk to people and attend conferences. And I love it. So use Heroku and things. If you have questions about Heroku, I'm always happy to talk. So my wife and I, about a year ago, we bought our first home. We're very proud. And one of the first things that I wanted to do with my home was to automate things. Well, the first thing that I did, I renovated the foyer. So we have the ball pit here. This is kind of to disarm people as they enter. Once the drawbridge comes up, you need to calm them a little bit. So I started to do some work automating my home. I have smart light switches. And I have Alexa in there. And I can trigger things throughout the house. For example, if I say, Alexa, trigger movie mode. Sending that to ift. So I just turned off all the lights in the house and started Netflix. I think it's like 10 AM there. My wife is probably wondering what's going on. So when I started doing this, I wanted to do some home automation. And I'm writing a little bit of code here and there. But a lot of that stuff is already pre-built. And I wanted to get more into it. And I definitely did not want to use one of these draconian interfaces, this terminal, to try and deploy my code to production. The idea is absurd. We live in 2016. This is the future. We can just use our voice. And so I started to study speech recognition. And it was a terrible, terrible mistake. I'm just kidding, actually. Speech recognition is wonderful. But it is intensely complicated. If I were to talk to you for the next 60 days without stopping, we wouldn't cover it. So I'm going to glance quickly over some concepts. I'm going to start at a very low level and try and the opposite of what low level means in software terms. Very high level, I guess. I'm going to talk very simply about sound and about voice. And we're talking about speech recognition here. We're not talking about voice recognition. Voice recognition is my voice is my password. Verify me. You are recognizing someone's voice and using it to unlock a magic door or a computer or some such. We're talking specifically about speech recognition. So there are two major categories of speech recognition. You have isolated word recognition, which is the stuff that we've had around for a long time. So when you call up your airline and you say, I would like to check a bag. And they say, did you say cancel all your flights for 2016? That's isolated word recognition. And it's not very good. But it's simple to implement because you're just matching waveforms, essentially. You're trying to get the other human to say the word exactly the way you said it. And then you compare the waveforms and they match. And so you have an isolated word recognition system. And they're obviously much more complex today. But that's where it started. And what we're going to talk about is continuous recognition, which is the sort of voice recognition that Alexa does and many of the other devices we have today. So before we get into that, I want to talk very quickly about how speech is generated. And if you will bear with me for a moment, remember when you were a child and you were swinging on a swing set and you would pump your legs in the middle of that swing it was a small movement and it had rather a large outcome. You went back and forth, up and down this swing. Now imagine you're holding a marker out to the side and you're just drawing on a piece of paper that I'm dragging up very quickly. I'm much taller than you, so this is fine. And as I'm dragging this up, that shape that we're making, that's a sine wave. It's probably a sinusoidal wave. It's not a perfect sine wave. But that shape will come out kind of like that squiggly line that I'm sure you've all seen before. The reason that I mentioned the low amount of energy that it takes to generate that is that it's the same reason that we have these, which is really just a great excuse to use this disgusting picture in a presentation. This is an endotracheal intubation. I really hope you never have this procedure. It looks like someone's being implanted with a robot kind of in preparation for the apocalypse. But those are your vocal cords right there and they close when you breathe out and your exhale forces air through them kind of like when you fart like this. And that makes sound, right? And we do it very quickly. So it's kind of like a throat butt that you use. If you want to think about it that way, all speech is basically throat farting. But so you can hear this sound when you say very low tones, right? If I say, ah, it starts to click, right? You can hear the actual, the sound coming apart. And those are the sinusoidal waves that we're talking about. So again, review for many of you, we're gonna talk about frequency today. The frequency is how many of these bumps we have and how close together, right? And when we listen to music or we talk about it in terms of what the human ear perceives, we call that pitch. We're also gonna talk about amplitude and similarly we call that volume when we're talking about what the human ear perceives. So a little bit of vocab about the words. We are coming into the words for speech recognition now. So this is in utterance. In utterance can be a word. It can be a couple of words. In utterance is what speech recognition implementers use to describe a sequence of spoken language that turns into a computer command somehow, something that can be understood by a computer. I say in utterance, computer understands, right? And we break those utterances down into their phonemes. So phonemes are different than letters in English. They are just the sounds that we use. These are the 44 different sounds that exist in the English language. And that may seem counterintuitive to some who didn't have a background in linguistics. So I'm gonna explain very quickly how some of these sounds come to be and specifically how even the same phoneme can sound different sometimes. So if I say the word bad or I say the word ban, that would be the same a sound, true? Kind of, right? That is one that on paper we would all mark as a yes on every test we've ever taken, right? Probably even linguistic students. But the reality is when I said bad, I made an a sound. And when I said ban, I made an a sound, right? They're different. And those are single sounds like that are called monophones in the context of speech recognition. So we're ultimately trying to break down to monophones at a minimum to start doing speech recognition on a particular sound width. And even better than that, we can break down into bi-phones because when either of those a sounds is preceded by b, it sounds different than if it were preceded by c or n or t, right? Bad and ban. So this would be a tri-phone. And tri-phones are actually what most of these systems use nowadays. So they'll break them down into groups of sounds as tri-phones and build a rather large tree that I'll show you in just a moment when we get into the discussion of hidden markup models. First, I want to again give you some quick review on foyer transforms. And I trust you're all comfortable with this. So we'll just move on. Everybody's got this. I'm just kidding. Simply other than this, to transform your foyer, you just remodel it. Oh, yeah, I was pretty pleased. Okay. So let's talk about foyer transforms here. This is the thing I heard a lot of people talk about and I assumed was far above my head and it probably still is. This is not something that I could necessarily rewrite from scratch, but I think I understand it. So if you have this square wave form like this, this wave is actually comprised of a number of sine waves that we can identify. And we can split out all of the sine waves that can be used to make this square wave form. And here's them splitting out. And now we can use these sine waves and the frequency, the amplitude of those sine waves to make a new graph that does not include time. We can describe a segment of sound without the concept of time, which is exactly what we want to do when we're doing speech recognition. We want to remove time. We want to put amplitude on one axis and frequency on another. And that way we don't have to worry about how long a sample is. And you can imagine that this particular strategy is not very helpful for a very large sample size, right? If I have a five minute wave form, not only is it very time consuming to take that apart this way, but it is possible to generate two different collections of, or two different square waves that could both be described by the same set of sine waves in that circumstance. So when we're working with speech recognition, the trick is to just use very short sample sizes. Generally is something like five or 10 milliseconds. We're taking out of your speech and we're using this foyer transform to analyze that speech and then to map it to a set of numbers that we can look up in a tree because we're trying to match it to something that we already know, right? So this is going to show amplitude and frequency instead of time. We've removed time from the equation. And the idea that a wave of any shape or size could be made up of a number of sine waves is kind of counterintuitive to people, but it's less counterintuitive when you see it like this. And that is that a sine wave is just a circle with time. So if I told you to fill the earth with golf balls, I think you can imagine that in your head. And when you ran out or you couldn't fit any more golf balls, I told you you could use grains of sand, right? And then we went to something smaller and smaller. You can see then that any shape that you can imagine could theoretically be made up of circles, right? So this foyer transform is actually quite difficult on a large scale. It takes a lot of computing power. It was the kind of thing that we weren't actually able to utilize until 1968, I believe. These two gentlemen came along, Cooley and Tukey. Cooley worked for IBM. I believe Tukey was a professor and they came up with a fast foyer transform. And I'm not going to go too deep into this because we have a lot to cover, but the idea of the fast foyer transform is to basically split the set of all of the waves that you're trying to calculate and thereby divide and conquer. So instead of becoming the order in algorithm that would be a foyer transformation, a fast foyer transform is order log n, all of the implementations of it. And there are actually a couple of different ways to do this thing. So I encourage you to look into it if you are interested. It is fascinating. Let's talk about Markov chains real quick. This is something you probably are familiar with where we can take English sentences and just predict what the next word is going to be based on the previous word, right? You assign a word a probability and you build a big tree. And we start with the, for example, right? And then one of those options is the Jonan show, right? So probably like 0.001% over here of all of the human language that is spoken. That was really generous of me actually. I mean, I might even say 0.1%. So the idea is that we then go down through this tree and we build up a sentence, right? And you've seen all of these silly eBooks Twitter bots doing exactly this. When your friend has their name and it says eBooks after it, or some of them have horse in them, I'm not sure the meme, but the idea is these are all Markov chains and they generate text by just taking existing sentences and guessing what the next word is gonna be and they're often wrong in hilarious ways. So speech recognition does not use Markov chains, but they use something called a hidden Markov model. And this is code I don't really want you to look at. This is just constants. We're just describing a bunch of things. And I'll show you in a graph form. So the part that is hidden here is the starting state. We don't know whether it was rainy or whether it was sunny. What we do know is that 60% of the time it's rainy and 40% of the time it's sunny. So this is probably Portland. We keep telling people it's 80% rainy though because it's getting a little crowded there. So if it was rainy, the chance of it transitioning over to sunny is about 30%, 70% of the time it will stay rainy. And then if it's rainy, we have a 50% chance of cleaning, a 40% chance of shopping, and a 1% chance of walking in the rain, which is glorious. And if you move to Portland, you'll get lots of chances. So this is very similar to what you know and love in Markov chains. It's just that the state at any given time is hidden from you. And that more accurately models speech recognition systems. So this is typically what they use. And that is the technical depth that I'm going to go into on speech recognition. We're done with that part now. It's fine. So again, these trees that we're building, they're not actually about rain and they're not about going shopping. They're about these monophones that I was talking to you about or diphones or trifones. We're gonna build a big tree of them and we're going to go and try and find the word that you said based on the sounds that we identified using a fast Fourier transform to match them up to samples that we already have or that we can predict exist. So one more thing we can do to help with that process is we can define some grammars. You can tell the computer what to expect. Just like those isolated word recognition systems, newer speech recognition systems like Alexa will take an input grammar and that grammar will tell them what to expect to hear and it will very seriously steer. Yes, Alexa, thank you for your help. They will very seriously steer the direction of that speech recognition. So let's make a thing. I'm gonna start off with Alexa and I'm gonna be using if. If is the same thing that I just used to trigger the lights to go off in my house, much to my wife's surprise. And we're gonna be using Hubot as well and I specifically chose this approach because it was code light. This is not a thing that you really have to code. If you already have a Hubot implementation, you can start using your voice to deploy with some minor caveats. But the plugin that you're gonna use for Hubot is called Hubot Deploy. It's written by this man Atmos. He works with me at Heroku. Brilliant guy, Corey Donahue. And Heaven is also written by Atmos. And Heaven is, so I'm sorry. Hubot Deploy is the part where you can talk to Hubot. Now I can tell Hubot deploy Steve to staging and it will deploy Steve to staging. Heaven is the part that receives deployment events. So deployment events are events emitted by GitHub. You go into GitHub for your repo and you basically set up a little webhook and you give it an endpoint and you say, hey, whenever someone makes a deployment, just let them know over there and it ships out a deployment event to that webhook. So this is the architecture of this system we're about to describe here. We've got Amazon Alexa and I'm gonna say a thing to it and it's going to use ift to trigger a Slack message. And that Slack message is going to be heard by Hubot. And Hubot is going to interpret that Slack message and it's going to create a deployment event in GitHub. GitHub is going to generate a deployment event callback for Heaven. Heaven is going to deploy the application to Heroku and it's gonna notify Slack. So it's simple, really. See, there's no code. It's fine, what could go wrong? Live demo, let's try and ship it. Okay, so I'm gonna show you the Slack channel that I had here, maybe. Assuming I can computer good. How do we go? Yeah, looking good? All right, so now, if I tell Alexa, Alexa, trigger deploy Steve to production. Sending that to ift. So that is sending it to ift. And at 1142, Steve got a message, deploy Steve to production. And hypothetically, Heaven would have said some things about it and Steve would now be deploying to production. So let's go see in our terminal here. Oh, this is great, I should un-mirror so I can see what's happening at the same time. Let's tail our logs on our Hal Steve production repo and see if Hal Steve actually restarted. Slug compilation started, slug compilation finished, restarting. Stay changed from up to starting. IFTTT is deploying Steve to production. Production deployment of Steve is done. That was at midnight last night and that's the message you would see now, but you don't because most likely network communication issues. But that is fundamentally how it works. You can see that it did actually deploy in reality. Let's just call it a success and move on, shall we? Thank you very much. And back to slides. Does anyone know why the mafia squirrel is the official ship at squirrel? I actually don't think there's an answer to that question. I think someone at GitHub picked an image at random and now this is an emoji that everyone uses to ship things. The nice thing about this approach is you can also use slash Heroku or similar plugins that are already part of your Hubot installation. You can send Hubot any command that you would like and Hubot can do sorts of things. So the one caveat that I was talking about is if is a user without a user ID. And so you kind of have to hack your way around that but it's pretty easy to figure out and if you have questions or you're actually going to try and deploy all of your production infrastructure with your voice, don't do it, then you should talk to me about it and maybe hire me for an exorbitant consulting fee. So Capistrano also has a plugin that you can use from Hubot, many other deployment systems. Anything that you can do with Hubot, you can now do using your voice and your Amazon Alexa. But you will notice that Alexa, when she spoke back to me, said sending that to if which is not a very interactive reply, right? I had to go and check for the heaven message that never came to find out if my thing had actually deployed and I found myself reading log messages and using one of those disgusting terminals like a mortal, right? So we have a solution to that problem. We can use a thing called the Alexa skills kit and the Alexa skills kit is a way to build skills on top of the Alexa platform inside of Amazon. So you can go into a GUI that I'm gonna show you a moment and I'm gonna move very quickly past it because it's terrifying because it's very GUI and small. Don't look at that, look at this instead. Oh, I made it so much easier. So this first page you're gonna be looking at, there's only one thing you really care about here and that is your application ID. You're expected to use this to verify all of your incoming requests and here's why. If I were now to have shown you this skill and I was not verifying that all of my requests came from here, then many of you would have just whipped open your laptops and started sending messages to Alexa on stage to help me with my presentation. I know you would have done it. You're probably still trying. But I am checking that that is the application ID that I'm using, yeah, don't take a picture. All right, this is the invocation name of our Alexa skill. We call Alexa, that which shall not be named because it will wake up. And then we have this piece. This is the meat and potatoes, right? So we'll walk through this here real quick. This is the intent schema where you describe all of the intents that you have on your platform. So an intent is something like deploy, right? I want to deploy a thing. And then I say, if a sentence looks like this, this part is the application and this is the environment. And those are the slot types that I define. I can have an application list slot type and I can have an environment list slot type. And those are types. I'm defining my own custom types in the language of the Alexa skills kit. And before you ask, no, there is not a way to do this outside of the GUI, which is why I'm showing you the GUI. It's really not as difficult, but I was not excited about doing a thing that I could not do programmatically. So let's look at an intent here. This is an example intent. This is the intent to deploy. I've got two slots here, one slot for the application and one slot for the environment. And I've got custom types defined there. I've got an application list and I've got an environment list. And I've defined the application lists to include Steve, Hal, and Heaven. These bars are just there to make up for new lines. They're not actually in the vocabulary that you need to make an application list. You just list them on lines in the little text input field and they show up here. So there's my environment list. I can deploy Steve or Hal or Heaven to production and staging. And then I give Alexa some more hints, that grammar that we were talking about. These are sample utterances. The sample utterances will also be used to give your users help when they ask what they can do with your service. So my sample utterances look like this. I can say, I first declare the name of the intent and then I say to deploy application to environment. So if I were to say HAL, or I would say Alexa, ask Hal to deploy to production, deploy Steve to production, then it would ship that off. It fires off the skill. And on the back of all of this, I've written a service and that service responds to these inbound requests from Alexa at this end point, right? So halrbproduction.herokuapp.com slash apialexainter utterances gets the message from Alexa of what was said and is expected to reply precisely with a well-formed request or Alexa throws a giant baby fit in quits. And the GUI is not my ideal form of inputting code as you might imagine, but it's not as terrible as it may seem except when it sometimes breaks for no reason that you understand. And I had one intense schema that was building for about 48 hours and I ended up having to delete it and start again. So sometimes these things are not perfect. My advice to you is to set up your intense schema and then don't monkey with it anymore like especially the night before your presentation. Don't change it. So let's try and ship something maybe with that. Should we try it? Alexa, ask Hal to deploy Steve to production. There was a problem with the requested skills response. Oh, lovely. So let's try a different application. Alexa, ask Hal to deploy heaven to production. I'm having trouble finding the application happened though it's clear you're trying to deploy to the production environment. I can't deploy Hal, Steve, heaven or lasers. Do you want to try again? Alternatively, you could become a monk. It's sounding better and better. Hal, wait, Alexa, send Hal deploy Steve to production. Alexa, ask Hal to ship Steve to production. Hal has deployed Steve to production, hypothetically. I'm not actually able to give you information about whether or not the deploy was successful because I can't deliver messages to you unprompted. That would actually be a terrible feature. Imagine if LinkedIn could send voice messages to your living room. You'd never stop hearing about how you're getting noticed. That's an excellent point, Alexa. So the one thing that is a downside to the Alexa skills kit is it can reply to you. It can give you feedback about whether or not a thing was successful, but it can't then later reply. A deploy is something that we obviously want to have happen in a background job. These requests will eventually time out. We can't go and launch a deploy and wait around for it to finish to actually give Alexa the response. There is a way to keep the conversation open. I've opted not to use it because it is sometimes difficult to maintain that. So with the Alexa skills kit, it is possible to do that thing, but if you're going to try to accomplish that sort of feedback with Alexa, then you should be using the Alexa voice service, which is basically a way for you to implement your own Alexa. You can use the Alexa voice service to feed a text and get back what Alexa would say if Alexa existed within your system with the same kind of boundaries that apply to Alexa, like she won't say bad words and things, much to my chagrin. And so then the Alexa voice service feeds those results back to you on your own device and then you play it through your speaker, right? Not going to detail the Alexa voice service. We're going to play with this guy. This is Hal. And Hal uses a different way to accomplish speech recognition. So we looked at Alexa, we looked at the if triggers. Hal is going to be using the web speech API, which is a very new thing. I'm very bleeding edge, you see. And the web speech API is only going to be available in Chrome and Firefox and probably Safari, although I have not had any success getting it to work on anyone's iPhones. I am not to be held responsible for mobile device compatibility, that's ridiculous. So just get androids like me, just buy my exact phone and then you can use my application. So we create a recognition object. This is JavaScript, you're probably familiar with it. Our recognition object is a speech recognition. This is the way Firefox calls it. At the top of this code I have assigned speech recognition to mean either speech recognition if it exists or WebKit speech recognition. And I've done the same thing for all of the models we're going to talk about. So I can just call them speech recognition. So we've got our recognition speech recognition. We've got our grammar. The grammar is actually coming from the server in this case. I'm calling it out of the code. This is a JS herb template, right? So I've got on my deployment model a grammar thing that defines the grammar. And the idea is that I can eventually define that grammar dynamically and I can reconfigure my speech recognition as the applications that exist in my sphere change. So if I create a new GitHub repo and a new Heroku app or a new Linux node somewhere, then I can add that to the grammar dynamically. So how we'll immediately be able to deploy those things. And then I create a grammar list. I take that grammar that I've created and I add it from a string. These grammars are designed very much like the JSON you were just looking at. It's a format called JSGF. You can read all about it online. I have. It's fascinating. You can do some very complex things with the grammar. I would not be surprised if it was touring complete. But we are adding this grammar list into the grammars that exist. We're setting a couple of things on the recognition that we want to enter results. We tell it to give us the results in the meantime while it's trying to recognize things. We tell with a language. You can tell it a number of languages depending on the accent that you're trying to recognize in English. So if you have speakers who are using a French accent and you know it, then you can set this to French and it will overcome some of those things by using tri-phones from French samples or French English samples. And then we start our recognition. So our recognition is started and then we do something with the result. When the recognition returns us an result, we dig it out of this hash of results, get the transcript and we put it right into the output on the page, the text box. And then we do this app.deployment.deploy thing. And this may be a mystery to some of you who have not worked with JavaScript. I'm gonna run through it real fast. This is action cable. This frayed cable here represents my implementation of action cable which is absolutely not the way to use action cable and I'm very excited about it. This is how you don't use action cable. So I've created this app.deployment object that I just used and I'm creating a subscription of my deployment channel. And I'll show you the deployment channel in a moment. And then when I call deploy on that deployment as I just did right here, I'm calling deploy on that deployment, right? Oh, it's just a method call, right? I'm just calling that method right there and I'm sending some text down the line. So here's the deployment channel. The deployment channel has this deploy thing here and on a deployment it says, hey, respond to this data that I got and it logs it, right? And the stream from deployments part there is the part you should pay special attention to. That should include some kind of session ID or a user ID or anything to prevent every single web socket that your application uses from being broadcast to at the same time. But of course I've omitted that because I wanted that feature. So when we stream from deployments, any subscriber, any webpage that comes to how.rb-production.arokuf.com and if you are using a Nexus 6P and Chrome with the same version that I am and you go to this website, you too can experience the glory of how, maybe. And when you do, you and your peers will all hear the same things out of your speaker because I'm using the same channel which is not the way that it should be done. I feel like I've covered that. So we get our message way down deep, we do a thing, we actually deploy and then we ship a message back up on the deployment responses channel and then this is the deployment response channel. It's streaming from deployment responses and then in the JavaScript, we are taking that data and we're calling message received and this is actually now finally calling that method back in our JavaScript which is just this function that creates a speech synthesis utterance which is a thing that you can do in any of these various APIs and then it actually speaks from the browser. So who wants to see if this demo works? I do. I'm pretty excited about how. How is my favorite implementation? Let's see if Hal does anything. I think I actually turned Hal off because I'm smart and Hal is dangerous. Whoops. Hello, this one. Not you or this. This one is Hal running on local host which is not the one we wanted. That's the one we wanted. So this is in production right now and hypothetically it has a wake word. Let's try. Hal. Nope. So let's just click the button. We have an option to click a button. The wake word is very difficult because Hal is not a thing that is typically said to these, right? So all of the grammars that they are using and all of the trees they're looking up in, they'll think it is Hal or pal or hell or hill and I actually check for all of those variants but I've apparently struck a variant that I was not expecting. So we can also just click the button here and tell it to do a thing. Whoops. Nope. It was listening that whole time. Clever job. Good job. Let's stop with it. Nope. Hal's kind of a smart ass. I still don't understand. He's just taunting me now. All right, let's try again. Deploy Steve to production. No, the status when it completes. Ta-da, Steve is deploying to production and the status message that I was talking about, we should see eventually something come up here in the logs. Steve is the Hubot implementation by the way who was chatting with us earlier. We're redeploying Steve because Hubot is actually kind of quick to deploy because we cached the modules. Anyway, the feedback cycle now can become that we can just send those messages up on the action cable anytime that we want, right? We can send it, we can broadcast to all of the users anytime that we would like. See, so we can give any feedback we would like from the robot, we can talk down to the Ruby. It's pretty cool, right? This is the best one so far. I was very excited about how. Okay, so let's go talk. About another robot, this is Marvin. Marvin, I was less excited about because Marvin is sad all of the time. But Marvin is a PocketSphinx Ruby implementation. So PocketSphinx Ruby sits on top of the PocketSphinx library which is part of the CMU Sphinx speech recognition. It's an open source speech recognition kit developed by Carnegie Mellon University, thus CMU. So Marvin uses PocketSphinx Ruby and we should probably just like play this slideshow also. I wonder how long I've been doing that. Has it been a while? Oh good, all right. This is Marvin. Sorry, Confreaks. And this is PocketSphinx Ruby which sits inside of Marvin, all right? So we will look at using PocketSphinx real quick here. PocketSphinx, do you want to disable logging up front? When you're just playing with it, when you like first run it as a demo, it's great because it gives you a lot of feedback like probably a hundred lines to start. It tells you about the tri-phones it's using and the identifications it's doing and it dumps the text at the terminal and that's great. But it's not gonna work for what we want. So we disable the logging and this is how you do it. You do not redirect standard out to DevNol. You'll regret it. Probably like 11.30 p.m. the night before your talk. This code is really bad. You're gonna be really happy about this. I'm quite pleased with myself. So we've got this PocketSphinx configuration keyword spotting. A keyword spotting lets you define a particular word that is our wake word for Marvin. When we say Marvin, Marvin wakes up and does the thing. And that ends this particular recognition. So we take that keyword spotting and we feed it in. We create a PocketSphinx configuration grammar as well that we're gonna use for the recognition after we have recognized our wake word. So the grammar definition DSL looks like this. You can just make a sentence. Deploy Steve to production is the thing I want to say. This DSL does not allow you to do this necessarily in more interesting ways than just that. And there are many more complex ways to define grammars than this, but they're not implemented in the PocketSphinx RubyGem. I'm sure they are coming sometime in the future or maybe from you if you were so excited by Marvin that you decided to go out and work on this. So finally we create our recognizer. We've got a live speech recognizer here and we're gonna feed it that keyword spotter that we created earlier and that's just gonna sit there. It's gonna loop infinitely until it hears the magic word and then it's gonna wake up. And then we are going to recognizer.reconfigure with our grammar because now we're done with that keyword spotting. We're gonna reconfigure it with a new configuration to start listening for something different. It's gonna be listening for live speech. And then this is how you make a bell sound on a Mac. You say in the voice bells be and you'll hear how great that is. There are actually a lot of ways to make a bell sound but this is probably my favorite. And then we recognize the speech and we're looking for things that look like deploy Hal or Steve or heaven to station, right? So we run our recognizer, we match the application and the environment and then there's this deploy method that presumably does an actual deploy. And once the deploy is done, then the loop will just start over automatically or else it will hang infinitely. In which case you do what all good Ruby programmers do and you wrap it in a timeout which is actually not something you should ever do. Timeout has a lot of problems. If you ever care about code, don't do this. But again, I don't particularly care about Marvin and so I did it. Poor Marvin, that's why he's so sad. Also, when you're raising a timeout, this is just gonna raise an exception that you can't predict. And of course, naturally, Pocket Sphinx Ruby will raise exceptions at random depending on maybe the word you said or whether the speech recognizer was ready. So the best thing to do is just to rescue all of them and ignore them. So you just put a rescue all exceptions at the end of your code and they magically disappear. I actually had a developer one time tell me when I was first starting out. You know, you can make that go away. Just put rescue nil at the end. And I was like, do you know what that does? It's so bad. But I actually didn't know at the time not to do that. I did it several times. This is the equivalent of that thing. Just rescue all the errors and throw them away. We probably don't care what Marvin's whining about anyway. And then at the beginning, of course, because we're raising errors, now we'll just wrap our two infinite loops in another infinite loop. So problem solved, right? We'll just swallow those errors and we'll keep looping forever. Who wants to see Marvin? I wanna see Marvin. Marvin's great. All right, let's see if I can show Marvin without screwing everything up here again. Well, that didn't work. Let's do this. I'm escaping and I'm tapping and I have a Marvin almost ready. All right, but now I can't see it because I have to command F1. Good. I'm sure you appreciate the running commentary. Make sure this makes it into the talk. This is important. Okay, so we just run Marvin with this bin Marvin command. Marvin, I'm sure you'll be pleased to hear we'll be a gem someday that you can install yourself and you can swallow all of your errors and loop infinitely on your own computers. So when we run Marvin, Marvin comes alive in a very uninteresting way because we are shrunk. But look how pretty the ASCII art is. If we embiggen and enhance, hello Marvin, you look sad today. Deploy Steve to production and Marvin has actually done it. Yes, I know, you're Marvin, thank you. So that's Marvin. Marvin is- Deploy Steve to production. Thank you, Marvin. That will be all. Let's shut the Marvin down there. You just have to double break to get out of your infinite loops. So, you know, programming. So we've now deployed to production. We've deployed Steve again to production using our lovely robot Marvin. And I wanna go back real quickly over what we've talked about here. We talked about the Alexa and using Alexa with if to ship things. We used Alexa with or we used how to ship things. We used Marvin to ship things. And we were shipping all of those things to Heroku because I worked there and they sent me here and that was nice of them. But also because I have infinite free Heroku, it's dumb not to use that. But all of the things that I've shown you are very easy to do with all of the other services, right? It's plug and play. I just happen to be using this thing. And while I'm on the subject of Heroku, it occurs to me that I forgot to show you something which I wanna show you real quick. I forgot to show you that we actually have Heroku commands inside of our robots. And so I can go over to Hal again. Let's go back to Hal. Whoops, there you are. You're on the other screen. And Hal is being surly with his own name. So we're just gonna click the button and we're gonna say shut down production. And this is the part where Alexa is close enough and he is loud enough for Alexa to hear. But it didn't work. So let's try again up here. Shut down. Yes. Yes. Alexa, stop. And then, what, nope. What are you doing? Thank you, Alexa. Scale web zero. All right, we'll stop him there. I'm gonna shut him down a different way. Let's try Marvin here. Marvin will help us. Marvin is a sad robot, but he is occasionally helpful. Let's see what Marvin has to say. Nope. Marvin, it's fine. Marvin, murder Hal's process. It seems you have outlived your utility fleshwad. It may be time to make better use of all that wasted carbon. Alexa doesn't think very much of us humans. I'm going to kill you somehow. With meat. All right. Well, I think I've had enough being insulted by my robots. For now, I will return to my slides only very briefly. To say to you all, thank you very much for having me. It has been a pleasure. If you'd like to play with some robots later, come hit me up. I'll leave Alexa out by the booth or something you can talk to her about things.