 So I'm going to give you guys a little bit of a talk about home automation with Amazon Echo and Ruby. I hope it works well. I've never tried this thing out in a room with 300 people, so I guess we'll see. But basically, I wanted to create kind of an API where there wasn't one to ameliorate all of my first world problems. I don't want to get up off the couch. Why would you? So who am I? At Zach Feldman on GitHub and Twitter and all those things. I'm the co-founder and chief academic officer of the New York Code and Design Academy. We're one of those newfangled coding boot camps that you keep hearing about. One of our students is actually in the audience right there. How's it going? NYCDA.com, classes on web development from basic to advanced, iOS development, UI, UX design, pretty much just awesome technology classes. So check them out. Ask me any questions if you want afterwards about them. We're launching in Amsterdam in September, so I'm stoked about that. So the Amazon Echo is this thing right next to me that looks kind of like a garbage can. It's kind of like R2D2 in that way. It has seven microphones, which is amazing. I can tell you almost for sure that this doesn't have seven microphones, nor do any of the Android phones. So all the kind of approaches to speech recognition that have happened so far have been on inferior hardware, if you ask me. I don't know a ton about actually taking a waveform and parsing it into text. That is not really my area of expertise. But I can guess. I also have a music degree. I can guess that it's probably a lot easier to parse speech from something that has seven microphones versus one to three microphones at max. So that's why this is so amazing. I'm sitting on my couch at home, or even in my bathroom at home, and my living room is over here, and that's where my Echo is, and I say, I'm not going to say the wake word just yet, but I'll ask it to do something, and it'll hear me from 20 feet away, and it'll parse that text pretty clearly for this generation of voice recognition technology. So when I got my Echo about eight or nine months ago, I don't know how I got it that quickly. Don't ask me. I don't know how to get your name up in the wait list or whatever beats me. It came in with some built-in functionality. Like for instance, Alexa set an alarm for 45 seconds from now. I love my beautiful assistant, Mike, to Mike the Echo, so you guys can all hear it. Alexa, where is the International Space Station? International Space Station is in space. Thank you. Thank you very much. Alexa, play Sober by Childish Gambino. The song's Sober, right? Yes. Sober by Childish Gambino. So that's one of my favorite songs. Alexa, what's the forecast for tomorrow? Tomorrow, in Brooklyn, you'll see rainy weather and can expect a high of 84 and a low of 73. Alexa, do my talk for me. Hmm. I can't find the answer to the question I heard. Oh, no. I don't have any other slides. So that's my alarm. Alexa. Alexa. Alexa. Alexa. Alexa. Stop. All right. So I love this song. Alexa. Alexa. Stop. So it comes with a bunch of amazing built-in functionality. When I unboxed it, I was wowed and amazed by this, but I was also thinking, why can't it do more? This is a little bit ridiculous. It seems like I could just parse the text and have it control other things in my life. So yeah, I wanted to add some more functionality to it. And to do that, I created a kind of proxy, a weird kind of API. And I'll go ahead and start it up right here. And hopefully it'll work. Let's see. So I'll explain a little bit more about exactly how this works later on, but I'm not touching my computer right now, putting it out there. It's kind of all happening. So I can say, once this is done booting up, let's see. Alexa. Add an event, breakfast at Tiffany's at 5.30. Stop. And then hopefully, if all has gone well, I'll have an event. Oh, there it is. Breakfast at Tiffany's. And as the word stop in it, that's kind of a bit of a hack. So we have to kind of work around the limitations of the device. Obviously, Amazon is not condoning this. So if anyone from Amazon is in the audience, sorry. But I also have some other functionality to show you guys to begin with. So Alexa. Tell the world, Gotham Alexa. Tell the world Gotham Ruby Conference rocks. Stop. No. So hopefully, so, you know, it's like kind of working at at least turned me say, tell the world, which is kind of the instruction to tweet. Right. You guys kind of get the point of that. Alexa. Tell the world Jesse Chan Norris is the man. Stop. Jesse Chan Norris is going to be organizers. Also one of my mentors. Let's have a hand for Jesse. And hopefully he showed up in my Twitter. Well, you know, at least someone kind of like him. So, you know, the point is you guys are all going to favorite these. Right. It's going to be great. You know, the point is like it at least heard the instruction tell the world, you know, the text after that may have been a little garbled, works a little better when I'm at home. And like I said, not in the room with 300 people, but it works okay. So how does this work? Does anyone know? I'm just kidding. I know the answer. It's the Alexa home project. We click on this. It's just kind of a short URL. This is a GitHub repo. If you guys want to take a look at it. Have any of you guys ever heard of GitHub? Yeah. So anyway, there's a scraper which scrapes the commands from the Amazon Echo web app and there's a server which receives the scrape commands and then parses them. So basically it's two different components here. And the scraper uses water web driver to log into the Amazon Echo web application which you saw. And then when a new command is posted, it sends the command to the Alexa home server application to be parsed further. So what's in a water? A lot of you guys have used water before for feature testing and basically it's a way to run web driver really easily from your tests. But I'm actually not using it for that, which is a little bit weird. Why can't I just use Nokogiri like everyone else? Another question you might ask is, why would I scrape it from the web app? Why wouldn't I just try to intercept network requests from the device itself? And I'm glad you asked. I actually did try to do that. They're obviously encrypted. So if any of you guys are kind of worried about, oh, whatever I say to this thing is going to end up in the hands of Amazon, I will tell you that it only sends a request whenever you give it a command and that all those requests are encrypted. I couldn't find a way to decrypt them very easily. So I went to the next available option, which is there's a web application that has a history of exactly what you've said. Check it out. So I was like, great, I'll just scrape it using Nokogiri. But then I was like, but I don't want to pull it every two minutes or every five minutes or something. I want this to be immediate. I want to say something that immediately have a command be executed. So we're not just web scraping. We're actually monitoring for Ajax Complete Events on the document. And it's kind of like a web book in a way. Whenever the web page does something, it sends a command to my server. So let's see some code. This is my waterlogin.rb file, which is basically taking care of all the magic here. I'm creating an object called AlexaCrawler, giving it some settings. So the URL of exactly where all the history is for the app, for the Echo itself, the login URL. If you go to echo.amazon.com, you can usually get redirected to this page. Also the refresh time in minutes. This is a hack. So every 32 minutes, I want to just kill the application entirely, kill Firefox, and restart it. And it's a lot more reliable that way. When we start up, we're going to initialize a new instance of water browser. And that opens up Firefox for me, or whatever your default browser with water is. And then there's also this keep alive method, which is going to ensure that, like I said, every 32 minutes, the process is killed and then restarted again. And as you saw before, we're basically using water to fill out all the fields to log into the application on echo.amazon.com, clicking the submit button, and then going to the history URL to look at the history from the Echo. Once we're on that history page, how do we find out what the next command is? Good question. We're actually going to inject some JavaScript into the page. That's going to, first of all, for restarting the script, it's going to recognize if there was a last command so that we don't repeat commands. I was having this weird problem where I would tell Alexa, stop, to do something. And then 32 minutes later, it would do that thing again. So I would be like, turn on all the lights, or turn off all the lights, and I would just be sitting in my apartment talking to my friends, and all my lights would just turn off. And that wasn't very fun, and I had to find the bug, and I was just executing the same commands twice. So we have to protect against that. So once the 32 command gets pushed to the page, we're going to figure out the command by just doing some parsing of the jQuery of whatever the first command is. And then we're going to send a GET request to the server that I have running on localhost4567. Can anybody tell me, localhost4567, what framework I'm probably using? Sinatra, exactly. So our good old friend Sinatra. And then I'm going to run the keep alive. It's a recursive function. It just calls itself over and over again which is, you know, not super awesome. I know I'm not exactly following best practices here, but I just wanted this thing to stay alive as long as possible without my interference. Cool. So that's how the scraper works. We're just using JavaScript injection to send a GET request to a local server. So the server, it receives the scrape commands from the Amazon Echo, and we're basically using regexpowered parsing to match against key phrases for different modules that we've written and it can be used with any plain text input, not just from Echo, which is really interesting actually. So if any of you guys, I know that there was the Google Now hack for Siri where you could say, you know, or Googleplex, I'm sorry, where you could say Googleplex and then go through some kind of proxy and scrape whatever Siri was saying. I feel like it's been deactivated recently, but you could find a way to take the commands from Siri or the commands from Google Now or something else and pass it to this. That's why I decided to break it into two parts. I wanted other people to be able to use this code for their projects too. So the server architecture sounds like a really complicated word, right? Five lines of code. Sinatra. So the first pass that I did was super simple. Require Sinatra, GET request to slash process with a parameter Q, scanning that parameter for the words turn on and then do hue stuff. So pretty much I just had to write hashtag hue stuff and then all the hue stuff happened. It was very simple, joking. So the first pass that I did of this was like I said to control the hue lights in my apartment. So I was scanning for the words turn on. When I found those words, I used the hue API to ping the lights in my apartment and turn them on and then eventually turn them off as well. And as you can imagine, the code base kind of grew from that into this really gnarly if, else if, else kind of thing that just was looking really terrible. And things just got more complicated and I wanted to encourage open source contribution. So two things that I think really encourage open source contribution, documentation, right? Don't we all love documentation? Really easily readable documentation and also an easily extensible modular system. So not just kind of a huge if, else if tree, but something that people could actually see as a pattern and replicate that pattern really easily on their own and then write code on their own to do so. So the code base is the current code base and that's like I said at Alexa Ho.me Alexa Home stop. Sometimes this thing gets a little bit annoying but you know. So my code is a little bit better formatted now and I'll show you guys on GitHub just because it looks a little cleaner there. So we've got the docs folder which basically describes getting started, running the program on a Raspberry Pi which is really nice. So I don't have to have my laptop open all the time. I just have a Raspberry Pi in a closet in my apartment that runs this program for me. So that's really nice. It's basically you know, editing the auto start file from Raspberry Pi and then also how to stop Alexa Home because not everyone knows how to search with grep through all their processes and kill a process. So I thought this was pretty important. And then there's also documentation for all the modules. Module Hue, a J River player, Nest, Uber, Evernote and also scheduling. So that's the docs folder and then scraper is in a separate folder and server is in a separate folder and then we have all the different modules also in their own folders. So it's a very modular kind of architecture. And if you take a look at the current state of things it's a lot cleaner than it was when I first started. Still not necessarily the best pattern but it's easy to contribute to which is most important. So whenever you create a new module for this platform basically it's a class that starts with the words. I'm not going to say them because it's going to react to me but you know what I mean. The A word. By the way, the two things you can wake it up with are that word and also Amazon if you set it that way. So if you happen to have an ex-girlfriend with the name then you can just call it Amazon instead if you'd like. So anyway, we're configuring the Twitter client here. Every class that's defined for this specific project has wake words method which is basically an array of all the different words that could wake up that specific module. So before I was saying tell the world, that's exactly how that works. I just defined the wake words inside of that array. And then I have a process command method which is on every single class and that's going to take whatever command she output and parse it and do something with it. So this is a really simple example because the Twitter gem is very easy to use. So whenever the command comes in I'm basically removing the word stop and also removing tell the world and then returning whatever is left and that is what gets sent to my Twitter account. And at the bottom of each module I have this constant module instances which is going to push a new instance of each module into that array and then later on we iterate through the array to check the method each time. And that's actually done in app.rb. So here we're saying first of all we're going to use the same modules right here. Modules equals yaml.load file and this is just a quick yaml configuration file. So I can decide which modules to load and in fact if you're going to be trying this at home when you get your Amazon Echo I'd recommend commenting out the hue module unless you have a hue because it's going to ask you to press the button on top of your hue device which you do not have. So you're going to need to do that and that's actually why we built in this module this modular module selection tool. Wow that was really meta. Because we wanted people to be able to use this without necessarily having all the components that are necessary for all the different packages that it uses and we have some selectivity there. So once again we're using this to figure out which modules we're going to load. We're going to iterate over each module and then require it inside of this loop and then whenever we get a command we're going to use the module instances that we have and basically scan through the command that we get for any of the wake words here. If we find any of the wake words we're going to process the command using that specific module. So everything's a lot more modular now. People are contributing to the project. There's a man, Steven Arkanovic out in Washington who has already contributed three or four different modules and is just really gung-ho about it and he actually got him into Ruby which I was really happy about to see another Rubyist kind of join the team. It was great. And generally I'm just much more happy with the way this architecture is laid out. There's more open source contributions as in my mind, something that's great. Cool. So imitation is the sincerest form of flattery. Well Bezos laughs on a yacht Jeff Bezos that is the head of Amazon. Alexa Home is actually the first to market with these now native integrations. What I mean by that is that I actually hacked together these integrations and then about a month later, Amazon came out with the same exact thing whatever Amazon, that's cool. I'm not going to sue you or anything. I would never do that. That's impossible apparently. But we came out with a hue module and the hue module for this project can control not only turning the lights on and off but also the brightness of the lights, the saturation, also many many different colors of the lights which is really nice and the module that they have can turn them on and off which is great but I want more functionality There's also a Google Calendar module which is contributed by Steve Arkanovic and they actually just integrated that two or three weeks ago. So leading up to this talk I was panicking a little bit I was like what if they take all the integrations that I've created and I'm literally presenting the same exact thing that is now inside of this product but luckily they only took two of them so I still have some other things to show you guys which is nice. There's also an official API now so you don't have to use my hack anymore. Frankly I think it's a bit easier to use because the way that they're they've kind of like only vaguely released some details of the developer program but it's not just parsing text it's a bit more difficult than that and I think frankly that my hack is easier to program apps with than their official program so what I'd love to do one day is create some kind of Ruby DSL for creating applications for the Amazon Echo that makes it as easy to use and modular as the library that I've built versus what's out officially currently so I'm going to give you guys one final demo and then I'm going to leave some time for questions this is a demo with Uber so we need to bring up that app make sure my server is running yep oh I need to rate this guy what should I give him it was pretty good, it was a pretty good ride good job cool so I'm not pressing any buttons here and let's just hope that this works Alexa get me a cab to Union Square New York, New York, stop so I'm not touching snow, just my voice give it a moment here see what happens and my driver is confirmed in a route to my apartment in Brooklyn so you're all wondering oh well why are we in Brooklyn right now so I did all the testing there and that's kind of what's hard coded in right now but in the future I hope to be able to parse both beginning and ending locations as well and you can actually get a cab with your voice which is pretty cool I have to cancel this ride unfortunately so I'm sorry Milan if any of you guys know Milan send them my apologies cool so that's what I've got for you guys I have more stuff at AlexaHo.me and I'd love for you guys to contribute to this project as you get your echoes and we can keep kind of hacking on it to eventually like I said build the DSL that means you can bring all those modules over to the native platform once that's out to the public, thanks guys