 My name is Dave. You can find me on Twitter at Dave Tapley. So firstly, thank you for coming to my talk. It's going to be fairly broad. And I'm going to be showing you how I've been engineering a solution to a problem in a board game. So if you are here because it says board games in the title, just to clarify, there is actually only one board game. But that's still one more board game than at any other talk this year. And if you are here because it says augmented reality in the title, then, yep, I've been working on an augmented reality app for the web using Rails. So to that point, almost nothing I'm going to show you is probably the right way to solve these problems. It's definitely not the right tool for the job. But I've had a lot of fun with this little project. I've got to use some old and new technologies in weird and wonderful ways. But before we get to all that engineering fun, we need to talk about board game fun. So introducing Pitchcar, that you can see on the screen. And I thought the best way to show you Pitchcar would just be to show you how it works. Introducing the RetailMeNot Phoenix Pitchcar demo team. Hey. So it's about as simple as games get. You're just going to take it in turns to put a piece on the start. Everyone has one of these little cars. It's just a little wooden disk. And then you take it in turns to flick the cars around the track. And then the first person to complete two laps is the winner. The track is made up of these little pieces. They kind of jigsaw together. It's just straights and corners. And then this is actually a smaller track than you can build, but trivia with full game, 630 possible combinations of track. So you have plenty of replay opportunities there. Something to notice with the track is these little red walls. Just missed your cue. So the walls are really what makes the game fun. You can slide around them and bounce off them. As you just saw, a side effect is that there are places where there are no walls. And so there we go. Right on cue. So coming off the track is just part of the game. It happens all the time. The good news is you're not out of the game. You just go back to where you were. That's a bit of a problem, because firstly, remembering is hard. And so sometimes you can knock yourself and take other people off the track at the same time. And even if you can remember where you have to go back to, it's a game that's all about being in the right place at the right time. And it's very easy to remember you were maybe half an inch to the left, or the other person was, for example, behind you. So we're all human. No harm, no foul. But I'm an engineer, so I thought we can do better. So let's have computers do it. So by it, broadly, I mean capturing images, like the one you see on the screen, and then we want some way to identify where the cars are. And then if someone leaves the track, we want to show where they were, so they can be put back accurately. That's cool. Thanks, Gitpitch. So before we get to how I engineered the solution, I was going to do a bit of a tech preview to try and justify some of the decisions I made. So I figured there's broadly three things that I had to do. One is actually capture the images from the webcam. So I never really played with this before. Turns out there's this web real-time communications API. Here's what Mozilla has to say about it. If you scroll down a bit on this page, no surprise it involves JavaScript. And I'm kind of like most people, I cautiously say, JavaScript is the language of the web, I guess. It's OK. But I haven't got to play with any new frameworks in a while, so I thought, you know what? Everyone's talking about it, so I thought I'd give Vue.js to try. So we're going to see some Vue.js. I won't say too much other than I'm kind of back into JavaScript now, which I never thought I'd say. So once we've got these images, we're going to need some way to process them, do some computer vision. That sounds terrifying in Ruby, so I'm just cheated, basically. And I'm using this 18-year-old, super stable, open computer vision library. But there's a Ruby wrapper, so we don't have to actually look at any C code. And then I wanted it to be real-time. I don't want to have to submit the form every time someone took a shot, because that would be tedious. And so I haven't got to play with action cable too much, so this project was a good entry to that. So just in summary, this is how broad the talk is going to be slide. We're going to see this WebRTC protocol. I'm going to be saying how much I like Vue.js, probably more than I should. OpenCV on the back end, that's going to be a bunch of math, but it'll be OK. And then action cable for sending it all back and forth. So the plan, and this is going to be the bulk of the talk, is getting these images, action cableing them over to the server, processing them in OpenCV to figure out where the cars are so when they come off the track, we can show you where to put them back. And then if we can get through this, there might just be a demo that may or may not work. So part one, I needed to get the image. This seemed like the logical place to start, because this is what developers do. I'd never really seen Vue components or anything, so I found this Vue WebCam component. If you go to the GitHub page, they have this little snippet. And this is how their hello world for Vue WebCam. The top element, that Vue WebCam one, that's what Vue's going to take over. And it's going to put that video tag in that implements WebRTC. And then below that, you see we've got this button element. The only interesting thing about that is that at click. And that's Vue.js' way of saying, give me an on click handler. So we're going to see that function in a minute. And then the image tag, you can see there's a column before the source attribute. This is another Vue.jsism. And this is saying, set the source for this image tag to be whatever is in the Vue.js components photo on its Vue model. And we'll see how that works later. And then there's a corresponding JavaScript snippet which implements that take photo button. That happens when you click the button. And so it looks fairly simple, but I immediately started thinking, well, what is this dot photo? Like, how do you even represent a file or image data in JavaScript? So spoiler, it's a string. This dot photo is a massive string. Specifically, it's a data URI. I'd never met these before. Mozilla says, a URI will prefix with the data scheme, allowing content creators to embed small files in line in documents. So small files, vague, it's 2017. Under a gig, it's probably small, I think. That's reasonable. So I wanted to play around with these data URIs. So here's a file, a small file. Let's see if we can get a data URI version of it. Turns out there's this little handy website. You put in a URL, you hit Generate Data URI, and you get out at the bottom, that little input. It's a bit misleading, because if you copy out the text in that thing at the bottom, that input, it actually is this. So vague strings, I wasn't kidding. What's amazing to me, and I've never seen this, is if you stick that monster string in the source attribute of an image tag and load it in a browser, you just see the image, like, that just works. I learned that source attributes don't always have to have files. You can just be like, here's the data. So I'm going to call that success, one step down. Albeit as a data URI, we do have the image. So the next challenge was to action cable it over to our server. So no surprises. To get a VJS component working with Action Cable, we go and Google VJS Action Cable. It turns out Richard LaFrancoe has been doing the same as me, albeit with Pong. But it's still a video, a game, so that's same ballpark. I had a look at Richard's code. It turns out there's broadly three things you have to do. Kind of make the Action Cable JavaScript, the JavaScript that ships with Rails, kind of be available in VJS. Then you want to subscribe to an Action Cable channel. And then once you've got your channel, you can start sending messages over it. So let's look at those three. Nothing too complicated. Just import Action Cable. This create consumer call. That's just part of the Rails Action Cable API. And then what Richard did is just stuck it on the view prototype, so it's globally available. I'm OK with that. Oh, and this just goes in. View gives you a setup code which I run when the page loads. And then the next step was to actually create the subscription, so you've got something over which to send of the messages. VJS gives you this created callback. And it calls this right before it's going to put a new view component into the DOM, or around the then. So Richard just uses that opportunity to create his subscription to the Action Cable server. So that's quite nice. Element goes in the page, and it sets up its own channel to talk to the Cable server. And then the final third piece of the puzzle is just using that video channel. And so the video channel sits on the VJS component. And so inside the component, I can just say this me video channel sends stuff over it. And if you haven't seen Action Cable, you just pops up in Rubyland, so you just have a class with a corresponding name to the channel, and you just get this receive method invoked with the data. So as a quick side, you'll see this pattern over and over again. It's really cool. It's about as far from the Rails kind of classic request response cycle as I've gone, still using Rails. And you essentially have these completely encapsulated DOM elements that are maintaining their own asynchronous communication with the server just in isolation. So just to close the loop, here's our take photo function. And so I guess we can just, now we've got a video channel, we can just send that giant string, data you or I, back over it. Seems too good to be true, right? It did work. It does work. But perhaps tellingly, it turns out that in development mode, Action Cable will log the entire message payload to the development log for every message. So it's not pretty, but I'm calling that a success. It turns out that you do actually get your data URI. So we've got our data URI in Rubyland now on our server. So next up, we're going to see if we can get it loaded into this OpenCV library. So if you have a look at the OpenCV docs, you find this class, this is the closest thing to an image class in OpenCV, IPL, because I think Intel image processing library. If you scan down, it does have a, if you can read that, it's a load function on IPL image. But it takes a file name. So now we've got the problem of how do we fit our string shaped data URI into our file shaped load method. So I got thinking, it's pretty obvious in this data URIs that everything after the comma is the data, I guess. So I thought maybe I can just regex on the comma, and that's the data and put the data in a file. That seemed a bit ugly. So back to Google. It turns out that Debal's written this data URI gem. So that's awesome. I thought, well, OK, how should I be handling these data URIs? More regex is the answer, just regular expressions all the way down. But at least it's in a gem, hey, so that's good. More gems. So I mean, it's basically the splitting on a comma gem for my purposes, but why not. And then I've got this data portion still in a string. What I couldn't believe worked is that you can just open a file in binary mode, write the string to it, and then just give that to IPL image load, and it works. This seems like something to me that an undergraduate would do, and then we'd be like, no, you can't just write binary strings to files in Ruby, because encoding and things. But no, I guess it really worked just the first time. So that's cool. We've got our image loaded. I need to do a quick segue now to talk about matrixes. So I don't know if you've probably, I think probably most people have heard of matrix, but if you haven't, it's a table. I think it's impressive that Google manages to pull a definition that contains the words rectangular and rows and columns, and doesn't at any point say it's just a table, but sure. So images are just matrixes, and where each cell in the table is just the color of the pixel at that point. OpenCV makes this clear in a nice satisfying object oriented way by just having their image class inherit subclass from their matrix class. So as far as OpenCV is concerned, an image is just a table of pixels. They are one and the same. If you need more convincing that an image is just a table, then why not this spectacular website where you can upload an image and download it as a spreadsheet, where they've set the background colors of the cells to that it's fabulous. I'm glad it exists. So cool. Now we've got our image making good progress. We've got it in OpenCV. And I feel like we have some intuition that an image is just a table of pixels. Maybe we can start making headway into finding the cars. So if we look at this image again, what we probably want to know is which of the car pixels and the car like pixels are probably the blue pixels in this case. Seems like a reasonable assumption. So we need some way to say to OpenCV, can you just tell me which pixels are blue? So we're all web programmers. So let's just pick a web color definition of blue, maybe this blue. And then OpenCV gives us this EQ method on an image, on a matrix, which is an image. I'll pick one or the other eventually. So we can just define blue. And all the CV scale I think is, it's another fancy word for just a value that has multiple values. So we're just saying blue, it's, and that's really just the web hex code going in. And then we can use this EQ function. And what we get back is what I'm calling mask. And it's another matrix, but this time, instead of the matrix cells containing colors, it's just trues and falses. So for every pixel is that pixel blue, reasonably sensible. It would be nice to visualize mask. So in what leaky abstractions abound, now the matrix class suddenly has a save image. So there's some OO badness for you. So that's good. Of course, like the load method, it needs a file name. So we're already in love with data URIs. Let's just try and do it the other way. This didn't work quite as spectacularly easily as the first time, but basically the same principle. You just write to a temporary file in binary mode, read it back as a string, and then you do have to base 64, encode the string you get back for reasons. I don't really understand it, but the good news is then you just put it back after the comma, and now you have a data URI again. So that's cool. As a side note, this seemed like it would be terribly unperformant, but I think in reality, the file system doesn't even bother to write these files to the disk. It really didn't affect the performance too much at all, in fact. So now we've got our masqueride. We know that we can just stick it in a source attribute, and what you get is disappointment, quite frankly. I don't think there's any chance you can see it on the screen, but somewhere in that sea of false blackness, there are a few glimmers of true pixels which show our fuzz completely white, I promise. Somewhere. But that's the start, and you might be ahead of me here. You probably figured the probability of a pixel being exactly the color of blue in the huge RGB color space is basically zero, and what we probably want is something more bluish, like a range of colors. Good news. OpenCV has our back. So this is kind of almost identical to the EQ function, except now we're gonna give it two colors, and we're gonna say anything, any car pixel that's within this min, max range of colors can be true in the resultant matrix. So what we probably want is some way to be able to visualize these masks coming back from the range function and tune our min and max colors, because what we want is to find the colored car blue pixels, but we don't wanna get true for just random bits of dust that happen to be blue and floating through the image. And so the first thing was, rather than copy and pasting my mask URI like a chump, we're gonna use ActionCable, and this time we're gonna send our data back from the cable server to our browser. And so all I'm doing is using that OpenCV in range function, building another data URI like we saw, and then I'm just gonna broadcast it back over ActionCable, let's call it ColorChannel, why not, and then on the JavaScript side of it, this first snippet, this is what we saw kind of back at the beginning where you create the subscription in the Vue.js's component created callback. What we're doing now, if you haven't seen ActionCable, JavaScript before, when you create a subscription, you can also provide this received function and ActionCable's gonna call that whenever it gets a message in. So what we do when we get our message, our data, is we're going to take, we're just gonna assume it's got the mask URI in it, and then we're gonna assign that to this dot mask URI, and this in this context is the Vue components self, and this is where in the template, the HTML below, that colon source bind to the data model thing I showed earlier becomes incredibly powerful, because now whenever you receive a ActionCable message, we're gonna take out the data URI, we're gonna update the Vue model, and then the image just magically updates because JavaScript works nowadays. So that was really impressive, but again, I was kind of blown away how simple it was. So now we don't have to copy and paste our mask data URI every time, that's promising. We still don't really have any way to update the range of colors, the min and max that we're passing in. So let's have a look at that. Getting used to it now, more Vue.js. The only thing really new on this snippet of HTML is that V-model, and that's Vue.js is way of saying, hey, whatever's inside this input, just put it in the Vue.model in a range object, and then that range object has a red object, and that has a min and a max value, and the same for the other colors. And then there's an at change attribute on the input, and that's kind of just like we saw with the button, so it's basically, this is an on click, on change handler, and you can see in the snippet below, in the component, it has this methods where you define all these methods, and I just say, hey, whenever a set HSV is triggered, send a message over action cable. And so this pattern's getting established now of things happening, sending messages, and in this case, we're just gonna send that whole range, so that's gonna be for red, green and blue, what are the min and maxes? And that message, we can pick up back in Ruby, promise this is the last bit of the exchange, and we're gonna extract out the range, and then we're gonna make our CV scalars, and we can put it in the mask, and then make our new mask based on the new values, and what you end up with is a nice little component like this, and I was really pleased with just how, I thought this was quite elegant, so basically every time you change an input, it's firing a message over action cable with the new range, it's building the new mask on the cable server, turning it into data URI, that data URI is going back down over action cable, and just becoming the source for the image tag. So this is nice, we now have our mask eventually, and you can tune pretty accurately I found, you can get pretty ballpark good colors, so you just see the pixels where the car is and nothing else. Of course, we don't really still know where the car is, we just have this mask, and what we really wanna be able to say is, you know, what's the X, Y, like exactly where in the image is the car, I don't just need a mask of it. This, which seemed like it was gonna be the biggest part of the talk, is one slide, and it kind of justified all of OpenCV, I think you can see it, this is a ridiculous thing, I've never heard of it, it's called the how transform, or our how transform, and it's just this amazing bit of math where you just give it a matrix of trues and falses, or grits, or ranges, and it tells you where the circles are, and it's cool. I was like, huh, I guess there's a bunch of parameters to it, I don't know, five or 10 parameters, the internet told me here's some that probably work, and they did, so. Yeah, I can't believe how simple that is, so you just give it the mask with all your truthy car pixels in a sea of false not car pixels, and as long as the car is a circle, which it should be, then you get back this circle object, and it has the x, y position of the center of the circle in the image, and if we assume that if we can't see it, it's missing, and why not, that's just broadcast over action cable, because that's what we do now. So yay, making progress. Again, as surprisingly reliable, it took a lot of tweaking, but the general premise is pretty robust. How are we doing, 15, okay. So the last piece is trying to figure out who actually fell off, so here's a nice picture. This is a simplified state of the game. It's pretty obvious the blue car's on the track. The pink car got knocked off, that's not great. What you can't see is the orange car, which is over the other side of the room under someone's chair, because people get very excited playing pitch car, and that's fine. So looking at this kind of thing, I then thought, oh, okay, maybe there's actually like three states. We can be on the track in the image, but we might be in the image and off the track, or we might have just been flicked out of the camera's view altogether. And then I realized that actually, you can be slightly lazier, and in reality, you're just on the track, or you're anywhere else in the universe, and you're not on the track. And this sounds kind of common sense, but I went down a rabbit hole, and when I realized this, it allowed me to do something, which I think is quite a nice optimization, and it starts with another mask. And so this is gonna be like the mask we saw before. It's just a matrix of truths and falses with the same dimensions as the image. But this time, instead of asking open CV, hey, where's the truths and falses, we're just gonna define where the truths are in the image. And we're gonna do it in such a way that all the pixels that are track pixels are true, and everything else is false. And when we've done that, we can do matrix math, and just delete out everything that isn't on the track. And so this was nice, because now the orange and the pink car are just gone. They're just not even in the image anymore. So now all I need to care about is, hey, if the cars in the image is by definition on the track. So that was, I felt good about that. It did mean that I actually now have to try and build this mask, which not too bad, quite a bit of geometry, but bear with me here. So one of the nice things is every track has one of the starts. It's the, this guy, the checkerboard. And we know the dimensions of the track of the start piece, it's always the same. So knowing that, we really just need to know that it's here in the image. And that really comprises three things. It's kind of the position x, y rotation, which is probably gonna be either that or that. And then the scale, like how close and far away it is. And the scale I can just define as how many pixels long is that gonna be? And so I made a tool with BUJS, an action cable. And so this is almost copy paste of the mask tuning tool I made. So again, we've just got these little inputs. There's on change, when they change, then the callback in the VGS component just sends a message over action cable. Action cable is gonna make the mask you see in the middle, apply it to our frame, to our latest image and then just send that back down. And one thing I realized when I was doing this is it's like puts, it's like debugging with puts, but in 2017, because all you do is you just action cable publish. And I feel like it's a really interesting pattern. And I wonder if there's scope for some kind of like interactive debugger. But anyway, so my code is littered with just publishing images as data your eyes. So if you thought that log from earlier was bad, you should see it when I'm developing it. So that's good, we've positioned the start piece one step down. At this point, I'm gonna introduce highly sophisticated code for describing tracks. It does match the one on screen, yep, good. So we've got the starting, which is always a straight and then we've got a right turn and another right turn and a left turn. Maybe I could have done some clever machine learning, but it's really not that hard to just type in the specific code for the track. So we know the next thing's gonna be the next piece after the straight is a right turn. So where does that go? Well, we know that after a straight, it's always gonna be one straight switch cross. And then the rotation, well, kind of by definition, a straight doesn't turn, so it's just the same direction. And the scale is just always the same, like nothing ever gets closer or further away. So that's good. Now we just have to draw the corner onto the corner of truth onto the mask. I skipped the straight and just said draw a rectangle because I assure you there's a method in OpenCV. There isn't a method in OpenCV for draw the corner piece of a pitch cartile, unfortunately. It's a bit of an oversight in my opinion. So I did it, I just did more drawing. So I drew a big circle and then you draw a small circle inside of false and then you just take out a corner. Voila, just like that. So all these primitives are in OpenCV. You can do draw a circle of true, draw a circle of false and then just crop the image down. And because I know where the corner goes from the previous slide, I can just draw on the corner. So make in progress, that's reassuring. The next, next piece, well, we know it's gonna be now one tracks width across to the right and because we turned the next straight, so what you don't wanna do is this and then that. So we have to know that the next relative piece is rotated around to the right. And I actually implemented this quite nicely. It worked, it was reassuring for OO. I just have a straight class and a left class and a right class and then when you draw a piece for the next piece, I just ask the previous piece where the next piece goes and it's kind of defined recursively, which has made it a lot easier. And you just keep doing that. So pull off the next letter in the track description and ask the previous piece where to put it and where it's rotated and then if your math's right, which mine wasn't for so very long, you'll end up back at the start. And so now we've got this mask of trues and we can delete out everything else and then this image on the end, we can just take that and do our, where's the car detection. And so either the car's gonna be on the track because it's in the image or it won't be on the track. And I'm gonna skip over this bit because it's a bit boring, quite frankly, the code. But this is just the logic then is, if we see the car, remember it's there and then if it's still there, just show it's there. But if a car ever goes off the track, like we can't find it anymore, then just leave a marker where it was and then the person can put it back. That's basically it. We've done it all, should we see if it worked? Okay, so bear with me a second. All along, this was just this. If I scroll down, I can probably, so here's what it looks like when it's running. So the first thing we're gonna do is because the track got bumped a bit because of these guys getting excited. So this is why we're dialing in the position to make sure it lines up on the scale and the x, y, that's a bit closer. Da, da, da, now we're just gonna run this. This null image thing, I didn't talk about it in the talk, but if you want to know the secrets, you can come and find me. It's basically a way of deleting out the glare by capturing what's going on at the start. All right, let me do a reset. Do you wanna try it, guys? I'll come back to the pitch car team. So I'll show you the masks first. Do you wanna put one on? Hey! So that's what it looks like. Oh, I don't know, I'm too excited. And then, yay! Oh, blue, pink, almost. I'll do a reset. So it's having difficulty on account of people's hands being the wrong color. There you go, do you wanna give it a flick? Hey! So this is it running. Ooh, there it is. So it's just gonna sit here and try and find them and show where they are. And then, if someone would be so good as to flick their car off the track, watch, you won't be able to do it now. Hey! So you see the orange now has a little question mark. So this is the logic of it just saying, where did the orange go? I don't see it in the image anymore. And if Matt puts it back, not where it was. For example, there. Oh, he managed to get it right on. You've trained my app into obsolescence. Brilliant. There you go, it's lost it again. So yeah, the crosshair won't reactivate unless it's convinced that Matt's trying to not try to cheat. So hopefully when he gets it somewhere nearby. Ooh, ooh. Hey, there we go. And so now when the orange car moves again, pitch cam, thank you. Okay, so the question was, how long did it take me to solve the problems and then do I, knowing what I've learned, do you think it'll help reimplement a similar thing? I think probably the biggest takeaway was definitely the power of this view JS with action cable, talking to elements in the DOM. And what I skipped over is I was using web pack to kind of rebuild the JS on the fly. And I guess this is all coming in, some new version of Rails is just gonna be for free. I was just using Docker to run them in isolation. But I think in a year, I wouldn't be surprised if everyone's just using a lot more JavaScript and everything's a lot more asynchronous. And to your second point, much faster now. Like it really is just once you've gone through the pain of setting up web pack and getting it built, the development turn around as view JS does it's hot reloading. So an annoyingly have to restart action cable, but it boots a few seconds. So really it is like, oh, I need a component that's really reactive. I just copy another component, make a new action cable channel, and I'm going. Yeah, so the question was, have I thought about building the track mask automatically by just inferring the darkness? Absolutely, yes. There's endless machine learning possibilities here. One thing I was trying to do that I didn't really make much progress, I've got an open issue on. There's a gem for, it's really a punny name. It's for genetic algorithms in Ruby, what's it called, DAWINNING. But there's a few bugs in the gem, but what I wanted, one thing that's quite annoying, for example, is orienting the mask. So I think as a first thing, I'll still type in the 10 character code, but then it'll probably be easy to quite quickly use the genetic algorithm to machine learn the orientation and map it over. And then when, and another thing that happens all the time, is someone will lean in and they might knock the track, and if it goes off a little bit, you'd be surprised over a 10 minute game how much the track can drift from where the camera thinks it is. Yes, much to do. Yes, so the question was, why? So it is, yeah, because it's cool. Although also, yeah, when you're playing the game, it's surprising how often disagreements will occur as to exactly where someone was. Do you work for the company that publishes the game? No, I did have a slide just to clarify that, but no, I have no connect, I should have made a affiliate link for the game though. You can get it on Amazon. So the question was, would I use VJS for a consumer web application? Absolutely yes, yeah. Like I alluded to, I've been quite stagnant on JavaScript frameworks, and I've kind of jumped from angular one point, something old, to view, and I played with React a little bit, but it really blew me away how far, far along JS frameworks have come. So yeah, I'm 100% in on view. So the question was, is VJS better than React? At this point, I'm just gonna come over to here and do this. So, I'm a Vim user, Vim is the best thing ever, Emacs sucks, I don't know. Yeah, maybe no, I've played with React a bit. It was okay, I felt like everything I did in React was okay, and I only used it for a few little MVP projects, and then I went to VJS for this project, and I was like, oh everything's way simpler, like everything in the React that was a bit like, okay I get it eventually, in VJS I was like, yeah, that's obvious. So, I think it has a much easier learning curve as a Vim user. I think that's it. All right, thank you very much.