 Thank you and thanks for coming to the epic NHL goal celebration hack with the Hue Light Show and real-time machine learning. So this talk is about a hack that I did last year for the NHL playoffs. So essentially I did a little celebratory light show in my living room for when the HAB scored a goal and I did a blog post and a YouTube video about it. So I'll just instead of describing it more, I'll just show you a little demo. So I put this video up on YouTube and it kind of got some attention on like Reddit and hack news and within like a week it had like a hundred thousand views and got picked up by CBC and Washington Post and so that was pretty cool. So without getting more into the hack, a couple words about me. So my name is Francois Maillet. I'm the head of machine learning at a Montreal startup called Datocratic. I'm working on our new product, the Machine Learning Database, which is free to try. So if anyone's interested in machine learning, go check it out mldb.ai. Before that, I studied computer science and machine learning at the University of Montreal, mostly working on music-related problems such as music recommendation and playlist generation. So in this talk, I'm not going to show you a bunch of code snippets and go through lines of Python code. I'm going to go through the kind of big decisions that I had to take about the design to make this hack possible, but I've got a blog post up with a lot of details on how this was done. So if you'd like to know more, just go check that out. So how did I get started with this? So essentially two years ago, a friend of mine lent me this nice USB button that I had on a table in front of my couch in my living room and whenever the house would score, I'd just jump on the button quickly and that would start the goal song and we would all dance the music. So that was cool. But can we take it further? Could we somehow make the goal song start by itself when a goal is scored? But not only that, in the year between the two seasons, I'd bought a couple of Philips Hue lights and I thought, wouldn't it be cool if the lights would also flash to the music? Keeping in mind that the most important thing about this is it's got to go fast. Because I consider so, how to automate this? I could scrape the web, some API, look on Twitter, website, if this and that. Anything like that I realized was a bit too slow because if the team scores and it was like, hey, but it takes one, two, three, four seconds until someone does something on the internet such that your system can get it from then, you kind of miss the moment, right? So we needed something faster to make this work. So then I thought, having worked with sound before, the intuition was maybe there's something in the sound signal that we can use. So I'll just play you a little clip if it works or did, oh. So as I was saying, the intuition was if the sound signal in the, so if some people here don't speak French, that was a French commentator saying something, right? And if you were watching a hockey game blindfolded blindfolded and heard that, probably you could, you know, make up that something happened. There was a goal, right? So that was the kind of intuition behind the hack. Like there is something in the sound signal that maybe we can use to do this automation. So the hack kind of has three big components that all stand kind of by themselves. So first, the Lightshow API, that's what we want to do. We want to flash lights and play the goal song. So that's great. We're going to be using machine learning, and I'm going to have more sound later on. We're going to be using machine learning to do the real-time detection, hopefully some nice and friendly machine learning. And we also need this to work in real-time during an actual hockey game. So there's a real-time component to this. And so you can picture it as I'm going through. My setup is I have a Mac mini in my living room hooked up to my HGTV, and I can stream the hockey game from the internet. So everything's going to be running on a Mac mini that has the game and hooked up to my sound system. Thank you. So machine learning. So what's the machine learning task? We're trying to do binary classification. We're going to set apart two different things. First thing is our positive class. So goals by the Canadian. This is what we're looking for. That's what we want to detect. And the negative class is basically everything else. So trying to set those two things apart. Looked at it another way. We're trying to model the probability of a Habs goal given some audio. And so what I did is I went on the Habs website and I looked around and I saw the highlights section and they had conveniently for me all the previous games of the current season with four minute highlights of the games. And they, you know, in highlights, they cut them around the goals for and against. So what I did is said, well, I'm going to create in myself a little data set. So I spent, you know, an evening, maybe two, just, you know, going through the highlights. And as you know, any of you who have experienced with machine learning or data science, I'm sure you'll agree with me in a project like this at some point, you're going to feel like a monkey. And for me, that was that moment, spending those time just recording the wave, the audio and taking down the exact time where goals occurred. I put that in CSV files and I had corresponding wave files that I could later on use. So the way it works with machine learning model is that you have to represent the things that you're trying to model. You got to present it to the model with a feature vector, which is a numerical representation of characteristics that represent what you're looking for. So a simple example would be separating, for example, Python programmers to C++ programmers. So, you know, features for that could be, you know, what company they're working at, maybe some companies have more one type than the other. So you could have a one hot vector representing that it could be their age, their color of their hair, if you think that that's relevant, or the length of their beard, or anything like that. In our case, I said, we're going to be using the audio. So we need audio features. And I use the nice Python library called Librosa, which has all the basic audio feature extractors built in. And I use the mail power spectrogram. So just a few words about about that. Bottom left, you just have a waveform. So if you open up a way file and look at it, that's the actual sound. Next to that is a spectrogram, meaning it's a frequency representation of the sound. So the x axis is time, the y axis is frequency, and it's a heat map. So the color is the amount of energy present at that frequency time pair. And what's great about this is you can actually see what the sound looks like. And this is at the basis of, you know, probably every every smart audio system. And so I said you could see what the sound looks like. So this is the mail power spectrogram of a hab's goal. And I highlighted the actual moment where the commentator yells goal, he's yelling it a boo. And you can see the white waves going down as he's yelling. And then there's a goal horn straight lines because a goal horn is constant. So that's great. You know, then I thought maybe I can actually make this work is pretty clear where he's yelling goal. But then what most people often ask is how could you then set apart goals for and against the haves, right? Well, let's take a look. So this is a goal against. You can see that it's smaller, right? I kind of got lucky, I think I mostly watch games on the Frankfurt station. So either TV or RDS. And, you know, as many of you can guess that, you know, those stations are a tiny bit biased for the haves, right? So when the hab score, they go nuts. But when the other guy scores like a goal, let's just move on. So there's a pretty big difference between the two, which I got lucky about. This is not the case for CBC, which is, you know, a national network. And so they're trying to be much more balanced. But for me, I said, Oh, well, what do you know, maybe this is going to work after all. Okay, so so we need feature vectors for the model. So we have a male power spectrogram that's like a matrix, we need an actual vector of numbers. So simplest thing that I can do is just vectorize it, take every line in my matrix, take them one next to the other. And that gives me for two seconds worth of audio feature vector of like 11,000 elements approximately. So that's fantastic. I now have a way to represent my audio for the models. So now we actually need to train a model. So I have my examples, but I actually need to put them together in a data set. So that means making a list of positive examples explicitly and negative examples, positives, it's easy, right? This is what I'm looking for goals by the Canadians, no problem there. But for my negative examples, gets a tiny bit trickier. So I want you all to close your eyes and imagine Saturday night, epic confrontation, hockey night in Canada, a game between the Toronto Maple Leafs and the Montreal Canadians. And everyone is at the edge of their seats waiting for something to happen. And then the unthinkable happens, the statistical anomaly occurs. The Leafs score a goal. And I know it's hard to believe, but it does happen sometimes. And then that's bad enough. Imagine on top of that, the light show starts. So not only that we get scored, but the algos trolling us about it. So to make sure that doesn't happen, we explicitly give as negative examples all the goals against us. So the model really spends learning capacity, setting those two types of examples apart. But we also need to give examples to the model of what a normal game sounds like. So we will select 50 random segments from each highlight that are far away from any goal. So that the model can see kind of everything that can happen during game. And then we can train something. So I use scikit-learn, which most of you probably know, and very simple of the shelf algos. The reason for that is essentially both feature-wise and model-wise, I had to get something done quickly that I was fairly sure would work. I woke up with this idea two weeks before the start of the playoffs and I needed this working for game one. So I basically said, was the simplest that can work? And I tried these and they worked, which was great. They got decent performance, which was good enough for me. So I was able to just use that and move on. But I want to, I guess, take a second to talk about the difference between a prediction and a decision. Essentially, when you ask a model, I give you audio and you ask a model, what do you think of this? It's going to predict something. It's going to say, well, I think that there's 50% chance that there's a goal in this to a second. But if you think of how we're going to be using this, it's going to be in a real-time context where there's a stream of audio coming through. So we can afford to make multiple predictions. I ended up making 10 predictions per second. And it took more than one prediction that was telling me that there was a goal to actually decide to act on it. And so combining many predictions can give you a much more robust decision heuristic to show you kind of how this looks like. I'm going to start this movie. And this is a two-second window and audio is going to be moving through it. And next to it is a probability coming out of the model. So you see that I have many predictions. The red lines just mean that these specific predictions were above a certain threshold. But I don't have to act on a single one. I can look back at the last n and use whatever heuristic to make an actual decision. So what I did was a 10-fold cross-validation. I had 10 games in my data set. So I basically trained on nine, tested on a 10th, and rotated through every game such that they all get tested on once, tried a bunch of heuristics, playing with hyperparameters, looking at these four metrics. So first one is what happened. I have scored and the model said there's a goal. So that's great. This is what we want, what we're looking for. So you want that as high as possible. The second one is the really bad example with the leaves. That you really don't want. That's the worst thing that can happen. So trying to keep that one as low as possible. The other one is no goal and the model said goal. So light show starts for nothing. That happened once during the singing of the national anthem. Why not? So it's funny once, let's say. But also try to keep that one down. The last one is have scored and there's no goal. Which is okay. It's not the best. But if the habs actually only score a single goal in that game and your friends came over to see the light show, they kind of asked, do you actually have something that's working, Frank? And like, well, yes, believe me. So I ended up using logistic regression. SVMs were better, but overfitted a bit more, I think. So the actual decision heuristic was looking at the last 20 predictions. There had to be at least five yes vote. And to get to vote yes, a prediction had to be above 90%. And that gave the following forms results. Looking at the stats from the previous season, you could kind of extrapolate based on the cross validation what performance you could expect. So loosely, 0.84 habs goal per game that we're going to miss. And we're going to go off by mistake once every four game for one of those lead goals. So, you know, those numbers aren't perfect, right? You'd want to get the 0.84 down to 0 and the bottom one with the leaves, you know, as far as possible. But the more habs goal that you detect, the more risk you have of getting, you know, lead goals by mistake. So that trade-off is one that you always have to make in like any machine learning system. And to make those numbers better, I'd have to go back, make better feature extractors, train, you know, more powerful classifiers, get more data, something like that. But for my actual use, which was get something done within two weeks, this was pretty good, better than I would have expected for the kind of trivial features and model that I was using. So I moved on to the next step. After having de-risk, what I thought was the hardest thing to get working, now we can actually have some fun with the light show. So two parts of the light show. First, we need to play the goal song called Le Goal Song. I don't know if some of you remember it, it's from 10 years ago at the Molson Center. So before YouTube and everything, that was the actual real good goal song that they should have kept. So I'm still in love with it. So that's what I want to play. But not only that, but we have the Philip Hue lights to flash. So for those who aren't familiar with Philip Hue, they're smart LED lights from Philips. They're awesome, normal bulbs that you can put in any one of your lights that you already have. And it comes with a hub that you plug into your network. And basically, you have an iPhone app or Android app that can talk to the hub and you can set color intensity and the actual, well, intensity and color of your lights. And when researching how to do this hack, I realized that there's a rest API that's running on the bridge. How convenient. And not only that, but a nice library called Hue by Nathan Hale. Some of you may know him. I didn't know he did that. But so a very nice simple library that wraps the Philip Hue bridge API into nice classes and objects. So we can actually talk to our lights. How does this look like? So this is the only code I'm going to be showing, but it's so small and nice. So to make flashing lights in your living room by the lights, then write this code, basically just import the library, connect to your bridge, set some color codes. I looked at one light, took the iPhone app and moved the colors around until I had my blurblanc rouge that I wanted, took down the color codes, and then flashing the lights is just, you know, four loops. You just rotating through lights and through colors, sleeping once in a while. And that's all there is to it. So I spend a few evenings, you know, tweaking the four loops in my living room while I playing the music to try to line them up as best as I could. And back then, I didn't have blinds in my living room and there's a huge condo building looking out from my living room. So, you know, a few times I would just look out and there's a bunch of people just looking at me like wondering who is this guy? What is he doing with the disco lights in his living room? And I'm actually not making this up. So anyways, after a few nights of doing that, I had four loops tweaked to work with the music. So we had a light show. And I wrapped that code with a rest API of my own using Bottle. So essentially that gave me on my Mac mini a slash goal endpoint that I could call to just trigger the light show and start the music so that decoupled that nicely such that it's independent from the rest of the system. And that's going to be useful. I'll tell you why in a second. So the final part to kind of make this all come together was making it work on a live audio stream because this was nice. I had flashing lights and I could score way files on my disk, but this needs to work on an actual game. So remember, I'm playing the game in my browser through a totally legal cable subscription service of some sort. And so I used a little piece of software called Soundflower that adds virtual audio devices to OS X. And basically you set your system output to Soundflower, meaning the sound's going to cut. Nothing's going to go out your speakers because you're sending it to Soundflower. Then you can use PyAudio, which is a nice Python package that you can use to play and record audio in Python. Tell it to get its sound from Soundflower and then you can do processing on that sound. So the way it looks is essentially audio from the browser in Soundflower in PyAudio. And then on the one hand, PyAudio is sending it out to the HDMI output and on the other, every frame is sending it to a ring buffer where we're keeping the last two seconds of audio. Each time we add a frame, we ask the machine learning model what it thinks about that last two seconds of audio and put that prediction in another ring buffer where we keep the last 20 predictions and then the decision heuristic looks at that. And if the decision heuristic says, well, you know, based on those last 20 predictions, I think there's a goal. It can just call the goal endpoint and everything starts. But I'm still using the good old USB button from two years ago because as I said, it's not perfect yet, right? We still do make some mistakes. And if it doesn't start when there's a Habs goal, then you just jump on the button and start it, which is great. But if worse, it goes off by mistake and you need to stop that right away because you've got some people crying in your living room, just got scored. You just hit the button and to remind myself that I had to keep working harder on this and make it better, I made it play this sound to stop the music. Okay. And so once the hack was done, I was psyched, but to do some self-promotion on the internet, I put together a little YouTube video and I did this in probably the ugliest way possible. I'm sure a bunch of people here have a couple ideas how this could be made better, but I just generated a bunch of PNG files with Matplotlib that you all know, synced to the audio, so I had hundreds of PNG files for each of my two graphs. I stitched them together in a bash grip and image magic and then converted those to a movie using FFM paid. So dirty, ugly, but worked fine. Then you could get that video into iMovie and just do a little montage with the shots I took with my iPhone of the light flashing. So what's next with this? So there's no problem detecting every goal. I can detect every goal, that's fine. The only problem is that trade-off between, you know, is it a good or bad goal for the few cases where it's kind of on the fence. So something I didn't do to begin with because I guess I thought it was going to be harder, I know audio better maybe, but some image processing of the actual game. So if we could answer a simple question such as whose goal is on the screen right now, or which side of the ice rink are they are they on right now, then combining that with the other model you can bring the overall accuracy to probably a hundred percent since we can detect every goal without any problem. So that's something that I'm probably going to look at for the upcoming playoff season. And the other thing is generalized to the other francophone hockey station, so RDS. I trained this on TVS sparse feed because they were airing the playoffs. Then I tried it on RDS, didn't work at all, like I said also CBC didn't work at all. And you know it's maybe not surprising, right? It's trivial features with trained on only this one guy, you know one commentator it's only him in the in the data set, so I totally overfit this this commentator which was fine for what I needed this to do, but it would be interesting to find you know what features would generalize to different commentators. So the last slide just for context, so as was said I gave this talk two weeks ago and Picon in Toronto and so I teased them a little bit with my hat's joke, so leave jokes, so I left this last slide and remember this was in Toronto, so it was pretty funny, but so thank you, and I'll take any questions now if you have them.