 So, little intro on Melbourne, Melbourne is from various NUS things, I think it's used to say NUS. And, how do you pronounce Margarina? Mage Arena. Mage Arena, sorry. There's Margarina and that's not even a hint. So Mage Arena I think is an open source software version of the card game for 1SN and he's going to be talking to us about how to get machines to do what we love to do which is to play computer games, so it should be fun. Thank you. Thank you for your very kind introduction and thank all of you for being here for this talk. So, I'll talk about Mage Arena at some point, but I thought I would make it a bit more general and talk about how various different open source game AI projects work beneath the hoods to get into a little bit of the computer science, as well as talk about why the projects are successful and what makes them work. So this will be my talk. Let's see, let's go. This is a picture from a game I played when I was a child. So, how do you play games here? Do you play games? Alright, good. So, the general idea, as you may have heard, is to get computers to play games. It's good as people. And this would not be a very old idea. I think maybe you can trace it back to even adventuring to develop an algorithm for chess but he had to play it in his head because there were no machines back then so he simulated the moves of the algorithm in his head. It takes him like a half an hour to make one move because he read through his algorithm. But this is an early program called Bell Chess. It was on the Amiga, but I played it on the PC. And it somehow gave me the impression that, wow, the computer is actually better than me at chess when I was a small kid. So, how do they do that, right? So that's the question. Of course, games are a good test bit for thinking about problems because they are very clean and abstract models. Especially for beginners, I think, they tend to, you know, AI games is a good way to get started. And so one of our early breakthroughs, I think, in this area is in chess. And I realize now, this is in the long time, this was like 1996, more than 20 years. And the match was between the computer, which is deep blue and Gary Kasparov, the world chess champion at the time. Kasparov was sort of representing the best of human chess players, chess grandmasters. And if you remember that time, you would remember that I think they played two series of matches. So actually, in the first series, Kasparov won. So IBM said, let's do a rematch. And Kasparov said, okay. And then in the second series, unfortunately Kasparov lost. So actually he sort of, he win one and he lost one. But then, you know, IBM said, oh, you know, Deep Blue has won Kasparov, okay. And Kasparov says, let's do a rematch. And I said, IBM says no. Because, you know, it's good to leave it on a high note, right? At least for deep blue. In case you haven't, you wonder what deep blue looks like. I did some research for this talk. And this is what deep blue actually looks like. It's actually not that amazing. It just looks like the best talk, but it's large. So he has a, it's actually a massively distributed system. You have many nodes, and they also have specialized chips that are designed to play chess. And today, actually, you can get a program from the internet that is much better than deep blue, even on your Android phone, right? And this is called Stockfish. You can see the last line. Okay, so that's where you can find Stockfish. This is a consortium project, C++, GPLB3. And if you look at the rankings of chess programs, Stockfish is the highest or the most powerful chess program. It has elo points of upward of 3,000, 3x3,000 elo points, which is, the human red monsters are about 2.8. So this is the best of the best. And the surprising thing to look at the list is that all the other programs are proprietary programs. So a lot of the people who join all these computer chess competitions, they like to, well, use proprietary programs. Because otherwise, the other programs next year could borrow your ideas and win you, right? So it's kind of surprising, right? Stockfish make an open source, and it's the only open source program in the top five. And it could be actually the top, right at the top of the computer chess. So how did that do that? Before that, let me tell you how the Stockfish actually worked. How does it find the best move for chess? So I can go into a little bit of a theory, but with pictures. So if you look at this image, it shows you the game of tic-tac-toe because chess is way more complicated to show you tic-tac-toe because it's easier. And the central abstraction that we use in these programs is what we call a game tree. And what is a game tree? Basically, you have the positions of the game. They are the nodes, and then you have the edges, meaning that you can make a move to go from one of these positions to the other one. And again, if you look top to bottom, it's going from the beginning, and then you make more and more moves. It's like, you know, the X makes a move, and then the circle makes a move, right? So you can see here you have two levels. But of course, the game goes on for many more levels. So this is a very important idea. Okay, so let me introduce something called optimal play. So let's think of a very small game. So in computer science, we have to work with small problems so that we can understand the situation. So let's think of a game where you only have two players. We call one of them the max player, and the other player the min player. So the first move is made by the max player, the next move is made by the min player, and then the game is over. Okay, so it's a really simple game. And each player only has two options, so you see two edges leading out of the nodes. So you can visualize the whole game just on this tree. So the game has four possible outcomes. Because each player has two moves, so you have a total of two moves. So there's two times two, there's four. Okay, and then let's label the nodes, right? So we'll take a convention that we'll label the node one if it's the max player wins. Okay, so let's see why it's the max player. And we'll label the node zero if it's the min player wins. So we just assume there's no draw. So either the max player wins or the min player wins. So this game is kind of biased towards the max player, actually. The max player wins three out of four times. Okay, but this is at the bottom, right? So now we want to work backwards. So if you're the min player, so what would be the number, right? So if you're the min player, you can actually see that after one move, actually the min player would win, right? So in this particular node I label it zero because in that node, the min player can choose to go to the right. This is the min player's turn, and then that's the place where the min player wins, right? So that node itself is going to be zero. Or if you give it that way, we look at the connected nodes, we choose the smallest number. Because I'm the min player, I want to go to a small number, right? That's the iteration. For the other side, unfortunately, both are ones. So the smallest number I can choose is a one, which is actually a loss for the min player. So at the very first node, the max player, of course, he would choose to go to the right. Because when you go to the right, all the outcomes lead to the max player winning. So in this way, you can sort of work backwards and figure out at any point in the game who is going to win. So mathematically, actually, you can say at every point in the game, you know who is going to win, even in a game like chess. It's just that, well, that's the problem. So chess has a very, very big tree. So it has 10 to the 46 states. So the states are the circles. So the 10 to the 46 circles, or for people who do programming, you can say 2 to the 153 circles. So it's a very, very huge number. So if you work backwards and go to the beginning, you can go through all these 2 to the 153 states, which will take a long time. So we don't do that. But people actually do that for, they go up a few levels. They don't go all the way to the beginning. It's called chess and games. That's all I'll say about that. People actually study these games. But we can't go all the way to the beginning, right? Because it's so big. Okay, it's what you do, right? The problem is so big. So this is how I show the very, very big tree. So there's the nodes, and then I have this triangle, which just shows the millions and millions of nodes below because there's 10 to the 43, right? I can't draw all of them on the screen. So I'll just show them as triangles. So the answer is as simple. So since I only have a small amount of time to make my move, I will not go all the way to the bottom because it takes forever, right? So I will just choose a level and just say, I only go at 2 levels. In this case, in this example, the computer will only consider the future for 2 moves, right? So this is the future moves. I only consider 2 moves. So I make a move and then you make a move. And then that's it. And we'll just ignore the rest. Okay, so when we do that, this is for the mini-max algorithm. So it's very important. This is a very standard thing you would learn. And then... So because in a move, we could give the numbers 0 and 1, right? At the bottom. Because when the game finishes, we know who wins, right? So now we have a problem. Because at this level, 2 moves, the game isn't finished because you know chess takes many, many moves to finish the game, right? So after 2 moves, the game is not complete. So we actually don't know who wins. Okay, so the... Here's where some of the heuristic comes in. So we gotta have some method of giving a number to an incomplete game. So the game is out halfway through. We're gonna look at the board, look at maybe how many pieces you have and so on, and give it some kind of a score. If you call this a heuristic score. And if you look at the board and we think that the max player is going to win, we give a number close to 1. But it wouldn't be 1 exactly because we can't be sure. And if we think the... I say the max player is the player playing black, because in chess you have to play black or white. And the white player is the mean player. If the white is going to win, we give a small number based on some sort of heuristics, rules of thumb, right? So let's say we give the numbers this way. So we give numbers like that. So it's 0.7, 0.1, 0.6, 0.9. These are not 0 or 1 because we don't know for sure who is winning. We look at whether you have more pieces and so on. We look at the position and the material advantage you have in the board. Okay, so assuming we do this, then we can perform actually the same method we did just now to the exact same algorithm would work. So what will happen, right? So we'll know at the two levels down. And then we work our way backwards. So the mean player here would be 0.1 because I will choose the right hand side. Because I want the smallest possible. And here would be 0.6 because I go to the left. And then for the max player, the number at the top is 0.6 which means the max player should make the move on the right hand side. Going to the right because this gives me a better outcome. Does that make sense? Okay, you guys can ask me questions later. Right, so this side works. Okay, so this is a basic idea but there are many improvements you can make to this idea. So for example, how do you calculate this 0.7, 0.1 or that, right? This is the heuristic. I didn't say how you actually do it. And here also we get very simple. I say I only consider two levels, right? But maybe you don't want to consider all the notes to the same level. Maybe some parts of the tree is more interesting, right? If I consider it deeper and some parts are more sort of obviously bad moves, I don't want to consider that part of the tree. And one of the problems actually in developing this software and finding it is that it's actually very, very hard to test changes and something that I don't see a lot of people talking about. So the issue here is that you are contributed to stock fish. You want to say, you know, so you look at one of the words, it's small, so it says more weight for ponds in initiative if both flanks are active. So this is to do with the heuristic score, right? So maybe there's some weight, some weight, some number for ponds. And maybe instead of the number being 0.7, I'll change it to 0.8 in the way we do the scoring. But stock fish is already a very good piece of software, right? It's already very powerful. So how do you know if you change this heuristic number from 0.7 to 0.8 would it make it better or worse? Right? Nobody knows actually. There's no way to tell. So the only real way to do it is to run sort of thousands and thousands of games with one version of stock fish using 0.7 and another version of stock fish using your proposed change which is 0.8, right? And we let them fight each other, right? And we make them play chess over and over and over again. And this is what this system does. It's a distributed testing system. So one other way to contribute to stock fish is like all the ways you've heard so far is that you can contribute CPU time. You can join this distributed testing system and use your computer to simulate stock fish. So different versions of stock fish essentially. The original version and the version with some change are different. So after they play, we have to play many games because the effect is very small. So maybe you have to play a thousand games and this one wins once, one more time than this one. You know, because the effects are quite small. And then that's how they actually judge whether something is an improvement for the project or not. And this is about to be the key idea of why stock fish is number one in computer chess. Because rather than debate endlessly about which is better, right? We just perform this test and know for sure if your change is good or bad. Whether it makes the program stronger or not. Right? So this is a very, very important thing. Okay. So let's move on to a different domain, a different game now. So I said of all these heuristics. So I mean like how do you weightage, what's the weightage of a pond, what's the king, the queen and so on, right? So of course you can use the fish test but it turns out that for some games it's actually very hard to even give a formula for the heuristic score. But chess turns out to be relatively easy to do that. Which is why chess works quite well. So what kind of game is that? So for example the game of Go, the game of Go was bit in the news one year ago. So Go was a game you play with also black and white pieces but they are all the same. So they just have these gold stones. Then the general idea is to try to occupy as much territory as possible. So these are like troops and they try to place these troops in a connected fashion to occupy territory on the pond. But if you surround the other opponent then you sort of capture all of these pieces. So this is sort of a simple idea of Go. In fact Go is a better game so you should go look at how Go works. So if you follow the Go has a corresponding to deep blue, right? In Go there is this program called AlphaGo, developed by the Google Digmar team. And almost exactly, well this is just slightly more than one year ago AlphaGo, developed by Google, played a match against, well played a series of matches against Lee Sedol who is one of the top Korean Go players. And if you follow you will remember that, well again this is a big news because AlphaGo won. AlphaGo won like four out of the five matches. So how does AlphaGo play Go? Because as I said one of the games where you can't construct this heuristic score is the game of Go because even if you ask an expert sometimes when the game is incomplete it's very hard for them to tell you which side is winning. Because the way the board works is that a small change in one part could affect the other part of the board. Because for example if you put a piece and connect two sections of the army to one that connected it's actually more powerful. So this is why it's verified with Go because one small little piece can connect together a much larger group. If you follow that let me tell you about what. AlphaGo unfortunately is not open source and I doubt it would be open source anytime soon. Or you might need a huge compute cluster to run it which will be quite expensive. What we have today is some other programs. So I'll talk about one of them which is called Pachi on this website. Pachi is written mostly in C. Pachi implements the same basic algorithm as AlphaGo and also recently incorporated this neural network that AlphaGo actually uses. So let's see what was the basic algorithm. Coming back to the game tree pictures we saw. So back to our old friend the game tree with many many nodes. Go has many more nodes than chess. I don't remember how many but much more. So we have the same problem. As I said we have no good way to assign the numbers to the nodes at the cut off. So we don't know how to give the numbers. So what do we do? So it's a very simple and elegant idea actually from I guess our statistics. So we don't know which future is possible. So what we do is we sort of have the computer make random moves from one of the positions make random moves for both sides and eventually the game will be finished because if you keep randomly placing pieces the board will be filled up and then the game is over. When the game is over you can count the territories and know who actually won. Because that's not how we know someone wins. When the game is over it's obvious. So let's have the computer make random moves until the game is over. Of course it's going to be written random result right. But we do that many times. That's sort of the trick. So we do that a thousand times. And the intuition is that let's say if white was in a very strong position then if you complete the game in this sort of random way white tends to win more often if white was originally in a very strong position. So think about that. So that's what it does. We make random moves and we gather the statistics. Who wins more often? After we do like a thousand different random plays. We play all the way when the game is over. Then we can use the statistics to calculate this number. So instead of just looking at the board looking at the pieces we actually make moves and complete the game. And when the game is complete we can know who wins for sure. So we do that for all the know-hooks and so on. But it turns out to be very slow if we do it like that because you're going to make many random moves for every node at the bottom which shouldn't have to be quite slow. Which like people came up with this concept called Monte Carlo Research. This is sort of a very refined idea. And instead of making many random moves, we just make one random sequence of moves. It is called a simulation in this case. So they call making this random move simulation. And when you do a simulation you get either a win or a loss. Because you play the game until it's finished. And then you take that either win or loss and update the statistics of all the parent nodes. It is called back propagation. And then we go back to okay so I'll tell you what the selection expansion is. So this has to do with which part of the tree we want to expand. Because as I said the tree is not all equal. One of the moves is a really bad move that we would never want to make. Because your tree has 10 different options. So one of them could be like a really bad move. So we never go there in real life. Because nobody makes this silly move. So we want to spend most of our time on places where it's sensible moves. What goal experts would actually play. So how do we do that? So this is why this is what these two steps are. So the selection step tells the system where should I explore next. Which section of the tree should I go follow. These are the basically the possible future. And then when I go to the bottom I will increase the size of the tree by one node and then just do one simulation. Just do one. And then do all the steps again and again. So keep repeating these four steps. Right. Okay so the goal is neural networks which I have no time to explain what they are. But the neural network generally is done to help in selecting the good moves. So I don't even have to go to the lousy moves. You can say it's similar in that sense but it's quite different I would say quantitatively if you think about it. It does do a simulation but the simulation is random in some sense. So it's not really giving you a test case where you say does it match the test case. It's quite different in that sense. This one does do bad propagation but it's not really updating the parameters. It's more like updating the statistics. How many times I won versus how many times I lost. So that's the difference. And then as I said go into the neural network to reduce the search further. But unfortunately no time to talk about that. Okay so last part third game I'll talk about because I'm running out of time. Third game is a game of hidden information. So the two games I've shown you chess and goal. The commoner team is that you can see everything happening in front of you. There's no secrets. So a lot of games have secrets. So in particular this game it's a game called Magic Gathering. This is a picture of the Grand Prix Singapore 2015 the year I was cut off unfortunately. And you can see the player there has a number of cards which he can see but the opponent cannot see. That's quite common in card games. So another game like that is poker for example. But I'll talk about magic because that's my project. And this is a project that I've worked on for a number of years. It's written in Java. Also in GPLv3. Unfortunately the website is cut off but you can google this work. So this is a board called Mage Arena. It prevents the Magic Gathering game on the desktop. And it has this AI where you can play with the AI. So I'll tell you a bit more about it because this is my project. This is how it looks like. I'm like most of these other engines look at all the chess and goal engines actually they usually have no UI. Because UI is done by some other program. But for us we do everything in one program. So you come with a UI as well. So this is what it looks like. So you ask me so how do we do this because now you cannot see the cards. I don't know what cards you have because it's like that. So how do we still perform the four steps? Essentially the problem is how do we perform I'll talk about how do we perform simulation. Because as you remember simulation means we make random moves until the game is over. But I don't know what you have in your head. So you can make some move from something in your head. But I don't know what it is because how do I do simulation? That seems a very tough problem. So for one year there was one issue that's open for one year about how do we do this. Because what we did initially was very simple. We imagine that the AI has X-ray eyes. The AI has X-ray eyes and you can see everything. The ones in your head and the ones at the table. Because that was the easiest way to get things to work. But our first players would complain that all the AI is cheating. So there was this issue for one year trying to solve this. How do we solve this? Not that difficult actually. That's what I mean. So what we do to think about the random moves is the future is very complicated because so many different options. So by making random moves we explore one possible future. We call this sampling in statistics. And we can do the same actually for the hidden information. Even for the cards that we don't know we will just pretend to fill them in with some possible cards. Based on what you have seen and what they have. What they have is a possibility because all the cards are like this and you have seen some on the table so what they have is the rest. We don't know how is it arranged we will just put in some random parts. So this is like also sampling the random possible hidden information. Because the hidden information is like a set. Because we don't know the actual one we know it's sort of this cloud. We probably have this or this or this. So we just choose a random instantiation. We just instantiate the hidden stuff with some random possibility. One of the random possibilities. It's just like playing random moves. Random moves means we explore one possible future. So here we explore one possible element from this random set of hidden information. So with that we can actually perform simulation because now we have actual cards. No more secret cards but it's not. The real things in the hand is just one of the random instantiation. It turns out they are good enough to perform the rest of the steps just this one trick. Okay. So that's how we do hidden information. But this is of course a very small hack or a small trick. It's not really very effective. There are actually more effective ways that people have discovered especially when they deal with poker. So let me just summarize quickly because I'm running out of time. What did I talk about today? Let's talk about the theory side of it. There are a couple of principles or basic algorithms that this game playing program uses. The idea of optimal play counting backwards. We match, we cut off at a particular point and we assign not 0 and 1 but some number in between and we can still apply the inference backwards. There is this idea of multi-colour research which is the random moves and then in the last part it's about sampling the hidden information. In practice, if you want to contribute to some of this software, these are the open source projects. You have Stockfish which is the best chess playing program. Pachi is one of the top two best gold playing programs available. It also uses multi-colour research. And Majorina of course is multi-colour research. Plus it takes care of the hidden information. With cards you don't see. Okay, so in a slide you can find it on the slide share. So that's all I have for you today. Thank you so much. Thank you very much. Majorina. If you have further questions from MetalVin you can catch me later on. Next up we have