 So I would welcome you to the next upcoming talk, which is called the automatic algorithm invention by Westfader, who is one of the software engineers of the part-time scientists, and you know they want to go to the moon, and he's going to show us how to use Cartesian genetic programming to evolve fitting image filters for their camera. So let's have a warm welcome for him, and we are lucky to have him for this talk. Hello everyone. Oh, I hear that go. I guess you can hear me. Thank you for coming out to hear a talk on my new favorite subject. So my new favorite thing, hopefully it'll be yours. I think there's no need for this slide after the introduction. So slide rules say I need to tell you what I'm going to tell you, so this is what I'm going to tell you, and what's not quite up here is something I hope to convey to you that I have a lot of passion about this topic, and my goal is for everyone in here to look at this and go, that is so simple. I can do that. I can do that better than he can. I've got 20 minutes. I'm going to knock that out and show him. I can do that better. That's my goal. I hope we all take it to heart and do it. So and there's sort of a secret goal at the end that maybe a little maybe somebody can help me with that. So let's get started with sort of the, also I've had a chance to time this, so this will take as long as it needs to take, so just go with that. So let's talk about why. It's important to understand why somebody starts to think about this kind of tool. That way you can start to say what projects am I in, where I should maybe be thinking about this. So we're sending a rover to the moon, a little dune buggy, you know, about half the size of this podium up to the moon, and it's got some cameras on it, and those cameras are like most cameras we get, they need a debayer filter, which was new to me prior to this project, and we really can't test this stuff too well here. I mean, there aren't too many copies of the moon. We can actually put our cameras on and test them. So we needed a, pardon? So we needed a, we did an algorithm, and we needed a way to make new algorithms all the time, and for our background, I want to tell you sort of what I had to learn to get started with what is, what is debayer image filtering all about. So it turns out most of the, most of the cheap chips in your cameras and such, that they're actually just black and white, just grayscale, and somebody a long time ago figured out that if you take these grayscale sensors and put little pieces of colored plastic over top, lead in only red, only green, only blue, you end up with this little pattern, a little repeating pattern of four here of two greens, one blue, one red, and that guy named was Bear. He invented this, patented it. I hope he made some good money on it, because there's certainly a lot of cameras out there that use it. But the upshot is, is now we've got to take that and convert, take these four pixels, take those four pixels and convert it to a single RGB value, because that's what all of our image processing software counts on. And if we didn't do that, we'd have to use, I think it's a foveon, there's another kind of chip where they stack these sensors on top of each other, very expensive, very hard to produce. So these, these cheap chips are sort of the lay of the land, we have to deal with that. So it turns out there's no single equation that tells you how to combine these, that if we were to take a picture of, say, this half of the room, we've got, we've got a lot of contours in the crowd, and then a lot of flat area here, and then some shadows over there. It turns out that optimum placement, you need different equations depending on the local parts of the images in order to recover what was the original light rays coming in. There is no single optimal equation, but we tried. We tried. Said, let's just break out, let's just get some sample images, and let's break them out, and let's try to do good old traditional optimization. And, you know, so take some coefficient times red, both blue, greens, and try to optimize this equation across all the pixels in image to come up with a minimum error. And this is not algorithm invention. This is optimization, and there's a very clear distinction between the two we'll get to. So you start with inputs. Hey, I have a, a, a beard image outputs. In my case, you would have a debair image, which I created with a Povray 3D modeling, so I could get mathematically precise ray tracing about what the, what the photons coming in exact really were. And then we start with this formula structure. I said, look, this formula, just find these four coefficients, the optimal blend of them. That's my goal. And of course, constraining to, you know, try not to make any things ridiculous. Ideally, they would be integer multiples so we can shove it on to a computer chip and FPGA and hardware. So that's our mindset. Take this little image, get our nice RGB out of it, and let's just maybe, maybe we can come up with, but nobody else has been able to come up with some, use optimization, oops, and a laser pointer, to tweak this equation. Well, that didn't work so well. If it did work so well, we'd be done right now, and I think I'd be agued a few times. So So we had to turn to invention. And the reason we had to turn to invention is again because the fact that there's no point in a human writing this algorithm here on Earth using Earth data. We're fully knowing that as soon as that rover gets to the moon or maybe opens his eyes in space to take a look at a few stars. As soon as it does that, we're going to have whole new sets of data that we really just can't simulate too well. Here. So we need a process in place to quickly make new algorithms to clean up these images as fast as possible. The seconds count when radiation is pouring into your rover, and you need to get that thing driving to win the money and the prize. So we don't have a few days to sit around and put together algorithms. So we need automation. So that led us to the idea of invention. Invention differs from optimization mentally in that invention. You do not start with the formula structure. Optimization, we take our pride and say, I know how this should be. You just work out the details that details are these coefficients, but I know the structure of the universe. Invention says, I don't know the structure of the universe. I know what I start with. I know what I want. You tell me the rest. You make the equation or the algorithm. That's invention. That's the mindset that got me to here and started looking at something called genetic programming. Genetic programming, a quick show of hands. Who has ever used genetic programming? That's fantastic. Okay, that's more than I thought. So feel free to boob. I get something wrong. I could use the feedback. I think that's just the normal fitness function. I do find it easier in writing fitness functions to complain than praise. So, you know, it's okay. So this is tree-based genetic programming that says any old formula, preview, any old formula your algebra teacher would hand you is really this nice syntax tree. And then once you look at a formula as a syntax tree that you can start to do nifty things with it. You could say, hey, what happens if I take this divide and make it a plus? What happens if I take this divide and grab this whole tree branch and move it over here and move this whole tree branch over here? Do I end up with a better formula? Do I not? Okay, that's tree-based genetic programming. Where you take this individual and start swapping its tree-based genes around and it does invent formulas for you. It's pretty nice. Not quite what we needed. Not quite what we needed. Wouldn't have been bad. But on the part-time scientists, we're deeply fascinated with this thing called an FPGA chip. Another show of hands. About the same number, but not the same. That's fascinating too. Okay. So FPGA chip field program will gate array. Imagine millions upon millions of AND gates or gates, XOR gates, and you get you right scripting language and that scripting language would convert it into a series of wiring patterns for those gates and then programmed on the chip. So the chip is just sort of anything and it'll grow up to be a microprocessor. It'll grow up to be this equation and be anything you want. So we wanted to find a solution that would fit well on an FPGA because we're talking about processing millions of pixels of images potentially on a steady stream. So we need some hardware acceleration. So it better be compatible with the hardware we're putting up. So along comes this interesting technique. Pardon me. This thing's excited. It's called Cartesian genetic programming and this was news to me. It was invented about 10 years ago, Julian Miller. I think he's in the UK. Nice enough chat. Terrible website. We should help him with that. The big benefit to us when I first saw it, when I first saw it, I went looking for open source genetic programming tools. I said, why type something in when somebody's got it out there, is what this thing makes looks like circuits. It looks like circuits. It's parallelizable and it's FPGA friendly and that's what we need. So all these things are leading us down this path. So Cartesian genetic programming sort of have the few slide few slide thing for you. So start with start with a grid, in this case a two by three grid of circuit elements, two inputs, one output in this case, and just take that grid of elements to say these are anything and I can just toss function chips down there and shake it up and I see what circuit I get out of it. And I take that circuit, that randomly generated circuit and I test it. It's terrible. Stunningly terrible. If randomness got us something for first shot, I think we probably wouldn't have much of a job for thinking about solutions anyway. So it's stunningly terrible and that's just the nature of how that works. So then take that and make a few little changes. Reach out, move a couple wires around, flick a couple chips out, pop in some new functions in there. And score it again. What do you know? It's still terrible, but not as much. It's like infinitely terrible minus a little bit. That's progress. That's progress. And where there's progress, where there's a process that we can do to make progress, we can automate that and let this thing start to produce something for us. That's the gist of this. I mean, it's not not that. I know our part-time scientist logo is hell, yeah, it's rocket science, but this is not necessarily rocket science. It's just random science. So the overall flow, okay, so we've got this grid, this Cartesian grid of circuit elements while you're together and there's some rules for it and I can point you to some books. It'll give you all the ditty gritty detail rules, although you'll probably pick it up from here in any code I hand out. So start with one parent circuit and that parent circuit has a score just arbitrarily here. I assign, you know, 500 and then make mutant children. There's something about that phrase I love. Just make a bunch of mutants from that. Okay, so we make three mutant children from the parent, assign their scores and then, and then the critical step is figure out which one to promote. Who gets to be the parent? There is one parent for the next generation. That's a critical thing with Cartesian genetic programming. One parent, multiple mutant children, and then you collapse back to one parent and keep repeating this cycle. It's very important that we pick the best child who is not worse than the parent. Now, you notice here the numbering that I'm using, I choose lower numbers to be metter. And that's because I tend to write fitness functions personally as complaints. So fewer complaints, just like a lot of C-routines return zero for success, zero problems. So it's very important we take that for that child who's not worse than the parent to promote them and then re-score their mutant children, the promoted mutant children, and then just repeat the process. That's the simple steps. Now, I want to talk philosophy here about why we have to do that. This is a nuance that's glossed over in a lot of the texts. Well, a lot of the texts. The one text I've been able to find that was published two months ago. It doesn't figure you write a bunch of code and then a week later somebody publishes the definitive work on the code. I could have used that. That would have saved me a couple weeks worth of sleep. So when you publish the child who's not worse than the parent, you're saying the new child is less than or equal to the old for fitness. Whereas promoting only a child is better requires this kind of condition. So you see the types of scoring that can show up here. So here we take the parent had nobody better, so it got promoted and one child in up better. On the surface, these don't look much different. I mean, yeah, I kind of I kind of game the numbers for 400's a little more blatant like, oh my secretary of blue, that's terrible, but I kind of game the numbers there. This makes more sense when we look at it in terms of mutations. I think there's something related to zombies with this because it's all about mutations. If you track this from starting with the parent, start with that parent who had zero mutations relative. He got promoted, then each of his children had one mutation relative to the parent, and here we just promote that same parent again. And what we're doing is we're counting on one mutation to make all the difference in the world. When we do that bit about only promote something that's better, we count on that that one that thing better has to make one giant leap, which I can't do our elusive microphone. But when we promote anything that's at least as good, we replace whatever we're using with something that's at least as good, then we carry the mutations that that one had forward. By carrying those mutations forward, we get what the books call neutral search, which says we've come up, we've hit a flat spot on the hill. We don't know which way to go, but let's just keep wandering this flat plateau rather than calling it good. So by carrying that mutation forward, we have the chance that one more mutation on that will activate hundreds of mutations we made along the way and that we could have a big leap. And it turns out this little subtlety that one difference of an equal sign makes a difference between success and failure in a Cartesian genetic programming project. It gets like a paragraph online, but if you did any other, all the other evolution work that I've done, it's you promote something that's better, not so a Cartesian genetic programming. It's very important you have room for mutations to stick around and be carried forward and get suddenly activated. So that's a critical terms neutral search for compside people who are looking for something of Earth thesis. Don't worry. We're almost at the code. We just got to get some of the terminology out of the way. And how are we doing on time? Fantastic. Okay, so Cartesian genetic programming, we start with our grid of genes. We have some number of inputs, some number of outputs, some number of columns, some number of rows, and then inside of each gene is a little function block with an operator type, a constant, and input indexes. Input indexes tell you where these wires come from. And there's a general rule that anything in this column can only talk to things behind it. So anything in this first column can only talk to inputs. Next column, anything here, or an input, an output can direct and hop over and just grab an input or anything calculated in the middle. All right. Though I didn't draw the slide, mostly due to sleep deprivation, it is possible to evolve a neural net with this. If you allow an operator type of the sigmoid function and give sufficient columns, basically log base end of your inputs, you can reproduce a full neural net in this. So this factors in neural nets, this factors in a lot of other things can be evolved with this. You could start with a neural net pattern in this and then let it mutate it from there, even to accelerate your research. So that's the general gist of it, and I should also point out there's one more thing on here. You'll see in the code it's called n, a, arity. And that's how many of these input arrows and two is pretty much the standard, you know, unless you've got a divide that somehow takes three. All right. As threatened, code. Which? All right. You may boom me for pulling up Dev Studio. Okay, so is that readable? Big enough font? Good. Okay, I can't hear you, but bigger. Yeah, thank you. Clown school bigger or just bigger? Good. Good enough. Zero complaints, perfect fitness. All right. This is our core class, the individual class, and first off you notice I use floats everywhere and that's because this is supposed to get ported to a GPU. I will say this did not have time to be ported to a GPU because this got put off in favor of the presentation coming from Friday, and then this one got picked back up rather quickly. So it should be easy to port to a GPU because we've kept the data types GPU friendly throughout the whole thing. So the heart of an individual is this obnoxious float array that's number of rows times number of columns times in this case four, one function, one parameter, two inputs, so four, and then each of the outputs. So it's basically taking that whole genome block we had and just laying that out in a big massive array of floats. Some supporting routines. And again, my goal is to show you I could do this. Why do you even talk about this? So here's how we set it up. Grab yourself a pointer to your genome, and then for every column, for every row, basically set your function. In this case, I do zero to 255 and I take, let's just a modulus operator, take care of that in a switch later, and then a parameter. There are some functions, for instance, say like add a constant value to an input where you just take the first input, you add the parameter to it, and then that's your output. Start adding two numbers. And then assign two random sources. And the critical thing here is to honor the rule that in a given column, so for a given column, you can only reference things that are behind it. That way, a wave of execution can sweep through. So again, very simple. It's just a matter of getting your mind around the data structure. And then output genes, basically for every output gene, you can pick anything before it. And if I'm not making sense with the code, just let me know. So that's your basic holder. And it's ridiculously simple to execute these things. So start with your individual, a little accounting, and then for every one of your genes, you start skipping through in blocks of four. And just a switch case statement, dereferencing that. It's trivial. It's all this compsci. When I started reading about this, there's page after page after page of compsci. It's like, yeah, yeah, yeah, and show me the code. It's like, wait, that's it? That's all there really is to it. So it just goes on and on for in this case, whatever function you want. I will say this is a handy little function, no op, just says I take my first input and just present it to the output. That's been very helpful. I find I don't get as good a result if I leave that out. I think it's because it lets you have a place holder for mutation to happen. So you can start to you start to mutate that that second input to this one can mutate and then later one small change will change the function code here. And then boom, now it's an addition. And that changes drastically the output equation. So it's one more place for mutation to accumulate and then make a big leap later. Okay. So show of hands, trivial code, trivial code. Fantastic. Okay. Next set. Next set is to think about how do you express test cases in a general way? And so the general array way to express test cases is this nifty little 2d array. I had to figure this out on my own because nobody has presented a nice framework for this stuff yet. So every input that you have available, every expected output you want to go with those inputs. And then for every member of the population, what was their fitness score? So when member zero executed these inputs and you compared its outputs to these outputs, what was its fitness? And then just a big dynamically allocated 2d array of floats. And the process is for each test case, take its inputs, map them to an executor's inputs, run your grid, your big set of switch case statements, produce your outputs, and then compare the outputs you've got with the ones you expected and give me a single number, just one number. I've not experimented with multiple dimensions, but that seems to be a big topic. I haven't played with that yet. It's just usually a sum of squared differences. Pretty standard Euclidean distance type thing. And you just iterate that through for every test case, for every individual. Take all those, sum those up. Whatever individual has the lowest score, who's not worse than individual zero. Individual zero is the one and only parent. Then he becomes the best one. You promote them, and you repeat this until you decide you're done. More code. And then a demo. So test case is a pretty simple piece of code. I didn't find any, it seemed like the researchers were working on this before me. Did not make an effort to sort of genericize their code. It was always a one off for whatever they're doing. And I'm hoping I can share this with you guys and make it a little more generic. So that's why this whole test case holder. So it's just a big pile of test data. Same arrangement. We saw the slide. And then utility functions to get and deal with the arrays. Not a big deal. The real algorithm about shaking things up, that's here, the population. So population, you say, didn't get it to a GPU, but at least bury every core on the CPU. OpenMP, really easy way to add parallelism. So you can turn it off for debugging. My God, is this stuff hard to debug when you get pointers wrong? You've got eight threads going, all pointing over the place. You just turn off OpenMP and drop it down to one thread. It's a whole lot simpler. For every individual, set up an execution environment. And then for every test, go through your square differences. Add it all up. And then promoting the best is just simply thumbing through. Notice the less than or equal magic sign, buried, middle of some for loop. It doesn't work, sign. Literally, you can try to do a simple three-bit adder. And if you leave that equal sign out, you'll never get all the test cases. You'll never get all eight test cases to come up together with leaving that out. Leave it in and rock that. And then just a few individual copies. You need a lot of support routines, just to shuffle the memory around. But don't let that get in the way of seeing the algorithm here. And then copy the parent and my favorite, mutate the children in advance. It's not that. So, simple code? Yeah. All right. That's more people now. That's great. That's great. We're evolving. Okay. Are we on time? Okay. Moving along pretty good. So I'm going to show you the demo that I put together. And it's quick. It's quick. So if you blink, I'll run it a few times. But I wanted to at least give you the preview of the demo. So over here, we have a bare image, B-A-Y-E-R. And you notice it's heavily mosaic. It's that gray and white. It's actually double resolution for this. It's all gray scale technically, although it's full of red, greens, and blues. Here's the image that's run through the best Cartesian genetic program so far. So this is a validation. What we do is we take several hundred pixels out of here at random and say, these are my test cases. Now evolve, evolve, evolve, evolve using those few hundred test cases. And when you get one that's better, do the tens of thousands of test cases that are here and produce a visual thing for human to inspect. And that's your validation. And then over here is the uber complicated because I could. Visualization of the test matrix. It turns out in that test matrix being a nice 2D grid of a float, you can take the fitness chunk of that and make a little hike map which displays lightning quick on a GPU. So that's the technical data. It'd be like you guys reading the LEDs on equipment. So that's, all right, don't blink. What you'll see is it'll come up, starts bad, and then pretty quickly snaps in place. So in only a few hundred generations, this thing locks in place an image that's actually pretty close. Now you will see, you'll see some discolored pixels in the shadows. The discolored pixels are because, because dark pixels are underrepresented in the sample data. Now if they're underrepresented in the sample data, when I, you know, I intentionally set out to write automated tests for hundreds and hundreds of pixels, how represented in the data would they be if I said, if I said, no, no boss, I'll just write this by hand. I'd probably run a few dozen. I probably would not run thousands. So testing with this opened my eyes to how much testing needs to be done. We're on this a few times. It's very common for it to start up even looking like, like moss or all grayscale and snap in the color. So these are true one RGB pixels, everything flattened. Now, does anybody know, hey, there's a good one. It's all about the random starting conditions, but no matter what your conditions, it really doesn't take long to snap in place. Sorry, I've watched that hundreds of times. It's like my favorite thing. If I could turn that into a screen safe, I would. I just love seeing all the starts. And then like every time it changes, like it's a surprise, like a goldfish thing, a castle. Yes, sir. I simply have to use human voting. That's the best I can do. What I can do is before the moon, I can mock up the best things I can, let's say, Poveray or something where I have mathematical, deep mathematical control over the light sources. But on the moon, I'm going to have to let humans vote. And I may have to, humans may change their vote later and we have to go recompute prior images. So it's important we store the raw data stream and be able to reconstruct it with a new algorithm on the fly. So we kind of have a unique feature in that way. And all of it's being, it's very valuable recording a lot of it. Yes, sir. Why don't we do the whole thing? Oh, we, believe me, I wanted to do, why don't we, sorry, I was told to repeat your question. If we're getting all of this down to earth and we're running this stuff, we can do all this on earth. Why don't we just do all of this on earth instead of on the rover and skip the whole hardware on the rover? It's a, it's a bandwidth thing. You know, RGB or some other color space compresses a whole lot better and a whole lot better than the flat debair. It's just, it starts off, that's what I've been told by my hardware guys who are working on our compressions. So I'm just a software surf. I have to go with it. One more, one more for good measure. I'm sorry. It's my new favorite thing. Okay. Now, as I understand it, we're moving right along. Scaling. So first off, this works. I'm very pleased with what we're able to get. I'm very pleased that in a few seconds on a cheap laptop that we can get something pretty good coming out of this. I'm very pleased with that. I think that we can go mission ready. I think we can do bigger nuts and get even better image quality. I'm very happy with that. Now it's a matter of scaling because we've done and taken us on to other experiments and a talk I have coming up on Friday. I'll show you how we use this stuff to actually evolve communication protocols, which is far more interesting. I think the hacker community, but I'd like you to walk into that talk going, I could do that. So it turns out cartesian genetic programming scales poorly, poorly, poorly, poorly. The more outputs you need, the worse it gets. So it's an N squared N output squared, at least as far as how this thing scales. And by that, I'm not referring to execution time of your grid in here, because there are optimization techniques you can do and compilation you can do. What I'm referring to is how many generations it takes before you lock in on an answer that's close enough to use in production. There are some problems where 20, 30, 40, 50, 100 million generations, weeks upon weeks of computer time, never lock in on a problem, simply because they're up to a dozen outputs, up to 12 or more outputs here. So managing that is the big researcher or cartesian genetic programming. The code is simple to get started with. But then how do you scale it up to more complicated problems? One problem, one technique that's met with a lot of success is to what they could use, they call simple chromosomes. So that is take your same inputs, feed them through different networks that are evolved at the same time, but separate from each other. So nothing from here can connect down to here. Nothing here, here. It's just strictly everything in this network. Come on. And one, exactly one output. So this gets us really low on the O N squared curve. And those problems that would take 100 million generations of still not converge will converge in four to five million generations with this approach. So this approach is, is the salvation for Cartesian programming on bigger problems. However, if these two outputs are related to each other, then you end up co-evolving substructures within the middle here. And you count on the fact that the substructure evolved here is robust enough like the one evolved here. That's not necessarily a good thing to count on. And in fact, it can be hard to score this chromosome independent of that chromosome. How do you say the red was good, but the green was bad? I guess you can do that, but it really bites the communication protocols we had to do. So we've invented something we call the complex chromosomes, which says, all right, you've got five outputs you want, at least break it up. Break it up into two, two or more small sets and don't provide them all the same inputs, only provide the inputs that we know empirically or from us as domain experts are actually relevant to these outputs. So add the extra part where there's the world, map the world in the subset of inputs, run them through separate nets, get separate subsets. This converges rather nicely. This is easy to code. This fits the mindset very well for simulations, not just test cases, but this fits very well as event handling simulations. It's very natural. And I've I've shown over the last several months worth of run after run after run. This converges. In fact, this type of thing will converge to a flawless communication protocol, chromosomes for dealing with flawless communication protocol in 24 generations, until until a hacker gets involved. But that's another story. So definitely recommend you look into using complex chromosomes. If you want to get this for a real project. And now really exciting part, as if this were not exciting enough, I know, a really exciting part. So far, what we've been generating are not technically algorithms, the formulas. Okay, you know, this input always gives you the same output. So now introduce a set of variables, state data. The state data is input along with the world's data and gets output as well, forming a loop. So now this network has a memory. The founder of the Cartesian genetic protocol or genetic programming, he's had a lot of luck with medical imaging. Doing a medical image analysis, he's finding cancer a lot better, a lot faster than humans or any other image analysis. And it's because he's giving his, he's giving his grids multiple passes at the data with state. We find that by using state, we're able to evolve communication protocols as well. So this, this little twist here of adding a few extra variables you hold on to. Very helpful. Now you're not just a function, you're actually evolving an algorithm. However, it's a state machine. So no wild loops, you've got to go back to the old comp side where all the world is a state machine, because that's the way this thinks about. And to really slam it home, to really slam it home, take this class contrived, yes, who writes widgets every day? I don't. Anyway, what you get to just is two member functions, taking some input, giving you an output. Normally, if you want to return one thing, you might return it as a native type, or you might in turn as a function or as an object. Here we're just doing floats, because again, we're CGP. This class, where we have two chromosomes, each with private state, and then the chromosomes with shared state. So it's like having two sets of the state data, one, one piped into each FNG. So this class could be evolved. This shows you, when you start to pile your chromosomes together with state data, you are actually evolving objects. Private internal data, private algorithms, public interface, you're actually evolving little miniature classes you can drop in. A few tips, and we're just about at the end. So a few tips, visualize everything. Wait for it. Okay, here we go. Sexy, right? No. Sexy. I can't tell you how many yawns and I don't care looks and the lack of email responses you get. Like you can get when you say look at these numbers. Look at look how well this image is converged. Oh, and the reason it's going faster is because it's not running validations because it's not come up with a better fitness number, fitness number. So it's slow at the beginning because it's coming up with new ones all the time. So it's critical. You come up with some kind of visualization to show how is this doing? How is this working? How well is this working? You'll be able to push that in front of people, your stakeholders. They come to you with this problem and you solve it. Also, I can't tell you how valuable this was to help me debug my code. When everything came out on there in red, I could not get any green to show up. I knew I had an index off when I set up the test cases. I was just pulling data from the test cases, the wrong one that was never presented with the color, key color. So visualize, visualize, visualize. So I've got some data for me that helps me technically. I've got some data over here that helps me actually explain this to people to show them what's going on. And come on. Well, technically you can fly through this, but it's not taking my keyboard commands. In fact, you could fly through a demo. I think it's nice. Very important. We're used to writing unit tests, unit tests. Yeah, excellent, excellent. Our unit tests are pass or fail. Hand of God, you live or you die there, pass or fail. Okay, that doesn't work too well with Cartesian genetics. Unless you write millions of unit tests, you want a unit test that says almost there, or not even close. You want shades of gray. You want that unit test to give you some indication of how much progress you have to make. Because without that, it's hard to have a fitness function. If every time you call, you just get back a zero. All zeros look the same. All failures look the same. So how are you to know that you actually made progress? That's where the neutral search helps a little. But if you had some shades of gray in your test cases, return back, you got, you got 10 of the rows back when you queried the data employee database. You can get better search algorithms to converge faster. So you have to change your paradigm a little from its traditional unit test. Increase your population. Everything I've been running has been a population of four, because that's what the main researchers did their work with. So I just followed that just to get started. But I think they chose four because I was a cores on their CPU. I've got eight, and test after test, I've run now, show me eight. Eight, population of eight takes three times as long to run as a population of four. But even if I cut the generation count by a third, so I'm running the same generations, I'm running, say, 100,000 generations of population four in an hour, and 33,000 generations of population eight an hour, I get much better results of population of eight. So crank that population size up as high as you can go, there's probably diminishing returns, and mutation rate. There's like 100 parameters you can tweak and tune with this algorithm. Key is mutation rate. If you end up with a giant, a giant grid of functions, you might be inclined to say, well, I've got a giant grid and maybe 4,000 functions possible in there, of which 20 or 30 are active, one little lightning bolt through them are active at a time. You might be inclined to say, well, crank the mutation rate up to 20%, let's change a lot every time. It doesn't work. Likewise, if you have tiny little grid, you might need a high percentage. So it's really the number of mutations, the number of mutations per generation needs to not be a percentage, but a slowly increasing number. So 4,000, you might want to mutate only 10. Whereas if you've got 1,000, you might want to mutate maybe seven. So it's not an n squared. There's no good rule of thumb on that right now. And it's something you have to experiment. You have to evolve as a researcher when you do this. You're going to do hundreds of little probing runs. You're going to say, okay, that's good. I'm going to run this. I'm going to Berlin for the week and I'm going to let this run because I did my probing runs and I think this is probably on a worthwhile track to let the servers keep up with. Places to go next with this research. Push this to a GPU. You notice I left the GPU out, but good reason. Especially validation. Validation is so many test cases. You want to push that to the GPU. There's a great book out there called Evolve to Win, a free ebook. And they had really good luck using islands. So take a little island. And in that island, they don't know anything about each other. They just compete with each other. And then every once in a while, the best ones from those islands go to other islands around them in a chain, actually a circle. And it turns out you can evolve a strategy that way that's better than if you constantly played a master. So novices fighting each other on islands will evolve better strategies than a novice who fights the master and is trained by the one master. It's better to be trained by novices and middle ones. Work your way up. Seems to be the data with backgammon and probably data in a lot of competitive situations. And then because, like I said, there's hundreds of things you can tune here. Put another optimization algorithm on top when you do your probing runs and let it auto-tune for you. A great way to go. And lastly, somebody here needs to help me put this up as an open source project. See my friends at part-time scientists. And shameless recruiting plug, join us. Now open it up to questions now. Thank you. Thank you for your talk. We have one audio angel running around over there. So please raise your hand if you want to ask a question and wait for the microphone. We also have a signal angel active in the IRC. So please also ask your questions there. The first question please. Yeah, you said that it would be good to keep some states. What does comprise a good set of state variables? Where do you get them from? Is it by randomly selecting some gene or keeping them for the next run? So where does a good state come from? Let the algorithm decide what it chooses to put in the state. Don't play God with it. Let the algorithm decide what to put in the state. I've seen over and over with the communication protocol. Yes. Yeah, it provides a feedback loop. He has another t-shirt to give out if you have a good question. Two more good questions. Oh, two more. Two more good questions. The process of optimizing what input data you said, the four cores, the eight cores, and the parameters you can vary. Okay. That looks exactly like the problem you're doing. Can you do genetic computing on genetic computing to optimize the process? Potentially. So can you use genetic computing to optimize the genetic computing? I would think so. What you would end up with is an algorithm for optimizing any genetic computing. Whereas what would suffice, what would be sufficient, is to use standard optimization techniques. I personally wrote a big particle swarm optimizer and found a new a new attractor technique that's really good in small sets. And so I hope to plug that in on top. But yeah, you can totally go, you know, do genetic programming, optimizing genetic programming, optimizing genetic programming, as many layers as you want. It's until, well, the old turtles on top of turtles thing. Excellent question. You said that you eventually wanted to, I am here, you're searching for a face. Okay. Yeah. You want to eventually implement this, the algorithm that you're evolving on FPGA. Yes. And I can imagine that with such a complicated algorithm, there are a lot of dead ends and you said you included a no-up statement. Yes. So is there any algorithm simplification involved when you convert the algorithm to run on an FPGA? Yeah, certainly. A lot of the algorithms that genetic, Cartesian genetic programming generates have tons of redundant kind of pointless code. So what you really want to do is not generate a circuit directly. You want to generate an intermediate language like VHDL or Veralog and let the optimizing compiler there trim that down. And if you give me a moment, I can show you, I can show you some of what I mean there. If I have any images here, nope. Sorry, I can't find something quick enough for my taste, but generate, the Julian Miller came up with this, what he did is he would generate C code and then let the optimizer compiler do that to remove dead variables. So that way your optimization only used to go so far. That cuts out a lot of dead ends. For us, it would probably be manual because we'd only burn it once into the FPGA. What gets sent to the moon, but what we do down here would be highly automated. I do not understand how do you decide to stop the mutation? How do you, how do you fix the borders of the mutation? Yeah, how do we know when to quit? Yeah. Well, it is easy if you have, if you have definite test cases where you could say all my test cases pass, such as, let's say you're evolving some Boolean circuit and you can produce, you know, the thousand test case, thousand 24 test cases. When they all pass, you can quit. When it's an image, that's a little harder to know. What you find when you look at, when you look at fitness, fitness over time, so as the generations go, the fitness drops off. And what you find is the fitness slows, progress slows and slows and slows. And at some point, usually you do some probing runs and I've been finding after about 100 to 200 thousand generations, at least for communication protocols, I quit and just start a new run. And so I rely more on right now manually doing the little islands and then I take the best out of hundreds of runs and go with that. And you could periodically take those and run those, but there is no good way to know if you've got a soft test case. Sorry. Yeah, sure. Hi, I'm the Mission Angel and I have the questions from Chad. Excellent. Hello, Chad. They did go in two distinct directions. The one direction is like which kind of hardware are you using? Is it like ATI or NVIDIA cards if it's about the GPUs or if you use FPGAs, what type of FPGAs would that be? Well, we're NVIDIA fans and CUDA certainly makes it very easy to do the coding. I find it easier to work with an open CL. So we use the NVIDIA cards. There's no... If you had the extra time, you could do it in open CL and port it to anything. And then Xilinx FPGAs is what we're using. Spartan 3e or 6 was the question. That's up for my hardware guys. Okay. Okay. Hello. What kind of FPGA are we using, Arna? Vertex 5. Vertex 5. So thank you. And how do you tell how the image should look at the end is another question? Okay. Well, for a lunar image, we have to rely on human judges. Okay. Okay. For the test images you saw running up there, what I did was I used Pavare, so 3D ray tracing program. And I took one of their pebbles scene as an award-winning scene and I thought it looked lunar. And I render it at one low resolution. So every light ray coming from a pebble was actually kind of thick in the low resolution. And I render it at a high resolution so the light rays are thin. And so mathematically they're equivalent and I can use this to sort of recover the mapping between them. Okay. One question here. Straight ahead. You said the runtime grows with enter the power of 2 at least. Have you ever thought about implementing pattern recognition or backtracking into your code for similar other cases on a similar pattern calculation? There is a, it's the generation count that grows. And there are optimizations you can do to eliminate dead code. Especially how to do the communication protocols. These are really tiny notes so the communication ones are bigger. Dead code removal makes about a 20-time difference. It's only execute the 20 blocks out of the 2000 that really matter. And aside from that I don't think there's much else you can do but I'd be open to creative suggestions. Have you ever tried running this with a fitness function which is actually pretty expensive? So to say if you want to discover unknown patterns and a human has to evaluate each of the test runs so the fitness function is essentially what's bottleneck in the case. How do you deal with that? Small populations small populations and islanding because you you randomize one parent. You're going to randomize one parent and then you're going to make progress and that parent's initial initial weight initial setup carries so much momentum at some point you just can't go too far with it. So you try to start with say maybe 100 100 100 generation runs and running those in parallel if you can and then you say okay the ones that do best now I'll commit to those too much longer execution cycles but islands are your answer there. Do you have any thoughts on the sort of wider philosophical implications of this kind of thing? Do you believe Wolfram when he says that in the future we won't write algorithms anymore we will only guide our machines to search the space of all possible algorithms and do you believe the hype on that? Not yet not yet mainly because of the scaling issues here. I've yet to see this thing write a user administration screen for assigning rights to somebody on a web page and I doubt you could do that. However I do believe that we humans spend too much of our time rewriting the same same little algorithms that we should be boiler plating you know in some sense what you're doing with this is you're saying to solve this problem I must put all these tools on the desk and any worker that solves these needs these tools and that that's our primary expertise creating and just use our intuition about what tools somebody probably needs to solve the problem and I think we spend too much of our time solving the problem rather than figuring out how to solve problem solving how to put the right tools on there and that's the growth area for us and until we've grown more in that area I don't think we'll be able to answer Wolfram's questions So I would say two more questions and then we have to finish Yes I see the way I understood you implemented it is that you only choose one parent for the next population Yes that's correct So this is more similar to a greedy neighbor search than to a genetic algorithm where you put where you generate a whole population and then choose from the population and recombine the solutions to new population There is there is no there is technically no crossover which characterizes true genetic algorithms there's no crossover of seed from one to another however so that is true however when you start doing a chromosome approach and I don't think I explained it very well so I apologize So you say I'm going to have five chromosomes represent my individual here and so I have a population of five chromosomes and 20 individuals I mean four individuals so I've got 20 chromosomes out there and what you do with that in that case is you take the individual with the best chromosome best first chromosome that's the chromosome that becomes the in the next parent and so it might not have been the next parent might not have come entirely from one individual So as soon as you start using the chromosome approach it does become more of the genetic approach Some people have experimented with crossover you know to grab two and cross them over but they have not been very successful but chromosomes will make it back to the true genetic approach Thank you One more All right Did you have a question there? Okay, see All right Simple code You're going to do it on your own? Okay then we have Close enough Nobody The mission angel Oh I'm sorry The signal angel wants to say something Yeah We have one last question about the FPGAs Okay Because someone tells that there are already cameras that onboard FPGAs that have some open standard and can be reprogrammed and they are asking why you are not using such cameras That's a very good question and I will take it up with my management All right That's it Yes, that's it Thank you very much Thank you everyone for sitting through a talk on my new favorite subject