 Thanks for coming today, I hope you've enjoyed the conference so far, I know I have, it's been a rich variety of talks, something for everybody. So I hope you'll increase the diversity today with this talk, probably not what you expect. My name is Phil Smith, I'm an amateur scientist, I do this stuff for a hobby and I use the Python framework to do the work. The thing I'm interested in is recombination, does anybody not know, everyone knows what recombination is, I have some idea, okay I'll just go over it very simply, we have two chromosomes which in your body you have two sets of genes, one from each parent and when you make eggs or sperm, this process of recombination goes on, so we generate basically a sequence of DNA is produced that has taken bits of DNA from each of your parents, so it started on one, oh everyone done, okay, I think I've touched something, maybe it's too skewed to touch there now. So it's basically going along from one chromosome, jumps over the other, takes some DNA from that, jumps back, takes some more DNA at some rate, which is called recombination rate, okay. This is, goes on inside your cells that are going to produce your eggs and sperm, okay. This is called, the other thing that can happen is you've got 23 pairs of chromosomes in humans and when those chromosomes line up during meiosis then they can independently go to either direction so you end up a mixture of chromosomes in each door to cell, so that's separate to recombination but it's kind of the same thing, it's recombination with a rate of 50%, okay, should check this is not working, here we go. Oh, I haven't turned it on, that's fine, okay, so just to bore you even more, how many of you guys did biology at school? So I used to hate this stuff, I learned it time and time again and forgot it just as often but these are the stages that the cells go through when they're producing gametes which is either eggs or sperm and the one we're mainly interested in is packaging up there, you'll forget this but don't worry, this is the stage here and so what's happening is the DNA starts to condense the chromosomes that are homologous pairs line up and they start to do this crossing over here, sopping DNA and as they pull apart you'll see these bridges form between the two chromosomes, this is also where independent segregation occurs and you end up with two new genomes which are different to the two parents because we've had this recombination occur but they contain genes from each parents but they're different combinations of those genes, okay, so as you can see this is kind of complicated dance presumably it does something, okay so the question is what does it do, why is it not doing it, right and that's part of what I'll be talking about today. Part of the problem with, so what we're talking about is sex, why is there sex and the evolution of sex, there's an old problem in biology, it goes back quite a long time but it was really kind of formulated in the 1970s and one of the big problems is called the two-fold cost of sex, having two sexes is very wasteful, if we just had one sex and it was reproduced asexually you'd produce more offspring because if you have males and females only the females can have offspring, right therefore if you had only females you'd produce twice as many offspring, so this is the cost of sex idea, it's quite expensive, what's it for, okay you have recombinational load, so you might have a really great combination of good genes here and you've got, you know, one of your parents is just okay, one of the parents was really good and we recombine them and we don't get good, we get good and okay, right, so that has some kind of cost, okay and we have the cost of courtship, it's really expensive, you can have things like all these kind of useless but pretty appendages that increase your chances of mating, so again, you know, you can see from a biological point of view we're going to an awful lot of trouble, okay, why, what's it for, and I won't go into all the theories about why this occurs but just to say it's an ongoing argument probably always will be, so yeah, sex is complicated and it's expensive, that's kind of two things I want you to take home from that if you didn't know that already, there's a Python audience they might not know, right, so the next thing is a complete unrelated subject and this is a wolf from hierarchy of machines, okay, so probably hopefully you feel a bit more at home with this, who's, anyone heard of the game of life, anyone not heard of the game of life, feel free to Google it during the talk, it's an amazing thing, anyone who looks at the game of life I think becomes intoxicated by it at some stage of their life, they just think all they want to do is do this now, he came up, John Conway came up with this in 1970s, late 60s or 70s and he was basically extending on the work of John von Neumann back in the 1940s, tried to think okay if I have a machine that makes itself what's the minimum amount of work I can do that in, right, what's the easiest way of doing it, so he came up with these two-dimensional cell automata which do all kinds of crazy things and Stephen Wolfram back in the 80s came up and thought that's too complicated, let's just one-dimension, right, so he said let's just do a one-dimensional cell automata and he came up with these which are equally intriguing and after looking at them he thought there's actually only four kinds and to me I think this is one of the most insightful things I know about is that all machines are any four kinds of machines if that's true, right, and if you accept that that's an amazing statement, right, so class one lead to a modular state so basically do nothing I'll click the next side, basically class two they go to limit cycle and just oscillate, class three are chaotic go to random, so basically random number generators and class four have complex structures that persist for some time basically some of them are like this, so class ones do nothing, class two oscillate, class three are random number generators and class four, class four machines can in theory simulate all the other machines, okay, so that's the big difference I always used to get hung up and thought the class three, class four should have been around the other way because it's like in between class two and class three and I thought they were idiots for doing it that way but actually they're right, surprisingly these guys are smart, they got it right, so yeah class four can simulate all the other machines, okay, everybody happy with that, any questions, okay, so the Wolfram say the automata, this is example, this is rule one team which is one of the probably the most interesting rule, okay, and I'll just go through how it works, we have this top line up here, the first line which are range from zero, zero, zero, white, white, white to black, black, black and we have all the possible combinations, so there's only three bits, so there's two to two to the three possible rules, okay, and so this rule we says if it's three whites then we put a white down underneath on the first light, our first block, second rule if the right hand side is black then we make it white and so on up to three black dots and we can, the rules are enumerated this way, so it's two to the one equals two, two to the two is four and we just sum that up and it gives us a number that's a rule and there's 256 rules, okay, so we have a complete set of rules of all possible set of automata that we can have with just those three bits, okay, so any problems with that? So there's 256 rules and some rules are left, right, homologous of each other, so going back, there's rule 136, I think it's 136, where that just goes the other way, okay, so this is biased that way, 136 is biased the other way, so there's lots of just really look at rule zero for example at the top here, it does nothing, because rule zero, everything is zero, okay, nothing works, okay, rule 256, everything is black, everything works, so those are class one machines, okay, they don't do anything at all really, and so this is a thing called the Wolfram Atlas, you can go online and you can have a look at the properties of all the different rules, here we have time, oh I should start, here we have time, okay, this is rule 110, so what they've done here is they've started with a random number at the top and it wraps around, so the last bit on this side calculates its state from one bit on the other side over here and vice versa, so it's a wrap, you can see the gliders will go off one side and on the other, and what you can see is that the gliders have a bias in one direction, but the gliders collide, you can get new gliders forming, okay, so these are persistent structures going down here, they're gliding down through time and through space, and we're interfering with other gliders sometimes catastrophically, and you can also see this repeating pattern here where it's just like a mat that's sometimes called the ether, okay, so that's what things propagate through, so these gliders can't propagate unless there's an ether formed to propagate through, so that's an important thing that we'll cover later on. Rule 30 is a pseudo-random number generator, okay, so you put in a random or any input here, and it basically just makes a pattern like this, and the thing that you can pull out here is there are kind of glider-like structures, but they don't persist for very long, okay, they're quite fragile, they hit another glider and they disintegrate, okay, and rule 30 is a glider at 30 bits and above, or 27 bits and above, if you go below that bit rate, it stops being a random number generator, so that's, there are phase changes from one type to another in some cases, okay, so I wanted to play with these, and what I wanted to do was basically generate genomes, I want to have a genome or a bit string and I can say is this viable or non-viable, and so what I did is I got this Wolfram system, and I didn't wrap around, so it just truncates, okay, so eventually it goes down to one bit, and that is my sort of indicator bit, it'll tell me whether the string is viable or not, so I get a bit string, it has to be an odd number, I apply my rule to it, and I keep applying it until I get one bit, and if that bit is a one, then it's viable, if it's a zero, it's not, and so I can do it to the arbitrary bit string, as long as it's an odd number, and I can apply an arbitrary rule, and I can mutate it, all right, so I've taken this right-hand bit and I've flipped it to a white bit, it was zero, and see if it affects it, and it doesn't propagate down, right, because it's going zero, zero, zero, one, zero, zero, that's zero, two, one, zero, one, zero, one, zero, that's one, so it hasn't changed it, so that's what we call a neutral mutation, it has no effect, all right, or I can switch this bit here, make that a one, and this propagates down and sets the indicator bit to zero, and so that's a lethal mutation, it's just killed it, okay, so now I can have an arbitrary rule and an arbitrary genome, and I can see if it's alive or dead, and so I can do for small bits, five and bits of nine, I can do all the genomes and all the rules, and I can have a look at whether they're viable or not, and you can see, for example, rule 256 goes out on their edge, everything dies, rule zero, everything lives, in the middle, it gets more interesting, okay, and what I can do is I can connect all the viable genomes to their one-bit neighbors, so one-having distance neighbors, so that's, if two strings are viable and they differ by one bit, then I connect them up, and I can make a pretty little network like this, and you can see different rules have different networks, some of them are very simple, just regular graphs like these, some are regular graphs of different size, some have these dense regions joined up together somehow, you can't reach that one, it's different size ones, down to rule 26, which has this kind of really weird graph shape, unfortunately, once you get to interesting bit lengths, it's not practical to draw it, so now I've got my network, I've got my nodes, and I can mutate them, okay, so if a, should have nine neighbors, if a genome has like a bit string of nine, it's got nine possible neighbors, I can flip nine bits, how many of those are viable, right, so it might be say 37% of them are viable, or 12%, sorry, five out of eight might be viable, or seven out of eight are viable, that's the load, okay, so if I randomly choose a bit and flick it, how many times does it kill it, okay, that's what we call mutational load, okay, so if you have lots of neighbors, you have low load, if all my neighbors are viable, then every mutation survives, so I'll talk about density and load, basically load is one minus density, okay, the more dense part of that network, the more neighbors you have, so it's quite an important thing, and I can do a recombination, I can take two genomes, swap the bits, choose one of the offspring and say is it viable or not, and I can say how often it dies, okay, so here we have response, so this is asexual reproduction up on the left up here with the white circles, dark circles is sexual reproduction and back, so this is generations long here, so I can switch sex on and off with different recombination rates, and this is rule 30, right, with recombination, so random number generator, recombination does nothing, okay, just randomizes it some more, whereas rule 110, initially the population moves into the dense, starts to move towards the dense regions of the network where there's less load, okay, so population moves in there, down here, starts to move down, I introduce recombination or hell breaks loose because I smashed up all those combinations of genes, but I'm selecting those individuals that have the lowest recombination load, right, they're surviving more, and it drops down to some equilibrium point, and then if I turn recombination off, I've got less mutational load, this has gone into denser region of genome space, then I get my mutational load, okay, so I can kind of tell the difference between these two types of machines, the class three machine and the class four machine, okay, class four machines respond, well, appear to respond to recombination, now you might just say, okay, just check all the class four machines, unfortunately nobody knows what they are because it's undecidable whether something's class three or class four, except by looking at it and thinking it looks like one, so I have no way of knowing if I'm doing this better than anything else because nothing else doesn't, but it looks good, so we'll stick with that, okay, alright, so yeah, any problems with that, any questions? Okay, so how's that working, well recombination seems to be selecting for gliders, so we take this rule 61 which is quite a nice little rule, we do this stuff, and what we do is we get all those little diagrams I drew before with the little triangles for every one in the population and we superimpose them on top of each other and take a heat map of that, okay, so that's what this down here is, it's a heat map of the population at that time, everybody get that, alright, and that shows us which pixels are popping out, right, which pixels everybody has, so this is saying if it's black everybody has that pixels black, if it's white everybody has that pixels white, if it's gray it's pretty much random, and so we generate with a random starting population and it's pretty much gray, okay, we run asexually and we get a bit of pattern to merge, when we do recombination, recombination brings that pattern out, we're making them more similar, they're going into that well and geom space and we're getting selection for this dense connected region, okay, we take it off and it expands back out and we get back this pattern which is the same as what we had at the end of the asexual run, so how's it discriminating? Well, what it's doing is because it can make gliders, we don't need so much of the genome, we just have that little part that makes the glider that's going to slide down and hit the indicator bit, so the rest of the genome is pretty much neutral, right, and because we've got these small gliders we can have redundancy, we can send two gliders down, as long as they can sort out the conflict here, that's fine, so we can have redundant systems, right, the other thing is any bits that may interfere outside of that, they can be kicked off by another glider, we could have a glider that just blocks them, right, so these gliders allow us to make things really robust, okay, now, if this is working and I'm having trouble with this part, so if it doesn't make a lot of sense, or it doesn't look really good, it's because it's not that good, okay, it's still trying to make it, it's hard to prove it, right, we're trying to make something that's convincing. We should be able to calculate recombinational load for both class three and class four machines, and we should have an equation that works one for one, well, for one but not for the other, all right, we've only got really three variables we can play with, we've got the hamming distance, that's the distance between the two parents, right, we've got recombination rate, so how often we're jumping between chromosomes as we generate the next chromosome, and mutational load of each parent, so we're about some genome space each parent is, we can get these two quite accurately, mutational load is quite hard to get, we can get, it doesn't tell us everything, it just tells us that point, and actually it's the topology of that point that we really need to know, it's not a bad approximation. So we can have a, try to come up with an equation for recombinational load, so we have the, whereabouts, so this is your load, we can have negative load, may sound counterintuitive but redundancy gives us negative load, negative load means you have to have at least two mutations for it to be broken, okay, so we can go down this way to basically the length of the genome, so we can have N minus whatever the length is, we can measure these two things, we can measure the hamming distance, we can set the recombination rate and we can play the game, and mutation will drive the population down into these wells, but we know it doesn't do it completely, but it does it reasonably well, recombination, we grab two points from DNA space, we recombine them, okay, now we're doing recombination, we get a whole lot of offspring, and those offspring get scattered around the place, right, but this load is one month's density, so the areas that have high load have low density, okay, so the offspring of, that are closely related to a poorly connected parent are most likely to die, right, so this will preferentially kill offspring that land in the less dense regions, and that basically drags everything down into the well, okay, because if you're well connected, your recombinant offspring are likely to be close to you and are therefore more likely to survive, so we can come up, this is my second attempt at an equation to explain this, the hamming distance is a relative hamming distance, so it's how many distances divide by the genome length, the loads of two parents so it's a negative power, and this doesn't, so this shouldn't work for the classroom machines, classroom machines are on this jagged surface, they can fall down a little bit, but when they have recombination it just throws them back out into the top again, okay, because that seems to be how they work, so let's try that, we've got rule 22, which is a peer-to-peer classroom machine, it seems that it will, under mutation load, improve a little bit, but you can have recombination as long as you like and nothing happens, okay, so this is with our calculated load here, this kind of gets the figures in the right ballpark, not quite though, doesn't seem to be any real pattern, we can compare that with the, sorry, that's the, we take this population this time, calculate the theoretical load and what we actually measure and try and see if they match up and they don't match up very well, here we do it for, this is moved down here, so this actually is, they're deeper into the well, but it's not really making any difference to the calculated load or to the recombinational load, okay, rule 110, we can try it and what we get is we start with that same population here, we get some improvement, so this moves slightly to the left and the load moves slightly down and then when we're at equilibrium, it drops down into this well, okay, so down here, we're down in the well and so we have an equation in some ways, it's not great but it can discriminate between the two, okay, so now that we can identify class three and class four machines by using recombination, we can use that to look at some more interesting problems and one of those is arms race, okay, so arms races are well established in biology and probably a whole lot of other things, if you take, you know, the original throwing a stone at somebody and now we have tanks and armor and anti-tank weapons and the whole tank, anti-tank arms race that's still going on, this is a common theme and so what I wanted to do is play with this, so we could have a host genome on the right hand side here and a pathogen genome and we can put them together and the gliders can fight it out and then if the host wins, it goes back into the host population, if the pathogen wins it gets dumped back into the pathogen population and we can just keep doing that, right, and we keep fixed population sizes and we can write it out like this, we have zero and one for the length of the host and zero and one for the length of the pathogen, if the host wins, if it's a one the host wins, if it's zero the pathogen wins, the thing you want to understand about this is that when the pathogen won, it's winning with a genome from the host that was previously winning in the previous generation so it won before it's losing now, okay, and it's saying for the pathogen, right, okay, so I'll play this again, here we have the host on one side and the pathogen on one side and the host on the other and the gliders are fighting it out and this is how well it's doing, so on the small graph you'll see this is the pathogenic load, so how many of the hosts are dying, okay, and these are of equal genome length, well, near enough, there's one bit difference because it's an odd number, and so that's the host genome and it's the pathogen genome and you'll see these gliders are forming but then there's a counted glider which wipes it out and then a new glider forms and it's just going in this kind of chaotic cyclic pattern, right, so this is that equilibrium, it meanders up and down a bit but no real pattern emerges. We can change the ratio of hosts to pathogen genome and this, I actually thought it was quite interesting because I don't think from a disease point of view, no one's actually really considered genome length. When you think about genome length it's important because whoever can recruit the most genes into the fight's going to win, all right, so if you've got a very small genome, you've got very little genetic material to play with, so with a new genome you can recruit more genes. So obviously if you've got all the genome you're going to win, if you've got none of the genome you're going to lose, okay, and equilibrium points are around about 50%. Rule 110 forms gliders better one way than the other, so it's slightly to the right of that. So that's with 75-bit genomes. Let's introduce recombination. The theory says recombination should help if you're going to have these new wonderful combinations of genes, they're going to win, win, win, and they'll get sick of winning and it'll just spread through the population and that just does not happen, okay. There we've got our pathogenicity along here and we're introducing host sex, sorry, yeah, host sex, right, so the host is doing recombination and it's a disaster, all right, the load goes right back up to where it was when we began and it drops down under asexual reproduction here. So then we try it in the pathogen and the same thing happens but it's the opposite sign that's going the opposite direction, right, and the same thing happens, it goes back to equilibrium there, right. So in both cases sex made it worse. So what's going on? Well, we can look at the length of the genome and say, okay, if your genome, the pathogen genome is longer, bigger, it's going to have more material that can draw on, it's going to drag the equilibrium this way and it's going to have a well somewhere in genome space that it can use, okay. But the population will be dragged the other way by the small pathogen and so you'll get, it's basically going to be in the equilibrium point, which will mean for the pathogen it's a random number generator, if the pathogen genome is small, if the host genome is big, it can kind of just deal with the host with part of its genome and the rest we're just making sure it hits the indicator bit. And vice versa for the pathogen, for the host, it's a host genome's long, it can just isolate the pathogen and say, okay, you just stay over there and don't interfere with my genes and I'll just deal with the rest of my stuff and so it can go into a well, but the pathogen genome is completely bound up in the fight, okay. It can't, it can't, it has no redundancy left, there can't be any redundancy, okay, because every gene has been recruited into the fight. So the kind of thing of it is the octopus fight, okay. If you've got two octopuses fighting, they recruit legs into the fight until all the legs are occupied and there's equilibrium state. If one limb now gets better or worse, you're going to win or lose just by that, okay, because once one leg's gone and the other legs can then gang up until you've won the fight. So this is kind of like an octopus fight. The more legs you have, the more likely you are to win. Right, so we can run this. One of the things I wanted to do, I'll just let this run through and you'll see, you'll see something weird going on and I'll explain why I did it, okay. But this is running, it's a fight between two genomes, of each of equal lengths, host on the right, pathogen on the left. You'll notice pathogenicity is slowly dropping and it's all going along and bam, that happens, right. And what we did here, what I did here was I was actually trying to make sex work. I thought, okay, let's drop the mutation right down so that all the pathogen genomes are basically the same and it can then adapt to that. And that didn't work at all. In fact, sex still didn't work. But what you get here is a highly structured defense, okay. These are all pretty much fixed. It creates a zone of death here. The pathogen cannot propagate over that ether. There's no ether for them to come over. So these, the pathogen genome is just bouncing off here and just becomes randomized because nothing works, right. And you get this solution. So if you want to watch again, you watch this graph and that graph, I'll just repeat it for you, okay. See what's happening with the pathogenicity. It's gradually dropping until this highly structured structure, highly organized structure evolves. And it has to evolve slowly because it's such a big structure. If the mutation rate was high, it would keep breaking, okay. So eventually it comes along and it just cleans up. Bam, it wins. Okay. Now part of the reason why it doesn't happen on, sorry, on the host side is because these gliders are, there's not that many gliders going that direction that rule 110 can select from, okay. Okay. Any questions on that? Bamboozle, unconvinced. So, weekend conclusion, right. The surface in these armor races is spiky, okay. It's not a smooth surface. It's a very jagged surface. So, if you have mutation, you're going to get thrown out of that spiky little well. You can't, you can't stay there. You can't survive there. You can get thrown back onto the surface. Recombination is going to do the same thing. So recombination disrupts these multi-gene structures, right. Mutation breaks them as well. So you have to have a very low change rate. Inclusion, you can't have sex on a spiky surface, okay. It just doesn't work. And class three machines are spiky surfaces. Class four machines have smooth subs. I have to mention Python at least once. That was it. I used MPI for Pi to run over several processes. Took a bit to figure it out, but eventually it worked. Scython to try and speed things up. So just recompile some of the functions in Scython. I'm going to work Matplotlib to draw things. And I'm paid to talk with MPI and Jupyter notebook to analyze the data. And just a few acknowledgements. Thanks to all from Python group who've been really good. My friend Lawrence Deliveroy, who was Deliveroy, who's my guy ring up when I don't know what to do. And he really suggested using Python for this stuff. It was a great idea because there's just such a great tool chain for analyzing data and doing things. And there's so many people using it. Basically everything you want to do, someone's already had that problem and it's got an answer. Some of my long suffering science friends at University of Auckland and Massey. Mark Kucinoff at Auckland who was instrumental in getting us started when we had a few discussions. This is Stuart Kaufman who is... He wrote this book Origin of Order. It's a really good book to read if you're interested in biology and complexity. He was one of the guys who set up the Santa Fe Institute. And I had the pleasure of meeting him early this year and having a chat with him. It's a great book and it's quite old now but still a really good book. I think that's it. Thank you. That's a good question. Bigger, longer, more is one thing. There are possibly other phase transitions out there. We saw some in Rule 30. There could be phase transitions in some of the other rules. And one of the things I want to look at is is there anything I can use this practically for? Hardly unlikely but you never know. Any other questions? Yep. In ecology or biology, some mutations can have favourable results in terms of longevity and some negative. Does this differentiate between those two potential pathways? Yeah. This is based on basically looking at the role of recombination in longevity. So there's a theory of aging called the reliability theory which just basically says you have so much reliability and you consume it with time. Different people have different amounts of reliability and eventually you run out of spare part or spare capacity. I wrote a paper on this saying that recombination selects for reliability because of the glider formation and redundancy. So it's the except for redundancy. Because basically what you're doing is you're... If we go way back... Excuse me, sorry to be a pain. Okay, the thing that these are about is you have these logical statements that say as things are true or false. And everything in the real world is conditional. There's conditional truths or contextual truths. And so this might be true but if we introduce some other change it might change it to false at some distance. And so these... I originally designed a kind of nested rule system so the truth of a statement depending on the truth of something else somewhere far away. Okay, and what this does it tries to make recombination keeps changing the context of those truths and so it selects for truths that are context-robust but not context-universal truths. Do you understand that? Okay. Any other questions? You just look at all them. Okay. Basically originally what I did was generate all the rules and go through them and there is information on the rules. So there's articles about rules and there's an atlas of rules and you play with them and see what happens. You don't have to look at all them because some of them are identical. There are black, white equivalents and left, right equivalents. And so you quickly narrow down the rules that are of interest. But you just... In the end I wrote programs that just looked at all them and just looked at the output and all that one looks interesting. Yeah. What do you mind going over one more time with 64 possible rules? Because I see... Okay, sure. Where is it? So you've got three bits here, right? So there's two to the eight possible... Sorry, two to the three possible blocks you can have. So you can have... That could be 0, 0, 0, 0 along there. So there's two to the eight possible rules you can have. Okay. But there's only... I think there's about 140 actually unique rules that take all the black and white homologs and left, right homologs. Because this rule, you can see it's biased one way. That equivalent... So black, black, white is not there. And if I had black, black, white was black and then white, white, black... So white, black, black was white then that would be the same rule you understand? So it's like symmetric. It's symmetric, yeah. If you go on the Atlas, go and look at the wolf from DCA, Elemental City Automata. Atlas has all the properties of the rules and their homologs or related rules. And the logic... There's logical versions of this rather than just going that's a two, that's a three. But that's how I do it, I just go... That's one, that's two, that's three. If I find a two or a three in those bits as I scan along then it's viable. Yep. Any other questions? Well, thank you very much.