 We are live to the world from Albuquerque, New Mexico. It's noon on T Tuesday. This is an experiment. There's folks watching on the live stream. Hi everybody. It really makes it exciting. I want to explain what we just looked at, and that's going to be the major activity of this update, as well as just trying to kick the tires on this crazy stuff. So let's get into it. So for those of us who are just joining us, what is this all about? What is the T2Tile project all about? It's about building indefinitely scalable systems. I find that when we build engineering systems, we just build a machine, and a machine has edges. And what we tend to do is throw all the inconvenient stuff to the edge, all the waste, the garbage, the pollution, the cost to society, and so forth. We push them to the edge and let them drop. The idea of indefinite scalability is saying, no, what if you can't do that? What if you have to build the system and your design has to go arbitrarily far? So it has to account for everything. In computing, that means we have to give up on the idea of central processing and random access memory. We have to give up on the idea of deterministic execution that computers are completely repeatable. How are we going to do it? Well, that's what the T2Tile project is all about. We start with an individual little piece of hardware, a tile, like this one. This is the T2Tile that we've been working on for the last many years now. The point is we can repeat this and make as many as we want and connect it to copies of itself. The point is it's all completely decentralized and bottom up. Often in parallel computing systems, there's like a head node, a big cheese, someone who's in charge. We don't have that here. It's all working from the bottom, and each individual tile, as far as it knows, it's the center of the universe. It's kind of like people. To go with that, that's a very different way to compute than thinking I'm in charge of everything. I'm the central processor. Everybody just waits until I tell them what to do. As a result, we need a whole new best effort software stack because these things might crap out individually without the whole grid going down. So that's what we're trying to do. We've built hardware. We've got simulators of the hardware. Most of what we've been doing for the last several months has been in the simulation. The hardware still needs some work, and we're going to be getting back to that soon. But why make this whole effort? We've got computers. The reason I am motivated to do this is I am seriously concerned about the future of information technological society. Now, maybe we're all going to die of pandemics or the next fungal plague or whatever it is anyway, but assuming we managed to get past that, assuming we managed to hold some kind of actual organization that allows for innovation and that allows for liberty and bottom-up influences to persist. I'm concerned about the computer technology base that we have. Based on CPU and RAM, it's incredibly fragile, it's incredibly brittle, and it's essentially impossible to make secure. And we don't even want just security. What we want is loyalty. And it doesn't even make sense to say, how could a machine be loyal? And the reason it doesn't make sense is because our machines don't know us. And you know, computers say, oh, you know, Facebook has so much data about us. Yeah, but they're not our friends. They don't have our best interests at heart. We want our hardware to be ours and will live and die with us. That's the goal. That's a pretty big goal. That's pretty ambitious. A whole new type of hardware, a whole new approach, a whole new software stack. When is it going to be here? When is it going to run Excel spreadsheets? And the answer is this is a research and development project. When is it coming closer than it was two weeks ago? And where this whole story just ended, last update, was we had taken this idea of a tectonic plate, a little solid block of atoms that we could customize to do what we want and said, well, what if we pop them out and spread them out so they had some empty space in between them, and we could put other things in there. Last time we made a crossbar switch. This time we made something else. My repeating goal in this is, you know, this is a crazy spatially distributed cellular automata new kind of computing architecture. Well, not that new. People have been studying cellular automata in various shapes and forms all the way back since the beginning. This, you know, in the details in the, with the engineering focus, yes, very different. The main difference is, oh, I'm cropped off here, paths to utility, that the goal is not to imitate life directly. That may be helpful to get something done, but that's not the top order purpose of it. The purpose is to figure out a way to do engineering on this sort of thing in a way that could actually be useful to society that we could build it, you know, future tile grids that would control systems that you might not be insane to have let them near you because they would be beholden to you. They would be your buddies. You know, I suppose they might get pissed at you from time to time, but they're yours. They will die for you. That's what we're looking for. And the paths to utility this episode goes through closed box function optimization that used to be called black box function optimization, but now we're trying to find terms that are less colored or in less color oriented. So closed box versus open box in an open box function optimization, you're allowed to look at the details of the functions. Oh, this is a linear function. This is a quadratic function. Oh, I know what to do with this in the closed box function optimization. All you've got is where is it? You send points off to the function and you get back values to the search strategy and the search strategies job is to figure out what points to send to the to the function in order to get higher and higher values back separate from the search strategy. There's some kind of observer that decides when we're done good enough, or we haven't done good enough and we just have to give up. And that's what we're doing here. So, oh, dear, right. Stochastic iterated genetic hill climbing. Si, S-I-G-H. It's an algorithm for doing closed box function optimization. And it's the algorithm that I wrote my PhD dissertation on in the 1980s. You know, these days, people do such a great job, you know, doing, you know, Twitter threads and Instagram pictures and all kinds of live public posts about to explain their thesis research. And I'm just a little bit behind. The way Si works, it's a neural net. So it has a bunch of units, a bunch of decision units can be on or off that represent the value, the point that we're going to evaluate in the function. In this model, that's called the government. The idea is based on an election. We have a population of voters and the voters somehow vote on zero or one for each position in the government. And then when we have a complete government, that gets off to be evaluated by the black box function. And the goal is for the voters to figure out ways to elect governments that lead to success. Now, this in a lot of ways, it's very natural. It's very common. So we have these. Where is it? We have. Oh, yeah, I keep forgetting I'm live. So I'm supposed to be able to do this. It's a little bit backwards though. So I'm still learning. Right. So here's the government positions. And then down here is the population. The governments are all plus or minus one or zero. But the voters have three choices. They can support a particular set of candidates with a plus, they can oppose a particular set of candidates with a minus, or they can adopt state zero, meaning they're apathetic and I could give a crap. And in the learning algorithm works, if you support or oppose either way, the result of the black box function coming back will descend upon you. If you picked something that produced a value that's judged good, and you supported it, your weights get bigger. If you pick something that turns out to produce a score that ends up with a negative reinforcement, then your weights go down. If you're apathetic, your weights don't change. And that's how it works. Is it a great amazing? Is it does it beat a world that all other approaches to close box function optimization? No, no, it doesn't. I mean, but it depends on how you count. The easiest thing to count is how many evaluations of the function does it take in order to actually find the best possible value or find a value that, you know, the observer is willing to judge as successful like that. But that's just one approach. And so what we've got now, what is new this week is, oh, and here's the actual rule. This is from the book from my dissertation, like that, you know, it's got these various phases, you know, so first you hold an election, which is you let all of the voters stay constant and they vote for what the government wants to be. And then at the end of that, you decide who won the election and now you have the reaction where the voters get to decide if they like the government or not. And then we have the consequences where we take the result of the government, we get it evaluated by the black box, the close box function and we decide how well we're doing. That's what that's what this is. I mean, it's pretty complicated. It's got all kinds of stuff going on. And it's got, you know, what the heck is this? It's got a foot. It's got the foot. That's the function under test. Let's break this down just a little bit just to get the idea so that we understand maybe what we saw at the beginning. So all of this started as the typical with these computations with a single seed. This is an SW, a seed weight matrix. And, you know, what happens is you put down a single one of these, the first time it gets an event assuming there's empty space around, it pops out into some initial configuration, initial date in a cellular automata, they usually call them a garden of Eden configuration because it didn't come from any previous state of the cellular automata. In the case of the movable feast machine, what we're doing, the concept of a garden of Eden state isn't as obvious because since there's a certain amount of randomness and computations don't happen the same way, the fact of a computation, a particular configuration being impossible to produce by the rules, it really is less likely to come up. So in any case, on the first event, this thing pops out to its initial configuration. And look at that, that's really complicated. One, two, three, that's what, eight, nine, nine atoms in the initial configuration. And each one of those is doing one of the things, something that's important for a thing. Well, let me use the mouse. Let me still do that, right? Yeah, here we are. So SO, that's the psi operator. That's the thing that's going to run the whole operation. The VV is the vector of voters that we saw along the bottom of the image. The PQ is the, it's a vector that's a vertical vector, the pole questions, the elected offices, whatever it happens to be. And these two WMs are going to be weight matrices. And the way these weight matrices work is they're inherently asymmetric. They're going from this to that or from this to that. And I've got versions which handle the various rotations. So this upper WM is the one that's coming out of the pole questions that's broadcasting the results of the election so that all the voters can see it. This one down here is broadcasting the result of a single voter's choices so all the government offices, the pole questions can see it. And the SG's are state gates. Now, one of the things I was really interested to try to do, but I just, you know, I just couldn't take the time. Well, I couldn't get it working. You know, the PSI algorithm is broken down into these phases. First you hold a vote, wait for it to be done. Then you have the reaction, wait for it to be done. Then you have consequences, wait for it to be done, and then vote again. But in this model, you know, it might be really kind of interesting to not bother to break those phases down quite so much. I haven't tried it yet. It's just one of those things, you know, hope that one day there'll be enough time to it. So given that we don't want to do that, we actually want to say, no, no, no, the voters should ignore everything that's happening until it's their turn to vote. Then and only then should we grab what they're thinking at that moment and have that be the vote. That's their moment in the polling booth. And that's what this SG is for. That's what this state gate is for. The PSI operator is going to trigger the state gate on the voter vector when it's time for them to do their thing. And we're going to, as quickly as we can, I mean, it has to propagate down the steps, capture everybody's vote, and then latch it and then keep it. And the state gate up here is the same thing except that's for deciding the result of the election like that. So the PSI operator triggers the state gates in turn to create the voting react vote react cycle. The EV over here is the evaluation thing. That's actually another state gate, although it's not colored quite the same way that the PSI operator uses when it's time to say, okay, what was the results? I have to actually pass this result off to the black box. Keep saying it. The closed box function to get a score. Now in a better implementation of this, the function to be optimized would really be more weight matrices, more things that would be evaluating what this system did to decide whether they like it or not. But for purposes of this demo, I cheat and the EVs, which we'll see stacking up in a second, they just fold all the bits down into a single little word and then pass it off to one individual atom which evaluates it. And then here's the key. How do you know if a value, a function value of 115 is good or 1012 is good or bad? Or, you know, how do you tell? The PSI algorithm is based on reinforcement comparison. The idea is we remember a running backwards average of all the function values we've seen. And if the one that we get now is better than the running backwards average, then we say, hey, you did something good. That's a positive reinforcement. If it's worse than the running backwards average, then we say that's punishment. Now, of course, no matter what it is, it's going to pull the backwards average in the direction that it goes a little bit. So with that approach, you know, we could just freeze up. You know, we could get a score of 1,000. Oh, that seems great. And now the reinforcement is zero, but that's okay. Everybody just says, let's keep doing the thing that gets 1,000. I'm good with it. There might never be any more searching. So the additional parameter in the PSI algorithm is called, I don't even remember what it is. Well, it's called delta, but it basically it's the ambition, it's the restlessness. So that instead of just comparing the backwards average to the current score and saying if it's exact match, the reinforcement is zero, we add in a little expectation. The value of the function needs to keep increasing a little bit or else the reinforcement eventually goes negative. So we can see it, and it does happen in the opening video that was going all over the place, that the election, the government freezes up and the same parties get elected over and over and over again. And in this particular case, the evaluation, the black box, the closed box function produces the same result over and over again. Now that will necessarily be true in general, but it's true here. And so eventually the reinforcement goes negative. And eventually the negative reinforcement starts to melt away the weight matrices until the voters start getting apathetic and they start flipping over to oppose things they supported or vice-versa and the whole system melts. So you get this interesting long term behavior, which certainly I could see behaviors like that in my own thinking of myself as a closed box function optimizer, trying to make something do something, like for example, trying to get code to work. And I could make different decisions about how much ambition I would have. So once I got something to work, I would be excited about it for a while but then, you know, it was boring. I wanted to do something cooler. So I tear it up. Psy just does that because of the reinforcement and the delta, the reinforcement comparison, that what we're really looking for is improvement, improvement by this much. And that if we get negative reinforcement, what that tends to do, the way it works through the algebra is it tends to make the weights in the matrices smaller, closer to zero whether they're positive or negative, which makes the system more random. And that's it. So the seed pops out into the initial configuration, the egg, which then grows. And did I mention RM? I don't know if I did. RM is the reinforcement matrix because, you know, once we've gotten the reinforcement score, we said oh, that was an improvement. We need to tell all the weights in both the networks from the voters to the government, from the government back to the voters what happened so that the weights can be updated. That's what the reinforcement matrix does. It carries the reinforcement out to the grid and that's the main thing that you see in the opening video, this kind of red wave sweeping out and then blue pulling back in. That's the reinforcement matrix distributing the results to the weight matrix and then coming back. Alright. And here's the whole thing. So we can see, right, we've got the psi operator. He's going through, he's a little state machine going in a loop. Here's the function under test. The last value that it evaluated was 17,920. The backwards average value was 15,220. So that's, we did better. So the last reinforcement was positive. In this case it's two and so on. And, you know, how well is it working? I'm not completely sure. Why don't we try it? We won't get to see it. Let's just try it for a minute or two before we run out of time. Let's see if I can get this to go. Alright, I think that's it. So we'll get ourselves a seed and we'll fire it off. And there we go. Let's get some Adam viewers here. Here's our foot and our psi operator so we can tell. Alright, so, you know, at the moment and the video does switch in the middle. Right now it's hard to see because it's better when it's going fast because you see the snappier decisions. In the early part of the video we'll watch it once again to close out this thing and hopefully we'll see it with slightly clearer eyes. When the voters are making their decisions the vertical lines tend to change colors so there was a horizontal one going so that's the results of an election being distributed back and so forth. And now the psi operators in state four which means it's actually going off to evaluate the thing like that. Does it work? Well, what's the fuck? What's the function under test? In this particular case what it's trying to do again the psi construction, the nodes, the voters, the poll questions, the weight matrices have no idea of this. It's hidden down inside the foot. The way you improve the scores by having as many of the poll questions green as many set as possible except the number of them set needs to be a multiple of three. One other thing one other thing that in all of the basic things that happens in weight matrices if you just have every unit of every vector of this thing connected to every vector of this thing and they're going back and forth then they have no real orientation. If one of these beings wants to be on by default it's got no way to do it because all of its inputs are coming from the flexible the variable state on the other side. So in this particular case the very top question in the poll question is a true unit. It means the result is always green no matter what the vote is. And the same thing we should switch. It's really hard to tell. Well this is what it looks like just the element view without showing anything that depends on the actual specifics of the network but you know it's more fun to do it. So this column out here is the true unit and this row over here is the true unit. So when we're evaluating the score we chop that one off so we take all of these other ones green green red green green green green green so that's one two three four five six seven greens and one two three red like that so seven oh three three mod four seven is three mod four. So in fact it looks like it's optimized the function here. If it turned either of these any of these three on it would have more on which gets it some bonus but then it would no longer be equal to three mod four the count so it would pay a penalty and so on. Did the job. Yes. Not that this is that hard a function not that it's that big a function etc etc etc but hey function optimization doesn't go away. It comes up all over the place and to have something that's reasonably general that makes very few assumptions. It doesn't know that you know positive that it should be expecting to reach a hundred or a thousand or whatever as far as you know it could be that minus ten thousand is the best possible score and the reinforcement comparison mechanism would automatically home in on that. Okay so that's it now the one thing that I did want to mention and then I will wrap up is you know with the psi operator sequencing it so that the entire voter vector goes and then the entire poll question vector goes it doesn't matter where you actually are on any given vector here but what I wanted to do was to let them go in in their own time so that the because you know when we distribute a reinforcement signal it arrives it emerges from the lower left corner and arrives at those parts of the weight matrices sooner and we do all of this extra work with all the sequencing so that in effect everybody gets a fair chance but what if we didn't do that? What if we put the things that we wanted to be fast like reflexes, spinal cord stuff like that if we arrange for that stuff to be represented near the head near the action and we could have fast learning loops going between you know the few key voters and the few key election positions and then slower and more general and more vague stuff and if the reinforcement gets a little bit muddy out there because you know it's some combination of the previous election and the current election well maybe that'll be helpful maybe that'll allow those guys to pick up the more general bigger landscape features rather than getting hung up about every last little point on every little fast time scale that would be for the future. Okay so we'll let this stop this guy alright. So that's it so all of this the reason that we're happening live and you know we've got 10 turn viewers look at all these folks thank you so much for coming I haven't been able to look at the chat while I've been doing this but I will stick around I guess you know after it works I'm not sure what happens when I end the stream I really haven't ended the stream before. All of this possible because I finally broke down and installed Ubuntu 20.04 on the workstation that we're seeing all this with and played around with OBS Open Broadcast Studio which is super cool it turned out that it didn't do Max Green mentioned a few updates ago out of the box and it turned out to be harder than I expected to do Max Green but I came up with an amazingly disgusting hack which is what we are using and it actually works not too terribly the screen is kind of fuzzy and that's my biggest concern about it that's not entirely because of my hack although it may be partly because of the hack so you know yes everybody bugging me to update you know you were right thank you everybody telling me like Andrew to try OBS yes you were right and you know if this all works out if I get better at this you know I would like to be using I would like to be using this more often for more stuff for example I am participating in a zoom workshop the rest of this week starting tomorrow and I may use this as well the SFI Santa Fe Institute Workshop on the Frontiers of Evolutionary Computing next week I've got one as well it just keeps coming the next update will be on August 3rd and thank you all folks for coming tell you what I'm just going to rerun the opener and catch up and try to read these comments for the last minute or two but anyway where are we here thanks folks this was fun so you see the red stuff going forward and back that's the reinforcement matrix right now the colors the blues the greens that means the weights are big it's using a rainbow display and that's because the foot is basically locked oh and it just melted and so you see the light the colors have gone back to more green which is the range of zero weights and now they're actually it looks like they're starting to grow again and that's exactly the kind of thing that you get with SA you get this excitement of convergence this is the guy for us bam bam bam look at that look at all those reds and blues those are big positive weight and I think it's going to end yeah it's going to end alright I'm going to end the stream here because I'm jittering hope to see everybody next time