 Computers keep changing the world, but their power and safety is limited by their rigid design. The T2TILE project works for bigger and safer computing using living systems principles. Follow our progress here on T Tuesday Updates. This is the 48th T Tuesday Update. Let's get into it. Last week, we got intertile events working in the limited sense of when everything was all perfect and there was no surprises in everything the way it was, which could be achieved by having loopback cables so it was really intertile from one tile to itself. And we established the lemonade benchmark, not real benchmark, just an informal number of six milliaire, six thousandths of an average of one event per site per second on the tile. In this past week, I've been working both a little bit on hardware stuff and on software stuff and I'll talk about both of those. In the coming week, now next week is going to be this biological computation workshop at the Santa Fe Institute. So that's going to take up the time next week. This week what I want to try to do is put myself on the record to find out how to do aluminum machining, CNC stuff, which means since I don't have the machinery to do it, it means finding out who does this sort of thing in Albuquerque that I could maybe visit, that I could phone up and act like a complete ignoramus because once again, I know nothing about this particular thing just like I knew nothing when I started the PCB manufacturer story and see if I can't start moving the ball a little bit there because we're going to need some kind of frame to hold the power zones, four by four array of tiles that will link together and look half decent and be strong. So that's what's coming up. And all right. And so in education outreach, oh, okay. So I've been collecting addresses for folks who want to get some of the commemorative hardware and land 2019 teach you tile stickers. If you want to get in on that, or at least if you want to get in on the first mailing of it, get an email, get a physical mail address to me that probably the easiest thing to do is to go to the getter and do a direct message. But if that doesn't work because you don't like that, whatever it is, find an email address for me using your internet skills. Most addresses that you can find that are obviously about me, they will work and get me a physical address. I guess I'll send them out after the Santa Fe biological workshop next week. In addition, Andrew Walpole lead communications for the T2 tile project. I got some new swag. Do I have a right side up? Yeah, I do. T2 tile. Carpet event window on the back. When life gets too tough, when the bugs are too hard, when the performance is too slow. It's nice. All right. Okay, hardware. So it's about the back of the hardware. You've seen this tile before. Long time watchers know what this tile is. It's the Keymaster. How do you know it's in white? Well, what you may not know is in fact, just like kids and animals in movies, this is not the only Keymaster that is ever existed. In fact, we are now on the third PCB on the third circuit board for the same Beaglebone. So it started out originally the Keymaster was E15. This goes all the way back to April when we were getting the manufactured at ETS. This is the E series of titles because they were made by ETS. And at some point, that tile went bad. Something was wrong with the Northwest connection. I don't know what it was. So I took the Beaglebone off it and I mounted it on E224. And this past week, that went bad. And so now the Keymaster, even though it still has the same white case, in fact, is on E225. And, you know, I didn't know whether to say this is a problem with manufacturing, but increasingly it seems to me it's a problem with having these exposed pins on the back. Strange hardware made a comment last week. Thank you. Pointing out the probably the obvious point sounds like you need to print some back plays to avoid touching those exposed pins. And, you know, it's a real problem with these things because, you know, when they're just like this, you know, I don't actually worry that much about static electricity and so forth, just because I've never had any real issues with it, even though people want to sell you these static straps and static mats and all of this stuff. It hasn't been a problem for me. But when we're talking about hot swapping one of these things or, you know, hooking up, hooking them together while they're powered up, or, for example, pushing an intertile connector or a ribbon cable into one of these things while it's powered up, you want something to push against and you end up going like that. And your finger is right on the exposed pins, which, number one, isn't very comfortable. But, number two, it's making electrical contact of some degree depending on how moist your skin is and so forth between pins that aren't supposed to be connected. So, yeah, absolutely, need something. And, you know, the problem is, you know, we've already got these nice feet that are all kind of measured out and I didn't want to mess with them. And so, and there really isn't a lot of thread left on the feet socket screws anyway. But then I realized, well, we've got these four as well. These are the four screws from the brass standoffs that hold the Beagle Bones in position. And so it's like this. So here, you know, here's a special one that doesn't have any parts mounted on it. So the Beagle Bones has these, you know, these four brass standoffs that we mount on that and then it goes in the holes. And so I've been using these nuts that long-time viewers recall. These are the nuts from Dubai to hold the brass standoffs on. But I don't like it because they loosen up. And I knew that down the road I was going to have to use a lock washer or some loctite or something like that. And it was another step. I didn't like it. I said, well, hey, why don't I just get rid of these nuts and have a back plate that would self-tap onto those brass standoffs. So I started designing it and, you know, printing up a bunch of things trying to get the position of the holes and all the stuff right. So it was everything was covered that wanted to be covered without covering up everything because I wanted to keep as much of the nice circuit board available as I could. Guess what happened next? Oh, yeah, I was just trying to line up the positions, get the holes all right. Yeah, that's what happened next. Once again, long-time viewers know the story of this. Oh my goodness. So, you know, what I should have done is unpack the Ender 3 that I got a brand new printer and get that set up and taken. This is the thing, but I took the path of least resistance and instead I went back and looked at the connections I had done before. This fuzzy thing at the beginning here in the front, that's one of the thermal temperature sensor wires that I repaired. This is the yellow fuzzy thing in the background is the other one. You know, I just used regular hookup wire to connect these things up because I figured how bad could it be, but it failed again. So this time I was like, no, and I had bought this wire, this super warm wire that is meant for radio control racing vehicles. And it's super thick and super floppy and it's zillions of incredibly fine copper threads in it because it's meant to take vibrations and all kinds of stuff. I had a lot of high current and I used a couple of pieces of this, which I had gotten for evaluating an alternate solution for powering the boards long ago. So I took a couple lengths of this and replaced the thermistor, the temperature sensor wires yet again right in that area where they all flex. And in fact, we are back in business. And these things, they actually, they print pretty quick. So they're starting to back up. We've got nearly 100 of these things going now and they are not bad. And so now where's my example here? Well, I mean, we've got this one just so you can see what it's like when it's mounted by the brass standoffs holding it down. And in particular, all of the naughty bits, all of the pins are covered. So I feel good about that. We'll see how long the current keymaster, the third keymaster, like the drummers for spinal tap actually manages to last. That's the hardware store. All right. In the software store last week, so we actually got intertile events working in the limited case when everything was really perfectly all set up. And we took some video and measured the the actual air on the single tile on the loopback and achieved six milliare. And six milliare means a tremendous amount of time, relatively speaking, is going into each significant event. Each intertile event is taking a significant amount of time. I didn't multiply it all out, but it's a lot. And, you know, this reminded me that, well, oh, yeah. And first, I just wanted to say thanks to all the folks who put nice comments on the video last week, because, you know, it was kind of a tough thing to have expectations really high. And then saying, you know, or thousands, we're talking thousands. So so John and Luke and Andrew and AJ, you know, thank you. It actually helps a lot. And, you know, part of it is just again, this is what I said in the pilot episode of T Tuesday updates is that if you do it in public, you can't wait until you get to the happy ending. You can't wait until everything looks good. You got to go when you got to go. And so that's what we did. And we've got additional information this week. So what I realized was was that, you know, the lock, we went through this with the locking in July, where they started out being like 300 milliseconds each and we got it down to 150 microseconds, something like that. And in the process, we made this, you know, this slick buffer scheme where you could make lock events saying grab this lock release this pin this one up this one down and so forth in a way that was much lower overhead than actually like printing messages to log file. And stuff like that. Although Linux tries hard to make bringing the log files quick. This was much quicker because it just built an internal buffer, made little one word event notices and just band them down. And then a separate program would read them from user space later and decode them and sale what it was and tells here was the lock events and then 10 microseconds later there was this and so on and so forth. And so number one, it was like, you know, we could use that we could use the lock tracing to get a sense of where the time is going for the cache updates because now that we have intertiled working in loop back sense. It's got to be doing both locking and packet updates. So did some experiments with that. And, you know, so this is what the output from the user space lock tracing program. It's got, you know, just the event number first and then the relative amount of time and then the absolute amount of time from the beginning of the trace, the relative amount of time from the previous event. This is one microsecond later than that. This is 19 microseconds later than that and so forth. And then information about what the event was, you know, the user, the request that the user mate has done so the right to the from the right from user space is now returning and then 100 microseconds later another right from user space came and so forth. So this is a case where the top part of it up to the W buck who it was grabbing locks, it was freeing locks and then in the next one from here down it's grabbing a lock and so forth. And, you know, looking at this stuff while M F M T two was running intertile stuff was running Watson, you know, this popped out immediately. You know, microseconds, microseconds, microseconds, 109 milliseconds, 109 milliseconds, that's 100,000 microseconds, you know, add up all those you second things and they're a hill of beans compared to this one event that was 109 milliseconds after it. And what it is, is that right return success there is saying that the locks were successfully grabbed. And this blocking right from user is the user space saying, okay, I'm done with the locks. So in fact, it's 109 milliseconds that somehow getting used by the cash update process. And it's not actually the locks themselves. I mean, the locks themselves are dozens and hundreds of microseconds, which is never going to get us to hundreds or thousands of errors. But that's not the problem at the moment. The problem at the moment is the 10th of the second that is happening somehow for doing the packet update while you're holding the locks between grabbing the locks and releasing them. So what I did is I went back and I implemented a similar packet grab event grabbing system that using the cues and all the same stuff that I had developed for the locks separately in the part of the kernel that's doing the packet update stuff so that we could see whenever user space sends a packet to Linux or Linux sends a packet to the proves, the coprocessors that actually push the bits through the wires, or when the coprocessors, the proves send the received packet back to Linux or Linux sends the received packet back to user space, because that's the pattern in order to do a cash update. MFM T2 is sitting in user space. It sends a packet to Linux to prove to neighboring prove to neighboring Linux to neighboring MFM. That's how information moves. And so I built a packet thing and then I extended the packet tracing guy so that it would it would consider events from both the locking queue and the packet queue and interleave them according to their actual time so we could see what was happening. But this pattern, a 10th of a second, a 10th of a second, a 10th of a second, it was really reliable. It was like basically every event or at least every event that involved grabbing locks was taking like a 10th of a second. Now that also meant that, you know, all of the time to do all the events that didn't involve taking locks was like nothing. So I started to get it working. I was using angle brackets, Southeast to prove those are packet reports that got interleaved into the lock reports and very quickly this is what jumped out. So up here we've got, you know, from user. That means MFM is sending packets to Linux in order to ship out via the proves. It's sending packets destined for the Southeast. It's a four to eight byte packet, eight to 16 bytes. I don't get the exact length of the packets because it didn't have enough room in the event tag. This is a begin, add them, add them, add them, add them, add them in. So that whole thing is a cache update. And they're all being sent bam, bam, bam, bam, bam. But none of them are going from Linux to the proof for another 54 milliseconds. That's incredible waste. Again, that's 54,000 microseconds compared to all of that stuff. And that's the pattern overall stuff was going from user space to Linux. Okay, it was reasonable. And then there was this huge delay and it was sort of repeatedly like 50 milliseconds plus or minus a little bit. And eventually the pin dropped. Eventually I got it. So this piece of code here, this loop while going around, what it's doing is it's taking packets that have come from Linux. I'm sorry, they've come from user space and sending them off to the proves. That's what ship current outbound packets does. And then it sleeps a little bit and it sleeps for 50 milliseconds. But that's supposed to be fine because when any packet is actually written from user space to Linux kernel, that process wakes up. It says, you know, don't finish the rest of your sleep. Get to work. Send this thing out. So this 50 milliseconds is only supposed to be an absolute last minute timeout because explicitly what's going to happen is this thing is going to be woken up. And, you know, and so here's a routine wake outbound packet shipper that calls this wake up process thing to explicitly get that guy to go and go around ship packets and so on. And this is what it looks like up here when the thing is actually getting called. Wake up outbound packet shipper. Wake up the process. Don't sleep for 50 milliseconds and so forth. No, that's not the way it works. Finally, M sleep interuptable. Interuptable. Interuptable is not wakeable by wake up process. The 50 millisecond wait was going to be a full 50 milliseconds no matter how much packets were ready to be sent. That's where the time was going. Where does the time go? It's sleeping for 50 milliseconds before it's even considering sending packets that have arrived from user space off to the cruise. And what one needs to use, I eventually find out is not M sleep interuptable, which is interuptable, not in the wake up process sense, but schedule timeout. And I changed it to schedule timeout. And, you know, this was totally not obvious and as evidenced by the fact that I found this message from 2016. So, you know, it's not this year, but it's not that old either, making an explicit change for the documentation to say that schedule timeout will wake up when anyone calls wake up process. Because the documentation didn't use to say that. It gave the impression that schedule timeout wasn't going to wake up for anything. And that's why you need M sleep interuptable. So we make this change. We build it all up. Here's what we have now. Look how fast it's going. 601.32. One time around. 602.52. 52 events in 80 seconds. 650 millilayer. That one change. M sleep interuptable to schedule timeout got us a factor of 100. That's what it boils down to. It turns out that that same misunderstanding M sleep interuptable also applies in the locking stuff. I haven't fixed that yet because I just realized that while I was getting ready to do this update night, I think there's a reasonable chance. And I've got some ideas about other possible improvements to do. It's going to take a little bit more work and more redesign. I think we'll have a reasonable chance to get to one error. And that feels good after last week. The next update will be out in a week. Thanks for being here. Have a good week.