 By the way, if anybody has any questions. So I just figured I would test it. Why not just test it and figure out? I had a few servers and so a lot of storage. So I was wondering like, how many targets can I really do? Maybe I could do 1,000. 1,000 would be interesting. I also wanted to test what happens when you have a massive number of targets in the air handling system kicks in, because that seems to happen naturally sometimes when the system gets busy. You have lots of I.O. going on. Things get slower and slower, and then pretty soon timeouts and retries, and you have out of order I.O. And so if there are any bugs, they get found. And also the startup and shutdown, as I mentioned, scanning SysFS and disconnecting from thousands of sessions in ice guzzly terms, how are you gonna do that? So I used target CLI to create my targets. That's the kernel target subsystem. And I kind of created a simplistic script to do that. It didn't work out at first. I had a couple servers and one I put 2,000 targets on, and the other one I put 4,000 targets on. So I had 6,000 targets. I should be able to test massive numbers. I wanted to test the initiator and the targets to find where the bottlenecks are. I didn't really want to test I.O., although that does need to be tested, but that wasn't the focus of this. I was trying to find the order n squared stuff. I did find some as I kind of alluded to a moment ago. I found some issues with kernel when the I.O., not the I.O. itself, but the communication that happens when you attach to new targets, it became, the load became so high that the system got behind and things started getting retried. And I found some kernel oopses, especially in older kernels, so I moved to a newer kernel. It was a little better. But there are some bugs in there. So I created a script, just a simple shell script to create targets, you know, create target one, create target two from I to 2,000. But the problem is that it ran really, really slow. It ended up taking about 30 hours to create 2,000 targets. And so I thought, well, that's really not usable. And so I ended up plotting the data. And sure enough, it's order n squared. And so, you know, it's kind of, my simplistic script was probably the problem, I figured, because I was calling target CLI like four times for each target. And each time you call target CLI, it has to build state and then throw it away and then build state and throw it away. So I decided to try the target CLI demon. That's a demon that runs in the background that we don't actually, in our distribution, we don't use because it's logistically a little harder. But if you're doing thousands of targets, it really helps to just have the state maintained. So I did that and much better response time. Now I could create 2,000 targets in a, I mean 4,000 targets in an hour. So it really matters when you're doing this, like how you approach it. Convenience of target CLI is great, but it builds a state every time you start it up. So as you can, let me back up there. You can see that line isn't exactly straight. The purple line, so there is a little bit of order and squared in there somewhere, but it's much better. I bet if I got to 10,000 targets, you'd see a little curve in it. But still that's much better. So there's a date about like 4,000 targets in one hour. But now I have to actually use them. So the first thing I did is going from my laptop, this laptop, which is running Leap 15.3, which has the 5.3 kernel, it's not exactly new. And this is where I started running into some issues in the kernel. At the beginning, I just did a simple loop. Log into all 2,000 targets at once. So you can tell I opened iSCSI to do that. Log into all targets and let me know when you're done. That ran into some IO issues in the kernel. So in order to work around that, I would log out of one, log into one target at a time. Okay, so that's probably not gonna be good, because again, I'm building state each time I start up the command. So some things that surprised me, in iSCSI, there's a sequence you have to go through to use a target. First you have to do discovery. You have to say, tell me what targets you have. Then you get a list of targets. Then you can say, I wanna log into all of them or some of them. So it surprised me that discovery of 4,000 targets only took like three quarters of a second. And I guess the reason is it's like, it's a long list, but there's not a lot of back and forth. It's not like you have to talk to it once for each target. Yeah, because essentially just doing very few IOs to get that to a list. So it's really the size of the list which is transferred, but this is just normal IO. Yeah, so that was very pleasing. So I went to Tumbleweed, which is our rolling release that has a newer kernel. This had a 517 kernel. And I know recently, Mike Christie has been working on a bunch of fixes in this area of the kernel too that aren't in this version. So I thought about patching it, but instead I wanted a non-moving target for my tests. But I think the next step would be after I've identified some areas to try some of the patches that we've been working on. And the timing on this is order n squared. So again, it's probably due to the, I'm guessing it's due to the fact that my OpenISCSI command is scanning CISFS twice for each target. And CISFS is getting bigger and bigger and bigger as you log into more targets. So there's your order n squared. But again, it took a long time. And as I kind of just said, I'm pretty sure that this graph, the problem is the initiator, not the target, because the target's already pretty much order n at this point. What's that? Is this again the ISCSI knob thing? That the did transport, it failed fast means that some transport error happened. And that typically only happens if you get around into a knob timeout. Right. Because that's what triggered the transport failure. And by the way, I had initiator no ops off too. So, but the target might have had them. So yeah, and fail fast, it retries indefinitely too. So that kind of keeps the kernel pretty busy. But have you figured out why you get a fail first? No, I mean, no, I didn't get to track this down. I only discovered this like a couple of days ago. It would be interesting to see what was going on in your target when your initiator is running this slow, whether the timeouts or something like this are because the target has so much data structure or state on it that it's taking longer to respond when you do all these logins. And the last thing I tried to measure was logging out of targets, but I did not get a very good data on this. It was just logistically hard to do. And I did not want to like go through the list one at a time like I'd done earlier, so. So is this order in squared or order in? I don't know. I have to do more data here. And so really I've kind of had more questions than answers here, I'm sorry about that. But there's still a lot to do. I want to, I'm only scratching the surface. I need to test the data you just saw there that logouts I need to try to figure out the order in problems in the initiator. And I need to track down the kernel issues. That's perhaps the most important. And this makes me think, this is kind of off subject, but why don't we have better tests for this? We have ice-cozzy tests by the way, but they're just not very good. Oh yeah, and then what about multi-path? How do I test this with multi-path because a lot of our customers, especially if they have thousands of disks, they use multi-path. So just to figure, just for clarification, so you really created 4,000 targets, how many loons did each target have? I'm sorry, what? How many loons did each target have? How many what? Loons, meaning devices. I'm sorry, I can't understand you. Right, so you create targets, which is fine, but the target needs to provide disks, meaning loons. What's that? So how many disk devices did each target have? Devices? Yeah, disk devices. You have to export some data. Yes, well, they all had one. Just one, so you have one to one, right? Yeah, each one had a one gigabyte file back end. Oh, just a single one then. All right, good, that was the question. Yeah, but they were sparse, so they really didn't. No, no, that's not the point. The point is that what really matters is it's not so much the number of target, but really the number of disks because when you log in, first log into the target and then you do the scuzzy inquiry thing and try to figure out how many devices are really there. Right. So that only, once you did that, then you have the disk device and only then the login is complete. So it really depends on how many disks you have, how long the actual login will take. Right, well, and the system is scanning for the partition table to figure that out each time it opens a disk, right? Yeah, sure. That was one thing that surprised me too. What I passed by kind of quickly is how much I owe occurs when you get a new disk. It's fine for a single disk, but when you get a thousand new disks, it's overwhelming because it has to like read the partition table, has to turn enable caching, it has to check to see if certain scuzzy commands are supported. One thing that kind of surprised me too is that about a year ago, I hooked up some code in OpenEye scuzzy that Mike Christie originally created but didn't use for no wait. So it allows you to log into a target but not wait for the response. And I thought that would be faster and it wasn't. So I wanna find out why. Any other questions? Thank you. So, I'm just curious, if you can go back to when you were trying to do the login data there and were surprised by that? Yeah, so, okay, this, so it ended up taking much longer than you thought. What's the actual scale of, I mean, was this eventually successful and I can't quite tell for you. It's second, so what, about 3,600 seconds is an hour? Okay, that's... Yeah, so this is... Does seem surprising because I do know we looked at this at one point, so yeah, I'm wondering if there's some sort of regression in here. I know in the past, the SysFS scanning from user space was an absolute disaster and we had some, there were some old attempts at caching attributes to not need to do the read SysCalls in the code that just generated an enormous list that was scanned repeatedly without ever any effective cache hits. But it looks like we pulled that out a few years ago and that was the last time I remember explicitly looking at logging into thousands of targets. So, yeah, I just kind of wanted to check with that because... Yeah, and Mike, oops, excuse me. Mike Christie suggested a couple areas to look to in the SysFS code to try to figure out why it's order in squared. But the time is coming when we're gonna have this many targets and can you imagine like a thousand targets how long is it gonna take the system to boot?