 Whether there was any interest in commonizing the MVMUF transports and to give you an understanding of why I'm even proposing this It has typically ended up that At least in my case on the FC support and sporting it. We've been playing catch-up with changes in the other transports When it first started we had RDMA implemented The fiber channel transport took that as a code base and then Implemented on top of that and so we were actually pretty close for the first two years of the implementation and Then we got into a lot of things on fiber channel that had to deal with very long connect times very long Reset times and a lot of things that failed in the actual connection process that wasn't typical of The way an RDMA connection or configuration was put together And so we ended up solving a few things and it did cause the FC transport to actually change a little bit such as changing the Connection process so that we reuse the reconnect thread But it put us into an asynchronous process that weighted on the completion of a work queue entry Which is very different than what's an RDMA and TCP today, which Tries the original form of making the initial connection in the thread that caused the IOCTL to do the create and Then having both a teardown path that stand alone as well as part of that path as well under error But starting with those implementations and when TCP came in TCP I believe encountered a lot of the same things we did and they were keeping RDMA in sync with them But what ended up happening is I think we went through about three different revs or waves of issues where we as a kernel we're seeing issues in the concurrent rescan and Teardown philosophy and it caused a lot of rework around the queue freeze Start-stop queues and so on and really all I'm saying is that the way the transports were maturing at this time They they actually started implementing a bunch of these things in very different areas And if you look in some of the cases for air recovery and deletion What we do is actually I mean we ultimately get to about the same steps But we do them in very different ways with different code bats Some of that has been able to be pulled into the common layers others It's just a lot of replication and the transports So one thing that I've always had at the back of my mind Which even at the beginning of the NVMe over fabric implementations is making a common layer with a lot of call-outs And I think initially we were trying to figure out What actually we had to do so we weren't going to stop and commonize it because we were kind of ahead of the Would be trying to do so before we needed to know what what actually had to be there But at this time, I think we've all kind of stabilized to a point where I think we can go back and remove a lot of the duplication and try and commonize some of these paths Such that a lot of the interaction with the the block MQ layer the request queues and so on can be in more of a Common layer with a lot of call-outs for the transports It will have an aspect of it that isn't as nice where there's a lot of call-outs to the transports Not just a few routines because there are a lot of cases where we jump out and do transport specific actions so Obviously if we rework that area it will mean a stability bump and You know my question on where I'm at is does it make sense for us to even entertain trying to go down that path? And is this something to try and do for a long-term support model so If there's any Suggestions or any comments on this? You know, let me know so the one thing where this already bit us is with a transport errors This is basically again an extension of the connect issue That currently all transport errors are assumed to be retryable So whenever the transport reports an error enemy will just retry and The only way where we really can abort or terminate commands if if we get an appropriate status from the NVMe a basic from NVMe CQE only then the command actually will be terminated Which is causing issues during connect because it means if there's an error during connect We will as you and said we will continue to retry the connection until a time it hits or if we are particularly Unlocking lucky not at all and will just being stuck at connect. I did do some patch for fiber channel to actually check for the DNR bit and Basically also allowing the transfer to fail the command if it figures out no we can't do a connect so it really should be failed, but this is a Fabrician only one thing and we wouldn't and I would now need to do the very same patch Two times over again one for TCP and one for RD may just so to have a consistent behavior across the board Which really is something which doesn't belong in the driver whatsoever But rather should be handled in the NVMe layer to even get to indeed get a consistent behavior And these kind of things really we should be do farmer or farmer often and just move things Which are common really into the common layer and not just have each driver do the very same thing all over again The other thing I remember recently there was there was a state Added to I think the NVMe State machine that got added to some but not all of the transports And so yeah, that that kind of thing is probably one of the things that might gravitate you Absolutely, and that's really part of the issue and you know I don't really blame the patches because if you look at the batch history. It was kind of a a Subtle in fact actually the error didn't actually correspond to the areas modified by the batch and so on it was just one random area that just wasn't caught by a particular state change and To try and make a developer cover all the transports like connoisseur saying is rather difficult and most of this stuff that we're running into now is more common functionality and you know if we had features such as Single connection loss without terminating the the controller We've got to coordinate that across all of the transports simultaneously and you know the bottom line is I think it is a good idea to do this commonality It's just are we willing to take some of this bump in some of the transport well in the three transports I mean my suggestion would be right Some of this stuff is sort of generic to NVMe fabrics, right? So perhaps what makes sense is to look at what commonality could be extracted out of the transports and put into the fabrics code Particularly the stuff that's that's more Yeah, that's exactly what we do right Yeah, so I mean if that's if that's what you're asking that I think that that's That's a straightforwardly reasonable thing to do what I don't think we should do is put non Spec-related stuff into the fireworks code just because it happens to be common because that that stuff tends to attract the spec pretty closely understood Okay, so I will look for the tcp and already made transport folks a little bit and try and get an answer and You know bottom line So far from the distros it seems to be how they see the same thing I do so And we'd push past here's for an RFC Thank you