 All right. So I'll start by giving a bit of a brief introduction to NVMe in general and then how NVMe over fabrics kind of builds on top of that and what are some of the differences in a fabric's implementation compared to the PCI Express version. Then I'll dig into the implementation that I've done so far for FreeBSD. We'll get there. And lastly I'll talk about future work in particular. The implementation I have is definitely not complete, but there's more to do. And then if we magically have time, which I doubt, I have some pre-canned slides showing a demo. We could even do live if there's a miracle in the space-time continuum. So let's cover some basics about what NVMe, which is NVM Express, which is a standard for managing originally storage that was non-volatile memory. So it's primarily targeted at flash, although I think when reading the NVMe 2.0 spec, there are extensions to handle spinning rust, which seems kind of odd. NVMe defines a protocol that you use to issue commands to the controller, and it's kind of similar in nature to SCSI and ATA, except perhaps simpler, certainly simpler than SCSI. And the terminology NVMe uses is also, in some cases, gratuitously different from other storage things. I'll try to explain their nomenclature. In NVMe terms, a host is kind of the client side, or in SCSI terms I may consider an initiator, but the host is the thing that's going to be kind of reading data or writing, so you can think of a device driver for a PCI device being the host, whereas the controller is the actual thing managing the storage that's servicing requests. So the PCI device would be the controller in the PCI model. And the kind of general flow of how things work in NVMe is you have FIFO queues that are kind of unidirectional, but going in each direction. So you have submission queues that a host uses to send commands to a controller, and then you have completion queues where a controller responds, reports the status of a command back to the host. On the completion queue, sorry. These queue entries are fixed in size. So submission queue entries are 64 bytes, and completion queue entries are 16 bytes. And I'll probably, on occasions, refer to them by their acronyms of SQEs and CQEs. There are two different types of queues in NVMe. You have an administrative queue that you use to kind of handle generally non-IO related requests, but this is how you can fetch things like additional information about the drive, for example, if you want to fetch error logs, you do it by commands on the administrative queue. And then you have IO queues that actually handle real IO commands like read and write. And then do I say it on here? Okay, I'll say it somewhere else. So what does the kind of general control flow look like? You have a host on the left and a controller on the right. And to start with, the host will allocate, and this is for a PCI, the host will allocate a kind of set of queues for the administrative queue in memory, both a submission queue and a completion queue. And then the host will construct commands and stick them into the administrative queue, and the controller pulls them out in five foot order. And when the controller has finished the command, it puts completions into the administrative queue, and then the host can read them out. One thing that I think I have in a later slide, there is, even though the queues themselves are processed in five foot order, completions can be out of order with respect to commands. So one of the fields in a command, or an SQE, is a command identifier that's allocated by the host and managed by the host. And when a controller reports the completion back to the host, it includes that command identifier in the completion of its reporting to kind of match up and tell you which command it is reporting status for. And it's a host job to not send a command ID, you already have something outstanding for it, unless you just really like excitement and corrupt data. And then you, in addition to an admin queue, you'll have one or more IO queues that also have submission queues and completion queues that the host will send commands on, and you'll get completions back from the controller. One different recall for PCI NVMe that I don't have here is that you can actually have a many to one mapping of submission queues to completion queues. So if you wanted to, you could allocate multiple submission queues, but then tie them to a single completion queue or some smaller number of completion queues. Or you can pair them one to one. It's up to however the host controller decides, or the host side decides to configure the queues when it creates them. So some of the types of commands you can send with the administrative queue, including actually creating IO queues and telling the controller how many you want and where they are once you create them. As well as, for example, I mentioned fetching error log entries, and then the IO queues handle things like read and write. So what do NVMe commands look like? They're a fixed size, because I mentioned before that SQEs are 64 bytes. So this means they don't contain any data themselves. All the data that is handled, for the most part, for things like read and write, for example, all that data is done indirectly with some kind of scatter gather entry. An NVMever PCI uses a more unique type of scatter gather entry, something that calls a PRP, where each entry is just a physical address of a page with an offset. And the length is implicit. The length is either the remaining amount of that page, based on its size and what the offset is, or given the kind of given the length, the total length of the command, the controller can infer how many of PRPs it's going to need to finish a request. And the command structure itself embeds two PRP entries. And there are ways to kind of have that change. You can get arbitrary number that I won't delve into here. But in particular, the head of kind of the list of your scatter gather list is embedded as part of a fixed size field and the fixed size command. NVMever also defines an alternate type of scatter gather list that it calls an SGL, which is a more traditional scatter gather list in that it has each element has both an address and a length. The length is not implicit or inferred. And these scatter gather lists also have a type. I believe from reading the spec that it's possible to use these own PCI express devices, although I think it's rare, like our PCI driver and previous data doesn't bother using it. Those will see NVMe express over fabrics uses this exclusively. And this the your first SG list, because you can change those as well, is stored in the command in the same place that you would otherwise be storing your PRPs. So what do completions look like completions? And it just like commands are a fixed size or a fixed size structure at 16 bytes in size. They also do not contain any data. If you are doing a request that is going to get data back from the controller, like you're doing a read, you are the way the protocol works is you provide a scatter gather list describing that buffer in the original command, and then the controller will send that data over to that location. And then when it's done, it sends you a completion entry telling you, Hey, I finished and that's what's done. So the data is sent before the completion. And then here's where I mentioned the fact that we have command IDs to match up commands with completions because completions can arrive out of order. So that's kind of my quick and dirty NVMe over PCI. And the NVMe and your laptops are the existing NVMe driver and previously and kind of what it manages. So now let's talk about how does NVMe go for a fabric and how is that different from the PCI model? So the first thing that NVMe over fabrics does is it replaces the notion of an in memory, mostly an in memory submission queue and completion queue with some kind of cues that are managed by your transport in a transport specific manner. Because one of the things we'll get into is that NVMe over fabrics takes a define it has a kind of different levels of abstraction and some things are kind of deferred to the transport, how the transport specifically manages things. But I kind of at the level of generic fabrics, you have some kind of cues that can be managed in a five foot order to submit commands to and get completions back on. But they're not necessarily a memory the same way they are for PCI. It's currently defined for several transports fiber channel, I think was the first one, several flavors of our DMA, although they're all kind of defined in some ways as a single transport in the spec. And then there's a definition for TCP, which is what we're going to focus on later. In fabrics, unlike in PCI, every submission queue is paired with exactly one completion queue. And so you always use the same a single completion queue to get back your commands from the submission queue. And so many places in this talk, I'll talk about an SQ or CQ pair or in many cases just a Q pair. And that means a pair of both a submission queue and a completion queue that are servicing as either an admin Q pair or an IOQ pair. In fabrics terms, because they to avoid some confusion, they define some of their own words. If you have kind of a logical connection between a host and a controller that contains at least one admin Q pair, or at least one IOQ pair as well as an admin Q pair, that is called an association, kind of the group of the whole thing together. And there are ways, for example, if you lose a Q pair, maybe you lose the whole association because they all go away. And the reason it's called an association is that depending on your transport, you actually may have multiple underlying transport connections that are apart and kind of build up your association. Another thing that fabrics adds is something called a discovery controller. And this controller doesn't do IO like normal. It gives you a kind of name service sort of. iSCSI has a very similar concept. In fact, there are a lot of things in NVMeover fabrics, especially when we get into TCP, like very close to iSCSI, if you're familiar with it. But this is a way that you can connect to almost like a DNS service sort of for NVMeover fabrics and find the addresses and kind of connection information for other controllers that actually do IO for you to connect to. So we talked about commands and completions before for basic NVMe. And then fabrics, there's a new abstraction called a capsule, which is how you are going to submit commands, get completions back. So we embed a CQE or an SQE into a capsule that we're going to send over the transport. And that's going to represent a command we want to submit and a or completion we're going to receive. And you can think of the SQ and the CQ as being a FIFO of these capsules rather than just the Q entries themselves. And in particular, the one reason that there's kind of this bigger abstraction than this a Q entry is that you can have data associated with a capsule, which represents kind of the data buffer that goes along with it when you're actually sending data to the other side or having data come back. I mentioned before that fabrics commands do not ever use the PRP style of the scatter gathered lists. They always use this SGL scatter gathered list approach. Among other things, as we'll see in the future, if you get further along, there are different types of SGL that needs to use depending on what transport you're using. You also, in fabrics, in some cases, you're able to embed data directly in the capsule. So you can have a capsule that logically as it's sent over the wire contains both the Q entry as well as the data associated with it. Or you can have an approach where the command that the Q entry is sent across in particular the submission Q entry for the command and it has a scatter gather list saying that there's some kind of buffer on the host side that the transport has a way to manage that you can do reading right from. But it's not included as part of the capsule. There's some other transport defined method for the IO. Let me get some water. So that was actually a good pausing point. Now I want to kind of drill down a little bit further the stack and talk about the TCP transport and specifically how fabrics works over TCP. So the TCP transport for fabrics defines its own little protocol with its own kind of packets that it calls protocol data units because it's, as I mentioned before, it's a lot like iSCSI. So they're PDUs, which is a similar framework you'll hear in iSCSI that are used to do the communication, both to pass capsules and to manage how data is transferred that goes along with a capsule. In the TCP transport, we use a separate TCP connection for every single Q pair. So your admin Q pair is a single TCP connection and every IO Q pair you do are additional TCP connections. I believe this is different from iSCSI, which kind of uses a single socket for everything. TCP supports in capsule data but only for commands only when you're sending data out, not for data coming back. And it also supports something that it defines as a command buffer, which I'll describe in a bit, which is how you can do data without using in capsule data, but a way that you can kind of have an associated data buffer that there are other PDUs that do IO to and from that data buffer. So these are some of the PDU types, actually these are all the PDU types that we have with TCP. We have two at the beginning which deal with actually how you establish your connection and you set some parameters that are at the TCP level such as if you're going to use digests on your PDUs or not. There's two more that are when an error happens that there's detected at kind of the TCP protocol level, we have a way of reporting that error back and asking the other side, telling the other side we're going to terminate the required the connection and let's go in both directions. Then we have PDUs to send command capsules which contain an SQE and response capsules which contain a completion and finally we have three PDUs that manage how we send data. We have PDUs to allow us to send data from the host to the controller, so you can think of this is something you would use perhaps for a write. From the controller back to the host which you would use for a read and then something called ready to transmit which if you know ischazi it's just like the one in ischazi and has to do with how the controller can throttle when it's ready for more data to come from the host. And in particular in NVMeover Fabrics all the data IO when you're not using encapsled data is initiated by the controller side. The controller decides when it wants to transfer data in a given direction. Yes. It depends on what you're doing there's for example there's a cap on how much in capsule data you would want to do that you can do in TCP in particular. Also I think if I remember correctly I think admin QPairs can only do ICD on TCP if I remember correctly. But and the driver it's more that currently it's based on what is the size you can do which is determined by the controller it tells you kind of what's the maximum cap the capsule size it can handle. So what does a PDU look like in NVMeover TCP? First they all start with what's called a common header which is a fixed size structure that includes things like the type of the PDU that is being sent and that's followed by a PDU specific header. And this varies in size based on what type of PDU you are and has PDU specific data. So for example the error reporting PDUs and this PDU specific header they actually have the header of the other PDU that triggered the problem as well as information telling you which field in the PDU was incorrect. Then you can have a header digest if you have turned them on it's a negotiated feature between the two sides. Then you have a data region if your PDU contains data followed finally by a data digest which is also something that's optional based on if you've negotiated it during connection set up. So let's walk through a few examples of how we are going to send data across the connection with NVMeover TCP and I'll start with the easy one which is we're going to look at the PDU sent for a command that's sending some data from the host to the controller and we're going to embed the data inside the capsule that we sent. So we start off, we've got to build the actual NVMe command we want to do and I should have mentioned before these the commands for the most part are defined identically to the way you would define commands for PCI. There are a few subtle differences and there's a new command with a set of subcommands for fabric specific operations but for things like read and write aside from the SGL being a different type they're otherwise identical. So we're going to construct the actual submission Q and true for whatever command we're doing. For any data that we for the data that we're using in this case we have data that we're going to send we're going to use what's called a data block SGL so the data block is a specific type so when we fill out the SGL we're going to use the data block type and for IECD we're actually going to say that the address offset is zero always and the length of the SGL how much data is going to be included. We then append that data right behind the SQE we slap the common header for the PDU in front. The payload specific header portion for capsules is actually just the Q entry itself. And that right there is our full PDU and we'll send that across to the controller and the controller will process it and figure out the result of the operation it will construct a completion Q entry and stick it in a capsule with a common header and send that across to the host and that's the whole process of how we get data back and forth and that one's pretty simple right in TCP it's just immediately in the stream embedded as part of the message that we send along with our SQE. So now let's get to the more complex cases and the first one we'll talk about is we're going to do a request that is going to receive data. So think of this like an IO read request. So we are going to send a request from the host to the controller and the controller is going to send data back to us. So the first thing we do is in the host we configure some kind of command buffer and it doesn't matter kind of the backing store for it but logically the buffer goes from zero to n and it's you know of however many bytes that we're doing and we can back it however we want an IO back or something it doesn't matter but logically it's just a kind of contiguous buffer. Then we're going to build our command entry and slap a header on it so we're going to have this is our entire command capsule and at this time we're going to use an sgl entry but for TCP we're going to use what's called a transport type and there's a set of types and fabrics that are specific to a given transport and they start a certain number and for TCP the first transport type is this command buffer it's the only one TCP uses for example RD may for example have its own transport types I'm in a transport type means we're going to do data by reading and writing from this command buffer and this is our entire PDU and we're going to send it off to the controller for example for our read request the controller is going to look at it and it's going to say oh no this is a yeah this is a read it's going to get some data maybe it's right up from the drive it's got a chunk of data that's going to send to us and it's going to stuff that into a controller to host PDU and slap the little common header and this tells us where in the region of the command buffer this data goes how long it is and what the starting offset and also one of the requirements in fabrics is that the buffer needs to be read and read or written sequentially and it's a protocol error if you get those out of water in case you've dropped one or something like that so it's going to build this as a PDU and send it across to the host and the host is going to copy this data into the command buffer in this case we didn't get all of it yet so the controller is going to construct another one where the rest of the data the host will receive it now the command buffer is full and finally at this point the controller is going to send the completion in across in a response capsule PDU so now let's look at the last case which is the most complicated case which in this case we as the host want to send data to the controller so a write but I mentioned before that all the IO is kind of initiated by the controller so we're going to start off again we have a command buffer but this time our command buffer has data from the start because we're sending data over not getting data back and we're going to build a command capsule like before also a transport block in fact this looks identical to the last read request the the direction of the data moving is implied by what commander doing it's nothing specific that's kind of command agnostic in the sqe it's instead has to be inferred by what commander doing we'll send that across and the controller decides okay I'm ready to read some portion of the data and when it's ready to do that it constructs a ready to transmit which entails you which tells the host which part of the buffer it wants to read particularly gives you an offset and a length of how much data the controller wants back from the host at this point so we send that across and so in this case this RTT is saying I want this particular chunk of the data again it has to start at the beginning and stay sequential the host responds by building a host to controller data PDU and this includes the offset and length so you can confirm that you've got what you asked for has a little common header and then the data is in the data kind of portion of the PDU and in this case you notice I only got part of the data so you can do multiple cycles of this and you could have multiple host to controllers until you actually fulfill an RTT request but I didn't want to put like 20 lines in the slide because it wouldn't be readable well the indexes are the controller when it's ready to get some data it has to ask from the beginning and go sequentially to follow the protocol and it's responsible however it wants to handle the data when it arrives it can do but that's kind of its implementation defined I mean the way this command buffer is also implementation defined it's just a way to think about the offsets and links that are passed in these PDUs the offsets and links that were only in terms of this buffer on the host side not the controller side and then finally when we finish transferring the whole thing then the controller sends back a response with a completion saying hey we're done okay so that is my little introduction to fabrics and fabrics over TCP let me see how I am on time because now I don't know what it's supposed to end 1145 plus 20 minutes right so 1205 I'm not completely behind schedule all right yeah so the first thing I want to talk about I want to shift gears now and talk about kind of what I've done so far to implement this in free BSD kind of the design that it uses so the first kind of abstract choice I made was to use a three layer design for how I was going to structure how all this kind of works so in the middle I have some notion of a transport abstraction and it's somewhat aligned with how the fabric spec itself is defined so the transport on the top end it kind of allows you to send them receive capsules and you can attach data to a capsule in some way particular to a command capsule or data buffer rather and then at the bottom end you kind of have multiple backends for different transports and in theory when you're first creating an association you kind of have to know the type of transport you're creating to kind of get it started but once you've created a Q pair and kind of allocated it from the yellow region then you don't have to worry about any more transport specifics it's kind of like I mean it's kind of like a dev switch or gifnet or something in the kernel where we have kind of a abstract layer between consumers of the interface and the back end protocol backends so the protocol backends kind of sit below this and you can have them for TCP or RD may or any other fabric you wish to do and at the top the kind of clients of this abstraction are either a host or a controller so I started in user space and so let's walk through the bits I've done in user space so far I have a library called libnvmf and it defines kind of that yellow layer defines a transport interface which includes both the API that kind of faces upward allows you to send and receive capsules but also internally it defines a class and it's in C so it's a structure function pointers of an interface that transports are required to implement to actually be a transport and there's a little bit of glue kind of in the yellow to handle some things that we don't have to duplicate as many things down the transport specific bits currently it provides an implementation of the TCP transport and this library is designed for simplicity and as I say later debug ability it's not designed to be high performance it was something so that I could debug it easily and make sure it was correct and functional so it is not thread safe it's you know if you really want to do thread safe you're gonna have to do that yourself and it uses blocking IO and sockets and not they think they're not non-blocking IO with a more asynchronous stuff just to make life simpler it does contain several helper routines that sit on top of the kind of bear transport layer to allow you to do things one example of this is it provides a helper routine for a host that allows you to send a command and wait for the response to come back which I use a lot in the user land bits I've been written both a user space host and controller so the host is a little program called in the MF DD in my branch and you can think of it kind of like DD it connects to remote controller you've you get parameters to tell you like the address of the IO controller you're connecting to which namespace you want to access you can you can get an offset like a starting LBA and how many bytes and then a read or write command and then it reads either from standard in to construct a write command that it sends across or if you're doing a read it sends a read command across and dumps that to standard out but you can do simple IO that this is enough to kind of test the protocol it's sufficient to test all those cases I described before for how we send and receive data and doing this news line was handy because it was a lot easier to find the find and fix the bugs that way and then more recently I've written a simple user space controller I call NVM FD and it does support multiple namespaces and you can have a namespace that's backed by a file or a character device that looks like a disk for example a Zvol if you wanted to as well as you can kind of create a temporary RAM disk whose contents start off with zero and get thrown when you're done and it also provides a discovery controller as well as an IO controller these things are not designed for performance and for the most part they're not designed to be used outside of development they're really designed to help me flesh out bugs in my implementation and my assumptions in particular as I was testing against Linux to make sure that my notion of things matched up with theirs sometimes their notions are odd they don't believe that errors exist for example I remember early on I had a I didn't like a field of one of their PDUs because of a bug I had and I sent back a one of the PDUs that said I want to terminate the connection because of an error and the kernel Linux kernel printed out unknown PDU type just completely doesn't understand that errors exist still close the connection because of an unknown PDU type but doesn't seem to be as graceful so and my user space implementation these are the kind of three layers are ended up with I have LibMU in VMF and it kind of handles this yellow layer and it's hard to see on this thing so the RDMA and fiber channel are hashed out because they're not done they could be done in the future but it does provide TCP and then I've implemented a host which is this in VMF DD and a controller and so I kind of have that bit done in user space so the next step I worked on or I've kind of bounced back and forth between some of these is a kernel data path where the goal is to handle for actual performance you want to handle the bulk IO in the kernel itself you don't want to do that in user space so I took the same kind of transport abstraction that I had written in Lib in VMF and mostly mirrored it into the kernel there is some regrettable code duplication in particular this and bits of the TCP thing that I currently have duplicated like how do you validate a PDU and decide which fields are involved and all the error checking for that but unlike the user space case the kernel data path because it's written for a kernel it's written for our kernel and for performance does not use blocking instead it is much more synchronous and uses callbacks for different things so for example if you are a host and you submit a capsule to be sent with a command along with that you have well nobody described the right one I'm skipping ahead when you create a queue you pass along a callback that you want to be invoked anytime any type of capsule is received so if you're a host this callback is going to be called when you get a completion back if your controller this callback is called when you get a command that you need to handle and respond to there's also a callback for IO operations so when you are going to do IO along with a command in some way they're reading or writing as part of the kind of describing the buffer that holds either where I want the data to go or where it's coming from you have to register a callback that gets called when it's done and in particular when it's done is kind of non-obvious one of the things that I SCSI more recently kind of have shoved into it is like a bit on the side was a notion of zero copy for buffers so that it would use unmapped embuffs and avoid copying the data kind of inside the I SCSI layer before send it down to the Nick and for fabrics I did that by design from the front so for example when you send data out of connection over TCP in the kernel we're going to create external embuffs that directly reference the data buffer you're using to send the data out like when you're doing a write for example I mean you have a strict bio in your host side and you're going to send the pages from the bio out then we use an external embuff that has a reference count back to this this kind of IO buffer and only when all those embuffs have been sent and your Nick has gotten the TX completion interrupt back to free the embuffs does a reference count drop all the way to zero and then your IO callback gets complete so depending on the way your Nick is working and how kind of the flow of interrupts you may actually get a your callback for your completion coming back and being received by the Nick and a different you know a different embuff you may get the completion callback for your completion before you find out that your IO is fully kind of done and freed and in particular that it's free to report the IO is done and that it's safe to now like not depend what you get safe to reuse whatever the backing store is for the bio or the CCB or something like that so there's not a particular this IO callback can happen after you've gotten your completion just non-obvious at first glance and then finally when you create a queue you also give a second callback which says if an error occurs call me to let me know and this is usually so that you can decide I want to kill the connection and kind of deal with that and the kernel what I've chosen to do is for all the IO buffers that are attached to a capsule I use a subtraction called a memdesk we've kind of had in the kernel for a while so if you've seen commits for me to kind of clean up memdesks recently and actually now we even use them in the base into me driver to simplify some things and I've added a whole set of routines to copy data in and out of a buffer described by memdesk that's all motivated and part of this work but in particular this means that inside the transport layer and the transports themselves they don't know anything about in buffs or bios or CCBs or if it's a kernel buffer or anything they just get this opaque handle to a memory descriptor a memdesk and they know the offset into it and the length of how much data out that memdesk they're supposed to send out or when they get data in where they're supposed to put it and so nothing the up in the host or the controller you might know what kind of buffer you want to deal with but nothing below that anywhere in the stack has to know what kind of data buffer you're working with that's mostly true in TCP in order to construct those external mbuffs I have to know what kind of data buffer you're working with but only there and then the design here is that user space should still handle the initial setup in particular one of the things that in VME and the spec mandates but for example Linux doesn't implement and I don't implement yet as it says that all remote controllers should implement TLS you can have your connection go over TLS and not just in the clear and I don't think it makes any sense at all to try to do a TLS negotiation in the kernel that seems rather nutty to me of five minutes thank you I'm actually not too bad on time so the design is do the fast bits in the kernel so the part I have done so far is I have a host in VME at MVMF it gives you an N kernel fabrics host it does not try to share code with in VME if you look hard enough out on the interwebs there is a github repository that actually has an in VME client for free BST over our DMA and that particular code does things to the existing in VME driver that are kind of gross and in particular when I had a design that had those IO callbacks in the complexity the fact that when you're doing I owe you have two callbacks to deal with it was just a lot simpler to kind of write my cam sim from scratch rather than trying to shove that into the in VME driver but it does create devices that look like in VME X there are now hung off of Nexus instead of off of a PCI bus but they show up the same in D message it creates similar devices and slash dev so you can use in VME control with them directly for things like identify we only support disk access via cam so NDAX devices not in VD and then one other thing it has is if some kind of connection error occurs will tear down the connections but the new best device stays around and all the IO requests ever open stay paused and then you have the ability to reconnect with a new set of connections at which point all the IO resumes so one other part of this is I mentioned some parts in user space so there's some extensions in VME control you can find out fabrics things and identify a few new commands and this will probably we're after stop you have a discover command to query a discovery controller and dump its log of what things it knows about there's a connect command which is how you actually connect to remote side and establish a new host a disconnect command when you want to go away and then there's a reconnect which is what you use that if an error occurs my TCP connections go away and but I want to reestablish and keep going and out lose IO you can use reconnect to do that alright so the things that are currently in the colonel today I have an in the MF transport that K.O. that's kind of the yellow layer in the MF TCP and then in the MF that pay out and in particular these blue boxes depend on the yellow as well as the green but you don't have to load all the transports so you have to explicitly load what transport I want as well as which kind of client I want alright what am I left at Peter but to all right I can maybe do this last lad real quick yep well two slides the things I still need to work on are an internal controller which is a big ball of work that I won't go into here but my slides will be available if you want to read them later it might be nice to do other transports I have some interest perhaps in our DMA actually have someone who emailed me two weeks ago who's company wants fiber channel I'll let them have that fun TLS should be doable and it should be doable just fine in free BSD using kernel TLS offload it should be pretty transparent actually and I'm going to skip past this part which you can see later I need to do this slide the code is available in a branch off my github repository it's called in the MF to don't ask about the first one there is a caveat which is this is my development work branch and I rebase for my workflow so if you're going to try to follow it for some reason before it lands in Maine you'll have to suffer and just know that's part of how I roll is I rebase I do need to give a big thanks to Chelsea who has paid for all of this so this has been something that they funded me to do and then lastly what little time remaining I will wait for questions I'll plug this into beehive it's not really relevant to beehive because the transports are all something that happened over top of an existing PV device it doesn't really make sense because the envy mean emulation in beehive is basically a PCI device oh and your question was about doing this in beehive Warner they are loaded using resources so for the envy me NS devices I actually keep the bios cued but no one should do bio via those anyway even in the PCI driver so I basically keep the bio cued and then when we if you detach the device I'll abort them all or if you reconnect then they'll get resumed and done with cam I think actually life is less terrible I think with cam I'm actually able to freeze the queue and send them back with a cam aborted status and then later when I get reconnected I unfreeze the queue and cam resubmit since I think for cam the CCBs actually get putted back up so Warner was asking about a pause IO is that am I done Peter okay yeah you can find me afterwards if you have more questions especially we're going to be at lunch I believe next no oh yeah I'm used to lunch at noon okay fair enough