 So, I'm Henrik from Cisco Systems in Norway and I'm going to talk about time-sensitive networking. Before I get going out of curiosity, how many of you know about TSN or AVV? All right. Okay. So, you actually know what you're talking about. Excellent. All right. This is going to be fun. All right. So, my day-to-day work at Cisco is doing real-time tweaking of the systems you are making, so a mixture of kernel and user space work, which means that I tend to spend my days looking at traces, thinking, huh. And that also means that I write a lot of scripts, so I'm lazy. I like hardware, which is why the current work in the TSN driver is excellent, because then I get to get more hardware, which my life loves, because I tend to bring all this home. All right. So the reason why we started using AVV or looking into AVV at Cisco is that we do telepresence systems, so basically massive systems connecting lots of microphones and speakers. And if you ever try to set up a lot of audio, you quickly realize that all the cables are going to be one big mess. And also, even though no matter how large a backplane you make, you always need more in some settings, so you would like to have a more flexible arrangement. That also goes the other way. If you have a big system, you may not want to use all the AD and DA converters, so if you could avoid all that, that would be useful. And finally, it would be nice to have something that's a bit more flexible than doing a point to point analog truncating everywhere, so you don't have to have splitters and moxers and all these extra things you need in order to route the audio where you need them. I work for the audio group, by the way, so my interest was primarily in audio when I started looking at this. So we started looking into this, and last year we had this yearly demonstration day at Cisco, so I took one of the big telepresence rigs, and I said I want to use this as a speaker, and I took my desktop computer, I took the AVB driver, and I basically used Spotify to play on the big telepresence system, so Spotify didn't know that it was going over the network. It just saw any normal ALSA sound card, and that was a really fun demo, a lot of work, a lot of things that could go wrong, a few things that did go wrong, but the end result was it was a nice way to show that AVB is not only a pro audio, and it's actually a fairly useful concept. So that was basically our way into AVB, I'm not with a switching group. And of course, like any other standard, there's a lot of different terms. So it started out being AVB audio video, and then it moved into TSN because they realized you can do a lot more than just audio and video. And you talk about bridges and end stations, and how many of you know about bridges and end stations? Are these familiar concepts? Somewhat? Ah, okay. And you need stream reservation. I'm going to go through some of this, but just to sort of give you some pointers as to where we're going. You have the stream reservation classes, A for the best effort, now for the best quality, and then B, and recently they also added C and D for automotive. And you have to look into timing domains and the stream reservation domains, and the intersection of those two will be the actual AVB domains. So as I said, we primarily use digital audio. We do all that for echo cancellation and mixing and routing, and then we ship it off the network. So at some stage, we need to have digital sound anyway. So if we do the AD in the codec, or if it's done in the network, it doesn't really matter. It gives you a great flexibility in how you create your setups. We do fairly complex systems. So being able to do even more complex than we do today is something we are interested in. And it also gives you a fairly high audio capacity. You can send a few hundred audio channels over a single network link. That would be interesting using analog cables. But at the end of the day, this is basically about building basic infrastructure. So if you put the audio and video aside for a moment, this is all about giving you the ability to send something time critical over the network without worrying about losing frames or jittering when the frames arrive. And once you start looking at infrastructure, you realize that open standards is the only sensible way to go. There are a few other audio standards that one of them is not open, and one of them is best effort. So AVB is guaranteed delivery, so you can actually mix it with other traffic, which also means that even though you have the audio stream going from the server at the top to the speakers at the bottom, you can do a pretty massive file transfer across the network over the same link without having to worry about losing frames or having glitches in the audio. That's really important because you don't have to have two sets of networks, for instance, in many scenarios. I'm sure in a professional audio setup or in a factory floor, that is not the same problem, but for a lot of people and in a lot of cases, having a single network is a requirement, so you don't have to do all the sorts of tweaking you do with some of the other. Yeah, and as I said, AVB then moved on to TSN when they realized that AVB could do more than just audio and video, and pretty much from the day born people starting using it for more than audio anyway. I think Exmos early on had a demo about using AVB to control a server with a gyroscope using AVB, which was a really cool demo. So now they've been looking into consumer AV as well, and the automotive industry is interested in this, and then you have the whole industrial application with the industrial IoT and control and robotics and all the part that actually needs some time-sensitive protection on the stream scaling. So if any of you attended Stephen Rostad's talk this morning about deadline scheduling, what you actually do underneath is a bit like the deadline scheduler, you pace it out evenly, we'll get back to that later, but if you keep sort of that in mind some of the things are being a bit easier to understand, I think. All right, so that was basically sort of a crash introduction to AVB, and I intend to go a bit more into details about TSN, which is basically just a bunch of standards, and standards are great because there are so many of them, but these standards actually try to solve a problem, unlike all the others, of course. The most important ones, I guess, is the 1Q standard that is basically the forwarding and queuing of time-sensitive streams, which is the tools you need to implement 35, the stream reservation, which is what actually gives you guaranteed delivery and bounded latency, and in order for all of this to work, it's important that the entire domain, all the end station and bridges agree on a common time, so they adapted the PDP version 2 and made their own specific profile for AVB, which is a bit simplified so that all the end station can agree on the time. So when you say play this at time X, all the speakers will play out at the same time. Then you have the 1722, which is Layer 2 transmit protocol, which is basically the header of the frame you're sending over the network, and they recently bumped the next version up to draft 16, and that introduces a lot of new streaming formats as well. So the original one was pretty much the Firewire standard for audio, and now they added floating point and PC impact and encrypted and discrete control, so you have standardized support for pretty much everything, or a lot of things at least. And they're also working on the RTP version, so you can actually traverse VLANs and cross larger networks, and you have the discovery and enumeration, which is basically how you connect the endpoint, so if you have a speaker and a microphone, they are pretty dumb, and if you want to connect them, you need someone outside to do that, and that's where the discovery and enumeration comes in. And you also have the frame preemption, which we don't worry about in the kernel driver that I'm going to describe, but it's used in some of the bridges to even lower the tutor over the bridges by stopping frames in flight and letting high-priority frames go through instead. And of course, the Firewire standards, which I was told yesterday has been opened up now, so you don't have to pay for them anymore, which is great news, but that's a digression. All right, so the 1722, as I said, that's layer two, and that's where they started, because a microphone, why would you implement TCP-IP on that one, or UDP, for that matter? And they started out solving an actual problem using small networks, typically used by pro-AV, and as they also discovered, sorry, multipath routing is kind of difficult when you have a reserved path through the network. So they solved the easy part first, and they moved on to do the higher level now. So they have a working task group in ITF to do just that for layer three and up. All right, so in order for this to work, if you want to actually send AVB frames through the network, you had to reserve a path through the network, and you need to declare membership to a VLAN, and you need to define your stream reservation class priorities. So as I said earlier, you have a set of domains. You have a stream reservation domain, and you have a timing domain, and the AVB domain is the intersection of those two. And the funny thing is that you are allowed to have different stream class reservations, stream reservation, stream class priorities, sorry. So a class A on one bridge can have a different priority than on another bridge, and then those two bridges will belong to separate domains. And you do all that in the multiple VLAN registration protocol and the stream reservation protocol. You figure all that out so that you know what domain you belong to. And then you basically send a set of frames through the network querying if you can reserve the bandwidth you need. And if all the bridges, which are typically switches, say OK, then you've got your path going, and then you can start sending data. And if one of them say no, then there's no, not enough bandwidth or not enough resources to do that. And a talker will typically list to a talker announce what attributes it has available. And then a listener will look at them and tell the talker these are the attributes that I will accept, and then they negotiate, and they find out what attributes to use for the stream. And all this is done in the MSRP. And that gives you extraordinary low package loss. They want to say zero, but that's kind of difficult. So they're saying somewhere between one in a million and one in 10 billion lost frames. Not exactly sure where they have these numbers from, but since you have reserved the path, you are pretty guaranteed not to lose frames due to frame collisions. So in the previous one, that one. The problem when you send a lot of traffic here is that if you send a massive file here, then you risk losing frames on this link. And this you avoid when you do the stream reservation. And I'll stop jumping back to the slides now. So the subnet is basically how the grouping. I'm not a networking expert, so I'm afraid to give you sort of the exact answer, but a layer two frame is not able to jump between subnets normally. You have to go through a bridge and then forward it through. And as it is now, you have to stay within that subnet. So if you meet a router, you pretty much you stop. If you can think of it that way, there are switches and routers that will let you through, but that's a bit beyond my skills in the networking part. So the VB domain will then stop at the subnet pretty much. I'm sure you can do tricks to traverse this. So as I said, there are a few different traffic classes, the A and B, or the original ones. And you have the default priority. And the reason why we actually have a default priority is because you can set the priority to be different in your domain. So you need to do the MBRP thing to figure that out. And if you are a class A stream, then you can send frames at most every 125 microseconds. And you are guaranteed a max transit time from start to end in two milliseconds. And that translates directly into you don't have to have more than two millisecond buffering on your microphone or your speaker. Class B is 50 milliseconds because then you are allowed to actually jump over wireless links as well. And then automotive added C and D. And the observation interval there is interesting. I don't know where they got the numbers from. It's 750 Hertz and something Hertz. So probably there's something else they want to play with. But what it is, I really don't know. The automotive standard also goes away with the whole stream reservation protocol. And they do a set grandmaster for timing and all these things because they typically have a very static environment. So I've looked at AMB. And I haven't worried about the automotive yet. Yeah. No, that's the slowest. So you have to look at, so in an AVB domain, you have to disable jumbo frames. So you can send up to 1,500 octets in each frame. So if you look at the transit time is 2 milliseconds in a 100 megabit network over seven hops. And if you look at the time and how you hit a bridge just as a new frame has started transmitting, you have to wait for that one. And if all of them are 1,500 octets, you are within 2 milliseconds over seven hops. So it's just you will be prioritized, but it won't preempt the frames, the original one, hence the reference to the frame preemption thing. So it's just you will never be more than 2 millisecond after. That's the idea. And that also means that you don't have to buffer more than 2 millisecond at the end station, at the listener. Then you're not guaranteed to be within 2 milliseconds. You can have lower as well. And you can also set your own max transit time if you want to. But these are sort of the default values. For a gigabit net, you will have lower times, but you will still have 2 millisecond as the max transit time. So to give you a sort of a pinpoint, you need to buffer 2 milliseconds. Yes. And you should probably be careful what else you put on to the network if you really want to push this down. I mean, if you have an open network, you can push frame over seven hops pretty quick. So and the 50 millisecond is so you can also jump over two wireless links. Because wireless, even though you have the managed wireless point, so that the wireless access point will actually tell which client to transmit, the latency is much greater. So that's the 50 millisecond because then you have time for two wireless links. Any other questions, by the way? No? All right. So since we now have fairly fast and reliable transmit through the network, we need to agree on what time it is for all the endpoints. And that's where you have the GPDP standard. And it's basically a simplification of the standard PDP protocol. So you only have two different end stations, two different participants, a bridge and an end station. And it gives you a clock accuracy of one microsecond over seven hops. So if you have a fair seven hop network, you know that they will agree on the time within one microsecond, which is enough for sample sync in a 48 kilohertz sample set up. It does track with nanosecond granularity, and it has some simplification. So you can do a faster clock convergence. And you're only allowed to have a single grandmaster. In PDP, you can have several clockmasters at the same time. And that makes the whole grandmaster selection algorithm a lot more complex. And they just did away with all of that to make it faster. So as a talker, if I want to send a stream, there's some requirements I need to, some hardware I need. And one of them is the credit-based chaper, which is probably the most important requirement you have. And what the credit-based chaper is, it will pace out the frames for you in hardware. So you don't have to do that in software. So in theory, you only have to configure the idle slope. There's a few other variables you can look at. But if you look at how it actually works, so this I grabbed from Wikipedia because they made way better images than what I was able to do myself. So you have a frame in transit already started here. So this also goes back to the 2 millisecond thing, right, that you hit a frame in transit. But this is on your local system on the way out. So you have to wait until this frame has finished going out. And the moment you have the first AVB or TSN frame ready, you will start building up credits. And the idle slope basically gives you the rate on how fast the credits will accumulate. And the moment you're ready to transmit, it will start to decrease. And send slope is basically just the speed of your link minus the send slope, no, the idle slope, sorry. So it will just how fast it will decrease. And at this stage, it had enough credits to start sending another one immediately because credit was above zero. And it finished sending here. And even though it has a third frame ready, the credits are in the negative. So it has to wait. And then it can finally transmit, sorry. And so you will have a small burst in the beginning when you have more queued up and due to interference. And then it will pace them out evenly, which looks a lot like how the SCAD deadline will do for tasks. And if you want to do this in software, it's going to be a nightmare. So that's the hardware requirement. And then you have the other almost requirement. You really do want this in hardware, but you don't necessarily have to. And that's time stamping of incoming frames. So when you do the PDP shrink, you want the PHY and the Mac actually to keep track of the time for you to reduce the jitter and the uncertainty as much as possible. I saw some, I think it was Texas Instruments that had a paper on this. And they managed to get the jitter down fairly low. I think 600, 700 microseconds. And in hardware, it's less than a microsecond. So if you don't have support for this in hardware, it's not the end of the world. But you really want this. And from this, if you're going to do audio, you probably want to tie your timing circuit on your neck into your audio so you don't have to do re-sampling. But that's another story. Yeah. So this is basically how you construct a frame. And what you need to look at is the destination address. If you do the VLAN registration protocol, you don't have to send to a single address. You can do multicast. And that's where it starts to get interesting, because then you can send a single stream and the network will just multiply it out to all the endpoints if you want to. And these are one of the really nice features, I think, with this in doing this all in the digital world. You get a lot of this for free. The other one is the VLAN tag and the priority cold point. This is what you figure out when you talk to the network before you set up the stream. Then you figure out, OK, if I'm a class A, you set the priority cold point to 3, for instance. And then the network know that this is a class A stream. You have the standard ether type, which is 22F0. And if you look into the 1722 AVTP data unit, yes. You have the stream ID, which is the 64-bit string, basically, the unique identifier for your stream. This is what the network will look at to see, are you allowed to send this? Have you budgeted to send this yet? And you also have the timestamp, which is the lower 32 bits of your GPDP time. Basically, either capture time. This sample was captured at time X or presentation time. Play this audio sample back to the network. Play this audio sample back starting at time Y, which is why it's important to have synchronized time across the network. So that allows you to sample from multiple microphones over the network and stitch them all together in software later and actually get a proper sound image. And then you have some more headers in the firewire standard. So if this was a class A stream, I think you have 30% of the frame is your payload and the rest is headers. Yes. Which is when you get to what do you want? Do you want throughput or do you want low latency deterministic traffic? And if you want to do low latency, you are going to pay for more headers unless you want to do your own proprietary thing doing away with all the headers and doing a dedicated network. But if you want to coexist with other traffic, you're not getting away from this, I'm afraid. But even so, you're able to pack quite a few audio channels into this. So basically, I highlighted two 8 and 32 channels to a stream. So you talk about a stream having a set of channels, and then you put a stream through the network and give that as the stream ID. So I think you're able to push 400 channels over a gigabit link if you really want to. If you can process this on the other side, it's another matter. And the last one, the 61 channel one, is basically you fill up the 1,500 octets you have available in the frame. So if you go across that, you need to have jumbo frames enabled. And then you are not allowed to do that in an AVB domain. So 61 channel seems to be the top most. For this particular encoding, there are other encodings that are a bit more efficient. So we get to the point where we actually written a kernel driver for this, which has not been upstream yet because we're not there yet. But we can play with it. So I started doing this in two years ago now. That was the original AVB attempt, not TSN, basically doing everything in an alpha driver. And then reworked everything and submitted something for a bit of a vital review in June this year. And I just rebased it on top of 4.8. So that seems to work OK. And I've set all the registers you need to set in the Intel. So it has an F4 card that allows you to do TSN streams. But you have to do some registry fiddling. So you need to kick it into QAV mode and set the idle slope and high credit and a few other things. And I think that works. It's not really stable, but it works to send a simple two channel audio stream, for instance, over ALSA so you can mess around with it. And it introduces a few config switches. The drivers are open source. I have a link to where you can find them later if you're interested, and that should apply cleanly to the latest stable kernel. OK. So the idea is you have a TSN core that handles everything pretty much. And the interface to user interface to user space is using config.fs. And config.fs was made to do USB management. But it's a really nice way to actually let user space kick up new drivers. So you just create a new directory, and then it will create a new link for you. And then you just set the attributes using read and write. So it gives you a fairly intuitive way to do this. And in order to use the core, we introduced something we called shims, basically a thin layer between subsystems. So we have an AVB ALSA shim that position itself between TSN and the ALSA subsystem. And that allows you to use ALSA, and it will just ship the traffic off to TSN and out on the network and the other way. And I'd like to make a video for Linux shim and maybe a TSN raw socket shim and so on. So it should be fairly easy to just add new shims as we go along. So yeah, we had to introduce some new network hooks because we need to configure the registers on the network card. And that's not that easy to do from user space. So we added a few hooks into the network device ops. Network driver ops, sorry. Basically, it's the network card capable and configure it. Basically, increase the idle slope on the network card. And I've done that for the RGB driver, the Intel i210 card. Yeah, the shims, basically a thin wrapper, gives you a set of, you have to fill in some functions that the TSN core needs. You need a probe function to basically create the ALSA card and some buffer reflox and drain callbacks when you need to actually move data and close. And if you want to do shim-specific headers, for instance, you have some functions for that. These are not all of them, but I just grabbed a few of them. So when we actually send the frames, we have to cheat a little bit because I can't send a new frame every 125 microseconds or 250 if I do class B. So what we do is we have a one millisecond timer and we just batch out a few frames and then we let the TSN driver, not the Intel driver, paste them out using the credit-based chaper. So basically it just runs in the loop and ships up four or eight frames. We create the frames, copy some data from the shim and the buffer into the frame and ships them off. And the other way, receiving frames, we hooked into the RX handler, which is fairly early in the networking stack. And if it's a TSN frame and it's a known stream idea, we just grab the data and we delete it, drop the SKB soccer buffer afterwards, so the rest of the kernel doesn't care about the data. So we just lift it out and removes it. You do the, yeah, and you call into the buffer drain when you need the shim to go and do something with the data. And when you actually do this, it's basically just load the modules and you can have a look at, these are all the attributes at the moment that you have available. So you have a buffer size, what class you're using, what kind of end station you are, payload size, priority code point is basically the default R3 for a class A and two for class B. If your network supports something else, you need to fix that here before you start sending frames. So all the stream reservation and everything, we just leave to user space to figure out and what you find out, you have to put in here, including remote Mac if you want to do multicast for instance. And it will just give you a default stream idea that you can change. And when you configure it, it will be something like this, change the buffer size, change the remote Mac, use stream idea, new VLAN. The default VLAN ID is two because that's what the standard says. If you want to do something else, go for it. I tend to use one because I have a semi-intelligent switch at home that insists on doing everything on VLAN one if I do VLAN tag traffic, otherwise it will drop it. So hence the one. And when you're ready, if you list all the other devices you have none, then you say I want to do an alpha shim and enable it. And you get an alpha device and you can use APlay or Spotify or whatever. At the moment, it is happiest if you do 48 kilohertz to channel data. It should support 44.1 as well. But I normally tend to work with 48 kilohertz. That's what I've tested. And that's a lot of fun. You can send stream data, music from one machine to another and start and stop music. It's lots of fun and giggling until it crashes. Yeah, so current status. It works on 4.8, which is the latest kernel. I have the device ops for the Intel driver done. Normally I do my testing on a VM and on an i7 with an i2-10 card. I also have the debug mode because I need to test this in my VM that does not have an Intel card capable of doing TSN. And you have the AVB also shim that works. Yeah, and most of the registers for i2-10 should be working. And I plan to send out a new version two revision soon. And then we have a small backlog, some details, some extra shims, actually doing the user space TSN control, doing the MSRP, the MVRP, all the bits and pieces to actually set up the stream through the network. I've left that to user space and ignored it so far. I need to redo the buffer management if I want to do IVD for Linux driver. And there's also some issues. If I want to have more than one stream, I need to sync the timestamps so that requires some more attention. And also how I do the whole network part. So that's pretty much it. If anyone has questions, go for it. Sorry, AES. If we're supporting it, well, again, we sort of just drop that to user space at the moment. So if you have a GPDP demon running, so the open AVB has a GPDP demon and it will log the timestamps to a shared memory thing. At the moment, the driver will just grab the timestamp from the system clock time, so you need to adjust that. That's one of the things that needs to do proper rework because it works-ish, but it's not perfect. So yeah, there's a lot of work remaining. This is not a stable driver, but it's at the stage where you can start to play with it and enjoy your machine blowing up in various interesting ways. Yeah. Yeah, it's last time I tested. I did a play from one machine and a record and piped it to a play and the local sound card and that worked. You will get glitches because I'm not doing any sample reasoning whatsoever. But once we get that sorted, then you should be able to do that without. You will get also buffering left and right though. Yeah, but yes, it should work. Yeah. Yes. Yes, yeah. Yes. Yes, there are other cards. The nice thing about Intel card is that you can get it as a PCI card, so you can just slot it into your machine. There are more hardware coming out that will support AVB and that does support in hardware, but normally there are vendor-specific, user-space drivers. There's not much work going on in the kernel because they see this as a customer value add-on thing. And I get that. I think it's the wrong approach, but yes, there are more hardware. It's been a slow start because a certain big networking company did not support TSN for a long while and then Cisco started doing that. So Cisco is sort of holding it back, but it's moving a lot faster now than a year ago. So there's more hardware available if you're interested. But there's no upstream support for this in the kernel, for instance, there's a few bits and pieces that needs to be sorted out. Yes? Yeah, the i350, it has enough transmit queues. It has eight, I think, but it does not have the credit-based cheaper. Yeah, so the i210 has four transmit queues, so you can take two of them and use for class A and class B and you can take a third, four PDP if you want to and the third, last, four best effort. So having more than one transmit queue is important, but most importantly, it's the credit-based cheaper. And as far as I know, that's the only card from Intel that supports the credit-based cheaper, is the i210. Yeah, everybody happy? Yes? So you don't have the eProm at all? It just boots into default mode. I think the i210 needs to start up from some sort of firmware image and I think I don't know. I would assume that you would need some support, but I'm sorry, exactly what's important, included in the base image, I don't know. You could try to read out the registers. If you look at the i210 driver I did, you can see what registers I'm poking and you can just try to set them and read them back. And if you get the correct value, then it's supposed to work. Yeah, yeah. No more questions? Okay, thank you. Thank you.