 Alright, thank you for coming. This is a talk about real-time networking for real-time systems, and as just a note, this is about where the network itself is a part of the real-time system, not just the real-time system using the network, so just keep that in mind as we go along. My name is Henrik, I work with video conferencing in Cisco in Norway. We make big system systems, and I'm part of the audio group, so we actually have quite strict real-time requirements for the processing that we do, which is why I do most of my work in real-time tweaking of the systems, so tweaking and tuning to make sure that we meet all the deadlines. This has vectored me into looking a lot at traces, that generates a lot of data, so I have scripts for everything, so many scripts that I can't find half of them anymore. And for the last couple of years, I've been working with AVB and TSN on and off, and this is what this talk is all about. So first, real-time systems. A real-time system is a system where the end result of what you said, the end result of the system is not only the logical result, but also the time for which the result was produced. Meaning that it doesn't matter if you plan the perfect trajectory for your robotic arms, if you're two seconds late, you run into something. And this also maps into the audio domain that I work in. Another aspect is the consequences of a failure, so when a normal system fails, the user of the spreadsheet gets annoyed. In a real-time system, the user gets dead, and that's kind of bad. Now, we don't typically kill our users in our telepresence systems, but that's sort of the extreme point of real-time systems, which is why you also have strict requirements for the network that we'll get into. And leading into this, it means that you need to certify your system to make sure that you can meet all the real-time requirements. And for simple single core systems, this is actually quite difficult, because you need to take all the small details into consideration. Do you have virtual memory, for instance, or buffers or all the small details matter? And of course you add more cores than it becomes increasingly more difficult. A multi-core real-time system is actually a conference on itself, so I'm not going to go into detail there. And then some bright chap decided to put a bunch of different cores on the chip, which made it worse. But we're going to do it even better than that, because we're going to add network to all of this and see why this is an interesting problem. Because when you do that, you open yourself up to a new set of challenges. One of them being time. It doesn't matter if you produce your result at the correct time if you don't know what time it is. And no two clocks will ever run at the same rate. The software stack will never run at the same rate on the same machine. The temperature will do something to your clocks, the voltage. And you do have the network adding to this complexity. You have network time protocol that you can tune to get sub millisecond accuracy if you do everything right. And then I mean absolutely everything. And then you have precision time protocol which pretty much solves this problem for you. And I bring this up now because this is really important for the next steps we're going to talk about. And you do have to have hardware support for this to have this work properly and have decent accuracy. But we're not going to go into detail of precision time protocol. The next challenge is a reliable packet switch network. And by reliable I don't mean the switches don't go down on you and die. I mean real-time reliable. The frames will actually go through the network deterministically. You won't lose frames. They will be there on time. And it's because it's perfectly acceptable for a network to just drop your frames. If you have collisions or if the buffers go full, the switches will just drop frames. You will also have the problem with you being queued up in an outgoing buffer. And even though you have a high priority and need to get through, the switch will probably just process the entire buffer before it gets to you. You can improve this by using priority in the VLAN tags. But then you'll have jumbo frames or other frames that are in flight. And one of the problems that you're faced with is you cannot express in a decent way to the network that I want my frames to arrive no later than say 2 milliseconds after I send them. You send them and they will hopefully be delivered without being dropped whenever. And for a real-time system, that's not good enough. You can also not express to the network card that I want this frame to be sent at that time. And you have no guarantee that it will be sent at that time. If you're lucky, it will not be sent too much later. Perhaps it will be sent sooner. There is no way for you to say that. And there's also really difficult to get an exact bandwidth through the network. I know there are ways to sort of reserve bandwidth through some sort of means, but to get really accurate bandwidth and never lose frames once you have gotten that bandwidth is actually quite difficult. And this is what AVB and TSN and ultimately that net tries to solve for you. And of course you have all the other challenges that you'll be faced with when you make a distributed real-time system over the network. That's your problem, not my problem. I'll try to fix the network for you and you can deal with the rest. So, AVB and TSN. Out of curiosity, how many have used AVB or know what AVB and TSN is? Ah, wow, cool. So, a fair amount of you actually. Yeah, good. So, the initiality was that AD and EA's are getting cheap. Max are getting cheap. Why not move digitalization of all of this as close to the source as properly? As possible. And you solve a lot of problems doing that. You don't have to worry about SNR and ground loops in big systems, especially for large AV setups. But if you want to use a network, you need to have a reliable network. Best effort won't help you. But if you can do that, let's just assume that you can get the network going, then you have a very flexible setup. It's easy to route flows from a single microphone to several sources, for instance. You can send it to a mixing desk, to a storage, to a monitor for local feedback, without having to drag extra cables. It also gives you very high capacity on a single cable. If you don't notice, I'm audio, so I think audio. Audio is sort of never an issue for me. I can put a lot of audio through a cable, which is good. The interesting problem we're faced with at Cisco where I work is we're actually limited by the physical dimensions of the backplane. From the first image of the backplane of the codec, you saw a lot of connectors. We're getting to the point now where it's not the internal processing that limits what we can do, but the number of microphones physically connect to the system. You can sort of pull tricks by attaching external mixers and hook into USB and everything. That's another issue. We would like to have control over all the sources that we get audio from. And using AVB, we can expand this greatly, and we're back to the CPU being the bottleneck again. And another benefit, of course, if you have reliable reserve bandwidth, you can coexist with best-effort traffic, so you don't have to run two networks. Initially, they started making simple microphones and speakers, really small, simple units. And there's no point in implementing the entire IP stack on these small units. They just did either frames. They specified a standard profile for PDP, so they could have accurate time stamping. They make some assumptions to the network so they can converge the clocks a bit faster and pick a new grandmaster faster. You use the stream reservation protocol to communicate to the network I want to reserve bandwidth. And they also made this sort of plug-and-play protocol, the discovery and enumeration of AVB units, which means that it's easy when you're on the subnet for AVB to connect a microphone and a speaker, which makes sense in an AV setup. And the downside of that, of course, is that its security is not really, you don't really care who connects a microphone to a speaker, because you run a subnet and it's a professional media setup, so no one outside is going to connect into that subnet, you hope. The focus is reliable low-jitter streams and a way for the admins of that to connect the dots. And to get the reliable stream, they use the credit-based shaper and the only thing you need to know about the credit-based shaper is the idle slope, which is how fast you recharge your credits. And that basically gives you an idea of how fast you can send out frames. So you send a frame and you lose credits, once you're below zero you can't send a new one, and the moment you hit zero or above zero you can send a frame if another frame is not intrinsic. So that's a sort of fairly elegant way of pacing out the frames on the network. Now once they got this working, people started using this for all sorts of crazy things. So suddenly people were using AVB to send sensor data across the network and they were controlling robotic arms. And it became pretty obvious that AVB was not living up to its highest potential, so they renamed it to time-sensitive networking in 2012 and expanded the scope of the project. So not only did they target professional audio and video, they did consumer because things are getting cheaper all the time, and also infotainment systems in cars and trains, messaging boards, all sorts of places where you just want to send a stream and use multicast to send it out simultaneously. Automotive is another interesting use case where they wanted to use this for the control systems for ABS, for engine control, for engine monitoring and you really don't want to lose those frames. So TSN is what you need to guarantee that if you want to use this over a packet switch network. And of course you have the industrial applications where you want to converge the operational network with the information of the IT network to get the whole industry 4.0 thing. There TSN is not the only solution, it's a crucible building block and then once you have that underlying infrastructure you can do all the other cool things. And this really shifted the focus away from giving you a reliable jitter-free or low jitter stream for media to a way to send frames predictably through the network. Not necessarily at a constant rate. It's more important as to be able to send a frame exactly when you want it to and make sure that it's actually sent and propagated through the network in that time slot. Because if you want to replace industrial control protocols, this is really important. So they had a lot of work with time on transmission, frame preemption guard windows and all these things. And also growing to larger networks. Industrial networks will be quite bigger than the average professional media network. And since you grow we also need to look into path redundancy. And by path redundancy it doesn't matter if you quickly discover that you lose a node in the network and you can reroute the flow, then it's too late. So you need to split the flow or duplicate the flows through several paths. So if one path goes down it doesn't matter because you have copies of the frames going through another path. So you will have immediate recovery from that and don't lose any frames. So this was all good, but why stop there? So they formed a working group in ITF to grow this into wide area networks. And one of the things they do, you can do the pseudowire thing or you have IPv6 tags you can use in the network. But that net is more about a set of recommendations for what you should do. This is what a deterministic network should look like. It's not the hard requirements that IEEE sets forth. So it expects that the end stations which can be ITS and subnet if you want to adhere to that and then you will specify over the net network the parameters you want. And of course multi-path routing is even more important in such networks. And they did away with the everyone can configure everything approach and that is actually a requirement I think that you have to have a central controller or a set of specified controllers that those are the only one who can set up and tear down paths. Because if you want to connect to your windmills you don't want just about anyone to connect to those windmills and do stuff. So security is much more important in that net than it was in AVB. The motivation of course broadcasting would like to use this TV if you want to stream to all your home users. People are using this or are interested to use this for blockchains and then it's not the public bitcoin thing but if you want to have large private blockchains for logging or whatever it's interesting to use something like that net because you can have the consensus process go faster. Building automated systems and one of the really active participants in this is the utilities because they want to make sure that you have the smart meters and the smart grid and connect to the power plants to make sure that you have the right grid frequency and you can balance the load. All these bits and pieces is really interested in this and then security is more important of course. But it still has basically the same requirements. You need to have low packet loss ratio you cannot ever say you have zero but they claim that you should be able to get really low. Low end to end latency of course and perhaps even more important for that net is to be able to coexist with best effort traffic because digging new cables is really expensive so if you can reuse the same set of cables for this it will save you a lot of money and that's important. So the important part of the talk which is what's happening in the Linux kernel and last year I gave a talk about a very hackish module I wrote to expose an ALSA interface so you can use that to connect to the network and play audio and it was initially meant to prove a concept. This is really cool. I saw a lot of potential for this at work when video conferencing and we can just pop up a new ALSA device and connect to the network and everything was pitchy. Getting there was a bit of a hack in progress so I sort of did away with all the timing. I hacked directly into the hardware queue on the network card and it broke several times in interesting ways. But the main problem with that approach was basically it called itself TSN but it was basically classic AVB. It did reliable or attempted to do a reliable media stream. It did not have support for sporadic frames for instance. This is really important if you want to support TSN in the kernel. So recently things have been happening quite fast. Before we move on I tend to note or use the word I210 which is a network card from Intel and the reason for that is it's the only network card I know that has a PCI form factor. There's several other cards available from other producers or several other Knicks but they're typically bundled on evaluation boards or made for embedded systems. So if you want something in your computer to test this that's pretty much your only choice. So about a month ago a set of patches from Richard Cochrane which is the PDP maintainer appeared and he had done the time trigger approach and tried to use the I210 Intel card to send frames at a specific time. And the Intel card is it has two modes. It has the CBS mode and it has the launch time mode. Sorry. So he created a new socket option called the SOTX time and that gives you the ability to send a frame at a specific time which is really useful if you want to do industrial applications for instance. And the setup is pretty straightforward. You create a socket and you set the socket option TX time. You have to do a bit more work when you assemble the frame. Basically you have to add a C message and say this is the TX time and then you send it off to the network card and the network card will then add it to its queue and send it off at the appropriate time. This is a working project so there's a few limitations. For instance if you specify the time more than half a second into the future it will crash with how the time is implemented on the I210. So it may be sent in the past and then underflows into the future and everything gets interesting. It also means that if you send frames out of order it won't sort them because you should be really careful sorting descriptors sent to the network card. So you have to do a bit more work and be a bit more careful. This will probably improve greatly in the next set of patches. Richard is aware of all these limitations. He also specified a test which was quite impressive where basically he took two machines and connected them through a crossover cable to avoid interference from the networking system from the network itself. Running preempt RT which he has tweaked to an impressive degree using the I210 card of course and then he had a look at if I send this at predictable rates what's the offset from when I expect it to arrive. And he did this in pure software implementation and he did it in hardware as well to show the difference. And looking at the first column you have the plane approach without using the hardware offload and you have a difference of at most 75 microseconds which in itself is pretty good. For real time requirements it's nowhere near what you need but even so it's pretty good. The problem is the peak to peak variance which means how much the frames vary in distance to each other which is quite large. But if you look at the next column where he actually offloaded this to the network you see that the peak to peak has dropped to 100 nanoseconds which is pretty good. So I'm a bit excited about this. And it really shows that the networking card is able to deliver what we need and it's possible to write a driver that really works. And he also did for a period of 250 microseconds which is the AVB Class B frequency of frames. So it means that you can do proper Class B with this card using this as well. That's good. Sorry. But at the same time another set of patches dropped from Intel they have spun through several iterations of the patches and it's looking very good. They are implementing the credit-based shaper approach. So basically giving you a steady cadence of frames going out. And they also provided a proper Q disk so it's easy to use and you can configure it through traffic control. And it hooks into the multiple Q priority Q disk. I'm not going to cover Q disk in great detail so if people are unfamiliar with Q disk you have a lot of reading to do. Enjoy. But basically you just give your socket a priority and then you start sending frames to it. And if you have configured the CBS scheduler and the MQPRIO scheduler correctly it just works. So to set up this you have a rather nasty looking traffic control command. It's really not that ugly once you sort of deconstruct it. You create four Qs and you provide a map. And basically priority 0 and 1 is here map to Q3. Priority 2 to Q number 2. Now priority 1 priority 2 to Q number 1. Priority 3 to Q number 0. And these are the default priorities set in the AVV standard. You want to do class A. That's by default priority 3 unless the network has specified something else. And priority 2 is class B. And all the rest are given to Q2. And then you list the Qs. And hardware Q0 and 1 both have the CBS scheduler and the time-triggered launch mechanism. And then you can attach the CBS scheduler to one of these Qs. So here we attach that to Q number 0. And we give it an idle slope which is basically your bandwidth. So here you ask for a 20 megabit bandwidth. So that means that that scheduler will not send out more than 20 megabit to the network. They did not specify a test. Oh, sorry. And this is how you use it. Sorry. You create the socket standard way and you just set the socket option SO priority which is already present in the kernel. And give it a priority, so 3. And then all the frames you send to this socket will be fed to the CBS scheduler and paced out onto the network. So I decided to give this a test. And my test is not up to Richard's standard. So there's a lot to improve here. But just to sort of get a feel for does it work and how well does it work. So I had an old graphy computer at home with an I210 card. And it's going through a couple of switches up to my workstation. I tried to improve a thing, a few things by setting priorities and locking things down to improve things. But I'm not using preempt RT. I'm traversing the network. I'm not using a crossover cable. There's other stuff running on these computers. So the test results are not really clean, but hopefully they'll give you an indication to how they work. And I did this with and without load on the system. Basically just a parallel milk make just to induce some load to see how that affected the talker. And the first two ones are when I using hardware offload. And you see there's a few outliers in both ends and they are typically connected. So when you go to 1,000, there's probably one zero below because there are two frames back to back. But all in all, the spread was fairly good. It was centered around 500 microseconds, 514 for the hardware load. A bit lower for software for some reason. Still not quite sure why there's a difference there. And the standard deviation was fairly low. And if you look at the software approach, you see that without load it's not bad, but it's not perfect either. And when they added load then it got really bad. The interesting thing or the good thing here though is I got a constant 20 megabit outgoing bandwidth and the frames were pretty evenly spaced even though I had crappy hardware to test this on. But the load on the system was zero. This did not cost me more than zero cycles, but the load was fairly low, which means that using this in an embedded system for instance is really desirable. So I'm eager to test this in one of our products. And finally I just plotted all the samples. In the lower right corner you can see the software with load. Can you spot where I read from disk? And you see there's a few spikes. So this is basically not a real-time system where you have all these spikes here. But all in all it was fairly stable. So future, I need to redo the evaluation and do proper pre-empt RT and do away with all the mistakes that I didn't have time to do before the conference. This arrived a month ago. So I've been happily playing with this. I would also like to do a CBS-like Q-disk based on Richard's transmit time. Because then you can really compare apples and pears. I hope both get merged into the kernel. Richard requires a bit more work on his patches. I think the Intel set will be merged. It's looking really good. And also as I said the load on the systems were next to nothing. But I need to make a proper load measurement for that. And of course I want to use my old also shim TSN driver for this to see if I can hook into it. Sorry. So yeah, that was my talk and my journey through the latest patches in TSN. And it's looking really good. So if anyone wants to use TSN in a Linux kernel, you actually have patches that works really well now. And it will hopefully be merged soon. So thank you. Questions? Yes. I can give you or I can just repeat the question. I don't know what's easiest. Just ask and I'll repeat. It's fine. Is there a fallback for the txt time as well? In the patches so far? No, not at the moment. So that is one of the things that will probably hold the merge back. Because they're really strict about having both the software approach and the hardware approach when you do hardware offloading. There's a few other things that needs to be worked out in that as well. But currently this only works for the i210 driver for the txtime. Yes, the i210 has the hardware has dedicated support for both CBS, continuous cadence of outgoing frames, but also for launch time. So you can specify exactly when a frame should be sent. You have a half a second window basically where you can specify. And it's tied into the pdp hardware as well. So once you synchronize through pdp the time on the network card will be the pdp time. Yes. The microphone is approaching. How about support for all our network cars? Is this emerging or is the only one now? Currently this is only for the i210 card and that's basically because that's the easiest car to test with. But there are other vendors that have, especially the launch capability. So NXP has one, Synopsys is working on one, Renesos has one. Yes, Renesos H3 has, yeah, so Renesos will. So more support is coming, but currently there hasn't been a central framework for doing this. So this is sort of a chicken and egg problem. So there are hardware support and currently there's a vendor specific. You can use their proprietary things, but we hope to move this into the kernel. So it will hopefully grow a lot more. Yes, all the way back. Sorry, I didn't catch up. What are the limits? Limits from multiple streams. Yeah, so currently the Intel approach, the CBS, does not care about streams. It only paces out frames. So both of these approaches doesn't really look into the packets. The SLTX time just looks at the attached C message for when to send the frame and it's up to user space or some other kernel driver to multiplex into several streams. So if you have several streams you need to do some juggling in user space. Acudisk is like a, so Acudisk is basically a software queue or a scheduler for outgoing traffic. So you have several different queue disks that can do different things. So the MQ prior that you tie into here is basically just using priorities to map to a hardware queue. So you can stare traffic to a specific queue and then you tie into that with a credit-based shaper because we have available hardware for that. So Acudisk is basically just a scheduler for packets and attaches to a separate queue. I'm not sure if that answered your question. Yeah, I wanted to ask another one. Okay. It's very high. I'm doing embedded software not related to voice or anything for satellite and I'm using multiple queues. And we have constant time queues that now it's in a high priority. But now if I can make sure that the Linux will transmit to the hardware or to the interrupt, doesn't matter what, any constant delay. So I can use it instead of priority which doesn't make sure that it will be constant delay. Yeah. So this will give you the hardware support and support in the kernel for setting a specific time for transmitting a packet. Yeah. But you still need to create your frame properly if you wanted to traverse the network. Right. So both of these approaches doesn't really care about AVB or TSN as long after the frame has left the port. But still I can open a socket with the right attribute to a constant delay. Or I'm mixing. Are you thinking through the network or just transmitting from your application to the network card? No, from the Linux kernel to my driver that is forwarding into the hardware. Yeah. So now you can set a specific time for transmitting and then the network card will send it. What point is if you want... To the binded API, right? When I'm... Okay, I'll talk to you later. Okay. Yeah, that's fine. Thank you. Any other questions? Okay. Everybody happy? Okay. Thank you. Thank you.