 Good morning. Good afternoon everyone. My name is Wang Yi. I am a cloud software engineer for Intel I have been working on starting X project for more than two years Today the topic I'm going to share with you is time-sensitive networking enabling on starting X Okay, let's start my sharing Here is the agenda. Firstly, I will brief the concept of time-sensitive networking Through the introduction, you will have a basic understanding of what TSM is Secondly, I will introduce how we enable the TSM on starting X platform Next, I will share the details about how we verify the TSM functionality we enabled In the last part, I'm going to discuss the future work about the TSM on starting X TSM is not a single technology. It is a set of standards developed by the IEEE 802.1 working group The TSM standards defined some new functions for internet networking such as traffic shipping, frame preemption, traffic scheduling, ingress policy, and seamless redundancy, etc These new functions provide a whole new layer of control for managing internet traffic For time-sensitive traffic, TSM can guarantee determinism with bounded low latency low jitter, and extremely low packet loss Furthermore, TSM can allow non-time-sensitive traffic to be carried through the same network That means critical and non-critical traffic can coexist in the same network And TSM still can guarantee the timely delivery of critical traffic The TSM standards were designed for four aspects respectively synchronization, latency, reliability, and the result management Synchronization means all nodes in a TSM network should share a common understanding of time Then the behavior of the nodes could be coordinated and scheduled on the same time basis IEEE 802.1 AS will define for this purpose Roughly speaking, the standard at 0.2.1 AS is a subset of IEEE 15 ADS standard 15 ADS standard is also called precision time protocol The abbreviation is PTP With PTP, it is possible to synchronize distributed clocks with an accuracy of less than 1 microsecond, where internet networks latency is for deterministic and prioritized packet transmission TSM has defined a couple of standards for that such as 0.2.1 AV, 0.2.1 QVU, QPV, and QCH Reliability is for robust transmission There are three standards defined IEEE at 0.2.1 CB, QCA, and QCI The last one is resource management It is for consistent network configuration TSM has defined a few standards to achieve automatic configuration for a whole TSM network Real-time transmission services are required by many industries such as industrial automation, transportation, automotive, etc Industrial Internet of Things is one of the most important user scenarios for starting X That's why we need to enable TSM on starting X platform In this page, I'd like to explain a little bit more about the standard at 0.2.1 QVU That is one of the TSM's standards we enable on starting X The idea of at 0.2.1 QVU is there are some traffic queues Packages to be transmitted will fall into one of them According to their QOS priority Each queue has a gate The control module called Time Aware Shaper can open, close these gates at a specific point in time So the Time Aware Shaper can close the gates for non-critical traffic queues while transmitting critical traffic In this way, the delivery of critical traffic can be guaranteed It already has many support for TSM In your space, there are some utilities The package IPR2 includes a set of utilities For example, the Utility IP can be used to create and config Linux Wayland Network TC can be used to config QDisk ETH2 is a utility to config the behavior of its network interface cards Linux PTP is an open source project It is an implementation of PTP protocol We can use it from time synchronization on Linux platform In Linux kernel space, there are some software components called QDisk that is queuing display From software's depth perspective, QDisk is a lower layer in the network stack It is used to control traffic such as prioritize schedule and shape, et cetera Linux already implemented a couple of QDisk such as TEPRO, MQPRO, CBS, ETF, et cetera They implemented different TSM standards TEPRO called Time Aware Priority Shaper It implemented a 0.2.1 QBV standard ETF called earliest tick time first It is based on a feature called Launch Time provided by a few network interface cards The idea is we could pre-compact expected transmission time for each packet Then, the hardware will follow the configuration to send packages at a specific time In this way, we could achieve better determinism RGB and IGC are Intel network interface card drivers They have TSM support Here in the TSM work we have done on Stunning X We enabled two TSM standards 802.1 AS and 802.1 QBV in Stunning X card containers Regarding hardware, we choose Intel i210 network interface card because it has native hardware support for TSM For software, we use Stunning X 4.0 release Of course, the latest master version is also working We choose the card container as our first step to enable TSM on Stunning X rather than generic container The reason is when we started this work, Stunning X were using Sennheiser 7.6 The current version is 3.18 The version is kind of low It doesn't have some TSM features we require The card container is more like a virtual machine It has its own kernel image This gives us the chance to customize a kernel image with required TSM features For kernel image, we choose Ubuntu 20.04 as the base image We also installed required TSM utility into it Choosing Ubuntu 20.04 rather than a lower version such as 18.04 is to get the latest TSM utilities which supports the TSM features we require 18.04 doesn't meet our requirements To enable TSM in card containers, here is what we have done on Stunning X platform We made a customized Linux kernel with TSM support for card containers We also built a container image with the TSM stack We figured out a solution to pass through I210 network interface card into card containers We also figured out a solution for time synchronization in card containers The last one is we enabled TEPRO and ETF2Q disk in card containers Due to time limitation, I don't have time to go deep into each item So if you are hoping to learn more details, you could refer to the document in Stunning X community Below is the link In order to verify the TSM functionality we enabled on Stunning X platform We set up an experimental TSM network As shown in the picture, the TSM network consists of one TSM switch and four nodes The TSM switch is a generic PC with a control TSM switch card The TSM switch card is a PCIe card with 4, 1 gig ethernet ports It is based on Intel's second way FGA And it can support 802.1 AS and 802.1 QBV The four nodes are all in Intel's Hariskania NUCs It has 1 Intel, 1 Intel, I210 NIC One of the four nodes were installed with Stunning X Simplex The rest three nodes were installed with Ubuntu Here is the software architecture for TSM verification The left part is the Stunning X node On Stunning X platform, we created a card container By using the container image with the TSM stack we created Then we path through the Intel I210 NIC of the host into the card container In the card container, we did time synchronization and enabled to kill disks TPRO and ATF by utilizing these TSM utilities A small program TSM sender were running in the container It periodically send a packet to the Ubuntu on the right side In Ubuntu, we did time synchronization too And a TSM receiver program were running on the Ubuntu platform It received the packet from the card container And dumped the time information for later analysis The Ubuntu node is the receiver side, we don't need to enable kill disks Doing time synchronization is enough for the test The time synchronization across the whole TSM network is a little complex We used five steps to achieve it In the first step, we synchronized the system clocks of the TSM switch and the Stunning X node With an external clock by NTP protocol In the second step, we synchronized the PTP clock of the TSM switch with its system clock In the first step, we synchronized the PTP clocks of the node 1 and node 2 With the PTP clock of the TSM switch In the first step, we synchronized the system clock of the node 2 with its PTP clock In the last step, we synchronized the system clock of the card container with its PTP clock Through the five steps, the three components, the TSM switch, node 1 and node 2, are on the same time basis We didn't form time synchronization on node 3 and node 4 Because in our test, we didn't use them for critical traffic transmission We have three performance indicators to measure the TSM performance It is to prove TSM is functional on Stunning X platform As shown in the top figure, it is a complete period In the first step, we print the time when the interval starts When the sender program wakes up from the sleep state TX program is the time the sender program calls the socket function to send a packet TX kernel-net scheduler represents the time we preset when a packet should be sent by the hardware Rx hardware is the time when the NIC on the received side receives the packet Rx program is the time the received program receives the packet by the socket function in the user space All the time information was collected by the test program, the TSM sender and the TSM receiver Based on them, we can calculate the three indicators The first one is scheduled times It is the interval between interval start and the Rx hardware It can indicate the determinism TSM technology brings to the system RTL specification latency is the interval between interval start and the TX program It indicates the real-time performance of the OS system It doesn't depend on TSM technology TSM network jitter is the jitter of the scheduled times It shows the variance of the scheduled times It is also a determinism indicator of TSM performance In the test, we set the period as two milliseconds When packet is sent per cycle under the scheduled time, that is the time when a packet should be sent waiting a period while the hardware is set as 1,250 microseconds During the test, around 10,000 packets were sent from the starting X node to the Ubuntu node The right figure is the test result The top figure is the scheduled times As you can see, the scheduled times for all packets are around 1,253 microseconds The variance is very small It shows the determinism performance is very good Let me explain a little more about the value 1,253 we got Roughly speaking, it is composed of two parts 1,250 microseconds comes from the scheduled time we set And a little more than 3 microseconds is spent on the link From the starting X-NIC where the TSM switch to the NIC of the Ubuntu node The bottom left figure is IT application latency The value varies in a wide range From, as you can see, from 3 microseconds to 7,058 microseconds Since we didn't use IT kernel for both the starting X host platform and the card containers The result is expected This performance can be improved by utilizing IT kernel The bottom left figure is the TSM network jitter You can see a very, very sharp peak It shows the variance of the scheduled time is extremely small Less than 100 nanoseconds And I know in some critical industrial user scenarios The jitter of the scheduled time is required to be less than 1 microsecond Our test result shows that TSM technology can meet the requirement The second test is a stress test We remain the same configuration for the first test In addition, we launched an IPerf program on node 3 The IPerf program sends massive traffic to node 4 through the TSM switch We use this way to emulate non-critical traffic The massive non-critical traffic and the time sensitive traffic co-existed in the same network On the right side, you can see the IPerf sends data to the network At a rate of around 956 megabits per second The pulse of the TSM switch and the knock are both 1 gigabits So 956 megabits is almost the maximum rate we can achieve But that rate of the IPerf received side is around 371 megabits Not 956 megabits The reason is we enabled 802.1 QB way on the TSM switch And so 40% time slice was allocated for IPerf traffic On the left side, you can see we got a similar result with the previous test This test result proves that TSM can allow critical and non-critical traffic Co-exist in the same network And the TSM still can guarantee timely delivery of critical traffic Though we already enabled TSM in studying its card containers We still have much TSM work to do on studying its platform For example, we need to spot TSM in generic containers And the IT performance is not good, we need to optimize it And currently we can fix the TSM network manually In the future, we need to enable automatic configuration for production Another interesting topic is how to share the TSM capability among containers Given the number of TSM needs on a host is limited But now we only have a recipe for the integration work In the future, we need to do further integration with studying its platform That's all for my sharing, thank you for your time