 Hello everybody. Welcome to DevConf. My name is Davide Carratti, and I work for Red Hat in the Networking Services team, and I'm a resident in Milan. And my work is mostly focused on Red Hat Enterprise builds. Today I want to share with you step by step where I'm at with the attempts of putting continuous integration in TC. I assume most of you know what TC is, is a component in the Linux kernel that implements traffic schedulers, packet classification, and packet bangling. In the case, we are going to briefly see how unit tests can be written within the TDC suite, and then we will see an example implementation for the BPF action, which is kind of trendy. Then we will see how this test case and the whole TDC can be automated in a way that every single patch targeting the NetDev mailing list can run this simple unit test with the proposal of avoiding regression and undesired behaviors. So I chose to use patchU to automate the test and summarize the test results. We will overview the setup of the whole test system and see an example report. Next, we will overview what can be the future development for the TDC suite and for patchU as well. Finally, if there are any questions and they are easy enough, I will be happy to try and answer. So, why am I doing this and why do I think it's useful? As you might know, TC code is in Linux since many years, so theoretically we would not expect to see so many breakages over time. But the reality is different because the TC code is constantly being expanded and improved and sometimes rewritten because there are many users of TC. I want to mention time-basic transmission, which is a new QDCyplain, which has been added by Intel. Also, I want to mention OVS offloads because OpenVisWitch is using the TC layer to offload classifying an action to their nicks. So, one of the strongest proposals for people working on nicks is to avoid breaking things. So, any change of behavior should be done on purpose and any change and any improvement should be designed to be backwards compatible. So, unless it's a bug, the current behavior needs to be preserved. And as you probably already know, CI is one of the answers to these kinds of needs. About two years ago, TDC was pushed to the upstream Linux and it started introducing self-test on the TC layer. And more recently, Petu project landed upstream to provide continuous integration for the QEMU project. So, before going on, it's worth mentioning the guys who wrote most of the code I'm using for this presentation. So, Kudos and thanks to the authors of TDC and namely to Alexander Lukas, Chiara and Roman from Mojatatu. And of course, many thanks to the authors of Petu. They worked for the QEMU project, particularly thanks to Paolo Wanzini and Fan Zeng. Last but not least, Kudos are for you in case you plan to contribute to OpenSWITCH and particularly to TDC and Petu. So, there's a lot of work that needs to be done. Especially if you are new to the Linux camera community like I am. Writing a self-test is definitely a good way for being onboarded. So, this is a mail that David Miller sent to one of my roommates and it's quite self-explaning. Serious applied. So, I assume most of you have basic knowledge of what this is. And in any case, there's a lot of documentation on the web, on the MAM pages. And a fun fact is also TDC is a source for documentation if you have troubles in configuring some TC rule. Anyway, what's relevant here is the interface that programs use when they communicate with the kernel. It's all made of netlink messages that carry inside the configuration data. You can try to get an idea of how big is this configuration space and you try this common. So, you just try to show the simplest option in the kernel and see what kind of messages travel inside. And this is the message. It's a big netlink messages carrying data. And after that, consider we have these objects. 35 Q disks, 12 classifier and 15 actions. Maybe counting because I know of people writing more Q disks and more actions. So, with this space of configuration, the probability of breaking something is not negligible. Now, let's take a real example and assume we want to unit test the BBF, actually. Let's install dummy eBBF program that is going to be executed on the data plane. For example, we want to mangle packets one by one when they are transmitted. So, we install this program and then we query the kernel back to see if the program has been correctly installed. So, we are doing the TC action add with the specified program. And then we do a TC action done to see if the program has been installed correctly. We will be invoking these methods in the counterpart of the action. We will be enclosing in the netlink messages the following parameters which are specific to the BBF action. And we will also use these kind of parameters that are common to all TC actions. Okay, this unit test and many others like this are covered currently by TDC. So, you just need to push D to the TC testing folder in the Linux search tree. And then you can just list all the test cases available for the BBF action and selectively run one tester or do them all. And let's see how test cases are written. Each TC action, every filter and every QDisk has a dedicated list of unit tests. So, briefly, unit test is made of a command under test that we launched. There is an expected exit code. We issue a verify command and check if the match pattern matches match count time. So, back to our unit test example. We assume we did all the setup phase where all BBF action has been cleared. This is the command under test. We check the expected exit code which should be zero. We issue the verify command and we check for the match pattern and the match count. It's quite really easy. So, that's it. That's what is TDC. The test infrastructure can be extended. There are the possibility to use plugins. So, we put some variants on the setup. We can use namespaces. We can use virtual alternates. Or we can use tools like Vigrant or KMM leak to find memory leaks. So, TDC, the number of TDC test cases is growing as the Linux kernel evolves. And please note that QDisk have not tested yet. Coverage will be introduced this year. So, there's a lot of work that needs to be done. And like I mentioned before, contribution in this area are very welcome. So, now that we know how to write a unit test, let's see how to do the CI part. For this thing, there is Petu. Why Petu? Why did I choose Petu? Because it's new. And I like new things. It's open source, of course. And it's used by another big project. So, it's kind of a good thing. This diagram summarizes the architecture used by Petu. There is an important node that checks for new emails. And it pushes information to a server, which holds a dashboard. And pushes to a Git repository. Every patch that is received. Then there is a tester that pulls the server, gets new patches, clones a repository from the Git repository. And it simply does the test. So, let's see how a patch is processed. At the very beginning, somebody sends a message to a mailing list. And there is a mail client that pulls for new content, checks for new messages. It checks if it contains a patch. And similarly to what is done inside PatchWorker, it recognizes the follow-ups. So, the reviewed, the tested, the arcade buy. And it stores these informations. So, the importer applies locally the patch and tries to push to a mirror of the Linux kernel, which is currently in my GitHub. And after that, it creates a tag. If the push operation has been successful, the importer then updates the status in the server like this. So, if the push is successful, the importer puts the G status flag. The G is blue. So, the application has been successful. It can be gray in case the application is not, the patch does not apply. So, the importer is also able to understand follow-ups. So, if somebody reviews the patch, it applies the R flag. And if the patch is superseded by a new patch, patch you understand it and puts the O flag. And in case there is a series, the series may be incomplete, the question mark. And, okay, now it's time to see if our patch did pass the test. So, the tester periodically pulls for new patches, untested patches, and simply clones the tag locally. It compiles the kernel. It launches a virtual machine with that kernel and runs the test. So, if the test goes okay, there is a green T flag. And otherwise, there is a red T flag, like in this case. And that's how the dashboard looks like after the importer applied three patches and tested two of them without finding any issue. You can click on the subject and have it look like this. And since I'm logging in the project, I can reset some of these flags to redo the tester, to redo the apply log. So, here is how I managed to install patch you. Did it work well? Well, the scripts I used to create the tester are on my GitHub. During this work, the traffic on the main list was almost zero because there was a long period where the branch was closed. So, no patch was applied. And also, many TDC test cases were broken. And what is strange is that the breakage happened very recently in the last month. So, we want really to make this stable. So, the current state is there, there is no dashboard on the official patch you. I have a draft dashboard with semi-broken importer that it's working at the unstable patch you branch. Stay tuned because in the next days I'm going to put it online with a live test. What's next? For TDC, a lot of things. We are planning to do functional test on the data part, inject the traffic. We are planning to do performance test to check if the install rate and if the bucket rate degrades with some patches. We need to add more and more test cases for testing queue disks. And we are evaluating inclusion for the CKEI project, which is a bigger, bigger project for kernel continuous integration when TDC will provide enough coverage. And then we need to fix the loose dependency on IP route 2 because many breakages happen just because a commit in IP route 2 broke the current behavior. There is a way to fix this and it's use the JSON-ified output. For patch it's much easier because everything that is in the future is in this nice to-do list on the patch you project. I'm done. Any questions? Till left some candies here. So, many questions. Cool. No, I'm not planning to... Sorry, the question is if we are planning to test driver changes. So the TDC layer should not impact on driver but in some case it does because TDC action and TDC classifiers can be offloaded. So in case a driver offloads a rule and the driver does not behave correctly that's a problem. The solution for this is to use plugins. If you use an S plugin you can specify a network device and all the tests will flow through this network device. So you install the TDC action on a specific device and this way you will be testing drivers. Am I planning to test patches for drivers? No. I don't think it's feasible with this infrastructure just because it's too slow. CKI is the correct thing to do. Speaking of CKI... I know you. You are famous. Me too. Any more questions? There's some extra content. So the git configuration the dashboard as soon as you log in you can just put the script on the server and configure email notifications and configure the git repository. For git hub it was quite tricky for me to find it so maybe there's a reference for somebody. This is the test script it's very easy. I just copy a minimal configuration of the kernel I compile the kernel I run a virtual machine with that kernel and as soon as the virtual machine turns off I grab for results and that's it. Oh! This is the traffic on Netscad in the latest two years. It's not much and that's why we can leverage on a light platform to do this kind of work. Why did the case test? Test case failed because the configuration of the kernel was run. I was testing ABPF without compiling the support for SHA sum and so the ABPF is called the failed system. Well, yes, it could do this so the question is does it report the test failure to mailing lists? It can do this for testing purposes it's better to be really careful when doing this because we don't want to spam the mailing list. That's exactly what typically happens. Thank you very much.