 Hello and welcome to my talk. My name is David Cermak and my topic for today is about validation of the TCP IP stack on the embedded device. So now let me decompose this title into validation of something on embedded devices and this practically defines the first part of the talk, which would address not only validation, but also verification in general on embedded devices. Let's now focus on this something, which we are going to validate. It's your TCP IP stack. So not a network stack in general, but the exact one, the one that is used on the device and that is compiled, linked and flashed into the system. So these are basically two topics, which I'm going to talk about. But before going ahead, let me briefly introduce myself. I am an Embedded Developer at Espressive Systems. I frequently commit to Espressive Repositories, ESPIDF, this is the official development framework for IoT and also to some other Espressive Repositories, for example, ESP MQTT or ESP LWIP and this is, again, the official Espressive Network Stack, or better said, it's fork, because we use LWIP or Lightweight IP and this is practically an industry standard for many other microcontrollers and smaller devices. My main area of interest at Espressive is in networking and protocols and this perfectly matches the network stack testing and matching as well. And of course, this is the subject of the talk and also the reason for having this contribution, because it talks about what we do or perhaps what we should do at Espressive in this subject. This is, I think, very important to highlight that users of the ESP platform and the products could benefit from these techniques as the actual design of IoT system could be implemented and tuned on a developer machine, not even touching the actual chip. Not only Espressive chips, but also other platforms for IoT and other vendors solve the same problems of validating their network stacks. So sharing methods, issues and especially batches is essential here. For example, some of the conformance tests that I'm going to show later are developed at Intel and we run them and call them Intel Netsuite and run them on Espressive chips. Links and credits are shared in the last slide with other references. So this was about me and now let's get back to the talk and this is the agenda of this presentation and surprisingly it is divided into these two sections which the title talks about. First part is a more general one. We are going to look into the common methods embedded people are typically used for testing and debugging and most importantly we could compare and contrast them especially when talking about testing internals of the TCP IP stack. Second part is a bit more practical one and I would give here specific examples and lessons learned on different methods used for testing and here I would talk about conformance tests. So basically if a specific protocol implementation complies with the standard part and the second example I would give a different example on using fuzzer techniques for finding issues and design flaws, implementation flaws in specific protocol implementations. So let's move on to the first section and here are basically the options which we typically use for testing debugging, analyzing, reproducing issues, regression, CI, simply everything related to development on embedded systems. So we could basically test on the real device or we could recompile the thing that we test to a different platform and run it on the developer's machine and we can also do something in between. We can use an emulator. So a computer program that emulates the target platform. Let me mention here that as an emulator we use the QMU for ESP platform and this is an open source emulator and virtualizer which is also widely used in other projects and fields. So here we run the target platform practically as a virtual machine on the developer's PC. So now let's go to the target tests. Testing something on the real device under the exactly same conditions that are used in the production or in the field. Here we have an application or test application sending the packets and expecting some responses or timeouts. On the other side we have the genuine target device running the exact application and the software stack under test which in our case is the highlighted box of the TCP IP stack. Since every engineer is partly a natural scientist or used to be one when he or she was a child I would like to compare this method to so-called in vivo experiment or in other words in the real life experiment where we have the subject under test which is this little rabbit and we inject a TCP packet into it and we watch it, watch its behavior to see if it is what we expect. This rabbit is in fact in fact our TCP IP stack under test which runs in its natural environment living on top of the exact same software layers and running on the exactly same bytecode and silicon as if it were if it were in the fields. Another method host tests exactly the opposite method quite far from the real life. This setup is what embedded people usually call host tests. The device under test or its part the part of our interest runs as a computer program on the same machine as the test application. Here it is the only source code which is the same as in the previous setup. So basically we recompile the portion of the code that we test into a different platform and run it under that platform. And again looking at this method with scientists eyes we might call this in vitro experiment which means in the glass experiment. So extracting only the fragments of our interest put them into a chemical glass and make all kinds of experiments on it using all available laboratory equipment and this is the main benefit of this method. We can use all the tooling for analyzing debugging profiling validation and do everything what we would be like to experiment on much more advanced faster and platform like feature richer platform than the platform of our real target. So we are in fact in the lab crammed with modern tools and equipment. Using an emulator is something in between. It is actually a host test per se but the program under test runs in its original form compiled for the original target platform and runs practically hosted inside a virtual machine. This method would scientists probably call in situ which means in its original place as the TCP test TCP stack under test runs kind of in the matrix of the original environment executing its genuine machine code and at the same time we are still in the lab. So we are on the platform rich on tooling and equipment and here comes probably the most important slide of my talk practical comparison of these methods and practical recommendations. Host tests are good approximation for most scenarios where we check for security issues array overflows wrap around variables writing behind the array as well as conformance testing protocol related issues. Target tests are essentially for testing the timing issues race conditions or driver related issues or checking on the exact reports by customers who provide steps to reproduce on the target device. Target tests are not very suitable for automations for automation tests in the CI where we require very good stability and robustness of the test. Emulator tests are very useful everywhere where we have to use the target platform and we still want to benefit from the host platform. Example could be let's say port related issues since the port layers are different in the host tests. Again the emulator is very useful but the number one choice for 99% of use cases are in my experience and related to testing network stacks these are host tests as they are fast they are reliable robust and provide good enough approximation for validating protocols. One example where the host test would not help and this is this one remaining percent of use cases is where I would suggest using an emulator and I would give an example of one recent issues that we had and this is this TCP close refused issue so this is an example of the issue where the TCP IP stack receives input packets the input processing currents in a separate thread and the data are posted to a queue this queue is of course of a limited size and especially on the target platform nevertheless the stack handles queue overflows correctly and this this works perfectly for data packets because this is the usual case the problem comes when the queued message is not a data packet but just a flag and in this case it is a thin flag indicating the connection closure and this is actually very very unlikely case and this is not handled properly properly so we may end up in half open half closed connection and at this this is actually the problem that we had and it could only be reproduced on either the real device because we need to use the real port layers or using an emulator and this slides actually concludes the first part of my talk listing benefits and drawbacks of these methods which we've already discussed and explained let me just mention here that in case for a host test we generally need to port the code to the host platform in the first place and this might not be a trivial task also running the targets inside an emulator we may experience some subtle differences in timing and when talking about the tcpip stack the i o device or the driver to this i o device so so the device which is sending packets to and fro is generally different let's move forward to the second part of the talk giving specific examples on tcpip stack tests the first one talks about conference testing using ttc n3 engine but i would briefly introduce the workflow on the ttc n3 and explain why it is so important and useful and would describe the environment and lastly give specific example so this is the official homepage of the ttc n3 and this acronym stands for testing and test control notation version 3 and this is an example of the test case written in that language similar to what i have shown in the pictures about the different test setups this test case actually sends a syn packet and expect syn and acknowledge packet and if we receive one we pass the test otherwise we set the verdict to fail as you can see this is a very simple language useful for testing protocols even though it's so nice and simple you may ask why we need yet another language to write out test cases to answer this let me open another web page this is the titan core implementation of its compiler on on github and this is the nice thing about ttc n3 that it perfectly isolates the module from the port and from the platform so the test case itself is perfectly platform and i o agnostic so that we can we can easily reuse the implementation of the protocol and this is an example of the protocol modules which are implemented in ttc n3 and are available here going back to our example here is the test setup for our conformance testing you can clearly see that this is a target test also not 100 percent the target test i've shown before since the input and output media to pass packets to and from is a bit different because we convert this network data into a byte stream and pass them using standard input and standard output of the board to to the test application and back to the board the reason for for doing it this way was because of some historical reasons and we already plan to refactor this setup to a standard host test so this setup helped us identify certain issues or certain cases and scenarios where the lwip did not comply to the specs all of these valuations were just the let's say corner cases when the other part of the connection the other endpoint sends something unexpected as an example of such a corner case i'm showing an issue when the sender sends a packet with the no flex setup no flex at all no tcp flex set the correct reaction should be like completely ignoring this scenario but the lwip fell into its default case and was actually sending an rst and this is this is a like a simple way to to fix this all right so that was the conformance test and now let's move on to the second example which is using fuzzy techniques for finding issues and vulnerabilities in the tcpip stack we run these examples as host test and use afl for fuzzing afl stands for american fuzzy law and it's one of the smart fuzzers which are actually fuzzers that use some guy dance to and use code coverage as a feedback to exercise the input vectors therefore the tested source code has to be implemented and this is an example of our execution on the thcp server on on idf again links are shared in the last slide with other references and let me just mention here that afl provides this nice table about all the statistics and results but what is important here from our point of view is that this is actually the thcp server that we test because this is an expressive implementation it's not part of lwip so so we have to test it properly lwip itself has some kind of fuzzing and we also test some some other some other components of lwip using this father but this is probably the most important step test that we that we have here and now let's switch to an example of a typical issue found found by fuzzing so here we just use an error rather than asset so actually converting this case from unreal to something that could happen under some circumstances but we just drop the packet record an error write some error message to the console and continue rather than completely crash the system and now we are moving towards the end of the talk so here are the links and the references to the examples that i have shown and the tools that we used in the presentations so this is just for the reference and you can actually run all the tests at least these two examples and also other tests related to these following the links provided here and finally let me conclude this talk with these three words of summary so if you are working on embedded device and on some network protocol related tasks just try to compile it on the host it will be faster and it will be easier and it would be a good approximation for the task if you are using esp32 check out the qmu it's an easy and convenient method to run it inside your linux or macOS or windows machine if you are testing conformance to certain protocol check out the tdc n3 language and its implementation and chances are that the best case is already implemented and you can simply reuse it and this is all from me thank you for your attention check out the links and references and if you find some of this technique useful please try to try to use it thank you again and see you