 Welcome everyone. We'll have a talk by Alexander Stefan about writing testing C code with Python Please welcome Alexander. Hello everyone. Thanks for joining the session I work as an embedded software developers. So I write firmware for microcontrollers mostly Unfortunately, this is mostly C code not yet Python code Though recently we've ported micro Python to one of our controllers. So somehow we're getting better Before I start for the talk, I'd like to know a bit more about you to your experiences with unit tests So if you've written any unit test in any language yet, please raise your hand. So get an overview. Okay, great That's most of you And who of you has written unit tests for C code probably in C then Okay, that's probably about half of you Last question then is who enjoyed the experience especially if you compare it to writing Python code instead Well a single guy. Yeah, perfect So then maybe I can show you a more fun way to write unit tests for C code Now you might wonder what my motivation is for that and some of it can probably be summed up with this quote here That the C language combines all the power of assembly language with all these of use of assembly language So with C you've got control of everything and you can control everything But you usually also have to control everything you need to do everything yourself. There's little support from the language and For testing stuff, you probably don't need all this power You're not constrained with resources You don't have that performance requirements that you might have in product in production code So you could actually use a higher level language to make it easier for you to write your test code Don't do everything in a low level language like C Now let's look into that in a bit more detail if you write unit tests for C code with C code Then there are some good things. You've got the same language everywhere So at the developer you do not need to switch contacts between different languages different styles different syntax And that might also be good for a lazy developer who only knows a single language And of course if you're working in an embedded environment like I do Then you could be able to run your unit tests on the target device or at least on a simulated device So that if there are bits and pieces of your code, for example implemented in assembly, you can also test those But it's also a bit limited in some ways I already told about the limitations that the language offers you so you can only use C constructs Which are not as powerful as Python constructs. For example, you need to write much more code and you could in a high level language But you're also limited by what the framework has to offer you and if you look up unit testing frameworks for C code There are tons of frameworks out there, but most of them are very basic They don't offer advanced features that you might be used to when you look at unit testing frameworks that are offered For example for Python code. So there are a few frameworks only that offer mocking, for example And in the end you're also limited by what the ecosystem has to offer For example, we would like to test some cryptographic algorithms in our implementations And of course you can call into OpenSSL to verify some some calculation But it's not really that easy and it might be nicer to do that in Python Now maybe we can do better than that and I've prepared a few examples to show you how unit testing C code with Python would look like So the first example is the most basic thing I could think of We've got a single function in our C code and that just adds two integers and returns the result So this is the header file, the public interface that we want to unit test And this then is the implementation of that function It just adds the numbers and returns the value and if you write a unit test for that It could look like this So as usual with Python unit tests You've got a test case class as Container for all your test cases the single function in there then is your test case We've only got one here And it's rather simple it loads in the source code that I've shown you before creates a module out of that and Then has an object on which it can call the functions that is defined in the module This function returns the result and we can assert that the result is really correct Now you don't see any C code in here and no construct that really do anything with the C code from before the only Mentioning a thing that you see is the name of the module the parameter for the load function And this is where all the magic happens. So let's look into that the load function here Consists of three steps first it loads the source code on the module So it opens the C file it opens the header file and reads out the source code And then it uses CFFI to build a Python module out of that thought out of that source code There are three calls that you need to make on the CFFI object for that And the first call the C def call will tell CFFI what interface it has to export to our Python code So we pass in the header file contents that defines the public interface. We want to test that so CFI It needs to generate the interface for us And then with the second call we need to tell CFFI about the implementation of the function So we pass in the source code here and the last step then is for CFFI to actually build the module that we want to have So it runs a C compiler in the background Builds the module and in the end does the last step we can import that module and return it to our test case And that's really all you need to run this example that I've shown you before Now I've got three more examples that all built on this implementation So I'd like to quickly ask whether there are any questions for this example already So that you can better understand the following examples You mean if you're more than one source file? It works now. Yeah, because if in this example source file I have some sort of includes and dependencies to other source files and how do I go with that? I have to compile them also and link them somehow or how does it work? Yeah, I've got some more complex examples with multiple files and with external dependencies and we'll show that later Okay, any more questions, otherwise I'll continue with the second example And the second example is still rather basic We've got again a single function that you can call multiple times And we'll just add up all the parameters as we pass into it and return the current sum This is its interface and This again the implementation So now we've got a global variable that we use to sum up everything The function just adds to it and returns the current value and The unit tests now look like this to make matters a bit more interesting. I've implemented now three unit tests not only one And so that I do not have to repeat this load call in every test case again I use the setup method this gets executed before each test case is run It will load the module for the test case And then the test case can access the modular just as before can call the function there assert that the results are correct But if I were to run this test case with the load function that I've shown you before it wouldn't work And why wouldn't it work? Well in the source code? there's this global variable there and The load function that we had before it just imported the module at the end And if you know a bit about how importing works in Python those imports are cached So if there are multiple test cases running the first one will actually import the module initialize the global variable all the other test cases will just Reimport just get the cached import back and it won't be initialized again So the assumption of the test cases that the sum also always starts with VRO Doesn't hold here. And so the test cases would fail Now there are Several solutions to this I'll just going to show you the simplest one and that looks like this the load function is still the same just the first line with the comment Has changed or got added Where generate a random name for the module so this avoids all caching by importing essentially a new module every time this function is called Which might not be the most Performance solution and it will also use more memory But it avoids nicely all the problems that you could otherwise have with caching all data For this I use the UID module which just generates a random unique ID and depends that to the file name Which is then used as as the module name all the other code in here is the same as before So each test case can still load the module and gets a fresh copy every time You could also implement that In a different way and when you have when you've imported the module just Re-initialize it every time what that would make it would take more code. So I don't show it here Okay, then example number three and here we are getting to multiple files now Since all the other examples so far were very basic was just a single C file and a single header file Now we take at least a second header file and we want to do some mathematics with complex numbers So we define our own structure for that That has just two fields for the two parts of a complex number for the real part and the imaginary part and we have that in one header file and Then we want to want to implement a function that uses this type So again, we use the example of addition adding two complex numbers and returning the result and We can implement it like this. We just add both parts together and return the result at the end now the test case for this again, doesn't really need to know much about the C code we load the module as before and You don't even have to deal with the complex type that the header file declared somewhere when you want to call the add function you just pass in the lists here and CFFI will automatically generate structures for that so that the C code is happy and gets the correct results and Also, the the result of this function call is a nice Python object where you can access the parts of the structure with normal names and can assert that all this Results are correct But again for this example to work. We can't use the previous implementation of the load function Because in the previous implementation It just looked at the source file and the header file of the module that we want to test It doesn't really know about the other header file that we also need Now if you remember the source code you could say yeah, well the other had a file got included into the modules header file So it should be present there But unfortunately CFFI cannot deal with these include statements So what we need to do is we need to run some kind of pre-processor like the C pre-processor over the source code So that there are no more include statements in there. No other directive that CFFI doesn't understand Otherwise it would throw an error And this is done with this pre-process call in here Again, there are multiple ways you could implement that I've chosen to just run the DCC pre-processor over the source code and get back the results at the end I've got one large string that contains the contents of both header files and CFFI is happy with that Now for the last example It's gets even a bit more complex because now we we have some external dependencies In this case you can imagine your you want to program a microcontroller and Maybe the vendor of the microcontroller provides you with a nice library like this here where you can Read GPIOs using simple function calls and the vendor has chosen to implement different functions for each GPIO that you can access So he provides you with a library that has this interface here But maybe in your code you'd rather like to use this interface You only want a single function call and a parameter to select the GPIO that you're interested in Now you can implement that in your own code You just look at the parameter call the appropriate function and if you get a parameter that you cannot deal with you return some kind of error code And now this is the code that we want to cover with our unit test We don't want to test the vendors library So we don't want to use the the read GPIO zero or one calls here We probably couldn't use them in the unit test because they might access some registers of the microcontroller that aren't there in our test environment So we somehow need to replace those calls With our mock functions that we can run a test case that knows what the GPIO values are The test case for that looks like this The first change that you'll notice to the previous implementations is that the load function now returns two values Not only the module as before but also an FFI object That's part of CFFI's interface and we use that in the first test case To replace the C function that we don't want to use with a Python implementation So we define a function that has the same name as the C function we want to replace and we tell CFFI Hey, when this C function gets called, please use this Python implementation instead Don't use a C implementation that you might find somewhere and so the Python implementation just can return a fixed value When the test case can call the function that we want to test with the correct parameter and See that the value that is defined before is returned in the end And the second test case for the GPIO number one it doesn't the same thing but using a different construct So in this case We don't want to really define a function But we want to use a mock object like you might be used to from the unit test library And you can do just the same with it You configure your mock object to return a value when it's called and then tell CFFI Hey, this is not a function But just something else that you can call use that in place of the C function and Then the test case again works can call this function and at the end you can also use the Assert methods that are provided by the metric mock function And in this case again, we need to modify the load functionality This is again for comparison the old implementation and we need to add some more code to that for this example to work There's three changes here all again marked with the comment The first change is that's not sufficient anymore to just process the header file for the module But we actually need to process all the header files that are included in this module So it just uses a regular expression to collect all the include statements Then runs that through a pre-processor and as a result gets one large string again That contains all the include statements all the all the content of the include files of our module And the main work then is done in the next two lines Where we need to tell CFFI which functions we want to replace with Python code and which functions are implemented in our C code So the first line just goes through the source code and looks for all the function definitions so that we know Which functions are implemented by our source code and the second line then Goes through all the includes that we have looks for all the function declarations in there And whenever it finds a function that is not implemented in the source code It will tell CFFI. Hey, please insert a Python implementation in here that we can replace later The functionality is all there in CFFI. We just need to prefix the function declarations with this extern Python plus C Statement then CFFI will know okay. I need to generate some code for that And this will already make the compiler happy. It will find a reference for this function So it can call it and we can later replace it with Python code And in the end the last change is as I said before that we now need to return this FFI object also from the load function So that the test cases can tell CFFI about the the implementation that I want to use Now I'll show you a bit more detail how this step in the middle works where we Analyze the source code to find the function definitions This is based on pi C parser and this is the first part that collects all the function definitions So pi C parser will analyze your source code and will build an abstract abstract syntax tree out of it So you can later walk this tree with the class that's already provided and whenever you hit a function definition This wizard function here is called It will get the note out of the tree and can just ask this note Okay, what is the name of the function will add this to a list and so in the end once it has walked through the Whole tree you've got a list of all the the functions that are implemented in the source code all the names of the functions there And this is then used in the second part again based on the pi C parter module Where we actually parse all the Include contents into an abstract syntax tree and then tell pi C parter to regenerate the corresponding C code from that so that we can modify some bits of that and Pi C parser already has support to regenerate code from the tree and we just took into that and whenever we see a declaration for a function This is then again the visit function for declarations we look at the declaration there and see whether it's a function declaration and If it is and the name for this declaration is not in the list of functions that we found in the source code Then we'll just prefix it with the external Python plus C statement So that when TFI again parsers the source code it will know what to do with these functions Okay, this was the last example that I wanted to show you so to sum up I want to talk quickly about some of the drawbacks that this approach might have if you're used to other approaches and One of the main drawbacks is probably that if you use this code as I've shown it to you If your C code does something bad and tries to access the null pointer for example Then it will also crash in the test process because the code actually runs in the same process There are no boundaries between it So when your C code destroys something your test will crash you won't get any nice error reports and you might not like that So one solution to that problem would be to run each test case in a separate process and have one main process collect all the results None of one test crashes just crashes the single test case The main process can still report on the errors and all your other test cases will continue to run This might add a little overhead of course because now you have multiple processes running that Yeah need need some more computing time But at the same point at the same time you can also run your tests in parallel So if you've got multiple cores it might actually be faster in the end than running everything in serial And another big problem might be that debugging your test cases gets harder now because you've got a Python process that calls some C functions that again might call some Python functions and Where really do you debug that you can attach a debugger to your Python test cases? But that won't help you much once you enter C land You won't see what the C code does there or you can attach a C level debugger So you can see what your tests or what the implementation does, but then you have to deal with all the the C Calls that are done by the Python interpreter internally and that you need to skip somehow So it would be nice of course to have some maybe better integrated solution here some Combination of two debuggers for one for the Python flight one for the C side But smoothly hand over control once you enter the at the other part Or one could also argue that since we are talking about unit tests here If you really need to debug your unit tests Maybe you could also think about Simplifying your code simplifying your unit tests or even the implementation So that you don't need to debug them in order to find a problem But so that you've got unit tests that really can tell you where the problem is when something breaks But to end on a positive note if you're going to remember something from this talk I'd like to I'd like you to remember that writing the test cases is really simple And no matter how complex your C code looks like you've seen all the examples that I've shown you the test cases look pretty much the same Because all the complexity that you need to care about is hidden inside CFFI and the replica that I've shown you here as A test case author you don't really need to deal with that You just can concentrate on writing your test cases and you need to solve the hard parts only once Have it in a generic part of the code and never look at that again as long as it works So thank you for your attention Yeah, we have any questions and can you run tests from see from Python on a compiled library for example like from a Built binary code. Can you import that in CFFI? Yeah, that is one of the main use cases actually for CFFI that you can interface from Python to existing libraries So that you can build a nice Python interface for for libraries that already exist without needing to reinvent them So that's of course possible. This approach was more meant to test the source code So it passes in the source code not a library but of course you can also tell it here use the existing library and You could probably also do this Could you do this trick with mocking or like put alternative function or Function definitions in a loaded Shared object or something like that. Do you think that this is possible? Well, that depends if the function that you want to mock is not part of the library But would be part of another library and you don't link against that's library then it should be possible because then you have to Insert your own implementation of that function anyway for it to compile But if you want to mock a function that's part of the library that you you want to test and it's implemented in there You can't really replace it because it's part of the same binary and the code will just call the function in there You can't really take it out and insert another implementation there So you cannot switch out the binary calls, okay, I would refactor my C code say Change the name of the fact a function as a signature and Forgot to add up my test how easy it is, you know to spot the mismatch. Do I get the proper error message or does it just crash? No CFFI will tell you if you want to call a function that doesn't exist that well There's no no such attribute on the module You'll get the usual error codes for that if you change the type It probably depends a bit on how compatible the old type is to the new type If you maybe change an int to float or something like that You might even not need to to adapt your test cases Even if you pass in an int CFFI will just convert that to float value then for your call But if I would use a different struct name or so it would detect that Yeah, if you change the names so that are not compatible you Get an error message if you have on the other hand a structure That's completely different a completely different name But has the same the same types in there then you probably won't notice if you change the complex number structure for example and just Switch the order of the fields. You won't notice that when you pass in the parameters You will only notice that then when you test for the for the assertions in the end No more questions Yes Can the CFFI module can be also used for the c++ code? For what please c++ For c++ I think it's it's not Completely supported. No, but there's the main CFFI developer arm in one of the four girls You can ask him about new features Short answer is no Okay, thank you for your attention and thank you Alexander again