 Let's welcome our next speaker, Tobias, about, and this is Dolk. We've got five seconds, so three, two, one, go. So welcome everybody, I'm happy that so many people are interested in securing existing software, are not afraid of using form verification. Let's start. I think it's not the news for you that there's a long history of security vulnerabilities and still today's software contains many security critical bugs. Here you see a list of bugs which were found in communication protocols and for this talk I selected one vulnerability in FIS on which I want to show you how existing software can be made more secure and how form verification can be used in this case. So FIS is a TLS 1.3 implementation created by Facebook. It is written in C++ and has a focus of course on security. So Facebook used modern C++ methods to ensure that common errors like buff off loss does not happen. But nevertheless, a vulnerability was found. So an unauthenticated remote hacker was able to trigger an internet loop in an application which uses this TLS library. This maybe doesn't sound disservient but with little effort and little time an attacker could be spawned many processes which are spinning in an infinite loop and so you can keep in server with many cores in short time, keep very busy. Let's have a look at a code which was attackable. So this is a part of a parsing function where a while loop was used. In this while loop, a parsing function is called which reads in 16 bit integer from a network packet and this 16 bit integer is then stored in variable code length. To this length later on, static value is added. That's the size of the header. And after that, trim start is called on a buffer object which tells the buffer how many bytes were consumed. What this does is actually we read how long this package is and then skip over this package. After we have this done, then we continue at the start of the loop. What is assumption here? The developed things, of course, the length must be bigger than zero. If this is not the case, no bytes are consumed and so we get back to the beginning of the loop and have the same situation again. So we are stuck in infinite loop here. What does this mean for the value which we read from network package? Then we must have this assumption that the length must be less than two to the power of 16 minus five because the length of the static header is five. That's a very assumption if you look at very TLS record packages because there the length must be much lower than that. So TLS standard says it should be below two to the power of 14 plus 256. But the remote attacker must not obey this rule. So it was just able to send a length which has this special value, two to the power of 16 plus five and then we add five to it and so get an integer overflow and the length is zero and now we are stuck in infinite loop. So I would say the solution should be here that we check if the length is according to the standard. But Facebook sector is not a solution. They just use some bigger integer. So now we have no integer flow anymore and so it works fine even if the package is not standard conform. But how could we prevent such bugs? The usual answer is to keep a good software quality. So the code reviews, testing, fuzzing and so on but that all our measures which Facebook did and they didn't find a bug. So we need some more intense techniques like static code analysis and so this bug was also found. A company called XAML which is now acquired by GitHub used a tool which is called CodeQL which does a variant analysis. What the variant analysis does is it looks for patterns of known vulnerabilities and just search for this pattern in the code. If there's a pattern then we know that is this a known vulnerability. But there we have also the drawback of this. We can only look for patterns which are known to us. If there are any unknown vulnerabilities we cannot find them. So I would favor a more constructive approach and that would be for verification. Where we can really prove while designing the code that there is no unexpected behavior in the program. Doing this on existing applications which mainly you see or C++ this is not really possible as it has too many features. We can do it maybe for a limited set but not for all features which C++ give us. That's the reason that there are especially designed languages for formal education and Spark is one such language. Spark is programming language which is based on Aether. It's like Aether compilable with GCC and LVM and is also very very suited for embedded systems. So we have to customize it with one time and you can customize it in a way that you have only really a minimal one time and then you can also use it on a system with very resource constraints. What Spark also offers is contracts so you can add pre-conditions, post-conditions, invariance to your program and so define with a high detail which behavior a sub-program should have. This is also the basis of the verification tools. They check when you use these functions if all contracts are fulfilled at any time and they do this statically so before really compiling the code or running it. What you can achieve with the verification tools is that you can prove absence of runtime errors so there are no errors like buffer overflows, integer overflows and similar things or you can even show functional correctness so that it really achieves through your specification of your program. Aether and Spark is mainly used in safety critical areas but also in security. Let's have a practical look at Spark. Here I've implemented the attackable code part of this. It's really similar to what we have seen before. We declare a length variable of an integer and then read some 16-bit value from external source. Then we add five to the length like what it was done before and then we use this length variable later on. Spark is different to C++ in that we usually don't use integers which integers which are built in but we always define integer types for with the certain properties which we need in this case. So there are two different types of integers. We can define modular integers which behave like C++ or C integers so they just have in size and allow to overflow at any time. Or we can also use type range integers where we can specify the range which is allowed for this integer and where we also have the property that these integers are always not allowed to overflow. We also use such an integer type here for our length variable and what we get when we now use the verification tools on this code example where we see that the tools will tell us that a range check might fail in line 15. And they also tell us for which value of the length this integer overflow could happen. So we have really nice functionality with Spark to finding such errors but the reality is so that currently usually software is written in unsafe languages like C and C++. Now we can think of using Spark instead of C and C++ but creating existing software to this language just is very expensive especially when we have done it, if you do it manually. But what could we do now? We could think of two options. Maybe we could only replace critical parts of the software. Usually the most critical parts of software is other parts which interact with the outside world. So often these are the communication protocol implementations and if you replace these parts with a verified implementation we already enhance the security. Another option is to use code generation. So do not all implementations manually but using specification and generate code for it. This is what we have done with Record Flux. So Record Flux is a toolset which is developed by Component Lead. This work is also supported by Thio Tristan and it comprises various things. One thing is that it has a specification language where it's possible to specify format of messages in a formal way. And this format specification of messages can then be verified and checked if there are any inconsistencies in it. If there are no inconsistencies we can then later on use the specification to generate parsers and message generators out of it. All code which is generated by Record Flux is Spark code and so we can check or we can automatically improve that there are no runtime errors in the code. Another feature which Spark gives us is that we can statically ensure that only valid messages are accepted and that only valid messages can then also processed. And in the same way it's also ensured that we only can generate valid messages. The source code of Record Flux is available on GitHub. There you will also find some documentation. If you're interested more on the theoretical side of Record Flux you can also have a look in our research paper. So now we have all building plugs to improve the security of existing application. We use as example FISC of course and we replaced the existing parser by a verified parser which we generated with Record Flux. Therefore we had to create a specification of all messages which are specified in TS 1.3 and then use the specification to generate the parser out of it. We have to remove the existing C++ parser of FIS and then integrate our verified parser into it. For a code which we have written and all code which we have generated with Record Flux we are able to prove that there are no runtime errors in it. The code of this proof of concept can be also found on GitHub. Here you see an excerpt of the message specification. In particular you see the specification of a TS record message. Just to give you a high overview what the specification does is on the left side we specify some elementary types like enumerations and integers. And on the right side you see the specification of a TS record message where we specify the message format. Such a TS record message is not that complicated. You have a tag which is the content type. You have a version field, you have the length and you have a payload field. What makes it a bit more complicated is that we here now differentiate if you have a plain text payload or an encrypted payload. When we now check the specification we will see that there's an error in it. There's an inconsistency. As we have two conditions which check if the tag is not equal to application data. And as we have these in two branches there's only two possibilities for encrypted and plain text payload. It's not clear for the parser which path it should take when it parses this message. So the residing parser would not be deterministic in this case. So we have to remove this inconsistency and if we do that and check it again we will see that the truth set us now the specification is correct. And now we can also use the specification to generate code out of it. Now we have created the verified parser and we have to think about how could be integrated as this parser now interface. As Spark and C++ is not compatible with regard to the data structures which they use we have to add some clue code in between. So here you see a Spark function which takes in buffer as input. So it's just a pointer in binary and which gives us results in a record which contains all information of the message which we have extracted. So record in Spark is similar to a struct in C or C++. All functions or types which are highlighted in green here functions which were generated by record flux. So what is done here? The first thing is that we declare a context. A context stores the internal state of our parser for this particular message. As next step, we initialize the result. So we set some default values which we want to have as in our result structure. And then we initialize our context. So in this initialization, we add a pointer to our context and then this pointer is stored inside the context and we can do the things which we want to do during parsing. So the actual parsing is then done by calling verify message. What verify message does is that it checks field by field if this field is valid so that all conditions which we have specified in our specifications are fulfilled and then the result is stored in the context. Later on we have different functions which we can use to use this information in the context. So what is done here is we check if the length field is valid, if this is the case, we are allowed to also access the value which is done for example here with the getLength function. The nice thing of using Spark in this case is that we can prove that the parser is used correctly. So if you would forget to check that the length field is valid but just try to access the value in the length field, then we would see in our Spark tools that the pre-condition of the getLength function is not fulfilled as we did not check valid before. And so we really see if the parser is used correctly and we do not any things or anything which is not supported or which could lead to some error at the end. We need until there to integrate the parsed interface. We, as I already told, this Spark data structures and C++ data structures are compatible but Spark offers an foreign function interface. So we can specify in record and say that it should have convention C and so it has the same binary layout as in struct. So you see an example here for our record and on the left side and on the right side is C struct. And the same is also done for functions. So we have 50 parsed record message where we also define it should we have convention C and we also define the name and on the right side you see this C counterpart. The next thing is then that we have to integrate to parser into the actual Fisk code. Therefore we have also to think about the buffer object. So as I already told in the beginning, Fisk or Facebook used for implementing Fisk some data structures which prevent buffer overflows and this buffer object also offers other features like scatter getters. And so we need to convert this structure into something that we can use on the Spark side. Actually what we need is one pointer to invite array but what we have here is a list of pointers to various buffers. So what is done here on the first lines is that we allocate enough space and then copy all buffers into one contiguous buffer and then take a C pointer out of it. This C pointer can then be used for our parse function. We also have to allocate some memory for the result and then we are able to call the parsed record message function. Then you see we have to move some code this red lines as the code which I've seen before. So this is actually parser which was used for reading the length and we now can just use the result which we got from our parse function and we're added in this case. So we've seen it, it starts functions that we just add to the plain text header size the record length and this is now secure because we check if the length is according to the standard and in standard the value of the length field is much below the 16 bit integer maybe after overflow before. What I forgot here, we also added in between and check if the record which we got from our parsing function is correct. So then we see, we check if it's valid ciphertext because we expect here in Taylor's record with ciphertext in the payload and if it's not a valid Taylor's record message then we throw an exception and terminate the Taylor S connection in this case. We also looked at the performance which we got from our modified FIS. Therefore we created a small HTTP server which just returns a static page and this static HTTP server then used FIS for the Taylor S connections. On the client side we use WRK which sends a continuous stream of HTTP requests and measures the latency and the throughput. We measured the handshake layer and the record layer separately for the handshake layer we just opened a Taylor S handshake for each HTTP request and on the record layer we only opened one Taylor S handshake and then send a continuous stream of HTTP requests over this Taylor S connection. What we saw is that this impact is really small so it's around 1% or 3%. And we also analyzed where this performance effect came from and so that's not the method path which we generate but it's mainly the conversions functions to convert the data structure of C++ to Spark. We only have seen a really simple Taylor S record message but there are more complicated Taylor S messages of course where you have sequences of messages which again can have sequences into it and these are represented in FIS with C++ vectors and objects and they of course are not compatible to Spark. So we need to do some conversion and that uses some performance at the end. So now we have seen it's possible to secure an existing application within formally verified library but we want to even go one step further in the future. We are currently working on Queen Taylor S which is a component based higher assurance implementation of Taylor S 1.3 where we want to implement all critical components in Spark and all these critical components should be also specified with the specification language of record flux and then generated out of it. The current status that we worked mainly on record flux because it's the main building block, we specified all messages of Taylor S 1.3 and we are currently working on designing the architecture and the protocol specification. The implementation of Queen Taylor S will be also available as open source on GitHub. Coming to a conclusion, what we did is creating a formally verified library in this case a parser for Taylor S 1.3. We integrated this verified library into an existing software FIS and we have seen that there are some obstacles so we saw that it's cumbersome to convert these Spark and C plus plus status structures but it is possible at the end. And we have also seen without doing any optimization that the performance impact is really small. So you can conclude that it's really possible to use formal verification to secure an existing software. That's it, so if you have any questions, I'm happy to.