 So, hello everyone, please welcome Tobias Leihar on Record Flux, facilitating the verification of communication protocols. Thank you. So, thank you for coming. I will start my talk with an overview about the problems we have seen in the communication protocol implementations. And then I will talk about our solution to these problems. And at the end, give a short outlook on what we are planning to do with Record Flux in the future. So, in the past, we have seen multiple security vulnerabilities in communication protocols. Here are just three examples. So, there we have Gluborn, that was a set of vulnerabilities, which was found in the implementation of Bluetooth. And this vulnerability affected millions of devices, and an attacker was able to take over to full control of the device, and also to write a malware that spreads wirelessly from device to device. Also very famously is the heartbeat vulnerability, a bug which was found in the open implementation of the TLS heartbeat extension. And there it was possible to extract sensitive data out of another TLS endpoint. Another interesting bug was found last year in the implementation of LibSSH. That was possible to circumvent authentication process by just sending one message at exact time. So, without knowing a password or a key or something like that, you can just get access to the SSH server. What we can see from these vulnerabilities is that at all software layers, we have found such vulnerabilities, even from very low level implementations like Bluetooth or TLS to application layer applications like SSH. But what are the cause for these problems? So, when you look at communication protocols, they are usually very complex, and implementing complex protocols is very time consuming, and also error prone. Also the specifications play an important role. Such specifications are usually only available in English prose, and so it's easily misunderstood by a developer, or it's easy to misinterpret or forget to implement some detail. Also, when you look at the specifications, they are not formally specified, so if you had formal specifications, it would be able to verify that the implementation we did is correct at the end, but as long as formal specifications does not exist, it's not possible. We could see communication protocols consist of multiple parts. We have messages which are described by formats. We have the protocol semantic, which is described usually by a state machine, which tells us at which time which message is expected. And for security protocols, we also have something like security probabilities, which define if a message or data in a message has properties like integrity or confidentiality. To tackle the vulnerabilities we have seen in communication protocols, we have to tackle each part. But as foundation, we need to ensure that no runtime errors exist. So we have to make sure that no buffer overflows, integer overflows or something like that are in implementation. By using verification techniques, we could do that by program verification. Then when we have this foundation, we can look at the messages to ensure that the message formats are correct. We could use functional correctness proofs. And with these proofs, we can ensure that vulnerabilities like we have seen in Pluborn or on Heartbleed, that these vulnerabilities are prevented. Then at the next level, we have also to ensure the protocol semantics, of course. There are different techniques possible. We could use temporal logic or model checking. And when we have also this ensured, then we can prevent errors like seen in SSH. And further, step further, then we also have to ensure the security properties of security protocols. And therefore, we could think of using security protocol proofs to ensure also that implementation fulfills the properties which we want to fulfill it. But let us start at the foundation. Let's look at the absence of runtime errors. So solve this problem. We are using Spark. Spark is a programming language and a verification tool set. And it's especially designed for error avoidance. So this is the reason that Spark contains features like a strong type system and formal contracts. And it's especially also used in this area, then. Also, the application tool set is able to to a flexible depth of application. So we can prove data and control flows. You can prove dependency contracts. You can even prove that there are no runtime errors exist. Or you can even do more and prove the functional correctness of the code. Because of the features, it is used in critical projects. So you can think of areas with high safety requirements, like aerospace, or areas with high security requirements, like government or defense area. As it's a compiled language, it can also be used for system-level programming. And so it's also used in implementation of microkernels. If you're interested in Spark, you can find more details on the website of Edakor, which develops these tools. So see, the power of Spark is a simple example. We have here a simple Z function which calculates absolute value of an integer x. This function mainly consists of an if statement, where it is checked that x is greater than zero. If this is the case, then x is returned. And otherwise, the negative value of x is returned. Maybe someone sees already an error there. For the most case, it works fine. So we can put some numbers in there and we get the correct number back. But in one case, we get not the correct value back. What's the reason for that? The representation of the integer uses the two's complement. And this means that not for all negative numbers, we have also a positive counterpart part. So in this case, this is the lowest negative number, which is possible to see in the end. And if you want to, if you try to negate this value, then we get an integer of low and get at the end back in negative value. If we now implement this function in the same way with Spark, we can use the SparkTools to verify if there are any errors exist in the code. The nice thing at this SparkTools is that you don't have to run the code or even compile it. You can just run it on the source code and it will check if there are any errors existing. So in this case, at line six, you see that SparkTools found an error. They also tell us what the error is. So they found an overflow check fail and also give us a hint what the example is when the sentence. So here it tells us that when x equals integer first, then we have an overflow fail. It also gives us a hint what we can do to prevent this error. We could add a precondition here. So this is the function body, but the precondition is added to the specification and you see if you now add a precondition which states that x must not be integer first and then run the SparkTools again on this function, the SparkTools will see that this error will not occur anymore. So you see the SparkTools are able to prevent runtime errors, but we can even do more and that is what we are doing with Fackert Flux. But what is the goal of Fackert Flux? We want to be able to dissect, generate and verify communication protocols. At the current stage, we are focusing on the specification and the parsing of messages, but in the future we also want to do more. So Fackert Flux currently consists of a specification language which is able to describe protocol messages. We implement it in parser which is able to parse such specification and creates an intermediate representation out of it which is then used by a generator to generate Spark code out of it. And the Spark code can then be used to verify protocol messages and also parse them. Here's the example. So our specification language is inspired by ADA, but we adapted it to fit our purpose. So here in this example, we describe a simple message. It's a TS-HARPID message which consists of four fields. So we have a message type field, we have a payload language field, a payload field and a padding field. The first two fields are fixed in their size and the two fields at the end are variable. We present such a message by using a message type which is similar to the director type or the Spark type in Z. It looks like that. We have just listed all field names and also the type of the field. Then we also have to define, of course, the type of the field. And then we could now think what would be an adequate type for the message type field at first. So the message type field has a size of one byte, so we could think we use just an integer of a size of one byte, but that's not really the best use case. So when we look in the specification, we will see that only two possible values are valid for message type. And that's one as a heartbeat request and two as a heartbeat response. So we are using here an enamiration type where we only specify these two possible values. And now if we see a message which does not contain one of these two values, we already see that this is not a valid heartbeat message. The same I'll be doing with the payload language field. There we also specify restricted integer which only consists of the values from zero to two to the power of 14 minus 20 because that's the size of the payload field. So for the payload and padding field, we use here a composite type which is a built-in type which is called payload type and this just represents a variable size byte array. To make our specification also modular, we have to put all types in the name package, which is here called TS heartbeat. When we look now at the specification, it's not really complete. As we didn't specify the length constraint for the whole message and the length constraint of the padding field. So we need to do more. Also what is missing is the relation between the length feed and the payload field and this relation is especially important. As we have seen in heartbeat, that this relation was not respected and so it was possible to send in heartbeat request with high value for the payload length but only in small payload. And then the receiver of such a request has to copy the payload back but did not check if it really fits to the payload that she gets and so it just read all data according to the payload length and so read also more data, not also only the message and this data could be potentially confidential like keys or something like that. So this relation is really important. Here is the full specification of the message. Highlighted are the length expressions in green and the message constraints in orange. Internally we use a representation of our messages in a graph like form where each field of the message is represented by a node and the nodes are connected by edges to specify the order of the fields. We can also add further information to edges so we can add length expressions like it is done here between the payload length field and the payload field and there we specify that the length of the payload field is the value of the payload length field multiplied with eight. Multiply with eight here because the payload length is given in bytes but the length in our specification is given in bit length. We also specify in the same way the length of the padding field by just seeing the padding field as the rest of the message and we also add restrictions for the message length here that message length has to be smaller than this value and the padding length has to be at least 16 bytes. This is a rather simple example. We could see the full advantage of our model for more complex protocol messages so we can also represent optional fields by adding an edge which skips some node. We can also represent by a condition that we have some fields which are optionally inserted or also fields which have a different meaning depending on their type. So we could think of ethernet. There you have a ether type field which is depending on its value interpreted as a length field or as a type field. Here's the excerpt of the code which we generate. So all functions which we generate for our message use as input in plain byte array which is then used to do further stuff on it. We generate verification code and access code. The application code for each method checks that the message is valid until to this point and if we check that we can then access the value by the access function. So for example for the message type field we have a message type function and a get message type function and by using preconditions here we ensure that we first have to check the message type field if it's really a valid message type and only if it is the case then we can access it. For convenience we have also a function which is quote is valid which checks all fields one by one and if there's a valid path from the start node to the target node then this function returns true and we can then just access all functions on their own without checking again and again. It's also possible to represent the relation between a payload field and another protocol layer so we could say that payload field under some conditions contain another protocol message and this is also checked at the code level so therefore we have this is contained predicate here which is a precondition for all functions like you can see and so we ensure that if a developer or use of the code and reuse it to divide buffer it can only use a buffer which is labeled correctly as a message here in this case like a heartbeat message and so ensure that he does not use the wrong buffer. Let's have a look how we can use the generated code. Here we have a very small function where we get the byte array from external source by this virtual read function here. We have to label the buffer here like I already said because we get it from external source it's not labeled yet and we have to ensure that it is a buffer which potentially contains a heartbeat message and only after we have done that we can access them the fields of the message. As we already specified the relation between the length and the payload field the user does not have to access the link field anymore. We can just concentrate on the fields which are really relevant for him which means here the type field and the payload field. But there's still an error here in this code if you run the Spark code tools they will tell us that we cannot just call them like it is here. They tell us that the precondition might fail and what is the reason for that? We did not check the validity. So we have to first check if the field is valid. This could be do here with an isValid function and by adding this if statement with isValid function and only accessing the fields if we get a positive result back we ensure that we only access valid messages. We could alternatively also check each field by one by one and also find out where the error and the message occurred. Looking back what have we achieved so far? By using Spark we ensure that no runtime errors exist in our implementation. We have built a tracker flux to also correctly identify if a message is valid and if this is the case we can also access the fields of the message. In the future we also want to represent the protocol semantics in record flux and also ensure this part of the protocol and further down the road we also want to use security protocol proofs to ensure that for security protocols also the security properties are fulfilled in a code which we generate. We will use record flux for a project which is called Qtls. Qtls is a component-based high insurance implementation of Qtls 1.3. So this project just started this year and is partially funded by the European Union and the state of Saxony. And the essential part of this project is that we want to split the implementation of Qtls in critical and non-critical parts. And especially for the critical parts we want to use Spark and record flux to make sure that they are really correct. As a base platform we want to use the GNOT or S framework but potentially also others will be supported. We see multiple challenges in this project so we have to find out how do we do the separation and get at the end sensible implementation of Qtls. How can we prove that implementation at the end is secure? And of course we also want to achieve good performance and also see that the implementation has no side channels in it. As the project just started this year there's not much done yet but there's already a repository on GitHub and the source code will be also available in the future. Coming to a conclusion, we have already created a specification language which is powerful enough to specify real-world binary protocols. We already specified ethernet including reliant tags, IPv4 and UDP and are currently working on specifying Qtls 1.3. We are able to generate Spark code based on the specification and prove for this generated code that no one-time hours exist and that messages correctly identified as correct or incorrect. Backup flux is still in development so we have found some minor bugs while specifying tiers. One of the three which I'm working on currently but the bigger thing which we also want to do is adding the message generation. So currently we can only pass messages but we want also to be able to generate messages based on our specification. After that we also want to be able to specify more protocols. We already looked into USB and there we saw that there are also some features are missing so currently we are focusing on the TLV message scheme. That means that we statically compute where the field starts and how long the fields are based on other fields and based on length fields but in some cases this is not possible. So there are protocols where you have to do this at one time and find out where the next field starts. You can think of areas which contains various sized elements and you have just, you have to first pass these elements before you know where the second area starts. But yeah, that's something that we also want to support in the future. So at the end also the code of packet flux is available online so if you have, if you want to have a look at it or you'll get it at the documentation you'll find it on GitHub. So thank you. So are there any questions? Hi, how do you handle hardware specificities like out of Rx execution or like reshuffling of instructions? Do you model hardware at all or? No, we are not at that level. We are modeling just the messages and generating code should then pass these messages. So you're not working on side channel things? That's something that we want to do in the future also looking at the side channels but that's not currently done yet. Anything else? So thank you. Tobias.