 Hello everyone, I am Uri from Sertora, and today we're going to talk about bad proofs in formal verification. Many times people think of formal verification as the holy grail or highest insurance against bugs. And yet sometimes we still see that even projects that have been formally verified still contain bugs. For example, this is a high-profile case where we in Sertora have formally verified a specific thing about one of our customers and it still contained a bug in deployment. In this lecture we're going to see how that is possible, how it happens and how can we prevent that. In general the lecture is going to be divided into a few different parts. At first we will say what is formal verification, what are the proofs that you get out of it and what does it mean for a proof to be bad. Then we will showcase two different types of bad proofs and we'll also show you how you can sometimes tell if a proof is bad or not. And at the end we'll show you a real-life example, the same example I've shown you on the previous slide. We'll delve into it deeply and understand it. Let's go. Formal verification is the process where we take a piece of software and try to see if it behaves according to some pre-fined set of rules. We call those security properties or rules, we call them specifications. And there are one of the two inputs along with the code that we want to check that we need to feed into the software that executes the formal verification. In this case, for example, the Sertora Prover. And then after the Prover gets those two inputs, it can give us one out of three possible outcomes. One, a proof. That means that in all different cases, ways, behaviors or states, the program behaves as intended. Another is that it doesn't always behave as intended and then we get what we call a counter example which is often showing us something that is hard to find and very often a bug in our program, an unintended behavior. And also the tool can time out and that in that case we don't know if the software is behaving well or not. Let's show that with a simple example of a Soliti Pseudocode. This is a transfer function. We're transferring some amount of tokens from one address to the other. And an example of a specification that we can give to the software that does the formal verification is this invariant that says that the total supply of the token is equal to the sum of the balances of all addresses that hold the token, which makes sense. If we feed those two inputs to the Prover, we will get a bug. We get a counter example, a specific scenario where that invariant doesn't hold anymore. Here we can see that when we pass an amount of 18 from the address of Alice to the address of Alice, we get a violation. We see that Alice's balance grew by 18 tokens which it shouldn't have. Now the invariant is broken and that's why it's a bug. However, if we try and fix that Soliti code, let's say to this implementation, which doesn't contain bug, we'll actually get a proof. And what does that proof mean? It means that the sum of the balances of the token for all addresses before and after the transfer is equal. And then we can be sure that it's true no matter which addresses we chose and which amount we have chosen. And that's one of the key advantages of formal verification. It is exhaustiveness. We're not checking for a specific set of inputs, but instead we're covering the entire input space. That way we're able to find bugs that are often or sometimes missed by humans. Another big advantage is that whenever we get a violation, it's very easy to verify it. We get specific numbers, specific addresses and we can see and write it ourselves and see if we get a violation or not and try to track and see exactly where the problem lies. And also we can get a proof of correctness, which is something great that we can't really get through many other security methods for smart contracts. However, those proofs are very hard to verify. Obviously we cannot go ourselves through all the different inputs and see that it's true or not. Else we would have done that instead of trying to prove it, right? And actually sometimes our proofs don't actually mean what we think they do. Sometimes we're proving something completely different from what we've intended. And those proofs are bad because they can give us a false sense of security. And if we have a false sense of security, we might upload code prematurely that might contain bugs. Let's look deeper into those specifications I've been telling you about. This code that you see on the screen behind me is written in a specific language called the Sertorovarification language used for writing specifications, but the same principles hold no matter how you represent your property. The general anatomy is divided into three different parts. The first is the precondition, then we have an operation, and then we have a post-condition. And if we look at this specific example, here we check that the transfer function behaves as intended. The way that we do that is that at first, our precondition, we check the balance of Bob for a specific token. That's our precondition, and we keep that number. Then we do the operation. In this case, it's the transfer function. And after the operation, we check the post-condition, we check the balance of the tokens of Bob after the transfer, and see if it is indeed equal to the balance Bob had before the transfer plus the amount of tokens we've transferred to Bob. Let's look at it more visually. When we define a property, we define a starting state by the different constraints that we give, and that's the circle. The arrow represents the operation. And then we want to land within one of the desired states, one of the states that satisfy or in which the assert expression that we had holds. So the desired behavior is that every state that is within the starting states, if we do the operation, if you draw the arrow, we end up within the green circle, and all of them live within the space of possibilities. When we get a violation, we get into a counter-example that means that we started from one of the starting states. We did an operation, we have an arrow, but that arrow doesn't land within the desired states. That's a counter-example, that's a violation, that's what might be a bug. And the interesting thing to note is that if the tool cannot find a counter-example for the specification or the property that we've provided it, then it will output us a proof. And that is because in logic we define a false statement as a statement to which we can give a counter-example. That part covered the first section of this lecture, what are the proofs and what is formal verification, and now we look at two different types of bad proofs. The first type is what we call a vacuous rule or vacuous proof. Vacuous is something empty or meaningless or insignificant. And this is better shown by a real-life example. I am 29 years old and I don't have any children yet. And I'm claiming the following statement, if I let my children drink Colombian coffee, they will sleep better at night. Is that statement true or false? So if we want to say that the statement is false, we need to provide a counter-example. So we need to start from one of the starting states, meaning you need to choose once my children. Then you need to do the operation, meaning let them drink Colombian coffee, hopefully not too hot, not to injure them. And then you want to see that they actually do not fall asleep at night. Can you do that? You cannot do that since I don't have any children, and therefore there is no counter-example to that claim. If we don't have a counter-example to a claim, the claim is true. Here, it's true vacuously. We don't have a starting state, so it's true. However, I can also say something that apparently seems completely contradictory. I can say that if I let my children drink Colombian coffee, they will not sleep at night. Sounds completely the opposite, yet the same principles still hold. You still cannot provide a counter-example here. I still don't have a child, and therefore that statement is also true. If we look at it visually, we just don't have any starting states, and we don't have any arrows, and therefore we cannot have a counter-example. Let's look at a code example. This bounce of function belongs to a token of open Zeppelin, and the interesting part about it is that we require that the address that we check the bounce of cannot be address zero. One of Sertora's employees was working on open Zeppelin contracts as was writing a rule of property that he thought should hold on the tokens of open Zeppelin. However, that rule, that specification, contained an error. You see, in the middle of the rule, this requirement that the bounce of the address zero for a given token must be zero. That can never happen because every call to the function balance off for the address zero will revert. If it reverts in particular, it never returns back the value zero. Therefore, nothing can satisfy this requirement. If nothing can satisfy this requirement, we don't have any starting states. This is a vacuous rule. And therefore, it doesn't actually matter what appears later in the rule after that requirement. It will be true no matter what we do. And in fact, I could try and assert something completely ridiculous. I can assert that zero is greater than one. And that would still be true because you cannot still give me any counter examples. And when a rule is vacuous, we can prove anything we want. Unfortunately, vacuous rules are not just an academic concern. Formal verification has been used a lot on hardware along the years, and studies have found that about 20% of specifications written on hardware on the first time are actually vacuous. And whenever they are vacuous, they hint at a real problem, either at the code that you're trying to verify or at your specification. So this is a pressing problem, but fortunately, we have some ways to try and catch them. And one of the ways that we use in Surtura is doing what we call a reachability check. What we do is that we take the same specification as before, exactly as it is, and add something at the end. We add an assertion of false. Obviously, assertion of false is always false, so we expect the rule to fail. We expect to get a counter example, because nothing can satisfy false. However, if we do not get a counter example, if the rule is proven, that means we didn't actually reach that last line of the specification. We didn't reach that last line of the specification because that requirement erased out all the possible starting states. It doesn't matter. We never reached that last line, and that's how we check that. We say something absurd at the end, and we see if we fail or not. If we don't fail, that's the problem. That's when we know that our rule is vacuous. For completeness, I will say that usually when we have a vacuous rule, it's not due to one precondition, one requirement that is never satisfiable. It's usually due to a combination that their intersection is actually empty and not satisfiable, but each part by its own makes sense. For example, here, if we require that x is smaller than y, maybe we can prove the property. Let's say we add another requirement, y is smaller than z. Now, we limit the starting states only to be the intersection. And if we add this third requirement that z is smaller than x, then we have no intersection anymore. Here, by transitivity, x is smaller than itself, and that's never true for any number. Therefore, we don't have any starting states, and this rule is vacuous. But note that some combination of those requirements could make for a sensible rule. Moving on to the next type of bad proofs, we're going to talk about tautologies. A tautology is something that is always true, and therefore is actually not telling us anything useful about the code that we're trying to prove things on. For example, let's look at this rule. This rule tries to check, again, the integrity of the transfer function. We check that the balance of a recipient is zero, then we transfer some positive amounts of tokens to that recipient, then we assert that the balance of tokens for that recipient actually grew. However, this rule on the screen is wrong because we didn't check the balance of the user before and after the transfer. We check that the balance of the user after the transfer is not smaller than itself, and this is something that's always true for any number, and that's a problem, because in this case we didn't actually check what the transfer function is doing. We're checking something that is always true. The transfer function could burn all the tokens, could send all the tokens to me, or it can do something entirely different and not move any tokens around. We don't know, we didn't actually check that. We look at it visually at autology, something that's always true, so it encompasses the entire space of possibilities. So here, the arrow representing the operation lands somewhere within the space of possibilities, and therefore it's true. The problem is that we could have drawn any arrow, we could have done any operation, and it would still be true, so we didn't check anything specifically about our code at all. One of the ways that we find autology in the insert aura is that we remove all the preconditions and all the operations, just leave the assertion at the end and see if it still passes or not. If it passes, it means that it does not depend on the requirements and the operation, it's something that is just always true, and therefore it's at autology. So here in this example, we just remove all the lines but the last two, and in this case, this rule would pass, it would always be true, and that hints that it's at autology. And now it's time to delve into the real-life example I've shown you briefly at the beginning of the lecture. Before that, I need to introduce you some new notion, a notion of an invariant. An invariant is something that is always true. In particular, it's something that is kept after doing some operations. In our case, it's calling functions of a smart contract. The way that we prove invariant is by induction, and you might be familiar with it from school, it's similar to proving by induction things about natural numbers. We have the base of the induction, we're trying to see if the condition is true right after calling the constructor. And then we're doing the step of the induction. That is, we assume that the invariant holds, we operate some function of the contract that must be public or external, and then we check that it's still true after. The same condition is true after. If we do that, we get that the invariant is true, the same when the induction proved that things are true. The interesting thing here is that we check for every possible function of the contract. And this example belongs to Notional, which is one of Satorra's customers, and they are using our tool and writing specification. And we'll go over it. This is a more complex example, because it's a real-life example. What they tried to prove here is that an asset in their system, and their system allows lending and borrowing at fixed rates, by the way, an asset cannot be counted both as a bitmap asset and as an active asset. And here we see that they try to reach that conclusion at the end, the highlight line at the end says that the same currency can't be both active and bitmap. The first line here is just a natural requirement on the index of the asset, and it's not very interesting. And the rest is where it starts to get a bit more complicated. Here we see we have one statement that says that the bitmap currency cannot be zero. And that will always have to be true. Then we have another statement that has an OR sign. And if we look at the first part of it, we see that we require that the active unmask currency is equal to zero. So if you look just at that part of the OR statement, we see that the bitmap currency is not zero and the active currency is zero. So zero and nonzero are always different. So the conclusion is true, trivially true. What happens if we look at the other part of the OR statement? Here we require that the active mask currency is zero. Innotional system assets are represented using 14 bits plus two more bits for a mask. So the mask encompasses all 16 bits and the unmasked encompasses the lower 14. But if the mask currency is all zeros, it means that the unmask currency must also all be zeros. Therefore it follows that it is zero itself and we have actually reached the same statement as we've reached before. Meaning if the bitmap currency is not zero and the active currency is zero, then they must be different. And that's always true because zero is always different than nonzero. And in total we get that this entire statement, although complex, is just a tautology. Hard to see and you can understand why the person that wrote it could make that mistake. It's not apparent, but this is indeed a tautology. And unfortunately the code was deployed thinking that it cannot contain that bug. And unfortunately it did. The bug lied in this specific function, enable bitmap currency. And it's a bit hard to see the bug itself, so I'm just going to show you the exploit scenario. The exploit scenario goes like this. We enable the bitmap currency for an account, let's say if. We deposit a second currency for that account, let's say die. And then we try to enable die now as the bitmap currency. Due to the bug, now die will be counted both as an active asset and as a bitmap asset. That means it will be counted twice. It means that the user collateral will be larger than it truly is and therefore we could borrow more money that we should and therefore we can drain funds from the system. Fortunately, this bug was found by a white hat, so no real damage was done. Notional had to pay the maximum bounty amount of 1 million USD, but no real damage was done. The interesting thing is that this invariant I've shown before is pretty close to the correct one. The correct one, the fixed invariant, will look something like this. Actually looks simpler in this case. And the interesting thing to see is that if we run this invariant on the buggy version of the code, we get the exact same counter example where you set one token as a bitmap currency, deposit another token, then move the bitmap currency to be the other active currency. We get this exact counter example. Notional used some of the top oil draws in the field and formalification and missed it. However, it's interesting to see that if they used formalification correctly, they could have caught this critical bug. So not only can we catch the bug with the fixed rule version, we can also verify the fix, right? Because the specification doesn't change when the code changes, so we can run it again. And more importantly, this incident was a catalyst for Satorra to invest in ways to catch bad proofs and ensure quality of specifications automatically. And today, you could have caught that the buggy invariant was a tautology using the method that I've shown you before, meaning that we just take it and require this condition without having the precondition, without having any operation. We would have seen that this would hold, because it's a tautology, and therefore this mistake would not be possible with our tool today. And we're still working on more ways to ensure qualities of good specification and catching more possible bugs and bad types of proofs. To sum it up, specifications are written by humans just like code is, and they're equally or even harder to write. This is something you have to keep in mind, you shouldn't expect your specifications to be perfect like you don't expect your code to be perfect. Therefore, you should always check your spec. It's good if another person reviews it, if an expert reviews it, and also even better to use some automatic checks like we in Satorra provide you. You should always suspect what the tool gives you, like you would any other security tool. When you get a bug, it's always a good result, because you can always check it and see if it's good or not. However, when you get a proof, you don't have a good way to verify it, and the best course of action is to be suspicious of it. Don't take it blindly and just upload your code. And obviously, we could have seen that, writing the correct specifications, you can still find bugs that are worth millions or billions of dollars. Thank you. I would like to take some questions. Thank you. So great presentation. So it seems from what you explained that when you have a bug proof that is a vacuum, you can just detect it just with a coverage report, let's say, in which part of your specification was never reached. So it's like immediate. But when you have a tautology, it's more difficult because you will need to remove, let's say, preconditions. And removing preconditions, you can have like 10 preconditions. You don't know which one, perhaps a combination of these. So what is your view on this? Is it correct that it's easier to detect vacuums than tautologies? Okay. So one thing I want to emphasize, the checks that I've shown, you don't catch all types of vacuities and not all types of tautologies. Just catches only the simplest types. And I've chosen to present them to you because they're the easiest to explain in a 20-minute slot. There are actually more involved checks that try to cover more and more cases. We don't cover all of them currently. It's an ongoing effort in Sertora or in phone verification at large. Maybe it was not clear, so I'm just going to say something simple. The removal of all the preconditioning operations is done automatically by Sertora. It's an automatic check. So in that sense, it's not harder. So if I understand the question correctly, you say, how can I fix it? Which part of this statement causes it to be a tautology? And that it indeed sometimes more difficult, but it's something that you can do systematically. Say your expression is an OR expression. You can check each part of the OR independently and see which branch of the OR is something that is always correct. If something is always correct, it means the entire statement is always correct. Therefore, this thing causes a vacuity. And this is actually a feature that's very close to production in Sertora. Maybe I'm wrong. It's already in production. Maybe I'm not so keen on the details, but it's really close to production. If you have an implication, same thing. You can see that maybe the premise, that's the part before the arrow that comes before the conclusion, if it's something that's always false, you will have a rule that's vacuous. So depending on the structure of your conditions, you can learn more intelligent things about them that can guide you. It's still not perfect, but this is something that we're actively working on. There's time for another. Thank you all.