 Thanks for having us. I'm Ricard. I work for Runtime Verification. I usually work on K-Wassam, and I'm here with Sean Wei from NTU Singapore, and we want to tell you about K and how and why specifying Solidity in K is a good idea. The K-Semitic framework is built on the idea that every programming language should have the formal semantics. So from a formal semantics, the K framework can automatically derive all the common software tools. This includes parsers, interpreters, but also more complex tools like compilers, model checkers, deductive verifiers, and so on. So the complexity of work for generating t-tools for L languages goes from t times L to t plus L. K is by no means a new technology. This framework has been developed for over 15 years. We use it heavily at Runtime Verification. There's lots of publication surrounding it. It has also proven very language and blockchain agnostic. So the formalism works for high-level languages like Solidity, CE, Java, low-level languages like Wassam and EVM, and we worked on different blockchains like Ethereum, Algorand, TSOS, and so on. I just want to show you the formalism that K is based on. It's a rewriting formalism. So I'm going to give you a short example of what that looks like. The first thing you need to define is the state that the rewrites should act on, which we call the configuration. It's built up of what we call cells. So here we have a configuration which contains the program to run in a cell that we named K. We have an environment for the current variables, which is a key value map, and we have a local memory storage. And more cells we don't really care about now. Richard, I'm sorry to be interrupting, but the slides are kind of blurry. Is there an option that you either can share them with the audience or maybe Shang-Wen could share her screen instead? Sure, let's see. There's kind of how to read right now. I can share my screen, but does it affect a recurse presentation? So we have some transitions, but we should be okay. I'll just say next slide or something. Let me try as well. That would be great. Sorry for the interruption. Yeah, so that just means that people will have to select the, not the speaker's picture to see the slides because the live stream follows the audio. Yes, okay. So let me share with my slides. Great. You can click through. Much better quality. Yes, this is very sharp now. Thank you. Okay. Yep, keep going. And yeah, here we come to syntax. So you write your syntax in typical EVNF form, but we have these handy annotations. For example, strict2 means that the second argument here is strict. So the right hand side would be evaluated first. So that has, you can add some semantic meaning to your syntax declarations. If you go to the next slide, here's an example of what a rewriting rule looks like. And I'll just show you how it acts over a specific configuration. So here we have a configuration with the identifier foo and you assign three to it. So, and you write your semantic rules with this rule keyword. See, basically what we will do here is we will look in the current environment, find the pointer, and then go to the storage and modify the value at that pointer. Yeah, so what you have this, I'll go back. Yeah, so the rule, it's a rule keyword and you see the, these little rewrite arrows, those specify where the state is going to change. The next slide. So first, the assignment is going to match the the rewriting rules just applies to any configuration that it's left hand side unifies with. So in this case, the configuration matches the rule. The assignment in the rule matches foo equals three with foo assigned to x and three assigned to y. So actually, this might be a little annoying to do in this way. So let's just skip over the rewriting part and I'm happy to like explain this in detail to anyone later or you can just look at slides yourself. So let's go to formal verification. So basically from a semantic written in this formalism, there's a straightforward way to reason about how program execution to click next, the k framework derives a prover for free. And the basic idea is this, you treat every rule as an axiom next. You can do next three times. You write a claim as a rewrite rule. Then you start from the left hand side apply all axioms that match branching whenever there's more than one thing that applies and you just show that on every branch you always reach the state that matches the right hand side. So next slide. Let's see. Maybe I shouldn't know. Yeah, yeah, that's probably good. So basically, why should you bother making a formal specification in or an executable formal specification? I'd say it's the best of both worlds. You get something that's readable and recently high level. If it's a case style, you can even write it in a literate style in line it with your documentation. It's executable, obviously. So you get an always up to date correct by construction reference interpreter. Everyone working on the formal verification tool can now do so properly in quotation mark because you actually have a formal definition of the language. I'm very curious to see what's going on on the SMTN tomorrow regarding this. But yeah, having some formalism that describes the language is usually a good idea. I also find that it's a good prototyping tool for trying out language changes because once you've hacked away on a language change in the compiler, for example, you can you need to specify it in a way that is ruthlessly unambiguous. And at least with K, the semantics are even composable. So you could write a separate semantics for say Yule and include that in the Solidity Semantics. And it actually shouldn't be that intimidating because it's defining a semantics as sort of on par with writing an interpreter in terms of work. So with that, I want to ask if you consider the statement that Solidity should have an executable formal semantics. And with that, I just want to hand over to Xiang Wei who has been working on just such a specification. Okay, thanks Rick for the introduction of the key framework. Then I'm going to take over to introduce how we define the formal semantics of Solidity in the key framework. To do so, actually, you need to define two components. The first component is the configuration which indicates the status or state of smart contracts. If you look into the configuration file, actually you'll find that it has two main parts. The first part mark in the red color which is for execution of a smart contract instance. While the second part mark in the blue color as for recording the whole blockchain network status. And let's zoom into the red part and you will see that we have a dedicated cell called execution engine for execution of a smart contract instance. And inside this cell, we have several important cells. For example, the call stack for function calls either external or internal. We have call state including the ID which is the address of the current instance, the call ID, call value, storage, local memory, etc. And if we look into the blue part, you can see that we have a cell called accounts in which we store all the counter instances that have been deployed on the blockchain, including its address, its contract name, its balance, its storage, etc. Now, let's move on to the second component to define that is a set of semantics rule indicating how each solidity statement behaves based on a current configuration as well as how it updates the configuration. Let me use this statement as an example. So here we have a statement to declare a variable of unscienteger in storage whose initial value is three. And how do we define a semantics of this statement? We need to write a rule a semantics rule for that. For example, here if we have in the case cell, we see this statement of this syntax, we know that it's a variable declaration. So we try to rewrite this statement to allocate term in K. It looks like this and at the same time, we need to look for necessary information and put it here. So for example, here we need to know what is the current account that we are going to declare this variable. So we need to look for its account ID. So we put it here together with the variable information, for example, the name of the variable, the expression, the value, the type, the location, etc. And then we move on to this allocate term. So whenever we see this allocate term in a case cell provided with the necessary information, then we are able to do the corresponding arrangement in this account, that is to insert this variable record in this account cell. Here I omit the details, but to sum up to develop a formal semantics of Solotary in the K framework, we need to define first the configuration, second a set of semantics rules like this for each statement. Now I'm going to talk about the challenges that we would face when we develop the semantics. The first challenge is that Solotary is actually changing very fast in either in syntax or semantics. Currently the latest version is 0.6, but if you look into the version history, in average, almost every month will have a version change, which is quite challenging for us to run after the frequent version changes. The second challenge is that the language description in the official document is not comprehensive, usually complicated or corner cases are not mentioned. For example, if we are talking about function modifier, the following details are not mentioned. For example, what if the underscore statement is used for more than once, what if the modifier is inherited for more than once by a function, et cetera. And we need to figure this out by ourselves based on some experiments, which is quite time consuming. And the current status of case Solotary, the project started in the beginning of 2018. And until now we have two versions. Version 1 supports Solotary 0.4, and this table summarizes what are the features supported by our semantics. And as you can see that almost every core feature is supported, except those that we are not able to support. For example, there's inline assembly, basically this is EVM bytecode, and obviously it is out of the scope of Solotary itself. And since Solotary 0.5 was introduced, we plan to have a refactoring based on version 1 to version 2 to support Solotary 0.5. And currently we have finished core expressions and statements, and we are still working on some advanced features, for example, function modifiers, user defined types, inheritance, et cetera. And now you can find our version 2 implementation on GitHub now. And with version 2 actually you can do automatic testing or proving your smart contracts. And now I would like to share with you one of the interesting findings when we developed the semantics, and it was back to Solotary 0.4. And here we have a very simple test case that we used to test our semantics. And you can see that there's a very simple contract test consisting of two state variables A and B with their initial values 1 and 2 respectively. And we have a function foo here in which we declare a local array D with two elements. And after that we try to assign 7 and 8 to the two elements respectively. Now the problem is what are the values of A and B after we execute a function foo? Well, based on our semantics, A is still 1, B is still 2, but the program stuck at this statement. And this is because when we declare D, we don't specify the location. So by default, it will be in storage. And based on the semantics, it will be a reference to storage. But we don't have its initial value, meaning that we don't know where D points to. So whenever we want to execute this statement, we don't know where to store 7. However, if you try to execute this contract in the remix compiler, I mean 0.4 version, and you will find that the result would be A becomes 0 and B becomes 8. And you may be surprised because you thought you are dealing with only local variables, but actually global variables are affected. And obviously, something went wrong here. So we reported this findings in our technical report in 2018 on archive. And after our investigation, we found that the Solidity 0.4 compiler implemented some implicit behavior, which is beyond developer's expectation. And the problem comes from this statement. When we declare this array D, we don't specify the initial value. But for the compiler, it assumes that the default value will be 0. So actually D points to slot 0 in the storage. So that's why when we execute this statement, the contents becomes like this. And when you execute this, the second assignment statement, the content becomes like this. So that's why A becomes 0 and B becomes 8. And of course, this behavior has been fixed since Solidity 0.5. Now you need to specify the initial reference for D. Otherwise, the compiler will complain about that. So from this example, we can observe that the formal semantics of Solidity is very important, especially for developers. Well, now I would like to summarize this talk by introducing the possible application of case Solidity. First of all, our semantics is fully executable, meaning that you can execute your smart contract based on our semantics, and you will have an output configuration. Actually, you can have the output configuration after each statement. And you can do formal verification. You can have some assertions in your smart contract, and our tool can help you to do symbolic execution to check whether the assertion will fail or not. Or you can even try to prove that your smart contract is correct. But of course, you need to specify the properties. And then you can even do compiler verification. This is what you can do. For example, you have a smart contract, right? You can run your smart contract based on our semantics, then you will have an output configuration. In the other hand, you can compile your smart contract by a compiler, and you have your evm-byco, and you execute your byco, and you have your real output. And after that, you can compare the two outputs to do course validation if they are not consistent, meaning that something's going wrong. And last but not least, you can even do semantic consistent checking. For example, how do you know that behavior in a Solidity level conforms to that in the evm-byco level? To do the consistent checking, actually, you need the formal semantics of evm, which is supported by another project, kevm, from runtime verification. All right, I think that's pretty much I want to share with you today. I think Riko and I would be happy to take questions if we still have time. Yes, you still have a couple of minutes. So feel free to ask questions in the Solidity Gitter Chat, or right here in the room, okay, somebody's raising his hand. I see it already. Hey, guys, thanks for the talk. Wait, can you hear me? Yes. Okay, cool. Yeah, so this stack, k is related to kvm, of course, makes a lot of sense, and it's super nice. But suppose, let's assume that k, Yule exists, would it, how much easier would k, Yule, like k is related to k, Yule and k Yule to kvm be if you want to verify the whole stack, then the single, sort of like the single large step from k is related to kvm. Hey, Rikard, are you going to answer? Yeah, sure. So you mean for compiler verification? It's like, in verifying the intermediate steps? Right, if you have the source code, and then you have the compiled byte code, and you want to check kvm, for example. Yeah, I mean, it would be the same thing in the sense that you would, if you have a complete semantics of Yule, you could symbolically execute your program, and then you could check that. I mean, the tricky part is to find sort of equivalence, like figuring out, well, you know, we expect this value to be output or stored somewhere, whereby EVM, just correlating that to, you know, the corresponding part of the Solidity configuration. I mean, it would be the same thing, I think, but I mean, it depends sort of on how the semantics are written and, but in principle, it should be the same or simpler. Actually, I have something to add up. When we try to define a formal semantics of Solidity, we try to keep the configuration as much the same as EVM as possible. For example, we still keep the gas transaction number, etc. We try to keep it as the same as possible. This is because we want to do the consistency checking. Although some of the cells, we cannot use it. For example, the gas cell. Actually, in the Solidity level, you don't know actually how each statement consumes how many gas. So we just put it there and have some estimation. One way is to compile your Solidity contract into an EVM and calculate the gas and get back to Solidity level and put into the gas cell. So I think if you want to do the consistent checking, that would be not difficult because 80% of the configuration are the same.