 Okay, perfect Yeah, so what he's saying is that this branch is not reachable For whatever reasons They're gonna figure out how we will symbolically figure this out And how we can use symbolic computation to find that this branch is unreachable So what is symbolic computation? Symbolic computation is about representing properties using mathematical equations And we use the solutions of these equations to reason about the properties we started with The property we wanted to know in the previous example is that the reward is unreachable And we have to figure out a way to convert this sort of decor into a mathematical equation and I want to give some hint about the climax which is usually These systems having the system of equation having a solution means that a property can be violated And on the other on the contrary If the system of the equations don't have any solution then it usually means that the properties always true I mean, this is not generally true, but most of the time this is what you're trying to do And now we want to convert code into mathematical equations. How do we do that? so So the question is what do we encode so we could encode so ready to directly We could encode you'll which is an intermediate representation of sorority It's a one level down from sorority. We could also encode EVM, which is at the lowest level So sorority is a complex language and there's Let's maybe like skip and calling that because there's like too many rules. We have to handle EVM on the other hand is like too simple And we have to extract a lot of information about the control flow and then caught it So maybe let's ignore sorority and EVM for encoding and just deal with you It's it's it's in the right middle ground. It's simple enough and it has enough information about the control flow So the most fundamental thing to encode would be a variable and a variable and AVM is a 256 bit integer Most of the time you represent variables as an element of integers Z as the notation If possible, we add the constraints 0 less than or equal to x less than or equal to 2 raised to 2 if it's 6 minus 1 Now that we know how to encode a very simple variable. We need to figure out how do we how do we assign? values to the variables So here's a very simple dual block with three variables x y and z x is assigned the value 1 y is assigned call that a lot 0 and Z is an expression in you will less than or less than x y So we want to represent each of this assignment by constraints For x and y it's very simple For x you just have the equation x equal to 1 for y We have to just say why that's it. It's a simple EVM variable We can't we can't really like produce any extra constraints from call that a lot 0 because it can't be anything So we just have to treat y as a regular EVM variable And with Z we have to figure out a way to encode a less than or of x y We will deal with that later But the big question is that can we handle every assignments? So here's a different you will block Where you assign the value 1 to x and then there's a switch of statement which has Three different control flow branches. So depending on the value of call that a lot 0, which we don't know what it is We can assign x 2 3 or 4 So the question is can we actually encode the switch This brings us to the notion of a single static assignment So these are variables that are only assigned once and working with SSI variables Simplifies our analysis quite a bit So here is an example of a different you will block you have two variables x and y Y gets assigned call that a lot 32 at the beginning and then reassigned something else By definition y is not an SSI variable. It's signed twice But it's actually possible to transform the same Transform this block into another you'll block where we introduce a new variable Z And all these variables are actually SSI variables So generally speaking we only want to work with SSI variables because they're really simplify our analysis However, it's not always possible to do a you will to you will transformation such that all the variables are as I say There was the example like two slides ago the switch the switch The switch example You cannot encode x as an SSI variable purely in you but we can still get a lot done with just Taking water worries as to say and maybe we have now we also have this step in the you'll optimize a call the SSI Transform that lets us transform you will into what we call as a pseudo SSI format So a lot of variables are as I say, but it's It's not not it's not necessarily like that every variable is as I say So whenever we encounter a non-SSI variable during the analysis We would replace it by a free variable. So a free variable is what we mentioned like two slides ago sorry three slides ago, which is just The basic constraints that you can give to any variable in in the VM But there is this important caveat here that okay whenever we encounter a non-SSI variable We have to replace it with a fresh variable because the value may have been assigned something else During the two reads but we can of course optimizes further, but this is what we'll do now So now let's think about how do we encode some EVM instructions So perhaps the most fundamental EVM instruction is addition you take two numbers and no you add them and leave it at the stack So how do we symbolically represent the addition, you know X X plus Y? What do you think it just isn't just X plus five? Unfortunately, it's not that simple So if you look at the EVM semantics of addition Addition is defined by X plus Y modulo to raise to 256 and if you look at high-level sort of the decode Since you're right, so we have the check arithmetic So X plus Y would reward if X plus Y is greater than Two raised to greater or equal to two raised to two two five to six So we are already seeing that it's not as simple. It's not that simple to encode add It's doable, but it's not the easiest perhaps here is like an easier set of instructions less than greater than an easy row Here's a formal formal like representation of these op codes We define when when these values take one or zero there are like If for less than or AB if a is less than B the value takes less the opcode gives one in the other case It gives zero almost the opposite for greater than For is zero if the value is zero then you get one and zero otherwise So let's digress for a bit and talk about difference logic. So let's start with an example So X Y and Z are integer variables and let there be constraints two of them X minus Y less than or equal to four and X minus Z less than or equal to three the question is Does the system have a solution? It does have a solution. You can just assign X equal to four Y equal to zero Z equal to one and These two constraints are satisfied So to go back to go back a bit you can generalize this Difference logic by saying, okay You can have n number of variables X 1 to X n that are integers and constraints of the form X i minus X a Less than or equal to a constant But let's look at a different example You add one more constraint here, which is Z minus X less than or equal to minus 8 and the question is does this have a solution now? any any takers It actually doesn't have a solution and how do we prove this? So assume that there is a solution. Let's just add all the variables So let's out. Let's add all the equations. So you add X minus Y plus 1 minus Z plus Z minus X and The RHS is going to be four plus three minus eight and they are let's just is going to be zero So we arrive at something sort of less than or equal to minus one, which is a contradiction So there is no solution, but how do we use some graph theory trick to do the same thing? So we try to encode each of this constraint using a graph Every variable is a node in this graph. So you can see that X Y and Z are three nodes and We assign some weights These are the weights that come from the equation. So X minus Y is less than or equal to four that would be the Weight of the edge similarly for the others What's important is that it's a direct graph and The direction follows, you know So in case of a minus B less than or equal to K The edge is from B to A and has a weight of K So the important takeaway here is that if there's a negative cycle In our direct graph, then there are no solutions to our problem You can see here You can see here there is a negative cycle. So if you add up Four plus three plus minus A that is negative one. So that that's what we are looking for So how do we find negative cycles in a graph? So there is this very classical algorithm called the Bellman Ford Which can tell you given a direct graph. Is there a Is there a negative cycle? You can also use it to find the shortest path between two Two nodes, that's the classical use case but it can also tell you if there's a negative cycle and It's surprisingly easy to implement. You can even implement this in Solidity Leo has a repo where he implements the Bellman Ford and much more in Completely in Solidity and he's gonna have a talk tomorrow at 11 p.m. 11 a.m You can come for the talk for more details and Here is like some insight about unsatisfiability like Unsatisfiability is when the sense set of constraints have no solutions and a lot of times we only encode like a very small set of like What we can actually encode and we are very generous about like ignoring the constraints we can we can't solve like I said Already we ignore like no one has to say variables So as long as we only care about unsatisfiability, we can do this and We can optimize and we can usually optimize when The constraints are unsatisfiable. Otherwise, we just leave the code unchanged So we talked about difference logic, but what does it have to do with? All this like EVM Upcodes so it turns out that we can represent these three EVM upcodes using Expressions that would match difference logic. So in case of less than of AB, so when the value is one It's only when a minus B is less than or equal to minus one and zero when B minus a less than or equal to zero So similarly you can build these constraints For greater than and is zero. So in the last example Zero is just a variable that we used to indicate it's zero. There is some like Nuance here, but you could just treat zero as a variable here. So how do we encode you? So a lot of times we want to know if the value of an expression is always zero or always non-zero So if you take this example of if of condition and then something going on in the if statement We can replace so the question we want to know is if we can replace condition by zero or one Inside the branch we can actually replace we can add the additional constraint that condition equal to true so to take it In particular if you look at less than if of less than x y We start by checking if adding the constraint x less than y make the system unsatisfiable So in a different logic, this is x minus y less than or equal to minus one we just add it to our other set of constraints and If the system is unsatisfiable, we can replace less than of x y by zero Similarly, we can check if the system if the constraint x greater than or equal to y make the system unsatisfiable In that case we can replace less than of x y by one and Inside the if body we can add the additional constraint that x minus x x is less than y in different logic That is x minus y less than or equal to minus one and then we can keep Doing our symbolic computation. So going back to the problem from the beginning Here is like a version of the same like solidarity chord in you We have three variables x y and z. They all read from cold data They are not 100% equivalent, but more or less these are this is how the you will code would look like and we have We have three if statements the first one has less than of x comma y The second one is less than of y comma z and the third one has less than of z comma x And the last one would reward if we can restore and the question We want to know is if the last less than of x z can ever be zero. Sorry. It can it ever be? true And if it's never true, we can replace it by if zero, which is what we want Let's think about how to encode the problem now So we have three variables x y z that are integers We don't have any extra conditions for cold data law because it's we can't really tell anything about it We add a dummy variable zero as I said before Now we add the constraints that these variables are 256 bit numbers That is zero less than or equal to a less than or equal to the UN max So the first set of constraints are simply saying that x is x y and z are positive You can see that and the second one would say that x y and z are bounded by the maximum value of UN 256 and Inside the if branch in the first branch you can add the first Constrain x minus y less than or equal to minus one Inside the second if branch we can add the constraint y minus z less than or equal to minus one The third one we can add a similar one z minus x less than or equal to minus one And we learn quite a bit here. So how do we represent all these constraints as a single graph? So you can see the nodes x y and z you can also see the node zero These are the Constraints for the positivity and also the boundedness. So here M is the the maximum value of a 256 bit number and Sorry This this constraint is Z minus x less than or equal to minus one similarly the others are on the outer I mean the outer edges of the node and The question we want to ask now is is there a negative cycle in this graph? And turns out there is one that the one and on the outside is a negative cycle Which means that the system has no solutions So now we can We can actually replace this if of less than xz by if of zero and once we have this if of zero we can just completely remove this branch and After that these branches are simply empty. You could actually remove the entirety of the code So in case of the difference logic the solvers very simple as I said you can write it in sorority layoffs and that however in general the Solver can get quite complex and the question of correctness will always come up The biggest priority for sorority is like the correctness of the compiler and we really want to like minimize trusting external tools As much as possible when they influence the code generated so so in case of a symbolic solver if there's a way to verify that The result is indeed true then that is very good for us So in case of a difference logic you can ask is it possible to produce a proof that the system is indeed Unsatisfiable that there are no solutions Turns out you can actually do that The proof of unsatisfiability in this case would be Just giving a set of constraints where the if you add up the left-hand side It's going to be zero if you add up the right-hand side is going to be a negative number so you get zero less than or equal to minus one which it's contradiction and The solver can just tell you that all these are the constraints that were add up to zero on the left and a negative number on the right And this can be verified by no whatever tool that is going to use the result of the solver but in general you can You can get proofs for a lot of symbolic Logic, I mean you can get proofs from a lot of symbolic solvers But it's not always possible I mean maybe the example was like very simple like who cares about this three if branches Let's look at a more real-world example where this could be actually useful As a lot of times Users would like to add their own up text before Combined sex so here is an array that You just read from the array and you read from other one of the index and And the user wanted tech if the the index is going to be greater than equal to our rate of length Which means there is an auto back bounds access and the user want to like reward by This customer However, the compiler will automatically do this Check van in this code. So doing this is doing this tech manually is actually like wasting gas and is it undone? But this is like a good pattern like so sometimes users want to like reward with their own error messages But we can actually use different logic to see that the second The second like constrain mean once you get out of this if branch You can add the constrain that the index is less than Less than array of length Because this branch is always terminating like if you get into this branch, it's going to reward But in general it can also be a return a branch where it's always going to return So we can actually add this except constrain So and if you have this check once again, you can actually prove that this Check is going to be unreachable and then you can optimize out those brands So, how do we improve? Yeah, how do we what do we go from the difference logic? So we could only encode less than greater than an a0 But once we graduate from the constraints of the form x x less than equal to y less than or equal to k Sorry x minus k x minus y less than or equal to k. We can't think about generalizing this So one generalization would be constraints of the form a1x plus a2x Until a and xn less than or equal to b Where a i and a i and b are constants and x i is a symbolic variable and integers We can actually solve that using what's called like linear programs and the simplex method and Once you can do that you can encode addition and subtractions because addition would be like x1 plus x2 And subtraction would be x1 minus x2. They would satisfy that that like form But there are some nuances here because Addition in EVM has a wrapping behavior. So you have to have some kind of branching to deal with this motor law motor law, but it's doable We can also encode multiplication and division similarly where one of the One of them has to be a constant and the other one can be symbolic in case of division the the second one has to be Constant in case of multiplication you can anything one of the variable has to be symbolic and the other one constant Yeah, I think that's it from For my talk Okay, thank you. Harry. We have some time for questions. So please raise your hand. Yeah, I see a hand over there in the back Volunteer is coming to you with the mic one second. Oh and another one there. Nice. Oh and another one there. Yeah Hey I'm wondering how do you know as a solidity developer when your checks are redundant and What can you do to inspect that? Yeah, I think the only way to do that would be perhaps Do both the tests I mean Write both of them and see if they have the same gas or check the Yeah, check the intermediate representation perhaps or the assembly and make it if I Mean none of the things I mentioned here are implemented so far. There are some branches These are just mostly experimental features Sorry what? Yeah, but it's almost Maybe just to use it as a segue Which parts are already implemented and what else might we see in the future? So we have an experimental Solver that uses an SMT checker An experimental optimization stage already. It's called a reasoning based amplifier It uses the full power of an SMT solver but it's it's a It's disabled by default in the compiler, but you can probably enable it if you Specify the optimization sequence that includes this one step so you can still get some of this if you add an extra an extra like Letter to the optimization sequence But the rest of so the what point of the talk is you can build a very simple logic that can do a Small subset of computations. This is not done. This is not in the the master yet But most of the SSA transform it's already there No idea I Thought I saw a third question somewhere. Please raise your hand if you have one If not a big round of applause for Harry. Thank you. We got one more one more question one more. Okay one more last one So you said we should be doing this optimization on you all rather than EVM bytecode or a solidity or or it's more conducive and I'm wondering It seems like people would use this sort of use this sort of analysis to potentially like look at MEV constraints and I'm wondering how you could apply this to on-chain bytecode or if that's infeasible Yeah, you can probably like Decombile EVM into maybe some you'll and try to do similar analysis on it There is this tool that's getting built by Leo and some of the people called you'lls can probably check the status of that can do some of this symbolic competition it doesn't mean to use it's like the full power of SMG solvers that not this restricted version But you can you can probably check it out. So once you have a translator from EVM to you You can apply these tools or you can use a TVM which works on EVM bytecode directly Another possibility Awesome. Thank you so much Hari. I hope your brains got kind of preheated throughout the last two talks because now we are going even Deeper and we can I guess already use the time while I announce the next speaker to set up Because this will be more of a workshop setup. Zina. Maybe you want to come up on the stage already to set your Infrastructure up The next speaker we will have here is Zina He's part of the go ethereum team and in the team his focus days mostly on tracing the json RPC and the graph QL APIs and Today he'll be talking about EVM tracing in geth Yeah, brace yourselves for a very detailed download on basic tracing Commonly faced problems as well as an introduction to the more recently ship features and how to write efficient tracers Welcome Zina big applause