 Hi and welcome back to program analysis. This is video number three of this lecture on operational semantics and what we'll do in this lecture is to look at an abstract machine for this simple language that we have syntactically defined in the previous video. So in this video and the next and the one afterwards we look at three different ways to define the semantics of the simple language and these three different ways of defining the semantics are at different levels of abstraction. We here now start at the lowest level where we will define an abstract machine for that language. You can think of this as basically a computer or a machine designed specifically for executing this one language only and we'll use this hypothetical machine as a device to explain what this language really means. So let's get started by defining this abstract machine for the simple language. This machine that we're going to define will have four elements and you can think of them as basically four parts of the hardware of this hypothetical machine and of course this machine is not really implemented, but this is just a thought experiment if you want to explain what this language really means. So one of these four elements is a so-called control stack which we'll denote with C and what this control stack is doing is to store all the instructions that we still need to execute. So essentially what happens here is that once we begin executing a program we put all its instructions on this control stack and then the machine will execute them one by one. Next we have another stack which is called the auxiliary or also result stack and we'll denote this one with R. So what this does is to store intermediate results. So whenever we have for example the result of a computation we're going to store it on this auxiliary or result stack in order to then use it later on. So now we have these stacks, but of course we also need something in our computer that performs the actual computations and similar to a real computer or to a real machine, this is called the processor. What this processor does is to perform all the operations that we have in our simple language and specifically this means it's performing arithmetic operations, comparisons and also Boolean operations. And then finally of course a computer also needs some way to store its current state and this is done in our abstract machine using the memory or also called state which we denote with M. Now you can think of this state or memory as a function that maps locations to values and this is a partial function because not all locations may have a value. So it's a partial function that maps locations to values and specifically in our case to integers because these are the values that simply is really operating on. Let's introduce a little bit of notation here. So if we have a given state M and now we want to write something into our memory then we write this like this where we say we take M and then update the location L so that it now points to the value N. So this means update our function M with a new mapping that maps the location L to the integer value N. So for example if we after doing this look up what is at this location that means we take the memory update it with this new mapping and then look up what's at location L afterwards then obviously we're getting the value that we've just written there whereas if you take this memory including the update and then look up any other location L prime then this should still contain the value that was at this other location before so this should be true for all L prime that are not equal to the location L that gets updated. So now you know what parts this abstract machine consists of but you still do not really know how this machine is working and how it can execute programs and to do this we're going to describe this abstract machine using one formula that we've already seen in the first video and this is a transition system. So we'll basically define what happens when this machine executes by defining transitions from one state or configuration to another and these transitions will correspond to executing commands of our program. So in order to define such a transition system we need to first define what the configurations or states of the system look like. So let's do this. So each configuration of our abstract machine is a triple consisting of three things namely our control stack C, the result stack R and the memory M and let's be a bit more precise and let's really say what can be in these three elements of the tuple. So in the control stack we can have three things actually two things sorry one is nil which just means the stack is empty there's nothing to execute or we can have an instruction i and we'll define what this could be in a second followed by the rest of the stack. Okay so just to add here what this really means so nil corresponds to the empty stack meaning there's no command to be executed and that typically happens at the very end of the execution of a program and this here means that we have a new instruction i that is pushed on top of our stack and because it's a stack when it's going to execute the commands it will look first at the top of the stack so this instruction i is the one that is going to be executed first. So now I also should tell you what these instructions i can actually be so let's also define what they are. So an instruction i can be any program p and this is the same p that I've used earlier when we defined the syntax of the simple language. It can also be an operation and again this is a simple we've seen earlier so this refers to all the operations that we have in our program then it can also be this negation operator it can be the logical and it can be a boolean operator it can be an assignment or it can be the if keyword or the wild keyword. So you've seen all these symbols before so you have an intuitive idea what they mean and what we'll now here do is to define the semantics of actually executing them and then we also have the result stack which can consist of a couple of things so one of them similar to the control stack is that it's just nil which again means that it's empty or we can store some programs here so we can have a sequence of commands followed by the rest of the result stack or sometimes we want to remember a location so we can also put a location here followed by the rest of the result stack. So now I said we want to use this transition system to actually describe how an execution of this abstract machine looks like so we are going to model the execution of programs as a sequence of transitions from one state to another namely from the initial state which describes what we know at the very beginning of executing program to a final state and you'll see in a second what this final state means. So the initial state looks as follows we have a triple as we've seen on the previous slide where we have our command or program here and then followed by nil so nothing else on the control stack we start with an empty result stack so this is nil and we start with some memory which could be empty or could already have some values inside and that what we want to reach is a final state where we have executed all commands so we should have only nil on the control stack and we hopefully have used all our intermediate results so we also have nil on the result stack and we'll have some memory again which is typically different from the memory we've started with because otherwise the program wouldn't have had any side effects so just to explain this a little more what this really means here is that we execute our program c in a given memory state which is the simple m and then here we stop with the execution when all stacks are empty. So now the question is how do we get from this initial state to the final state and this is what we need to define next and as you can guess whenever we want to define a transition system we define the configuration and a set of transition rules so in this case we are also going to define the transition rules which tell us how to get from an initial state to a final state so how to really execute a program in the simple language so these rules are denoted with this arrow and essentially what they will tell us is how to get from one of these triples c, r and m to another one where we then have an updated control stack and updated result stack and an updated memory so I'm now going to define the transition rules for all parts of our simple language and I'm starting with the most simple element of this language which are the arithmetic expressions so I'm going to define first how one can evaluate expressions so let's start with the most simple possible expression which is an expression that is already evaluated actually because there's just a number and in this case on our command on our control stack we will have a number followed by the rest of the control stack some result stack r and some memory and in order to evaluate this very simple expression that only consists of a number what we're going to do is to pop it from the control stack so only c remains and the n which was on top of it is now popped and instead we push it on the result stack so that we can then use it for some other operation that follows afterwards and the memory m stays the same as before because this does not really write anything into the memory so let me just put this down here in words again so we're popping n from the control stack c and instead push it onto the result stack r so that we can use it for the next computation a slightly more complicated expression is one that reads from memory so in this case what we'll see on our control stack is this kind of expression where we say read the value at location l and this is followed by some other commands we have some result stack and some memory and now in order to evaluate this we're actually going to look up the value at location l so we again pop this read l expression from the command stack so that only c remains we're putting some value n onto the result stack and n is what we actually read in the memory from at location l so we can only do this if m of l gives us n so again just to put this down here in words so we read the memory at location l and then we push it onto r so this is very similar to the first transition rule just now that we've not um had a constant but we've read some value from memory so now you know how we can get values um but of course expressions also need some operations otherwise they are kind of boring so let's have a look at how we can evaluate complex expressions that also involve some operations so let's say we have some expression that looks like this on top of our control stack so we have some expression e1 followed by an operator and then some expression e2 and this is the top of the stack which yeah has some other commands in c and then we have the result stack and some memory now what we're going to do in order to evaluate this whole expression is three things first we need to evaluate this expression e1 so we put it on top of the control stack then we need to evaluate the expression e2 so we put it next and then once we have these two and we've put the resulting values onto the intermediate step then we're going to execute our operation and to not forget about this we put the operator here onto the control stack and then we do whatever follows afterwards in this program and this is only affecting the control stack so the result stack and the memory here stay as they are so in a sense you can think of this rule as basically unwrapping this expression we take this command that contains the expression and disassemble it into its components and put all of these components in the right order onto our control stack now once this is done and once we have actually evaluated these expressions e1 and e2 for example using the first two rules that you see here then at some point we will reach a situation where we have just the operator left on the control stack and where we have put some values onto our result stack namely value n2 and n1 and then there's potentially something else on the result stack then what we're going to do is to execute this operation by removing it from the control stack and putting the result of this operation which we here call n on the result stack all of this does not affect the memory so this stays the same m and this n is actually the result of applying operator um what the operator op 2 and 1 and n2 so if we do this for example if we add two numbers then op would be plus and the result is n then n is our result which we then put on the result stack now all of this is for integer expressions it's very similar for boolean expressions so i'm not going to write it down explicitly here but you'll actually have access to the complete set of rules in the ilias course where i'll upload a page with all these rules so one question that often comes up here is why the order of these values n2 and n1 is like this and not the other way around well it's simply because if we start from a state that looks like this and then evaluate first the expression e1 then we'll put this expression e1 first on our stack so we basically put this here so this is the top of the stack then and then we evaluate in e2 and then because we can just sneak it in somewhere we also put it on top of the stack which means they look like they are in the wrong order but we correct for this by um basically knowing that the second operant is the one that will be on top of the stack whenever we evaluate a binary integer operation so given the set of rules that we've just defined we can now write down what are the semantics of expressions in our simple language and specifically specifically this will be the following so we say the value of an expression e in some state m so we are evaluating e in this state of the memory m is a value v if and only if there is a sequence of transitions where we are starting with this expression e on top of our command stack and we're starting with some result stack and the given memory m and then there's some sequence of transitions so I'm using the transitive reflexive closure here that leads us to a state where we have um evaluated the expression so only the rest of the control stack is left and where the value v is now on top of our result stick and in principle this may update the memory so let me just put m prime here and not just m but actually because evaluating expressions in this simple language simp that we're looking at here does not have any side effects um actually m prime is always equal to m for simp but you could of course think of a slightly more complex language where evaluating arithmetic expressions does have side effects so where it can actually also write to the memory for example think of the plus plus operator in most languages which does not only yield a value but also updates the memory and in that case m prime would not be equal to m but you could actually change the value when you evaluate an expression now this is the semantics of integer expressions and again all of this is very similar for boolean expressions so I'm not going to write it down again because it looks more or less the same so to make all of this a little bit more concrete let's now have a look at a at a concrete example where I'm going to give you an expression in simp and an initial memory so a state in which this expression is going to going to be evaluated and then the question for you is what is actually the result of this um of evaluating this expression and to answer that question you should write down the sequence of transition rules that we can apply so that we compute the value of this expression I'm going to show you the question then I'm also going to show you the solution but I actually recommend to stop the video once I've explained the question so that you can first try yourself and then see on the solution afterwards so let's have a look at this example so the question is what is the value of the following expression so the expression e is read a and added to read b and we're going to evaluate this expression in the following state m it's a state where a is mapped to three and b is mapped to one so before you're going to now think about the transition rules just understand the intuition behind this so this is basically adding a and b a and b are values written at memory locations and you can look up in the memory what its values really are so there are three and one so the result will obviously be four but now the question is how to formally show this using the transition rules we've just seen and you should try to do this um yourself before you look at the solution all right so let me show you the solution so in order to evaluate this expression we're going to put this expression on top of our control stack so we have read a plus read b followed by in this case nothing else because we just want to evaluate this expression we start with an empty result stack nil and then have the memory m as defined above and now we're going to use the transition rules that we've just defined so the first step that we will take here is to actually decompose this expression into its different components which in this case are hey we need to read the first expression of this binary operation plus which is read the a we put this on a stack followed by the second sub expression which in this case is read the value of b and then followed by the operator itself which in this case is plus and then followed by whatever else was on the stack which here was nothing such as nil and this doesn't affect the result stack and also not the memory so now next we're going to go through our um control stack one by one and evaluate these different parts starting with read a which means we're going to pop read a from the stack so only read b and plus and nil remains and now we do not just pop it and forget about it but we pop it and read the value of a in our memory which if you look at the memory above is three so we're going to put this result three on the result stack so instead of just nil we now have three on top of nil and the memory stays as it is next we're doing the same with the value of b so we are removing this read b from the stack so only plus and nil remain b is one so we put this on top of the stack three is still there and then nil so the end of the stack and then what's left to do is to actually perform the operation so we remove the plus from the control stack so the control stack afterwards will be empty and we put the result of applying this plus operator to three and one because these are the two values that we have on the stack two yeah two two plus and three plus one gives four so at the end we'll have four on the result stack which means that the answer here is four because that's the value that remains on the result stack after we've executed the entire expression cool so now you know how to execute expressions that's great next we'll also look into how to execute commands because our simple language does not only have arithmetic expressions but it also has more complex commands so the most simple command that we've introduced was this command called skip which does exactly what the name suggests namely nothing and in order to express this we have this transition rule that says whenever we have skip on top of our control stack we can just remove it so we just skip it and everything else stays the same a slightly more interesting command is the assignment command where on top of our control stack we'll have an assignment that says write to location l the value of evaluating some expression e and then there may be something else on a control stack and what our abstract machine does in order to evaluate this kind of command is two things first it needs to evaluate the expression e and then it should write the result into this location l and now here in the abstract machine we're doing this in multiple steps one is that we put this expression e on top of the stack and then the next thing we'll do is to perform the actual assignment so we put this assignment symbol next on the stack and then we have whatever else was on the stack before and we also need to remember the location where we should store this the result of evaluating e so to do this we're putting this location l on our result stack so that later on once we have evaluated e we still know where to actually write it so just to make this more clear let me write it down here so what we do here is we push this location l on our auxiliary stack so once we have evaluated e we will reach the situation where on top of our stack we have this assignment symbol so let's have a look at what we'll do in this case so if we have this symbol on top of the control stack and then we also need to have some value n here which will be the result of evaluating e and we have this location l here followed by whatever else is on the result stack then what we do is the following we are removing this assignment symbol so only c remains on the control stack we are removing both n and l from the result stack so only r remains and now we're doing the actual assignment so we are updating our memory by now taking the old memory m and we add this new mapping which maps l to n so which is essentially writing two location l the value n so this is writing n two location l so now you know how to handle the skip command you know how to handle assignments let's next have a look at this chaining command that we've seen before we're using a semicolon we can combine multiple commands so if you see something like this in a program and if this chain of commands is on top of our control stack what are we going to do in our abstract machine well as you might guess from the syntax what we're going to do here is to first compute the result of c1 and then also execute c2 so we put c1 here first followed by c2 followed by whatever else was on the stack before so this is basically just a rule to again decompose this syntactical command that combines two commands into something that we can then execute one by one next let's have a look at if commands where we have something like this on top of our stack so we have an if with some boolean expression b and then the two branches the then branch where we want to execute command c1 and the else branch where we want to execute command c2 and this is on top of our stack followed by something else and we have a result stack and of course some memory and now if we evaluate this we'll do the following so the first thing we have to do if you see an if like this is to evaluate the boolean expression so we put this on top of our control stack and then we need something that reminds us that oh actually we did this because we're here because of an if statement so we put if next followed by whatever was was on a stack before and of course we need to somehow remember what the two branches of this if are and we use our result stack for that purpose so we put c1 and c2 on top of our result stack so that we know what to do once we have evaluated the boolean expression so this is sort of similar to what we've done here with the location l where we have just put it on the result stack in order to remember it for later and here we do the same where we store these commands for later and later means once we have evaluated the boolean expression b so now let's have a look at what happens once we have evaluated this boolean expression there are basically two cases one is that we are reaching this state so we then have only the if left on the stack followed by whatever was there before and then because we have evaluated the boolean expression b we will have put its result on top of our result stack and one case is that this result is true so we will have true here on the on the stack followed by c1 and c2 which we've put there before followed by whatever else was on the result stack and as you might guess from knowing the semantics of if statements in other languages what we're doing here is to take the then branch so we forget about c2 and also discard true value because now we've used it and put instead c1 on top of the stack and the the remaining result stack and the memory stay as they are and very similarly if we get in a similar situation but now with false instead of true so this will look like this then what we're going to do is to execute c2 or let's say to schedule it for execution by putting it on the control stack so that in the next transition c2 will be executed so essentially what these two rules here are are saying is that we execute either the if branch or the else branch so now if you remember the different commands that we have initially defined for our simple language you may wonder about one that we have left out so far and this is the while command which is used to express loops in our language so if you have a while on top of the control stack so something like while b to c with some state of r and m then what we're going to do is similar at first to what we've done for the for evaluating or for executing if statements we're going to decompose this syntactic command into the different components that need to be handled here and the first thing you do once you reach a while command like this is to actually reason about the boolean expression so we're going to evaluate this first then put this while keyword on top of our stack to remember that we still need to continue with this while loop and in order to not forget the body of the loop we also put that on the result stack and on top of that we also put the boolean expression here again and the reason is that once we have executed the body of the loop once we're going back to executing the boolean expression because this is a loop so we may have to do this again and again and again so this first rule is just to decompose the syntactic command and now what we'll do next is to define two more rules for the case that the boolean expression evaluates to true and to false so in the true case we'll have while on top of our control stack and true will be on top of our result stack followed by this boolean expression and the loop body c that we've put there using the previous transition rule and if this is the case so if the expression b evaluates to true what the program should do is to first execute the body of the loop c and then afterwards we essentially back at the beginning of this first rule that you see here because then we are back at the beginning of the loop and again want to figure out whether we need to enter this loop or not and a simple way to express this is to just put another while b to c command on top of the stack followed by whatever was on the stack before they can just make this c a little larger so that it's yeah clear that there is the lower case c for the remaining control stack and the upper case c for the body of the loop so this was the case where the boolean expression is true of course this boolean expression may also be false and hopefully at some point it is false otherwise we have an infinite loop so if you reach this state with false on top followed again by b and c then what are we going to do well if the expression that controls a loop is false then we do not enter this loop so we simply skip over it which in this case means we just continue with whatever else is left on our control stack and we are discarding false and b and c from the result stack and just just leave there what was there before and also the memory in this case stays untouched excellent so now using the transition rule set we've defined we can now write down what the semantics of a complete simp program is so we can say what are the what is the semantics of simp commands and it is the following so if you have a program c and we are executing this program c in some state m and if this we say that this program terminates successfully and it produces some resulting state which we call m prime so we can say that this is the case if and now we're using our transition rules if there is a sequence of transitions as follows where we start with c on top of our control stack and there's nothing else and we start with an empty result stack and this memory state m and then we have some number of applications of our transition rules that eventually leads us to an empty control stack so we've executed the entire program c an empty result stack so we've used any intermediate values that we may have pushed onto this result stack and we may have updated our memory into m prime and if we have such a sequence of transition rules then we can say that the program c executed in state m successfully terminates and produces state m prime okay lots of transition rules to make this more concrete let's again have a look at an example and again i'll invite you to think about the example yourself flip back on the slides to look at the transition rules and see if you can actually use them to compute what happens when we execute the program that i'll give in this example and only then look at the solution that i'm providing so here's the example it's a program c that goes as follows so it says while whatever is at location l is larger than zero do the following we update the value of variable f by taking the old value of f and multiplying it with the current value of l and then we use the semicolon and afterwards we are updating l by removing one from the old value of l and we also need to define in what state we are executing this program so here's the initial memory where l maps to four and f maps to one so now to define what happens when this program is executed let's write down these transition rules and because i'm a bit lazy i'm going to use some placeholders here so this is going to be called b and all of the commands these two commands that are in the body of the loop i'll just call them c prime so this is just to save some effort while writing now in order to um save what the semantics of this program are we start by putting our program c on top of an empty control stack also start with the empty result stack and start with the memory m that we see above and now we apply one rule after the other until eventually we hopefully reach a state with nil nil and some updated memory okay so what are we going to do here well we start by looking at this um um outermost command which in this case is a while command so we are um taking the boolean condition of the while command which i've lazily called b put this on top of our command stack followed by the while um keyword and then followed by nil so just going back to the corresponding slide so we're basically using this rule here in order to decompose the while command and part of this rule also said that we need to remember b and the loop body in this case c prime and to do this we put those two on the result stack and m here stays the way it is so now let's think about what could be the next transition so on top of the stack we now have this expression b which is the value of l larger than zero so in order to evaluate this expression we need to use one of the rules that we have defined earlier for expressions namely this one where we start by decomposing this expression into its components by saying hey we need to first evaluate the first expression then the second expression and then perform the actual operation so in this case the first expression is going to be read the value of l the second expression is just take this value zero and then once we have evaluate these two sub expressions we're going to perform the actual operation which in this case is a comparison of these two values and then all the rest of the control stack stays the way it is so we still have the wild keyword here followed by nil so the end of the stack and then the result stack doesn't change in this transition so we still have b and c prime and nil here and m still is what it was before all right so now next we need to evaluate um what is on top of our control stack so reading this value at location l um we have seen a rule for that which basically says hey look it up in the memory and then pop it from the control stack so only zero remains here followed by all the rest and we now put the result of reading l from the memory on top of the result stack so this will be four because up here we see that four is the value that l maps to and then we still have all the rest of the result stack here so b the loop body c prime and nil and our memory is still unchanged and now this goes on and on and on like this so we're basically using one transition rule after another and in the exercise you will have the chance to actually go through this example and other examples in more detail here i'm just putting dot dot dot but of course you're invited to actually try it out and then maybe ask about it in the exercise session until eventually hopefully you're reaching this state where you have two empty stacks and where our memory has been updated so that eventually l um points to zero and f now points to 24 and as you've maybe figured out by now this is actually the factorial function and this is the result which is stored here in value f which is 24 all right and this has brought us to the end of this video so we've now defined the abstract machine that can execute simp and this is one way of defining the semantics of this language as you've seen some of it is pretty mechanical so for example if you have a complex expression or a more complex command on top of our control stack then we first disassembling this into its components and this is maybe a bit tedious so the next video will see a more compact way of defining the semantics of a language which will be structural operational semantics thank you very much for listening and see you next time