 Hi, welcome back to analyzing software using deep learning. So this is part three of this module on using sequence to sequence models for analyzing software. And what we want to do in this third part is to look at another application of sequence to sequence networks. And this application is about interpreting Python programs. So again, this is based on a research paper from 2014. So one of the very early applications of this kind of model to software. And if you're interested in more details, then please have a look at this paper. So the overall idea of this application of sequence to sequence models is to see if we can interpret a program. Interpreting a program basically means take the source code and execute it or reason about what would happen during the execution and tell us the output of the computation. Now in principle, we know that neural networks are able to express arbitrary computations. So in principle, a neural network should be able to also interpret a program given in source code. And the question of this approach is, can you really interpret the program using a neural network? So why would you really want to do this? Because if you have the source code of your program, you could just interpret it using a real world interpreter. For example, for Python, you could just call the Python interpreter and run the code. So one reason why you may want to learn a model instead is that real world interpreters are pretty complex pieces of software and it's very easy to get them wrong. So having a model may be an alternative way of executing a program. Now in practice, it's not clear if this is really the way people want to implement programming languages. But it's a very interesting question to see how far we can push the power of neural networks and to see if they can actually help executing or interpreting programs. Now how to do this interpretation of programs using a sequence-to-sequence model? So again, the idea is to formulate a problem as a translation problem where we have some input sequence that then gets translated by our model to an output sequence. The input sequence in this case is the sequence of characters of the source program. So we basically look at all the source code and look at one character at a time and take the sequence of characters as the input. The output is the sequence of characters of the program's output. So whatever this program is printing, for example to the console, is considered as the output and we want the model to predict this printed output again character by character. So doing this for arbitrarily complex programs that can print all kinds of outputs and maybe have other outputs beyond just printing to the console would be very, very hard. So what is done here in this work is to look at a restricted set of programs that you can actually evaluate with a single left-to-right path pass through the code using a constant amount of memory. So this is a huge simplification and of course means that this approach as we see it here is likely to not really work on more complex programs. But applying these simplifications or these restrictions is a way to see if a model is able to predict the output of a program at all. So let's have a look at a concrete example of how this is supposed to work. So what you see here is one of these simple programs that are in scope for this work. This is about Python code, so this is a Python program and the program has a couple of statements. It has a couple of variables that mostly deal with numbers. There even is a loop here and at the end there's a print statement that takes the result of some arithmetic expression and prints it to the console. So now what the model is supposed to predict is the output of this program. So basically what is printed to the console and the output in this case happens to be 25011 and we want the model to predict exactly this output. So just to show that this is a really hard task let me just show you another example where I've obfuscated the characters used in the source code and also the characters used in the output just to illustrate what the neural network really sees. So if you look at the previous example this looks like Python code and you see the numbers and you can do the math and you know what the result is. But the neural network doesn't really have all this pre-existing knowledge of programming languages that you probably have. So what the neural network really sees is basically this gibberish of input characters and these output characters that at least initially do not really make a lot of sense to the model. And the question is can it really learn how to predict the output from the input. So to train a model that is able to take a piece of Python code and predict what it's going to print you need some training data. And as usual we want to have a lot of training data. So the idea here is to automatically generate Python programs so that we get a lot of these programs but at the same time can control what kinds of language features they use and what kinds of properties these programs have. So specifically the Python programs generated in this work focus on a few language features that are just a subset of what the full Python language provides but that keep the scope of this work in some limits so that the model has a realistic chance of actually learning to predict the output. Basically what your authors do here is to use arithmetic operations like addition, subtraction and multiplication. They also use variable assignments, they use if statements, they also use for loops but they do not use any nested loops which simplifies the task a bit and at the end of each of these generated programs there's a print statement which is what produces the actual output of the program. Now given these automatically generate Python programs what the authors do is to execute them which gives you some behavior so to do this they use a traditional Python interpreter so some existing implementation of the Python language which then will print something for each of these programs and the printed characters are what we take as the output sequence that the model hopefully learns to predict. So now the way the actual model works is the same as what we've seen before so the input sequence is taken given to the encoder and which produces a context vector which is given to the decoder and which then produces the output sequence so I'm not going to repeat this again for this particular application but instead let's have a look directly at the results of this approach. So what the authors find is that they can achieve a prediction accuracy between 36 and 84% which means that the model works sort of so it's not perfect of course but in many cases it can actually accurately predict the output that is going to be printed by one of these simple Python programs. The accuracy depends heavily on the size and the complexity of programs so programs that do not have loops for example are easier than programs that do have loops and longer programs tend to be much harder to predict than shorter programs. Just to give you an example of an inaccurate prediction here's one that is reported in the paper so this program consists of just three lines where we have a variable e that is manipulated in this loop here and then eventually the result of this computation and the value of e after the loop is printed. Now what this code really prints is 95,007 but what the output that the model predicts is is 94,103 so what you nicely see in this example is that the model is getting sort of close but not quite to the right result and apparently this is a pattern that your authors have seen in a couple of examples where they the result is roughly correct but not quite correct which is sort of unsurprising because these neural networks are always based on probabilities and approximations so yeah if they get close to the result but not quite to the right result this is something that you would expect. Now overall this means that this kind of model is able to sometimes at least interpret a Python program of course this is mostly yes a toy application or a nice research idea but it's not yet clear how this will impact practitioners because of course you can instead just interpret Python programs the old way by running the Python interpreter but it's nice to see that in principle these sequence-to-sequence architectures are actually able to interpret at least some programs and predict what they are going to produce. All right and this is the end of part three of this module on sequence-to-sequence models and how to use them for analyzing software so now we've seen what these sequence-to-sequence models are you've seen one application that was about predicting API usages another one about this sort of crazy idea of predicting what a Python program produces and in general there are many many more applications of sequence to sequence models for analyzing software by now for example people have looked into predicting what code changes developers might apply or how people may want to fix it back and and there are many many more applications of sequence to sequence models because they turn out to be pretty powerful and can be applied to various tasks in software development. Thank you very much for listening and see you next time.