 Hi, and welcome back to the program analysis course. In this lecture, we will talk about random testing and fuzzing, which are techniques to automatically test a program. The two terms random testing and fuzzing mean more or less the same, and I will use them interchangeably in this lecture. Let's get started with the question of what automated testing actually is and why we need it. As you know, practically all software has some bugs, and some of these bugs are pretty bad, so you better want to find them. And one very popular way of doing this is testing. Now, if you start implementing something, the first thing you probably do is to manually test the software. But at some point, you've probably exhausted your time budget for implementing this piece of code, and you just can't do any more manual testing. And this is where automated testing comes into play, because it tries to reduce the human burden involved in testing. There are two ways of doing this automation. One is to automate the test execution. This means that someone has written a test, for example, a unit test, and then this test is regularly executed, for example, as part of a regression test suite. This is one good way of automating testing, but it's not the one that I want to talk about here. The form of automated testing that I want to talk about here is about the test creation. So here, the idea is that you're not even writing the test yourself, but you have a tool that is automatically generating tests. And then you can, of course, also automatically execute these tests. So here in this lecture, we will focus on this part of automated testing, where it's about automatically creating tests. So now to automatically create or generate tests, there are many, many different kinds of approaches which can be roughly categorized, as you can see on that slide. So one category of these approaches is black box testing, where black box basically means the program that is tested is considered as a black box. So we are not looking inside. We do not know what's inside because we do not do any program analysis on this program. Instead, we may, for example, just feed some random values into the program and then see what happens without really looking into what happens inside the program itself. The other extreme is white box test generation, where we're using a heavyweight static analysis or maybe a heavyweight dynamic analysis to understand what the program is actually doing and then try to use that knowledge in order to generate inputs to run this program. For example, a white box testing approach might look into the conditions that trigger specific paths through a program and then try to change the inputs in such a way that a particular path that you may want to trigger is actually executed. Now, white box testing is pretty heavyweight because it involves a lot of program analysis, which means you cannot do a lot of testing in a given time budget, whereas black box testing is very naive and maybe stupid in a sense because it doesn't really know anything about the program. The approach in the middle is called gray box testing because it takes a bit of a look into the program by doing a lightweight analysis of the program, for example, of the execution of the program. So one specific example of such an analysis might be a dynamic analysis that looks at how much coverage a particular input is achieving, so which statements or branches or maybe paths of the program are actually covered if you run into a particular input, and then you can use this information to change the input so that you can maybe get more coverage. We will look at all of these approaches in this course. Here in this lecture, we will focus on the first two approaches, so black box and gray box testing, and then in the next lecture, we will look at a form of white box testing that involves a more heavy weight program analysis. One property that is common to all the approaches that we discuss here and actually to almost all approaches that are popular for automated test generation is that they use some kind of feedback from test executions. So instead of just generating inputs and then seeing what happens without really acting on that knowledge about what happens when you run the program, they are executing the program with degenerated inputs and then observe what is happening and based on these observations, try to generate better inputs that hopefully trigger some other behavior. So this kind of feedback is something that is very valuable and that is actually used by many of these automated test generation approaches. Now I keep talking about the program and I haven't really said what I mean when I say the program and the reason is that there's not just one answer but there are many possible answers. So depending on what exactly you want to test, the program may just be a single individual function that is tested in isolation. It may also be a class and all the methods that this class is offering or it may be an entire library that consists of different classes and a whole set of APIs provided by that library. And of course the program may also be a standalone tool where the input that you give to the tool can for example be a file that the tool is reading or maybe some kind of input stream. The ideas that we are discussing here in this lecture in principle can be applied to many of these different levels of the program. But for the specific examples that we are discussing, of course we will always pick one of these definitions of what the program really is. So after this short introduction into automated testing and random testing and fuzzing, let me now give you a brief outline of what follows in the rest of this lecture. So in the second video, we will look at a tool called RANDUPE which implements an idea called feedback directed random test generation, which is essentially a way to do black box testing on classes in object oriented languages, where we are executing the methods of these classes and then without looking inside the execution of these methods still use some feedback from the executions in order to generate better tests. And then in the third video of this lecture, we will look into a gray box fuzzing approach called AFL, which is pretty popular and is a way to test entire applications, entire tools by randomly mutating the inputs while looking at the coverage that these inputs achieve when they are fed into the program. All right, so this is already the end of this very first video where I've given you an introduction into random testing and fuzzing. And in the next two lectures or next two videos of this lecture, we will look into more details. Thank you very much for listening and see you next time.