 held by Max Berber. He's already very ready for that. He's head of a software company in Tübingen in Germany and he's going to talk about getting software right with properties, generated tests and proofs and the main thing here is functional programming. One tiny thing you might not know about him is that 1986 he won a federal competition on IT and so give him a warm applause for that and also for his talk. Thank you very much. Is anybody actively using the induction loop feature in the first couple of rows? I know somebody who would like to know, not right now. Anyway, let me get one shameless plug of advertising out of the way. If you find the contents of this talk interesting we're running a developer conference in Berlin in February called BOP which is very friendly and very nice, very tiny compared to this one and we'd love to see you there. Another thing, this is an introductory talk so if you were expecting the latest developments on proof tactics, in fact if you know what a proof tactic is then all you might get from this talk is sort of mild amusement and I won't be mad at you at all if you go for one of the more exciting talks, okay? Or leave at any time, that's perfectly fine so if this material is not for you. Speaking of introductory talks, here's a piece of code written in the language that I will use for this talk and it's called Idris. Who's written an Idris program before? Very good. Okay, oh, there's one person back there. That means if any part of this program, as soon as I'm done explaining it, is not clear to you, it's also not clear to two or three hundred other people in this room and I would love to have your help interrupt me, ask a question, any time in the talk if there's anything here, not clear. It's going to get, even though it's meant to be introductory, will get quite technical at times. So let me try explaining this one. So this is a classic example in functional programming that I use often in my talks about animals on the Texas highway. And if you can see there, the central definition says data animal, that's the data definition of animals. And in this particular version of the Texas highway, there's two different kinds of animals. There's Armadillo, is where it says Dillo there, and then there's Parrots for some reason on the Texas highway. Does that make sense? Two different kinds of animals and you see that definition? Yeah, not, that greatly helps me. And if you see those two definitions for Dillo and Parrot, you can see, well the arrows are kind of funny, but you can see that Dillo and Parrots have two properties each and it says their liveness, that's one of the properties of an Armadillo. And up there at the very top you see the definition of liveness, it says liveness means dead or alive. So an Armadillo can be dead or alive on the Texas highway. And there's also the weight, and well you see this colon thing is the type signature for the constructor for Armadillos. So it says there's a liveness going on, there's a weight going on, and then it constructs an animal. And for a parrot, there's a string, every parrot speaks, right, and so it's the sentence the parrot says, and also the weight and it also produces an animal. And up there you can see the definition of weight is for simplicity's sake, I'm saying that weight is a type. So that's kind of unusual for Idris, but you don't have to worry about it, but you can see there weight is just the same thing as an integer. And if you look down there where it says A1, A2, and A3, that has three examples for animals. So it says A1 colon animal just to say that A1 is an animal, so Idris is a language that always has type declarations, and it says A1 is Dillo alive 10, and that means it's an Armadillo, it's still alive and it weighs let's say 10 kilograms. The second one is dead, a little bit heavier, weighs 12 kilograms, and the third animal is a parrot that knows, well it's a pirate's parrot, obviously, and maybe weighs three kilograms. Okay, so far? Okay, so if you have any question about any of that, then please ask away. So what happens to animals on the Texas Highway is, you know, people drive cars, they run them over. So there's a function down here and, well, we're doing functional programming, shouldn't worry you at all, but what's important here is that it says there's an animal going in and animal going out, and really what this means is that this animal object up there is not really the animal, it is the state of the animal at a given time. And so run over animal, you can see the type signature that says an animal goes in and animal goes out, and what it really means is the state of the animal goes in before it gets run over and the state of the animal after it gets run over comes out. And then, while we know there's two different kinds of animals, and that means that for the definition of run over animal, we need what's called equations. There's two different equations, and the first equation says what happens to armadillos when they get run over. So an armadillo has a liveness and a weight, here's something going on called pattern matching, and the second equation says when there's a parrot going on and it has a sentence and a weight, and on the right-hand side you can see, well, when an armadillo gets run over, well, all that means is the liveness sort of turns to dead. We're constructing a new armadillo object and it's dead, and it has the same weight as before, and the equation at the bottom says, well, when we run over a parrot, it turns really, really quiet. So, classic example. So far, we're going to return to that example at the very end. Right now, it's just to illustrate the language that we're doing things, and we're going to do a lot of things without complicated programs. So, well, so now I'm going to jump around a little bit. So one thing, so just the other day, two weeks ago, I was teaching a course on architecture and somebody said, well, there's this problem, I'm building a domain model, I'm putting the domain model in a database, and, you know, customer comes and has new requirement, or somebody comes and has new requirements, and that always ends the same way, I put a new column in the database, and, you know, seven, eight, nine, and it just goes on and on, and as the software gets older and older and older, more columns that create the old ones seem a little stale, and so, yes, well, how can we build models that are flexible? And so, here's something completely different, you might think. So here's sort of the key to that, to building flexible models. Does anybody recognize this? Does anybody associate a word with this? Very good. So you might remember, so some, depending on what state you went to school in, you might remember that this is a property called associativity, right, and it means that we can associate either the A and the B first with the parentheses or the B and the C. And so, and this is, if you take away one thing from this talk, it's associativity, knowing what that is, is one of the most useful things in software development. So of course, just a generic equation, we really need to be more specific, namely, that we're dealing with numbers in addition, and you might know that it's not just addition that's associative, also multiplication, for example, is associative. So, here's a little mathy stuff there at the beginning. So you see that upside down A that says for all, we just say for all, and what that means is for all A, B, and C, and then this funny epsilon shape letter kind of thing, it means element of, and then that funky N means the natural numbers. So all the numbers from zero, one, two, three, the whole numbers from zero on up. So what that means is for all natural numbers, A, B, and C, the associative property holds when you add them up. But, well, it says numbers in addition, it doesn't just hold for numbers in addition, in fact, associativity is everywhere around us, specifically it's everywhere around us when we program. So here's another example, when you're dealing with lists, and that funky the two pluses that you see there, that they are just list concatenation. So you concatenate two lists, and, well, of course, you can concatenate three lists, and by just using that double plus in any order. And that's also associative. So it doesn't matter if you first concatenate the B and the C, and then tack the A onto the front, or if you concatenate the A and the B and tack the C on at the end, doesn't matter, you always get the same result. So lists and concatenation also have this associative property. And here's something that I've always find very, very enlightening is that you can construct images that way. Well, you don't see it here. So here's an image, well, it's from a co researcher of mine at functional programming, Brent Yorge, and he's he is a great library out called diagrams for constructing diagrams out of parts. And so this really is what associativity is about. It's about operators that construct things out of parts. And so as you can see here, well, there's different shapes here, there's sort of the black rectangles, there's a different rectangles that denote the towers of Hanoi, we're not really going to deal with the towers of Hanoi here really the important thing here that the image consists of several parts. And well, in normal or sort of in classic object oriented programming, when you do graphics, you have a canvas, and you might draw pixels on that canvas, you know, might be square shaped or circle shaped canvas pixels. But what we're doing here is, we are treating an image as a data type. And the definition is not important. What is important is that there are a couple of functions that construct sort of simple images. So here's a function that you might imagine called star and construct stars. And well, you can see up there, there's a type declaration that says, well, the star function, it accepts an integer, it accepts a mode, whatever that is, it accepts a color, and it produces an image. And we can call that star function with the arguments 200 and solid and gold. So mode is solid or outline. And then we have a color. And we get image, and that image is an object, not particularly exciting. But we might have another function called polygon. Polygon takes two integers that denote the size of the polygon and the number of vertices. And also whether it's an outline or whether it's solid and a color. And for example, if we call it with 180, again, that's the size. And five, we get a five-corner polygon. And we get that as an outline, and it's in red. Now, the idea here is that we can combine, just as we can combine two numbers, or we can combine two lists, we can combine two images. Maybe the most intuitive way of combining two images is just sticking them beside each other. So there's a function there called beside, and it takes an image, and it takes another image, and it produces an image. And this is exactly what we're thinking about when we talk about associativity. We're talking about a sort of binary operator that produces the same thing that went in. And so, for example, we could stick those two images next to each other. We could also imagine an operator called above that just puts one image above the other image. And we can combine these two things. And here it really is important that the same thing comes out so that its image goes in, another image goes in, and an image goes out. So we could, again, call above on the result of beside and make arrangements. So here's a tiling arrangement for your bathroom or something like that. Now beside and above are two possible operators, and you might already think about associativity, but really the more fundamental one is overlay. You put two images on top of each other. And so, again, overlay has the right type. An image goes in, another image goes in, and another image comes out. And if we take the gold star and the pentagon and put them on top of each other, then it looks like this. And we can then formulate an associativity property. It might not quite look the same because I wrote overlay in front rather than between the operators. We could also write it between, but just to show you that it's the same idea. So it doesn't really matter if we first take two images A and B and superimpose those two, and then put those two on top of C or if we do it in another order. Does that make sense so far? Okay. No? Do you have a question? Yeah, so, ah, good point. A good point is, so this implies that there must be some kind of, that there's probably some notion of transparency involved. Yes, yes there is. But then you have associativity. And really what it means, a very good question, is that if you think, so if you think of this image in terms of the color at certain coordinates, right? Well, you need to think about how to combine those two colors that are in the constituent images. And you can imagine that there also has to be a combination operation for the color. And that also needs to be associative as a prerequisite for the, for the overlay operation to be associative. Does that make sense now? Thank you. Good question. Great question. So anyway, so since this associativity property is something that is not just restricted to numbers, as we may have learned in school, it really makes sense to get, and that means that when we talk about associativity, we always have to name two things. We have to say what set we're operating on and what the operation is. And the combination of those two things has a name in mathematics. And it's not the best name, but it's called a semi-group, right? And, but, but, you know, if you drop it in certain circles, they'll think that you're an expert on mathematics. You might try that. So, so just to go over that, so you have some, some set S and that S might be image, it might be the natural numbers or something like that. And we have an operation that I'm just going to call circle here. Then that take any a, b, and c from that set S. We can use circle as an operator and we have that associativity property. And for that circle, you can put an overlay, you can put an beside, you can put an above, you can put an plus, you can put in times, or you can put in the list concatenation operator, the double plus. Okay. And associativity is great. It's really my favorite property because it means when we have a whole lot of things that we combine, we can parenthesize in any we want. We will get the same result no matter which way we parenthesize them. And that really means we can leave out the parentheses when we write an expression that involves only the circle operator, if it's associative. We can just leave out all the parentheses because the parentheses don't matter. And that makes it, well, that makes it instantly easier to read, I think. Also it has practical uses. So if you do big data processing, associativity means that if you have large data sets that span several machines or several hard drives or several data sources, then, and you're combining them, and you have an associative combination operation, it just means you can rearrange that combination operation according to the load in your compute cluster. And that makes it a very useful property when you're doing big data processing and sort of map reduce-based frameworks. But I mean, that's a practical application, but I think it's much more useful, associativity is much more useful when you use it for designing your domain model. And I talked in the beginning how, well, you avoid always adding more database columns. And one way of doing that is to view your domain model not as something that has more and more properties, but your domain model as building blocks that you combine into larger building blocks the same way that we combine images from simpler images. So here's a great, so here's one of the great papers from Functional Programming, one of my two or three favorites from Brent Yorke, and it's called Something, Something, Theme and Variations, and you can see that it is about images, and these images get superimposed with an operation that is just like overlay, and that title is eminently Googleable. Now, it has a funny word there, it says, it doesn't say semi-group, it could say semi-group theme and variation, it says monoid theme and variation, and a monoid, well, it's also not something, even though it sounds kind of fancy, it's actually not much more complicated than a semi-group, it's a semi-group and also the semi-group has a special element called the neutral element, and whenever we combine something with a neutral element, it doesn't matter if we do it in front or at the back, we get the same thing back. So of course the neutral element with respect to numbers in addition would be zero, the neutral element with respect to lists and concatenation would be the empty, I always hear several voices, that's wonderful, thank you, and the same thing for the overlay and beside and above, you can imagine that you have just an empty image that has only, that consists only of transparency, that can work as the neutral element, so all of these things that I showed you that are associative, they're not just associative, they're not just semi-groups, they're also monoids, and so as I said you know, as long as you remember associativity, that's the important thing, but often you also find a monoid, and monoids in the wild they're just everywhere, we've seen them for numbers and lists and images, music forms a natural monoid, you can describe musical structure with monoid operations, you can treat animations, the time axis, you can define monoidal combination of animations, a famous example in functional programming is with financial contracts, if you were here last year for a talk of mine we talked about semiconductor fabrication routes which sounds very concrete, but also they form a monoid, the properties themselves that will see form a monoid, all kinds of things they're everywhere around you, and these are really the key towards making flexible domain models, because almost any domain model you can find a monoid just by looking for building blocks and for ways of combining those building blocks into larger building blocks, so let me get back so I said, well you can use associativity or you can use this monoid thing to guide your design, and I haven't really made that concrete yet, and so I stole a couple of pictures from from Brent's paper, so you remember the beside and the above operations, and those are fine for arranging things sort of in the vertical and the horizontal axis, the way that they work is they make they put a bounding box around every picture, and then they arrange the bounding boxes either beside each other or above each other, so it's something more involved thought, and that works great when your picture happens to be a square that's aligned with the axes, it doesn't work so well if your picture is rotated, because the bounding box then is too big, and if you want to attach anything just about in any direction, then there's going to be a gap in your picture, and so beside and above are not particularly good operations as the basis for an image library, the overlay operation is much better, but that leaves open the question how you can arrange pictures, several pictures, so that they are beside each other or that they just touch, and Brent came up with this idea of an envelope, technical idea, so the idea is that well if you give me, so the red dot there, that's the origin, if you give me a vector starting at the origin, I will tell you how far you have to go along that vector, so that I can draw that blue perpendicular line that's just outside the shape, and that's called an envelope, and envelopes are wonderful, so if you ship each picture not just with sort of the visuals that you see, but also with a function that describes the envelope, then you can use that envelope to arrange pictures both in the horizontal and the vertical, but also in the diagonal by just drawing vectors so that they touch, does that make sense, slightly more complicated ideas make sense, and Brent goes through the motions of using that inspiration from the monoids that he's getting, he's saying everything must be a monoid, absolutely, and uses that as a guiding principle through the library, so I'm not going to go into technical detail on how that works, but it's a very pleasing paper to read on that, and it results in a beautiful library that's great fun to use, so that means though that you also have to find a monoidal combination operation for the envelopes, you can't just, we've already seen how we can combine the pictures themselves, but we also need to combine the envelopes, and fortunately that's pretty easy, if somebody says a vector in a certain direction, then that envelope is just the maximum of those two pictures, if you combine that ellipse and that square you can see that I'm just going to have to go to the maximum of those two numbers in order to just be outside the composite shape that comes up superimposing those two things, so that's great, now I sort of introduce these properties as a mathematical thing, I said well there's this fancy, fancy upside down operator says for all, for all images, and we might say for all images, now we can also formulate these properties as code, and that's really where additional magic is, so for example the associativity property, well there's not much of a difference except that the image one and image two, they're now in typewriter font, so we could put those in a program, but there's still that mathematical stuff on top, but in a functional language, and a lot of other languages too by now, we could also put the top line and translate that into code, and it might look like this, so that's what it looks like in Idris, so I thought it's not quite the same, but maybe we recognize the structure, so we say well there's a property called, and the property is just called overlay associative, so we give it a name, so Idris is an ASCII language still primarily, so we say just for all there instead of the upside down all, and then it says ARP triple, ARP image, ARP image, ARP image, and that means for all arbitrary triples of arbitrary images, another arbitrary image, and another arbitrary image, so triple is three things, and we're going to call those three images, these three images, image one, image two, and image three, that funky backslash there, that's a lambda in Idris, and then the overlay property means that, well if we overlay one way and we overlay another way according to associativity, we get the same result. Do you recognize that structure, right, that it's the same thing, so that we're writing structurally the same thing that we wrote in mathematical notation now as a piece of code, and now the great thing is once we've written it as a piece of code we can manipulate it in a program, so one way to, there's different ways of manipulating it, but one of the most useful ones is again by another great researcher in functional programming, John Hughes came up with something called QuickCheck, so if there's another thing you take away from this talk is Google QuickCheck, and whatever language you use it doesn't have to be Idris, in fact I had to hack together a QuickCheck for this talk, but basically any other language is going to have a QuickCheck, whether that language be a functional language or whether it's Java or Python or R or something like that, you can always get a QuickCheck for that. And I'm going to try and demonstrate this QuickCheck thing by, not by thinking about the design so much, but by demonstrating a property of something that's very error prone. So here's this idea is we want to have a representation for sets of natural numbers, and we're going to represent those sets of natural numbers by a list of intervals, so by a list of ranges, if you will, between two numbers. Now I'll try to explain that, so up there at the top it has a type definition, it says ISET, interval set, is a type, and that type is defined to be just a synonym for a list of pairs of natural numbers, that's what those round parentheses with the comma and the middle mean. Okay? And just to see what that means is there's a function there I haven't, I've alighted the definition, but what's important about it is it's type signature, it takes an interval set, and it produces a list of all of the members of that set. And you can see sort of a demo thing here that I typed in before the talk, so if I apply I to list, so the brackets there, they are just, they just mean the list, and we feed in a list of intervals, and those intervals are from zero to three, from five to seven, and from nine to ten, respectively, and they're all inclusive. And you can see down there is a list of all of the members, so the first interval is from zero to three, so it has the numbers zero, one, two, and three. The next one goes from five to seven, so it has the three numbers five, six, seven, and the last one goes from nine to ten, so it has the two numbers nine and ten there. Does that make any sense? Again, slightly more complicated example. So let's see, so of course, well not of course, but the way we want to do, the way I want to do it, is I want to have the interval set structured in a certain way. I don't just want any list of any pair of numbers to denote an interval set, and therefore here is a function that describes what it means to be a valid interval set, right? So for example, we don't really, in order to have efficient processing, we don't really want two intervals and one interval set to overlap, right? We want them to be disjoint, and we also want them to be ordered so we can have efficient operations for certain things, right? And so let's go through this, so there's an isValid function, it just tells you whether that interval set is valid or not. It says, well, if that set, and there's three different cases here, which is why there's three different equations, and the first equation says the empty interval set, the empty brackets mean the empty list of the interval set, the list representing the interval set is empty, then we're going to say true, empty set perfectly fine. The next case says, our interval set consists only of a single interval, and that single interval goes from low to high. Well, we kind of interpret that there, but, and well, that interval set is valid if low comes in front of high, right? They shouldn't be the other way around. Does that make sense? Somebody, can you nod at the back a little bit? You're still there? Okay, thank you. Great. So then it becomes a little bit more complicated, and it says, well, this is the third case, this when there's at least two intervals in the interval set, and those two intervals are, the first one goes from low one to high one, the second one goes from low two to high two, so those double colons, they separate the first element of a list from the rest, and then there's the rest of the list, and it says, well, again, we want the interval to be ordered so that the lower number is on the left, that's where it says low one is less or equal high one, and then it says, well, there should be a gap between two consecutive intervals, otherwise they should be one interval, which is why the high from one interval should be, should be separated from the low of the next interval by at least one, and then we're going to say, well, also we want the rest of the list including low two and high two to be valid to, so far so good? Okay, so this, this probably, well, second more complicated piece of code. So anyway, so here's, so we might imagine a union function and the union function, get what? It was a monoid, and with respect to interval sets, so it takes, you know, two interval sets goes in, go in and another one comes out, and if you've written that kind of thing before, you might notice it's probably a little tricky with that fancy validity condition that's there. So how can we get this right? Well, what we do is we write down properties. Of course we could write down associativity, I'll leave that as an exercise, another one is just very simple, is just a very simple property that says for all pairs of two arbitrary interval sets, we want the union of those two interval sets to be a valid data structure. We want the union function to preserve validity. Okay, makes sense. So here's another property that says, well, I already gave you this function or I told you that there is this function i to list, which just gives us a list of elements of an interval set. And what we can do is we can use sort of that representation, that's also a representation for sets, we can use that representation sort of as a model and say, well, if we take the unit, you see there for all pairs again of arbitrary interval sets, we take the union, it says i union, i set, and i set two, and we convert that to a list. And what we could also do is we could instead convert each individual set to a list and then just merge those two lists and that should yield the same result. So in a way, we're just giving a very simple model for our interval sets, right? And that would, so those two criteria would be kind of nice to have in order to get our implementation correct. And I already got started before the talk on this. Looks like this. No, it doesn't look like this. We'll get to that later. But like this. So here's what I came out with. So you see there, well, there's all this other code there, ignore that, but there's i union, says i set, i set, i set, you see that? And then there's two equations that say, well, the first set is empty, then I'm just going to give you the second one. And if the second one is empty, I'm just going to give you the first one, right? Classic things when you have union or concatenation operations or something like that. And now you can see the third case, it gets tricky, right? Again, you don't need, I mean, main thing is you need to understand it's tricky is, well, the third one is such that, well, it says the first, so the both have at least one element. And that element is in the interval low one and high one, the first case low two and high two in the second case. And then there's the rest. And I already put in a little bit of code, and I said, well, if low one comes after high one, then we want to start with low two, high two, and then continue with the union. In the other case, if low two comes after high two, then we're going to start with low one and high one. And in the other case, it means that no interval comes before the other, and therefore we need to merge the two intervals at the beginning. Does that make remote sense, right? Don't worry. We'll get back on solid track. So we just take the minimum of those two intervals, the max of those two intervals, and we do this. Now the great thing is, I told you about this tool by John Hughes called QuickCheck. And the great thing is, we can load this into Idris. And then here comes a REPL. And we can say, I hope I'm doing this right. So we want QuickCheck. And we want what was it called? It was called Prop Union Correct. And I hope I'm doing this right. And well, very small fund. But you can see here, it says 100 tests. And that is what QuickCheck does, is it takes your code version of the property and automatically generates a lot of tests for them. And that is super effective at weeding out bugs. So it says, well, the thing that you wrote is correct. It always produces interval sets that when you take the list, gives you the right result. But there was that other criterion called Union Valid. And there, it says, and this is really the better part, of course, or QuickCheck is when it fails, it says, it's falsifiable. It says, there is a counter example. And so here it says, I did nine tests, I generated nine random tests, and I found one where the result is not valid. And the great thing is that we can then go and cut and paste this example. So I could say, I Union this, remove the comma in the middle and call this. And well, what happens here is, so what we can see is we can see two and four, one and one and three and five. And what's not valid about this, so by the way, this is randomized. So this always goes differently. So I have to look at it too. So then it says, well, those two intervals, they should really just be merge and they should just be one interval. Right. And so it didn't, it didn't do that correctly. And the reason for that, maybe you saw it, is so, and what happened is that it ran into one of those two cases here where it says if low one greater than high two or low two greater than high one, remember that I told you there needs to be a gap of at least one between them? Remember? And here's an off by one error that says, well, so this says they can, low one greater than high two says they can still be right next to each other. Right. And this is what happened here. We need to make sure that there's that gap in here so I can fix it like this. Load it again. And oh no, there's still a counter example. So, and we can try that out. So, and that's great. We get test cases that sort of show where the bugs are. And in this case, well, what happened here? They still overlap and what happened here? So, can you see it? So you can see that the first two intervals, they must run into that last case. Right. Because they overlap. Zero is the interval from zero to three and the interval from zero to five. They overlap. So we need to get to that case. And so it merged them and then it went and didn't somehow didn't merge it with the six and the seven. That's there. And so, so it had, well, if you look at it so it must have done this and what it did is it then went on with the rest there. Let's have one more look. What actually happened? So there it is. So it merged those and so and then you can see that it went into a symmetry problem here. Well, maybe you don't see but you, you know, this is tricky stuff. I couldn't do this by myself. So you can see here that it just taxed the result onto I set one rest, whereas the maximum of high one and high two could what the maximum of high one and high two might violate the consistency criteria on if it's the wrong one. And then it runs into one of the other cases and I've never seen this tricky one. Does that make any sense? But can you see that it should be symmetrical the last one? Can you see that? Okay. So we'll try to make it symmetrical. Do it like this. So we'll say, well, if so this only works. So if high one is less than high two, so we really need to make sure then then it's perfectly then the maximum of those two numbers is high one. Does that make sense? And so the maximum of those two numbers is high one. And then it's perfectly valid to tack it onto I set one rest. And the other case, high two is greater and we need to go and do something different and rip this out here. Stick it in front here. And then and then and now it's symmetrical. Okay. So let me load this. And and it's past the tests. Okay. Live. Great. Thank you. I did. I did practice getting it correct. Right. But you can you know, this kind of stuff it always gets me. I mean, you know, with old age, especially this kind of stuff it always drives the sweat on my forehead. Right. You know, there's off by one. There's, you know, I don't know how many cases there needs to be and quick check is the kind of thing that weeds out the box. And even though it weeds out the box in a different order each time it always weeds them all out. Okay. So it's a great tool. Now I recommend that you try that. It generates tests from properties. Okay. Where are we? So let me let me give you a couple of real-world examples. So if you're if you're using X windows, there's a there's a tiling window manager X monad. It's already a couple of years old and they don't do much development on it anymore. That's because it's correct. Right. Right. And why is it correct? Well, it's because they wrote down a lot of properties for the geometry and the tiling algorithms and verified them using quick check. And so I sort of loosely translate. So Don Stewart, one of the authors of X monad graciously wrote a couple of blog posts on a simplified version of X monad and I translated them into interest. So um. So the here's a very simple idea of just a window manager. So it doesn't do geometry. It just has stacks of windows and it has several workspaces and each workspace is a stack of windows. So here's a data type called a stack set. It's parameterized by a type called window. We'll see later why there's a type parameter and why it just doesn't say what the windows are. And then it says there's a constructor stack set and there's two fields in there. One is called current. That's the number of the that's the number of the of the workspace that's currently active and then there's stacks which is a map from the number of the workspace to the stack of to the list of windows that sit in that workspace. Right. Again. So here that really the technicalities are not particularly important but there's a bunch of operations that operate on this window manager configuration really the details aren't important. So you could create an empty stack set. You could say well you know I have I have the number of a window that I want to get to the front and please make me please rotate me the stack set around so that I can see it. Peak means you know maybe I can get the topmost window that the user's currently looking at rotate means I'm just going to rotate the workspaces around and either either left or right direction. That's what that ordering argument pushes. I push a new window onto the current workspace insert means I insert a window into one of the other workspaces delete means I delete a window shift means shift also means I shift something with the windows not really important what they do but you can imagine again just as we did with the with the interval sets is a validity criterion or an invariant that should hold for these operations and it's very simple well it says well if you have a stack set with some windows in it I'm just going to tell you whether that stack set is consistent and by doing that I'm just going to say well the current the the the number of the current stack set that should not be higher than the number of stacks set of window stacks that there are right so of the number of stacks that there are and the other one that just says there shouldn't be a window should not be in several of the workspaces right and then I can go and you know maybe maybe with this definition all those function definitions aren't very complicated but I can go and write a whole bunch of properties and if you just understand well maybe the second one prop view you understand all of them and just says well for all pairs of a natural number in a stack set that are a stack index in a stack set I want if I call the view function which is one of the operations I want the view function to produce a consistent stack set and then it goes on to do all of that for all the other ones at the bottom here you can see some prerequisites that need to hold for the property so the invariant only needs to hold if the window if the number of the window is actually smaller than the size of the stack set otherwise the I think the function just returns what went in there so that's a very that's just a very efficient way to invent properties is to think of some invariant that shall hold in your data structure and if you know Idris you can sometimes encode that in the ties but often that's kind of tedious and you can just write it down as a property and then have quick check check it for you and it's not particularly exciting for the simple definition but you can imagine that the actual definition when you have tiling window management going on is much more complicated than the one that you just saw but you can keep those same properties right that there still needs to be some consistency invariant that if you have tiling the windows don't overlap and things like that that should be obvious write those properties down check them using quick check and that will weed down that will weed out a lot of the bugs here's an example from our practice we a couple of months ago we were tasked with migrating a giant visual basic six application it had a password checking a function there you can see here a visual basic six type signature and the property that we wrote was well if we create the hash from the password and we compare it with the hash that's in the database then they should all come out the same and to our surprise that function that test that property failed when we ran it for quick check and we had to correct it because that password hash is restricted to 11 characters by some restriction in the database schema and so that means that you can use quick check not just to sort of check the correctness of things that you already know but to actually develop a model for what goes in your software which you don't always know very well so that's what we did there another example is we had we wrote for a large industrial client we needed to write a synchronization application so when you had two mobile devices and they would sort of meet as strangers they would exchange data and they all needed to look at the same sort of device configuration data and we didn't want them to exchange all the data every single time we just wanted to exchange them you know the data blocks that the other side was missing and again there's great algorithms for this based on Merkel trees they're pretty complicated you have to do a lot of bit fiddling with that but fortunately the property for that is pretty easy to write so here's the property that says well so the synchronization algorithm works on sets of blocks whatever a block is so you can see the property here for all pairs of sets of blocks and more sets of blocks so they're called BS1 and BS2 blocks set one and blocks set two what we can do is we want if we union those two then we get some then we get all the blocks in the system we call that all or we can call the synchronization algorithm and that will give us two new block sets block set as BS1 prime and BS2 prime and those block sets are the ones that get transferred to the other side okay and the criteria on here just says if we take the ones that we have if we union them with the ones that we get we should get all of them that should be all of them and that should be the same for both sides and also we want the algorithm to be efficient so we don't want it to transfer blocks so we want to make sure that the blocks that we have and the blocks that we get are disjoint that they don't have any elements in common otherwise we could make that algorithm trivially correct by just transferring all the blocks every single time and I can tell you I've sweated you know I sweated one or two weeks over this algorithm and it was really hard to write but this one test weeded out all the bugs that I found along the way so that is just super, super effective John Hughes has a couple of papers on hard barks that he found so he found a bug in a distributed database called Nija and that bug was dependent on opening the database closing it and opening it again so this is not the kind of bug that you find by just writing a bunch of smart unit tests right so if you did anything shorter in the beginning so if you just open the file and then did some lookups there that would not manifest the bug you really need it to close and then open again you know have you turned have you tried turning it off and on again but then the database breaks in this case and here's another example called the mysteries of Dropbox so you can imagine that with Dropbox you really want certain properties to hold right and it turns out they didn't they never worried about writing properties down but John Hughes did and found a couple of bugs so here's one it's kind of hard to read where it says so on the client one writes into a writes a into a file that was previously empty so that funky turn style there is empty so writes a into a file and then deletes the file and another client writes replace sees the a in the file replaces it with a b and then client one goes and writes see into the file that it previously saw to be empty and then unfortunately even though you can imagine that you should see either be or see in that file but Dropbox deleted it so I think they fixed that bug now but so you go so it goes Oscar Rickstrom has a couple of great pretty recent blog posts on properties in a screencast editor that I highly recommend so so this is a great tool for for finding bugs but it's not the same as having a proof right so you can still imagine that you can find very subtle bugs that are not covered by quick check quick check just randomizes just generates randomized tests so that is not the same thing as as making sure that there aren't any bugs so the great thing about interest and the reason I chose it for this talk is that interest allows you to not just encode properties in the language it also allows you to encode proofs in the language so here's the associative property for the list concatenation operation and if you look at the top that has the definition of that function from the Idris standard library it says plus plus in goes a list in goes another list out comes a list and then it says well if you concatenate the empty list with any list that is just that list right do you see that the second one says well if we concatenate a list that starts with the element x and goes on with x's then we just sort of pull the x in front and concatenate the rest with right so that's a classic recursive definition of list concatenation and functional programming and now here's something really strange in Idris here's the type declaration for a definition again in the standard library called append a source and it says if you have a list A you have a list B and you have a list C and in the type it says oh the associative property should hold right and so this is a statement of that property that's wonderful it's not the same as a proof so but and writing proofs who loved that in math oh you're good I didn't I'm sorry so the great thing about Idris is it helps you write down the proofs so I'll show you how that works just really really briefly so here's that so here's just what I showed you on that slide so I can load that in there and it says well you're not done you didn't write a proof for that property but in Idris you can just push a bunch of buttons and I love that so I can push one button and it says oh well you should write a proof of that form you have list A, B, and C well now and I can push another button that says well you're doing this on lists and if you're writing anything on lists you always need to distinguish between the two cases of the empty list and the list that consists of the first element X and further element X's and then it says well write down something but then I can tell Idris well I'm too lazy I'm not going to write anything so I can just push a button and Idris wrote this so you can see me but I didn't type this right I just pushed a button and it says refl what is refl? what could that be? well you can ask it what refl is and it says refl oh you can see here it landed here refl is just a proof sort of a built-in proof that says that if two things are two things are equal if they're identical if they're the same right and that kind of makes sense in the first equation because the first equation of append a solch corresponds to the first equation of plus plus can you see that? how it corresponds? can you see that? the first list is empty can you see that? can you see how the first list is empty with the first equation of append a solch and the first list is empty up there with plus plus can you see that? Okay, and then it just says well then obviously well not quite obviously, but then sort of the the way that the definition works It comes out just right. So what what's really important is that Idris Idris accepts that proof with the first the second one slightly more tricky But again, we can get help because we know that append a social Is is this recursive function it recurses on the first argument? so we're just going to do the same thing and the proof and You can tell Idris that it should use that That it should use that fact if you will So here's the recursive call and again until easy to push a button But if I push that button it also puts in refl and there's loads so this It might be a mystery to you how it works But this is a proof of the associative property of the list concatenation at Idris And since Idris helps you write it. It's kind of fun to do that Oddly enough even for somebody who doesn't usually Who doesn't usually doesn't enjoy proofs so the way that you program an Idris? We haven't done that a lot in this talk is That you put a lot of information in the types and the more information you put in the types The better Idris will get at figuring out the correct definition and you don't have to do it by yourself Okay, so that's really nice Okay So we got that and and and sort of these kinds of proof assisting systems Such as Idris have been used in a lot of real-world systems So one one prominent example is SEL for a version of the L4 microkernel has a long history But important properties of that kernel have been verified It runs in their security enclave on on iOS and even though it's written and see it Provably does not have buffer overflows or a lot of the nasty things that are responsible for a lot of security exploits Conserves another example, which is a verified I should mention this has been verified with the help of a tool called is a bell also great fun to use There's a project called concern Which is a verified C compiler which is important for a lot of certified software where you know the source code might be Certified, but how do you know that the compiler generates correct code? And you know because it's been proven to be correct And even there you can shoot you can cheat sometimes So for example register allocator is very complicated very hard to prove right But what you can do is you can write a checker that the register allocator did its job Did its job well and you can verify the checker And so you can see a little bit so there's tools for that We've seen Idris and there's a number of other tools and they're getting more and more mature and they're great fun Really they really are great fun But you know going back switching down a gear a little bit there's lots of useful properties that you can look for in your programs So commutativity might be useful that you can switch the two arguments for an operation Also, if you have relations you might remember that from some math class There's some properties here like reflexivity symmetry anti-symmetry and transit and transitivity Reflexivity says that a is always related to a symmetry says if it's one way if a and b are one way related They need to be related the other way to Anti-symmetry intuitively would seem kind of the opposite that doesn't make sense It just as if two things are related in both both ways around so for example You know orders like less or equal are anti-symmetrical Then they must be the same and transitivity just says that you can form chains of your relation So those are a little dictionary of useful properties that you can look for Let me close with one fun fancy property that you've probably seen somewhere and that property is called functor and You might have seen in your programming language in your list library or in your stream library. There's a function called map Right and you know even Java has that and has had it for many years And what map does is if you have some you know You know in Java for example, it says stream where it might be lists, right? It says well if I have a list of a's I can apply a function to each element of that list But you can generalize that it doesn't have to be lists It could be an optional of a's for example, you could also apply a function to the value that's in there and so you can generalize that notion and then it's a functor and And of course in Idris you can write down equations for functors and these ignore the technicalities here But you can sort of and there it says but if you sort of pick out where it says functor identity the middle row says Gv equals to v which means g is the identity function when you feed and v you always get v back And when you use map with the identity function you apply the identity function on each element of your list or whatever it is then Then you always get back the same list and here it just says you get function composition So if you apply one function and then another function and you do that either inside or outside the map You should also get the same results. So there's also just as there's associativity with monoids with functors There's these laws and you might think well, where would I look for a functor? I've never seen a functor except for the ones on streams and a couple weeks ago in a training somebody said Well, you start always start with that animal example shouldn't you look for a functor there? and I was kind of you know sweat broke out on my forehead and I was like Was that gonna go but we came up with this so if you go back you can see that there's this obvious So that what you need for functors is you need a type parameter Right, and so you just look for a place to stick a type around me or any place here at all And if you look at dillo and parrot they both prominently have this weight thing, right? And so that seems more important than the other two properties Which are specific to a particular kind of animal and so the way the thing to do is just to Well, you can see I replaced uppercase weight by lowercase weight and made that into a type parameter and I can then provide a functor implementation down there, and you might think what is that good for well? I don't know Well, one thing that you could do is you could provide a different representation for weights Another thing that you could do if you look at the type for run over animal It says animal weight Arrow animal weight and weight is a type variable what that type signature tells you is that run over animal Does not know what weight is and that means that the weight cannot change as a result of that function If you see that and the type signature you get an immediate small benefit But you get a benefit even with silly examples such as this one and that really brings me to the end So in your software in your domain model Look for a combinator look for a function that will combine two things into a bigger thing See if you can make that thing Associative and look for neutral element and very often you will find one make it a monoid, you know Say monoid a couple times you'll remember You'll remember it generally write properties for the things for the operations In your software test those properties using quick check You know if you feel like it and you have a lot of time prove them correct Find the functor if you found the if you found the monoid, you know find the functor next, you know And and it takes it might take time, you know, I'm very old as you notice at the beginning So so it gets easier over the years and it will just seem like a regular staple of your of your arsenal when your program and of course when the important properties in your program have either Have been written down if they've been tested with quick check or even prove then you can sleep much more soundly Then maybe you currently can thank you very much Thank you Mike. So I see we have three minutes for questions Maybe that's two or three questions if you have any come to the microphones, please Do we have a question from the internet? No Not yet. So microphone to Right. Hi. Hi. So quick check Generated hundred tests. Yes. What can we say about the quality of this test as enough test? Can I say the program is correct with hundred tests? Are these tests good? Yeah, very good question. So what we can see about the quality of the tests and indeed if you really do sort of industrial strength applications of quick check quick check comes with a bunch of tools that let you look for example at the distribution of the individual example Rated and and and well, you didn't quite see me do that But I mean for your domain objects, you will typically write generators that will generate those examples And you can reason about the distribution of those and you absolutely should do that Because otherwise you might miss large areas of your of your test space So but but there are tools and they help you do that. But even if you don't do that You know, it's it's you find a lot of I I found a lot of bugs in my software Even without worrying about that. But if you go beyond that look at the distribution Thank you Next one, please number two Let's say I've hacked a program for example in Java or C sharp or whatever How do I how do I apply what I? learned So far so where do I start when I have a already? Completed C sharp program with Yeah, how do I apply it quick check on that? So just pragmatically because it's written in C sharp. Is that the question? So well, I have to be very concrete here I mean, so if you can think properties, right one way to do that so for example, so in C sharp you can link with F sharp and There's a quick check version for F sharp called FS check and FS check Actually, even though it's itself is written in F sharp you can also use it from C sharp So you have two options you can write your tests in a slightly more awkward fashion in C sharp Or you could just link your code with F sharp test suite and write it down there And there is a fairly reasonable Java quick check I hear another another idea would be to use the slightly more fancier the slightly fancier quick checks that exist for Scala and and closure I'm sure there's one for Kotlin as well link that against your Java code. Does that answer your question? And so whatever language I use I have to find out what is the correct implementation of quick? Yeah, yeah, but as I said, I mean usually a fun thing I do in training is I just quick check and somebody calls on language, you know I'll quick quick check PHP or something like that and there is one sure enough, right? Thank you I didn't know about it before All right, thank you and thank you Mike again for showing us the way to sleeping soundly Thank you