 OK, roi'n fawr i'ch bod ni'n bwysig o'r cyffredin iawn. Why functional program matters? Yn y fath ar ôl, rydw i'w wneud yn cyd-dweithio ar y titul yn rhan o'r cyffredin iawn. Rydw i'n rhoi'n 30 ymlaen o'r eitul, ond rydw i'n gwneud yn cael eu bod yn ddweud, a'r ddweud i'n meddwl am 5 ymlaen. Rydw i'n meddwl i'n meddwl. Rydw i'n meddwl i'n meddwl o'r cyffredin iawn, a dyna'r cwysig yw'r cyfan yma, ond yw'n gweithio i'n ddweud y 25-yrdd yma. Felly, rydyn ni'n meddwl, mae Mary, rydyn ni'n meddwl i'r hynny'n gweithio, yn gweithio yma yn ei ddweud, mae'r ddweud o'r ddweud, maen nhw'n gwneud i'r ddaf i'r hystryg ymdweud o'r ffnwysgol ffnwysgol, yn ymdweud o'r ddweud o'r ddweud o'r ddweud, o'r ddweud, and also some of the papers that have inspired us in particularly. I'm going to take you on that journey with me. Although I'm the one giving the talk, so why functional programming matters as a very well known paper, is it's very popular for many people coming into the field. It's the first paper that they read. Of course it's not when functional programming started. Functional programming started much earlier than that. ynghyd, y byddwn yn edrych gweithio ffêmol yn ffwrdd yn 1940. Ond ydy'r awr, y ffêmol yn fwyaf ar y gweithio'r ffêmol yn 1940 ond mae'r ffêmol ydwi'n meddwl gwahodd. Felly, y bwliant. Rwy'n ymwyaf! Cymru rwy'n bwliant. Ych meddwl cyflym o'r bwliant, mae memes drydych gyda'r brwyddoedd, ac mae'n fyddwch yn cofio'r gallu byddwch yn fwyaf y bwliant. Llywodraeth dyn ni'n bwliant lle mae'n gweld trwy yn gweld trwy a fwyaf If they're going to make choices, they've got to have at least two things to choose between, so let's give them two arguments, we'll call them X and Y. And now, true, we'll just make the first choice, and false, we'll make the second. So, back in the 40s, people thought let's represent booleans this way. And well, do these really work as booleans? If so, we should be able to define if then else. And yes, we can, it's easy. We define if then else to take a boolean, the then branch and the else branch, and it just calls the boolean to make the choice between the then and else branches. So that's great. We've got functions that behave like booleans, we don't need booleans. Well, if we don't need booleans, what about numbers? Who needs them? Here, let's talk about positive integers here. What's a positive integer for? Well, counting loop iterations. Let's see if we can count loop iterations without numbers. So, what I want to do now is I want to define two to be a function that takes a loop body at a starting value and just runs the loop twice. So we apply the function twice to the starting value. So that's going to be my representation of two, and one is going to apply the loop body once, and zero is going to apply it zero times, and so on. So, you can see that for every positive integer, I can construct a function that I claim corresponds to it, but is this actually a good way of representing integers? Can I recover a normal integer given one of these functions? Well, sure I can. All I have to do is call the function and give it a loop body, this is Haskell notation, a loop body that just increments a value starting from the value zero. So, now if I sit in a Haskell interpreter and I call my two function and I give it the increment function and zero's arguments, what do I get? Two, I've converted back to a normal integer. So, there we are. So, I can represent natural numbers this way. Can I do anything with them? Well, look, I can add them together. I can do that like this. How do I add m and n? Well, I'm going to want to construct a loop that iterates f m plus n times and I can do that just by itering f n times and then m times. Okay, that's addition. Cool. Oh, what about multiplication? Well, if sequencing loops gives me addition, nesting loops give me multiplication. Right, so here I'm using, I'm going to iterate m times, that's the outer loop, an inner loop that iterates f n times. How many times do I call f? m times n. Okay, does this really work? Well, let's try it out. So, I can sit in a Haskell interpreter again, say let's add one to multiply two by two, that should be five and then five times I'll iterate the increment function on zero. Sure enough, we get five. Cool, so I can represent integers and I can do arithmetic on them. Well now, there was a time when every talk on functional programming had to include the factorial function. Here it is. Here is the factorial function, aline 1940. Okay, so factorial n is, here's my if then else. So that's my using functions to model booleans and if n is zero then I'm going to return that function I called one. Otherwise, I'm using my arithmetic implemented by functions to multiply n by recursive call of factorial with decrement n. Does it work? Sure, in the Haskell interpreter I can take factorial of one plus two times two, that's factorial of five and that many times I'll iterate incrementing zero and what do I get? 120, that's right and it's not even terribly slow. So there we are, there's factorial, is it? A couple of auxiliary functions that I haven't shown you there is zero is one of them, that's really easy. How do I tell if one of these functions represents zero? I iterate a loop body that returns false n times starting from true. So if n is zero I'll get true, if it's anything else I'll get false. And the other one I didn't show you was decrementing and the less said about that the better. But you can do it, it works, I promise you. So there we are. We see that booleans, integers and actually any algebraic data type, any data structure can be entirely replaced by functions. That's cool. Why? Well the guy who came up with this was Alonso Church. These are the church encodings of values and what Church wanted to do was to show that functional programming could be a foundation for mathematics. So you didn't need anything else to start off with. You could just start off with functions and then construct everything else that you might want from them. Which is interesting theoretically but this is not just of theoretical interest. It turns out that early versions of the Glasgow Haskell compiler actually implemented data structures this way. Not numbers and booleans of course but other data structures were implemented by functions in this way. Because it turned out that for a while there in the early days this was actually the fastest way of doing it, believe it or not. I remember the guy who discovered that had a different functional language compiler which he built from scratch and to save time he had implemented data structures using the church encodings and then one day he decided to do it properly. And so he built in data structures as we know and love them and the code went slower. That was an uncle John Fairbann. So I should warn you, I've shown you the Haskell shell evaluating these things. If you try it at home you're going to get an error message. GHCI says that the occurs check failed we can't construct an infinite type blah. Okay that sounds kind of serious. It's a little hard to tell just from that message what the problem is though. But don't worry GHCI is very helpful. It tells you it was expecting this type but it got this one. This is the first time I've had a chance to use a three point font on the slide. Okay so now you can see what the problem was I suppose. No don't worry there's more. These are the types of the relevant bindings. So what's going on here is that the tricks I've shown you they work very well in an untyped language and you can do them in Haskell too but they go a little bit beyond what the Haskell type checker can figure out for itself. So you have to give a little bit of help. You have to add these two lexines in red in order to tell the compiler what the correct type of the factorial function is. And if you do that then all the examples I showed you just go through perfectly. So that's great so in the 40s we could write functional programs. But we couldn't run them. That was kind of a shame wasn't it. So then round about 1960 John McCarthy built the first Lisp implementation and suddenly it was possible to run functional programs for the first time. Here is the factorial function in Lisp. It's pretty similar to the 1940s version. So Lisp wasn't a purely functional language but nevertheless you could do a lot that we think of as functional programming today. Right from the start it had the map function which was called map list in Lisp. So you could map the factorial function down a list of five numbers and sure enough you would get the list of factorials. So higher order functions were in there from the start. And now in the early 60s there was a lot of work going on in the research community to do with functional programming. But I want to skip to a paper from 1965 which is a real classic. The next 700 programming languages by Peter Landon. So Peter Landon was working for Univac at the time and he was aware of no less than 1700 different special programming languages already by 1965. Why were there 1700 languages? Well they were used in over 700 application areas. And Peter Landon thought that even 700 programming languages would be far too many. All that was needed was one. His programming language which he called iSwim for if you see what I mean plus 700 different libraries. That was his idea. So at this time every application area would have its own programming language and that was unnecessary as Landon saw. So he presented iSwim which was a very nice idealised language. Here is the factorial function in iSwim we haven't forgotten that. And it was hugely influential as an early functional language. But I'm not going to focus on iSwim so very much as on another point that Peter Landon made very strongly. Peter Landon said it's not only important that we should be able to express programs it's important that we should be able to reason about them. And he was very interested in equivalences, laws that relate different programs like this one that says that if you map F over the reverse of a list that's the same as reversing the result of mapping F over the list. Now this law is nearly true in Lisp. It's not actually true because if F has side effects then we're going to be performing the calls in a different order and so we may end up with different results. But Landon was very opposed to laws that are nearly true. He thought a law should hold always without question and he even discusses in his paper whether or not that's a good thing. So you could imagine somebody coming along and saying well what's the point of having this law? What's the point of having two programs that do exactly the same thing? Wouldn't it be better if they did different things because then you'd have a choice Landon Thunders? No expressive power should be by design rather than by accident. Yes Mr Landon. So this is really the first paper that emphasised law so much and I think it's a tremendously important idea. So let's skip on a bit further again so there was lots more research going on and then in 1977 John Backers won the Turing Award. Now John Backers is somebody that everybody who studies computer science has heard of because of Backers' normal form for grammars but he won the Turing Award because he was the man who led the development of the first Fortran compiler. Now even by 1977 it was quite popular among computer scientists to be a bit rude about Fortran but that doesn't mean that one should forget what an enormous achievement that first compiler was. It was the first compiler that could generate better machine code than a person could write and in those days when computers were the expensive part all people cared about was efficiency. So Fortran made high level language programming possible and that was a huge step forward. So it was a very well deserved Turing Award and John Backers when you win the Turing Award you get to give a lecture and write a paper about it. So of course John Backers could have written the paper about how they developed the Fortran compiler. I'm sure that's what people expected. I'm sure it would have been a great paper. He wrote one like that later. His Turing Award lecture he didn't talk about Fortran at all. He presented this talk, Can programming be liberated from the von Leumann style which was a manifesto for functional programming and a description of his idealised functional programming language. And this paper is a wonderful read. The opening paragraphs are just fantastic. I've got some of his text on my slides. He starts off, conventional programming languages are growing ever more enormous but not stronger. I think you had Ada in mind in particular at that time. Inherent defects at the most basic level caused them to be both fat and weak. It's terrible, isn't it? What are these defects? Their primitive word at a time style of programming inherited from their common ancestor, the von Leumann computer. So he was very negative about word at a time programming. And he said, because the processor operates on one word at a time our programming languages operate on one word at a time. And that creates this bottleneck between what we want to do and our data. And by this bottleneck he was thinking of the bus basically that connects the processor to memory. And he said, I call this the von Leumann bottleneck. This paper is where that term comes from which has since become so very well known. So next time you're going to buy a computer don't ask how fast the bus is. Ask how fast the von Leumann bottleneck is. What else was wrong with them? Their inability to effectively use powerful combining forms for building new programs from existing ones. What combining forms was he thinking of? Well, I'm going to illustrate a couple of his combining forms with diagrams. So in these diagrams boxes are going to represent functions mapping inputs to outputs and the inputs are coming from the right. Might surprise you a little bit at first. So one of Bax's combining forms he called apply to wall which he wrote as alpha f and the idea here is that if you've got a function f you can apply alpha to it and get a function that takes a list of inputs and just applies f to each one. It's map. So that was one of the forms that he thought was important. Another one he called construction. So here the idea is that you have a number of different functions f1 to f4 at these coloured boxes but you give them all the same input. By making a construction of these functions you get something that when given an input passes it through each of the functions and makes a list of the results. What else was wrong with conventional languages? Their lack of useful mathematical properties for reasoning about programs. So Bax was also very concerned about laws. Here's one of his laws. This represents a composition of a construction with another function g and if you look at the diagram it's pretty obvious this will do the same thing as this. Provided g has no side effects. It doesn't matter whether we apply g once and then duplicate the result or whether we duplicate the input and then apply g to it and that tells us that those two terms at the top and the bottom in Bax's programming language fp must always be equal. And what about the vulnerable bottleneck, the word at a time programming? Well Bax has proposed replacing programs like this. This computes the scalar product of two vectors a and b and it's written in Algor 60. So rather than write this program which manipulates the vectors one word at a time Bax has proposed writing a scalar product function which first of all takes a pair of vectors, transposes them into a vector of pairs, applies multiplication to all those pairs and then folds addition to add all the results together and that gives you the scalar product. Actually he wouldn't have written it like this, he would have written it like this, a little bit of APL envy there perhaps. But the point is that this example demonstrates very powerful ways to compose simple components into complex programs. So this really was a landmark paper. The fact that John Bacchus, the father of Fortran, devoted his Turing Award lecture to a manifesto for functional programming inspired a generation of researchers myself among them. If you haven't read this paper, google John Bacchus Turing Award today and read it, you will not regret doing so. I should say maybe that the first half of the paper is really the part that has stood the test of time and been tremendously influential. The second half of the paper was Bacchus' ideas for doing IO in functional languages which have not had the same impact. It's the first half of the paper really that I absolutely strongly recommend. Everybody should read that. So this paper prompted a huge amount of research by people who were inspired. Among them, Mary's PhD supervisor, Peter Henderson. So let me tell you about some of the work that he was doing when we first met him. Peter had two loves, functional programming and the drawings of Escher. And he decided that he would like to combine the two. And if you look at this Escher print, which is, it's called Square Limit, you can see that it has some kind of a recursive structure. And recursion is the bread and butter of functional programming. So Peter thought wouldn't it be great to construct this picture by functional programming. So in those days, then you did computer graphics by sending drawing commands to a plotter, which moved a pen on a piece of paper. Yes, really, it was that long ago. So graphics programs would be very much word at a time, one drawing command at a time. But Peter said, let's not do that. Let's let a picture be a value. So here's a value, fish. And as soon as a picture is a value, then you can define functions that combine pictures. For example, you can make this one by overlaying the original fish with rotate, rotate fish. So here rotate is a 90 degree rotation. And if you do it twice, then we rotate the fish around. And then they fit together. The fact that they fit is Escher's genius. So that's cool. Here's something else we can do. So here we've got the original fish and two smaller copies. This one is the original fish. This one is fish two. And fish two is a 45 degree rotation of the original fish. And it's also shrunk a little bit in order to fit in the same box. So it's actually been rotated 45 degrees and then flipped around the y axis. And then this is fish three, which is the further rotation of fish two. And now we've got a fish and some smaller fish. That reminds us of the original print, doesn't it? Let's keep going. What else can we do? Well, here we've taken that smaller fish, fish two, and just put it together four times in four different rotations. And we get this nice little picture of fish swimming in a circle. So Peter went on to define more and more operations for combining pictures like quartet that would take four pictures and draw them in the four quarters of a square or cycle that would take one picture and draw that in the four quarters of a square. But with a rotation applied to make them kind of swim around. And if you have an implementation of these things, it's great fun to play with. You can just stretch whatever you like. Here we've taken the T image with a fish and two little ones and just cycled it around. Here I've made a quartet of this thing. So Esher has ever drew this. But it doesn't matter. We can make our own pictures, whatever we feel like. So in order to construct the final picture, then Peter wanted to make an edge. So here we've made a quartet of two empty pictures. That's what nil is supposed to be. You can see the gaps up there and that T picture in one of its rotations. And you can see that here we've got larger fish on the inside and smaller fish on the outside. But really we'd like to have even smaller fish up here, wouldn't we? So let's take this picture. Let's call it side one. And then let's make this picture where we replace the empty pictures by smaller copies of side one. So that code does that. And you can see that by continuing in this way, we're going to be able to get smaller and smaller fish approaching the side of the picture. Need a corner as well. Let's use this picture as the corner. So once again, I've got a corner with three empty squares around it. And now if I call this corner one, and put copies of that in the empty squares, we'll get corner two, where the fish are getting a bit smaller, and by repeating this process once again, I can make smaller and smaller and smaller fish approaching the corner. So Peter constructed square limit at the end. From the picture U, this side and this corner, in a non-et, I haven't shown you the definition, that's a square divided into nine little squares. So we've got a corner in each corner, and we've got the side on each of the sides, and then the U thing in the middle. And what does that give you? Square limit. Do you remember the original picture? Not bad, eh? That's pretty close. So this was a delightful fun exercise. But Peter didn't stop there, because he still had to implement pictures somehow. Let's see, how can we represent pictures in a way that we don't know? Hmm, well, we need some kind of complex data structure, do we? No, let's let pictures be functions. So Peter's idea here was to say, well, when we draw a picture, we have to end up at the end of the day with a list of drawing commands. But let's let a picture be a function that constructs that list. And we might want to draw a picture in a variety of places and in a variety of sizes. So a picture just be a function that takes three vectors, A, B and C, that specify a box on the plane. Okay, so vector A specifies where the bottom left hand corner is, and then B and C specify the edges of the box. And then a picture will return drawing commands that draw itself within that box. So the really nice thing about representing pictures as functions is that it made picture combinators really easy to define. So here's how you might overlay two pictures. You just draw them both in the same box and take the union of the drawing commands. Here's how you put picture P beside picture Q. So just to make this easier to understand, take the box A, B, C, we divide it into two boxes and you figure out what the vectors there need to be, and then we draw P in the left box and Q in the right box and take the union. You can even rotate a picture by fiddling with the vectors. Okay, so if we take this square and rotate it, we can see that its bottom left hand corner is going to end up here and we have to swap the axes B and C and negate the B one. That gives us a 90-degree anticlockwise rotating. So Peter was able to define his picture combinators really easily, and thanks to those simple definitions, he could also easily prove laws about them, like this one, that if you put P above Q and rotate it, you get the same result as if you rotate P and rotate Q and put them beside each other. And that's a little bit of simple algebra because the implementations of these functions are so simple. And Peter studied those laws very much and he says this in the paper. It seems there's a positive correlation between the simplicity of the rules, the laws, and the quality of the algebra as a description tool. In other words, by making sure that the functions he defined had these nice properties, he got a language for describing pictures that was more expressive and nicer to use. So I think this is a really great example of functional programming at its best. And we've seen a number of examples of ideas coming back again and again. The idea of programming with the whole value, the whole picture, the whole vector, the whole array, not word by word. Of defining combining forms like map and construction and overlay. And of using the algebra, the laws, as a litmus test to guide the design of these combined forms. And another idea, dating back to church, of using functions and representations, which is tremendously important. So I want to very quickly now hop ahead to a paper from 1994, which is a bit less well known, which I really like and meant a great deal to me. It's called the somewhat clumsy name, Haskell versus Ada versus C++ versus Ork versus dot dot dot. So this paper originates from a big DARPA project in software prototyping. Software prototyping was very trendy at that time. So DARPA wanted to support work on it. And you couldn't get DARPA funding to write Haskell programmes for real use in those days, but Paul got some DARPA money for using Haskell to build software prototypes. And many other people got money from the same project. So then at some point, the DARPA wanted to evaluate how well people were doing. So they gave everybody the same problems, a prototype, and they were to record how much code they had to write and how long it took. The problem was a geometric one. So it involved defining regions in the plane with complex shapes, and at DARPA being DARPA, these regions had names like weapon doctrine and slave doctrine. And there are points in here called hostile aircraft and so on. So the prototype was supposed to input a description of this stuff and then analyse it and output things that would say, you know, the commercial aircraft was in the Engageability Zone and so on. Mark Jones got the job of implementing this prototype in Haskell. You can see that the key thing here is you have to work with regions with complex shapes. You've got to represent regions somehow, and there are lots of different shapes of them. It could be a very complex data structure. How do you think Mark Jones represented a region? A function from point to ball. And that made the combinators on regions extremely easy to define. I mean, the outside of a region, it's just, you know, not of what the region would return. The intersection of the union, they're really easy to define. And as a result, Mark's prototype was very, very tiny. I don't know if you can read this, but this is Mark's code. He wrote 85 lines of code. The ADA solution was over 760 lines. C++, 1100 lines. And there are some others that approach 85 lines. Many of them didn't work, though. Mark's did. In fact, when Mark's solution was submitted, the evaluators at first thought he hadn't solved the problem. He's just written down a specification. Where's the code? But it was actually executable. What I think is a real shame is that Mark had written 29 lines of type signatures and type synonyms that the type checker would have inferred by itself. So his solution could have been 29 lines smaller. So anyway, when these results came into the evaluators, they thought there's something funny going on. They're cheating. You might see that Haskell appears in this list twice. So what's the second one? The evaluators gave the same problem to a graduate student somewhere else without telling Paul. And they just said, okay, you've got eight days to learn Haskell. Then see if you can solve the problem. And the student did. The second smallest solution. How about that? It's almost twice as large as Mark's. Well, he only had eight days to learn Haskell. But it was the second best solution. So you would have thought at this point, Darfur would say, wow, Haskell is clearly the best prototyping language. But they didn't. They said it's too cute for its own good. Higher order functions are just a trick, probably not useful in other contexts. Oh, well, so much for that. Let me go back in time a little bit. I want to talk about another pair of classic papers that really inspired me. And these were the two papers, both came out in 1976, that introduced and publicized lazy evaluation. There were, lazy evaluation had been discovered before this, but these were the two papers that put it on the map. And one of the guys was Peter Henderson, Mary's supervisor. So I was really inspired by the idea of lazy evaluation because it meant that the whole value that we've been talking about could be infinite. So a programme commit may play a whole value that would be the infinite list of natural numbers, or the list of iterations of a function. So here's f applied zero times, one time, two times. How many iterations? All of them. Now, of course, this doesn't mean that any single programme would compute all of them, but the choice of how many to compute is left to the consumer. This definition can compute any number of iterations of f, as many as you want. It's up to the consumer how many you get. And I was very excited about the possibility of separating the decision of how many things to compute from the code that told you how to compute them. Here's an example of a consumer for numerical methods. I had studied numerical analysis without very much success a few years earlier, but one of the things I had learned is that numerical methods often take the limit of a sequence of approximations. So let's just make that a consumer. Limit gets an infinite list of approximations and returns the first one it finds where the distance between approximations is less than epsilon. So now we've got a part of many numerical algorithms but expressed as a reusable, separate function. Here's the Newton-Raphson square root algorithm. That starts from some value, let's say 1.0, and computes successive approximations. Here's the next function that does that. So I can compute Newton-Raphson square roots just by iterating the next function until I reach the limit. And so I'm able to separate the computation of the approximations from taking the limit. Here's code for computing a numerical derivative. So the derivative is the limit of the slope of the graph as points get closer and closer together. So let's compute a series of h's by starting from 1 and making them smaller and smaller. And then let's map a slope calculation over that and take the limit. And the convergence check can be the same code even though the algorithm for computing the approximations is different. So I thought this was great. Actually there's something even cooler you can do. So differentiation is done by taking the limit of a slope as h gets smaller and smaller. Integration is done by taking the limit of a sum of areas as h gets smaller and smaller. Okay this is the trapezium rule for integration, we've all seen that. So in both cases we've got something an h that is getting smaller and smaller. Okay think back to your numerical analysis courses. When you do something like this the result that you get can be expressed as a plus b times some power of h where a is the right answer and b times h to the n is the error term. So suppose we take two successive approximations. Now we've got two values, we know what age is in each case. We've got two simultaneous equations for a and b. We can just solve them right? And if we solve them what do we get? Well we get b which we don't really care about but we get a which is the right answer. So we can take a sequence of approximations and just take the first two and compute the right answer from them who needs the rest of them. Okay well it's not quite as brilliant as this because this form is only an approximation. What happens is that we get a much more accurate answer than either of these two but it's not completely right. But never mind we could take the next two and do it again. So we can make a new sequence of approximations which converges faster. It's a better numerical method and we can write a function that does that. So here's a really fast function for computing derivatives. We make the initial approximations in the same way as before. We improve it a couple of times and improving the first order error term and the second order error term. Now it converges really quickly. We'll take the limit. Isn't that sweet? Everything is programmed separately. Everything is reusable here thanks to whole value programming. If you do this with integration which is a much more expensive operation you get a really good numerical integration algorithm very, very easily. So this is some of the stuff that appears in my paper Why Functional Programming Matters. And the basic idea that the paper uses again and again is that of connecting a consumer and a producer with lazy evaluation. So the consumer demands values and then the producer generates them. So I've shown you how we can do that with numerical approximations. In the paper I also show that you can represent a search space in the same lazy manner. And then the consumer becomes the search strategy that decides which parts of the search space we need to compute and explore. And in the paper I use that to implement the alpha beta algorithm for knots and crosses. My implementation was extremely slow so this was about the most complex game I could solve this time. But I was able to program the alpha beta heuristic as a separate function whereas all the presentations I've seen beforehand had to mix the alpha beta part with the construction of a game tree. And there's one more paper. I've used this idea again and again during my career. There's one more paper that I want to show you that uses the same idea. It's the quick check paper. So for those of you who haven't seen quick check the idea is that you test your program by writing general properties like this one. So here's the property of reverse that just says for any x's that is a list of integers if we take x's and reverse it twice then we get back the original list. And if you give that quick check it'll generate 100 random values by default. Run the tests for each of them and in this case all the tests pass. So that's great. What's more interesting is what happens if the property is not true. So here's a wrong property that says that if you reverse a list you get itself. Of course that's not true for all lists. And in that case quick check quickly finds a random list for which it's not true such as this one. But you can see that that could be hard to debug. So then it goes on to simplify the failing test and finally ends up reporting a minimal case that fails. In this case just a list zero one. This is the smallest list that falsifies this property. Why is it smallest? Well if we removed either the zero or the one we get a one element list that is its own reversal. Why is it zero and one? Well if we replace the one by zero so we had zero zero we've got another list that is its own reversal. So this is the simplest case that fails. And these are the cases that quick check presents to the user for further debugging. So how does it work? Well the property defines a space of all possible tests. And then the quick check search strategy is a consumer that traverses that space running the tests first of all at random and then systematically define the smallest possible case. Okay so I think in the interest of time I will skip over some hardware stuff except to tell you let's talk about hardware for a little while. So something else that we'll all remember from around this time is the Intel Pentium bug. Here's a calculation you could draw on any processor. We take some number 419 something divided by another one 314 then multiply it by the 314 one again. So this should give us back the original number and we subtract it from itself and of course we can get zero which you do on modern processors and you should have on Intel's processors as well but on the faulty Pentiums you got 256. So this wasn't good of course and for a while Intel tried to pass it off saying it was very unlikely to happen and it's nearly right but in the end they had to replace enormous numbers of Pentiums. It cost them $475 million which is a lot of money even for Intel and of course then they had all these flawed Pentiums. What did they do with them? Simple they put them into key rings and every member of staff got one of these key rings at the time. I've seen one of these on the back they say bad companies are destroyed by crises good companies survive them great companies are improved by them. So what was the improvement? The improvement was they recruited this man Carl Seeger who went to Intel to build his own lazy functional language, FL which was specially designed for doing proofs about hardware that had some functional facilities and using that he built his forte system that has thousands of users inside Intel this is what they use to verify Intel microprocessors and make sure that Pentium bug will never happen again. So FL it's used as a design language specification language for scripting for all the formal verification tools and so on and it's also the language that the tools prove things about. So functional programming is really at the heart of Intel's approach to formally verifying circuits. So why hasn't there been another Pentium bug? At least in part it's thanks to functional programming that's good to know. On the subject of hardware I also want to just mention a language called BlueSpec which is Haskell for hardware basically and this is the the brainchild of Arvind at MIT which combines Haskell for the large scale structure of hardware description and some atomic transition rules from which the compiler can infer parallelism to generate verilog that goes into industry standard tools for synthesis after that. So BlueSpec lets you write your hardware in Haskell basically and the code you get is often faster than handwritten code as functional programming always does by simplifying the task of just making things work it gives designers more time to use better algorithms and it's made architectural change of designs and so on very easy to do there's a very nice paper about this by Nikhil who is the CTO of the BlueSpec company and there's a reference there and of course BlueSpec is essentially a kind of Haskell so it has QuickCheck this has worked by Matthew Naylor and more in Cambridge and you can take that the BlueSpec QuickCheck and generate and shrink tests on the chip and this blows away hardware designers I mean hardware designers are used to maybe having some support for testing on a chip but having the whole process where the chip just outputs by the way I don't work for this input that that's just like magic so I think that's really cool so we've made a very long journey starting from church numerals and coming all the way to Haskell and QuickCheck running on a on a chip but throughout this whole journey then there are a number of ideas that have come back again and again and again programming with the whole values not word at a time powerful combining forms that satisfies simple laws and using functions as representations and I think these four ideas pervade good functional programming and they've delivered a whole lot of value over the years so I think we would have made backers proud thank you certainly the table of numbers yeah I can hear you testing two three five seven okay uh I'm curious why the number of lines of code is used as the criteria for judging goodness rather than development time development time according to development time Haskell still does quite well but LISC that's true so LISC blew Haskell away completely the difference if you read the paper is that the Haskell one worked okay thank you but if LISC worked no well if they had kept going until it worked then provided took zero time to do that it could have won but my understanding is that it wasn't anything resembling a complete solution but you might ask as well you know what what about lines of documentation if only mark had written less documentation perhaps it could have been much quicker it's it's not really a very scientific study but it's a nice a nice story that paper by the way is on the web at the Yale functional programming groups website still so one can download it and read it any other questions oh yes yes so one in one area where a great deal of research effort is being spent at the moment is on enriching the type system so a lot of people are very excited about dependent types and they're going into Haskell so I guess we're going to find out how useful they really are I'm I know a lot of people are enthusiastic about that I'm not as enthusiastic as many others because I see that there is perhaps a point where you get diminishing returns there are costs as well as benefits for a more powerful type system but that's clearly something where a lot of work is being done and whatever comes out of it it's going to be interesting the other things that I wonder about are first of all there's more and more interest in constructing software to meet specifications and our work with QuickCheck is a part of that right we have the specification and then we test programs against it we're not doing proofs but nevertheless we're identifying inconsistency between the code and specification and the problem that we face is getting the specification from somewhere and I think this is a general problem that many people who develop software don't really know what it's supposed to do not precisely not precisely enough to formulate a specification and even test that the code meets it so if we can't develop specifications then the whole process of trying to develop software that meets specifications is facing a roadblock so I think finding ways to develop specifications is one very important problem for functional programming and all kinds of programming really and the other thing that I think about is when I got started in functional programming then we were in the early days of the software crisis that was named in 1968 and everybody was worried that software was growing inside so rapidly and our capacity to develop it was not so when I started out in functional programming I remember going to a summer school in 1981 and people like David Turner presented functional programming it was beautiful and David said there's no point fiddling with software development with programming languages we've got to make a radical revolution that's what functional programming does and it will give an order of magnitude improvement and productivity that's what we need no factors of two were any of that nonsense an order of magnitude and I think the order of magnitude has been delivered if you look at today's Haskell software and compare it to software in Pascal or C from that time I think it's easily an order of magnitude shorter, easier to develop, quicker to develop and so on but the software crisis is still with us where is the next order of magnitude coming from how could something be an order of magnitude more expressive than Haskell I don't have a good answer to that I'm excited about the whole area of program synthesis because that seems to me to be one of the few ideas that offers any hope of another order of magnitude but if I had a plausible idea of how to get that order of magnitude that's what I would be working on Joe Armstrong has been quoted as saying that Erlang never got a stratified system because they didn't know how to do it and recently added that to offer some ideas but they broke down because they couldn't really agree on how to enforce a type system on sending messages so what do you see as the future of type system for synthesis is that both is about another problem is what is going on? Well there are strongly typed concurrent programming systems like concurrency for example so we have typed channels you have to do your concurrency a little bit differently from Erlang but you can certainly type concurrency so the fact that you're passing messages doesn't necessarily mean that static typing will work the other thing that Erlang was designed to cope with was hot code upgrade and that is a lot more challenging because for the first time you know when you start running a program you don't know what types might even be defined in code that will be hot loaded later so coming up with a type system that can deal with that is a much more challenging problem and any such type system is bound to be a lot more complex so I don't think that Erlang designers made a mistake in excluding types they had good reasons for excluding them but it doesn't mean that types wouldn't be valuable in that context too one of the things I would like to do when I have time is to build a front end for the beam with Haskell-like syntax and the Haskell-like type checker but I think you couldn't do everything that you wanted to do on the beam in such a language but you could do maybe 98% of it and then fall down into Erlang for those things that you can't type so I think that could be really useful again that's when I have time okay maybe we're ready for our I guess we're not ready for coffee ready for the next talk