 Hello everyone, nice to be here, so I'm not going to talk about Plutus itself that much or Marlowe, I'm going to talk about slightly the wider context of assurance within Cardano. I mean we've heard a lot today about the importance of building stuff that works and I want to talk about that same topic but slightly more broadly within Cardano, not Plutus and Marlowe specifically. So yeah, my talk is show me the evidence because it's all about evidence based approaches to making sure that our software does what we want it to do. So this is a very difficult space, the challenges in the cryptocurrency space are legion, there are so many ways that we can fail and not just the obvious ones, I mean we can have flawed protocol design, I mean IGLOS is here to make sure that we don't do that but that's one way to fail, you can have an incorrect implementation of a design like Auroboros, it's very easy to do that. There's all kinds of the typical software mistakes that occur throughout the industry, I don't just mean crypto space, I mean just general programming mistakes that people always make. Amateur cryptography failures, that's another way to fail, it's very easy to fail that way. I like to say that amateur cryptography is like amateur brain surgery, not something you should engage in. You can fail in this space by missing performance deadlines, the Auroboros protocol has real time performance constraints, real time timing constraints, they're measured in seconds rather than microseconds but nevertheless there are deadlines. The systems can fail by collapsing under load when lots of people try to use it and hit the maximum capacity of the system, the system can fall over, it's very easy to write systems that do that. There's a failure to scale, if we look at Ethereum, it works, we can argue about the details but broadly speaking it works but as soon as crypto kitties took over then it could not scale to the capacity that people wanted to use it for, that's a way a system could fail or can fail, it's by just failing to scale to meet the capacity needs of all the users. And then there's denial of service, not just distributed denial of service but ordinary denial of service, that's another very easy way that software can fail here. And then there's other more esoteric things but nevertheless things that could cause a cryptocurrency to not succeed, like vulnerabilities to economic attacks. This is a really interesting one, this is where the costs that are borne by the people running the system are not properly compensated by fees charged to the people using the system and if those don't match up people are going to say well I'm not going to bother to run the system, I'm losing money doing so and then the system can fail. That's a very interesting one, Bitcoin has a major problem there with the size of the UTXO gets bigger and bigger and there's no compensation for it. So there's interesting economic attacks on these systems. Then you can have social or voting collapse, this is why we do governance research and the system can fail to make progress because no one can agree or there's no mechanisms to help people to agree. And then macroeconomics and the list goes on and on and on and you need all of these problems that are soluble but they all need different specialties like cryptocurrency, cryptographic design one area or software. You need experts, all these different areas, it's a highly multidisciplinary space. So that's what makes this space really interesting and really challenging and very hard. And the sad thing is that the software industry as a whole, the quality is not what the reputation is for in the software industry I think. When was the last time you had a software update on your phone or your laptop or something? To fix some critical bug in Patch Tuesday or whatever. We get these patches all the time because those are flaws in the software that was already deployed and that's actually the stuff that's of a relatively high quality where they bother to patch these things. Most random apps that you're using or other bits of software behind the scenes that you never see are full of problems all over the place. And just the other day, the O2 network went down. These kinds of things are all over the place, we see them. So normal industry standard software, which is what we deal with, interact with most of the time is not great. If I was in a less illustrious company, I would say something about industry standard. But safely say I consider industry standard to be a badge of shame and not a badge of pride. So industry standard software is not high assurance. We don't have any assurance that it does what we think it does. It's not terribly reliable, has design flaws and bugs. And if you follow industry standard software development practices, you get industry standard results. We do the same thing and this should not surprise us. I mean, doing the same thing over and over and hoping that the next time it will be better. And as you do something different, that's not going to happen. And this is a real problem for the cryptocurrency space because you can't do Patch Tuesday in the cryptocurrency space because they will steal the money before next Tuesday. Before you had a chance to find the solution, write the patch and deploy it, and someone at the back was saying $5 billion overall has been lost on Ethereum smart contracts that were written badly. They will steal your money before you have a chance to patch it. So assurance and quality in the cryptocurrency space is even more important than it is with your mobile phone network working, perhaps. Well, it depends. I mean, people's lives are at risk with mobile phone networks not working. But here, there's lots of money at Stated Core. If we believe that it's real money, then we ought to treat it as such. So this leads on to the question of assurance. Now, I would say that most software products follow the faith-based assurance approach, which is to say, I have total faith in my team of crack engineers that they know what they're doing and they will produce good results. That is to say, you don't really have any assurance approach at all. You're just crossing your fingers and hoping and hoping that you can patch the software before something terribly bad happens. Or you can say, well, it's open source and people will find the bugs, so it'll be fine. And it's probably true. They will find the bugs. But you don't know who will find the bugs first. And it's quite likely that the people who want to steal the money or take the system down will find the bugs before the other people that send you the bug reports discover it. So that is not a great form of assurance for these systems working. So what's the alternative? The alternative is evidence-based assurance. We use evidence-based policy or evidence-based medicine. And the same thing here. You want to look at that you have evidence that the thing does what it's supposed to do. So in this case, what does that mean? It means objective evidence. And it's objective evidence about the software artifacts. This is not about the process. This is not like, I don't know if it's ISO 9000 or 9001. Someone will know. There are sort of standards of how you build things, the process by which you do. But that's not what I'm talking about. Assurance about the objects that we run, the software that we actually produce at the end. So it doesn't matter how it was made in a sense, so long as we produce evidence that it does what it's supposed to do. And that evidence should be objective and it should be reviewable by any sufficiently expert third party. And that holds us to a standard. If you take that approach, it's not that you can't produce good software without producing evidence. But having to produce evidence forces you to take an approach which produces high quality. So I think it's very important for us to take this approach of generating evidence and asking our users to demand that of us. So then, of course, the development process itself does have to generate that evidence about the artifact that it produces. So let's go into more detail and talk about what does that mean? And then how are we doing that? Or approaching that within Cardano? So what does good evidence look like? We use well-founded computer science techniques. Phil Wadlow was talking earlier about 80 years of academic research and programming language design. There is vast amounts of good computer science that does not get used in everyday industry standard software because people think it's too slow, it's too hard, they want to do just stuff quickly. But if we want to do things well and produce evidence about the code that we've written, there are actually lots of techniques out there that we can use that generate evidence about the software artifacts. So key things are specifications and formal specifications, not just long documents with lots of words. But when we say formal specifications, we mean specifications that use mathematics, generally, or a combination of pros and mathematics. And the key idea of a formal specification is it says concisely what the system is supposed to do. It has to be concise because it has to be comprehensible, at least by a large enough group of experts. It's no good if your specification is as big as your implementation. That doesn't help you with checking that it's right because you can't fit either of those things in your head. And these things, a formal specification is expressed in a precise mathematical notation, like you've all got one in front of you. This is without any pros around it to explain what it does, but it's a precise mathematical formal specification in a mathematical notation. So that's the approach we want to take for many parts of our system where we want to generate evidence. So we want specifications about as many bits of our, at least the important bits of our software, the important components within the system. So then, OK, so you've got a specification, and then you've got some code that runs that is the real thing. And you want to be able to link those two together. So you want evidence that the implementation and the specification agree with each other in some kind of sense. The one is an implementation of the other, or there's different notions of exactly how they relate. But you want some evidence that gives you reasonable assurance that these two things are equivalent, or that one is an implementation of the other. Implementation meets the specification. And there's two basic approaches to that. One is kind of the gold standard approach, which is mathematical formal proof. And that is great. But it's also still, even though we've made a lot of improvements over the years, it's still slower and more expensive than just like hack, hack, hack. So there's another approach, which is a compromise, which is testing evidence. And that is not as good. It's not as good as proof. It's not the gold standard. But it can be really quite good, especially when compared to industry standard, remember, which is right down there at the bottom. So the testing is a sort of lightweight method. It's much cheaper and quicker than the proof gold standard. And so it's much higher quality than industry standard. So it's a compromise. But at least it's a compromise between perfect and pretty good rather than a compromise between pretty good and industry standard. So it's a compromise on the right side. OK, so that's the two basic ways, like gold standard and kind of lightweight methods, particularly testing. But testing in a rigorous manner, which I'll talk about a little bit later. So within Cardano, how do we approach this? So as people have discussed this morning, and Agloss talked about, we start with a lot of peer-reviewed research. That's the starting point for us in Cardano in many places, in many areas of the system. And this includes things, of course, like Auroboros, protocol design, but also game theory and incentives, and the various other areas of research that Agloss talked about this morning. So that's the starting point, some really great ideas, research papers. And then we have, as I said, we want to write these formal specifications in a concise mathematical style of what is it we're actually going to build, having got some ideas from all these papers. And then we want to design our software, not just there's a spec, and then we go and write the software and then try and compare them at the end. That doesn't work. You have to design the software with that specification in mind so that they can be compared later. It turns out, for many years, of people trying to do them totally separately and compare them at the end, it just doesn't work out. It's too hard to compare an implementation that was designed completely independently of its specification and try and compare them at the end. You have to have your spec, you can co-evolve your implementation and your specification, but they have to be done sort of together to some degree. Otherwise, they're just incomparable. And then at the end, we want evidence that links our implementation with our specification. And for Cardano, we are taking a dual-track approach. Remember, I said that this is like the gold standard, and then there's a kind of the pretty good. And we're doing both. And we're doing a very different time scales. So we are doing the sort of fast approach, the lightweight formal methods approach. And then we're following up with, in certain areas, where it's particularly critical, we're following up with higher standards of evidence using more formal approaches. And I'll give examples of those a little bit later on. So just before we move on to, I want to go through in a bit more detail one particular example about the ledger layer, the ledger rules. But just before I do, there's also a very interesting problem here within our industry in general. And we see this problem quite concretely within an IOHK. Is that there is a massive gap, actually, between the output of academic computer science research and what ends up in products. It's sort of interestingly symbolized here within the room. So Agloss is sitting over here. And Phil and Manuel are sitting over there. And they're at sort of opposite ends of computer science. And for the rest of you, you might think that they're like, those are very close together. But actually, being within that sphere myself, I know that those are even they are quite a long way apart. Their notions of proof are look different. One is a kind of logician's proof and one is a mathematician's proof. And that might sound like the same thing to you. But to a logician, they look at many mathematical proofs and they go, oh, I wouldn't do it that way. And on the other side, the same. They're not wrong. They're just different. And then if you imagine, all right, they're that distance apart. Now what about the engineers? In the next street over there, there's the distance apart in terms of the language, the culture, what we focus on. It's surprisingly large gaps. And this is a problem we have throughout the entire industry. So you can't just, and so IMHK funds lots and lots of research, as Charles was saying this morning and Agloss was saying this morning. But you can't just write papers and then just chuck them to a bunch of engineers and say, all right, go implement that. That does not work out. So what have we done to bridge that gap? So one of the things I'm most proud of of having helped IMHK with is to try and help bridge that gap by putting in place a team in between. People who can understand the needs of engineers and the low level details and also can understand, at least to some degree, enough, can understand the papers that have been written. So Philip, who's sitting over there, is the head of our formal methods group. And that group is there exactly to bridge this gap. And this is not a new idea. I mean, this has been done before in, for example, Microsoft Research. Microsoft introduced this entire new group to help them bridge the gap between their engineers in Redmond hacking out code all day and trying to get more ideas from academia pushed through into their products. But we're doing the same thing on a much smaller scale within IMHK. And so far, that seems to be working out pretty well, I would say. OK, so that's part of our story. But let me focus, in particular, go into a bit more detail about how we're doing this concretely for an important part of our system, which is the ledger rules. So the ledger layer is the layer that sits above the blockchain and says what data is allowed to go on the chain? What makes a valid ledger? Can you add this new transaction onto the end of this ledger? Is that valid or invalid? And there's lots of rules for that, because it's not just transactions that go on the chain. There's all kinds of other bits and bobs. And you need to specify very clearly what those rules are. And then, as I said, we have this whole chain of evidence that we want to. So what is this chain of evidence? This is credit to Phil over here for the top quality artwork. So we have a, as I said, at the very high level, we've got specifications and implementations. And we want to link those two and have evidence that they are, in some sense, equivalent and some reasonable quality of evidence. This is not going to be the gold standard proof style evidence. It's going to be some testing, some manual checks. But it's nevertheless reasonably good evidence that, in principle, third parties could repeat these checks. So over on the far side, we have our specification document. And then the next step across is a Haskell executable specification. And we need to produce evidence that these two things are linked. And then at the far end, we've got the Haskell implementation, the real implementation. And again, there's a link between that and the Haskell executable specification. And we're going to a bit more detail about each of these steps in turn. But that's the big picture. We've got the specification document on one side, which is like a PDF with lots of rules. At the far end, we've got the real implementation that really runs and operates the system. And in between, I'll come into a bit more detail about what that is exactly. But let's start with the, let's start by thinking about the specification. So specifications are not God-given. They don't just ha ha ha in point of enlightenment. They are iterated in the same way that programs have developed. You start by scribbling down something and you go, oh, no, that's not right. Let's do it again. And you share it with your colleagues and you think. And you go round and round. And then you halfway through, OK, you've written some of it down in a more precise way. And then you go through multiple rounds of review and talk into other people. And it goes round and round and iterates and iterates. So this is still very much an iterative incremental process of producing a specification. But it has to be produced in this formal mathematical style. I'm not going to go into the details of exactly what's on this slide. It doesn't matter. The point is more that there is an iterative process of writing these things and reviewing them. And then finally, eventually, you've done enough iterations and you've done enough rounds of review. And you say, OK, this is pretty good. We agree this is actually what our specification is. I like these ones because it fits on one slide. And these are the rules for UTXO and UTXO with witnesses. And they fit on one slide. And that is actually very nice. If you want me to read off what one of these things means, let's look at the example. In figure 10, this is the rule that says that you need to have witnesses for a transaction to be valid, that each input has to be signed. Or rather, you have to have a signature for each input, corresponding to each input, that signs the entire transaction body. So someone who did first-year mathematics and had read this through a couple of times could read this off and turn it into English language in their heads and go, yeah, that sounds right to me. So let's read it off. So it says, for all I in transaction inputs, so that's for each input, each input in the transaction, there exists a verification key and a signature in the set of witnesses such that we can verify on the same signature on the transaction body, that that signature and that verification key, those verify with the transaction body. And the address in the UTXO of that input corresponds or equals the hash of the verification key. So that is exactly what you need for the witness to be a witness for that input. The witness has to correspond to the hash of the verification key, corresponds to the address, and the signature and verification key together can be used to verify, they do verify, with the transaction body. So that's in one line what witnesses are. And what's really good about that is it's really compact. This is so small that you can fit it on, as I say, like one slide. And more importantly, you can fit it in your head. And multiple people who are with different backgrounds and experiences on the problem can fit it in their head and say, yes, that is actually what we want. It's no good if people don't understand these things. Then it's just like squiggles on a piece of paper. It has to be in a language, precise language, but in a language that enough people can understand it, enough experts at least. And that is a challenge. But enough people can understand it and say, yes, those actually are not only are they self-consistent rules, but they are sensible rules. Those are the rules that, in fact, we want. So it's very important that when we choose the language in which we express these rules, that it's language that people can understand. It's no good if we write it in some awesome theorem proof that only three people can read. It has to be readable. All right, so that's that first bit. So that's Charles and Agloss over on the far end there saying, yes, these are the ledger rules that we want. Not only are they self-consistent and they make some kind of sense, but they are the rules that in fact we want to deploy on the chain and they can read it and they can understand it. So that's okay, we've got our specification. And so there's a lot of effort as you see that goes into getting the right specification. The other steps then are actually a lot easier. So the next one is the link between the specification document and the executable specification. The executable specification is very important because it's what we will use later for testing. So this is our step of going from this style of rule on paper into something that actually runs. And so this is an important transition. And we do this by writing them in very similar styles but in different languages. The one was, well, here we go. The top one is the one that I showed you a second ago. Well, actually it's the first of the two rules I showed you earlier. This is the main UTXO transaction rule without witnesses. The other rule was the witnesses one. And then the other part in the gray that is the Haskell executable specification. And they are supposed to be equivalent but one of them we can actually run and the other one is just looks nice. Well, the other one is more readable to other people. So this step requires people to read both and say that yes, those are the same as each other. And that might sound a bit dodgy that how can we really check? But they're written in such a way that they really correspond very closely. And we can do by code review, we can compare the two and they're not that large. This is only a few pages that we have to compare. So let's just see how they match up. So each of these things on the, I'm not gonna go in, it doesn't matter. We don't have enough time to go into all the details of like every single bit here but the key point is how they correspond. So each of these things at the top is a predicate and that corresponds to the Haskell code. So if you look at that, so transaction ends as a subset of the domain of the UTXO. Okay, so in the Haskell code, it says transaction inputs of the TX set dot is subset of. Okay, that's equivalent to the mathematical subset operator is equivalent to DOM of the UTXO. That's nice. So those correspond quite closely. The question mark, bad inputs, that's an additional detail that's in our executable version that says if it fails, why did it fail? Which the mathematical version doesn't say, it just says it must satisfy all of these things, otherwise it's not true. Whereas in the executable one, it's useful to know if it failed, why did it fail? Because that lets us get good test coverage later. We can make sure that we generate ledgers that cause each individual failure to happen. And so it's useful to be able to classify those failures. All right, so that's one of them. And then the same thing with the next one. They correspond. Each of those predicates at the top corresponds to one of the predicates in the Haskell code. There's a little bit of translation, but they're really pretty close to each other, as I think you can see. Same again. And then finally, this one down the bottom. So the ones above the line were predicates, things that must be true. The thing below the line is a state transition that says, if you were in this starting state, so long as all the predicates are true, then you move to the final state. And again, these things match up really very closely, particularly closely because we're using unicode Haskell operators to correspond to the special operators in the mathematics. So we can get this very close correspondence. And so the manual process of checking that these two things are the same is doable. The reasonable degree of reliability. Some humans have to do it, but it's doable. And anyone could repeat the process. So that's the hairy one in the middle doing that, because that's me. I'm Philip over there, and various other people. And then the final step is this step between the Haskell executable version and the real implementation. So now we're in a setting where two programs, both in Haskell, they both run. They both can be evaluated. And we want to check that the real implementation, which has lots of detail and concurrency and performance and real cryptography, corresponds properly with the specification. But it's an executable specification. And this we now do using testing. So the people in the middle there are saying, yes, the tests pass. And so let's talk about what those tests are. So the tests use this idea as follows. We have, at the top, we've got the executable specification. At the bottom, we have the real implementation. And these we're modeling here as functions. Of course, the real implementation is more complicated than a function. And it still goes from some state to another state. So it sort of inputs, outputs, very similar idea. And what we needed to be able to do is say, for any state of the real implementation, we have an abstraction function that takes that to a corresponding state of the executable specification. And so long as we can do that, then, if we generate hundreds of thousands of transitions in the real and executable specifications. So what does that mean in this case? That means ledgers. If you take, it generates an example ledger. And you feed it to the executable version, the executable spec. And you feed the same ledger to the real implementation. And starting in an equivalent state, they should end up in an equivalent state. And you can tell their equivalent by applying the abstraction function and checking that they're equal. So what that's telling you is that for not all possible, that would be proof. But for many, many, many correct and incorrect ledgers, if you feed the same one to the real implementation and the spec, they give the same results. They end up in a state which is equivalent. So that's not proof. That's not for all possible ledgers. But by testing with generators that produce, we can generate as many ledgers as we like. So long as we get good coverage, then that kind of testing evidence is really pretty good. And that really finds the bugs. And this technique has been used for decades. This is a well-established technique. And it's very effective in finding bugs. It's not as good as proof, but it's really pretty good. So that's the story with the testing. Beyond that, I'm not going to go on too much longer now. Beyond testing, as I said, we are taking this dual-track approach of, in certain cases, where it's really important, we're going to use more, or we are using, more than just lightweight methods and testing. So in particular, there's four examples, at least, that we're doing this with. For the Cardano wallet, which we started rewriting from scratch at the beginning of this year, following a formal spec approach in the spec we published back in the spring, we are now following up that work by formalizing the spec that we did on paper, formalizing it in the Koch theorem prover, and proving the properties that we only tested previously. So that's going along the same path that we did previously, but now to a higher standard. And so far, what's really nice about that is that we've not found any new bugs. We did find bugs when we ran through the tests, when we did the quick-check testing approach, the same technique as I showed a second ago. That technique did discover bugs in our implementation, but so far, the formal proof has not found any new ones. And that's actually a good thing. That shows actually how effective the testing was. But the proof is, of course, a higher standard of evidence. So that is something that we're following up with. And we're going to do the same thing with the ledger rules. We haven't yet, but that is something that's on our list. Now, in addition, we have people working on formal approaches to the development of Auroboros, new versions of Auroboros. We are doing this lightweight do-it-quickly approach, but we also have people working on the slower, more formal approach. And the way we're doing that is by re-expressing Auroboros in terms of a process calculus, and then formalizing that process calculus in the Isabelle theorem prover. And we will prove properties about that, and we'll prove relationships between the abstract version and more concrete versions. So again, it's trying to link lower level implementations with higher level specifications. And James Chapman, over in the other track earlier, was talking about this last one, about formalizing Plutus Core in Agda and proving properties about Plutus Core using Agda, in particular progress, preservation, decidability. I'm not going to go into the details there, but again, Agda is another not exactly theorem prover, but in this case, we are using it sort of as a theorem prover. And this, again, gives you a very high standard of evidence, even better than testing. So before you entrust billions of dollars for patient safety, you should say, show me the evidence! And then get independent expert review of that evidence. And additionally, we are trying to follow through on that. And I think what's important is that people who are interested in these systems and want to use them should demand this, not just of us, but of everybody else. Hold our feet to the fire and everyone else's and say, show me the evidence. I want to see that your system does what you think it ought to do. And we are trying to do just that. Thanks very much. So today, we've seen a lot of interesting work from the Plutus Immoralea teams on fleshing out the Cardano ecosystem. I would like to look towards the future. I would like to ask what is the significance of this moment in time? In 2100, when we are writing history books on the history of programming, what will be the import of this work? And I think if we've done our jobs well, it's going to be the sea crystal of a dramatic change in how we write software. Not just software for blockchains, but software in general. So right now, the current cost of an error is relatively small. You do have bugs which can cost you a lot of money. For instance, if your rocket blows up, that will cost you a lot. But that happens pretty rarely. The costs of these bugs are spread over time and in some sense over space because you have bugs affecting more things rather than big things. However, oh, additionally, the organizations that have large, costly bugs typically have insurance to deal with this. Smart contracts are changing this. So the Dow did not have insurance. I mean, who was going to insure it? It was very expensive compared to the amount of code involved. It was dozens of lines, not hundreds of millions of lines. And so the costs are getting larger and more immediate. Today's work aims to ameliorate this problem by giving us ways to write better software, better languages, formal methods, machine verification. But the ultimate user is not academia, it's not NASA, it's smart contract writers, all smart contract writers. And so what that means is many orders of magnitude, more people will be using these techniques than other techniques, other related techniques. And so more people will be exposed to these techniques than ever before. What are the implications of this? Again, not for Cardano, but for programming more broadly. So I think there are at least three. The first being the normalization of these techniques will lead to higher software quality overall, partially by using them directly, but also because as people who use these techniques on a regular basis already know, it changes how you think about writing software. Your thoughts themselves become more rigorous and it changes the thoughts that you have, the kinds of thoughts you can have. Second, as there are more users of these particular tools, it will generate demand for newer and better tools. Again, partially directly by demand, but also because you'll just have more people exploring the frontiers of what's possible. You'll get better things just because more eyeballs are looking for them. And thirdly, there will be more incentives to educate people. It's been said that you sort of need a PhD to use dependent types to prove anything about your programming languages or your programs. This is sort of no longer true, partially because YouTube has lots of great courses that people have taught. Also, we have books like Type Driven Development, which if you go to hacker spaces around the world, people are reading Type Driven Development and having little study groups and it's just average people. The work today will drive even more changes like this. You'll get more popular books on formal methods, not just on functional programming, but on theorem-proofing and the tools associated with it. You will also get more blog posts, more tutorials, more online courses, MOOCs on all of these things. And you'll get more web playgrounds for people to experiment with them. You can go to Repelit right now and run Haskell, which is pretty cool, but what's next for Repelit? Will it be Agda and Cock? Who knows? This won't happen overnight, but I think it's probably inevitable. And so I would like to thank the Plutus and Marla teams for planting the seeds of this change and for helping to bring about this very interesting future. I would like to invite all of you to join with us to build that future. And I can't wait to see what you build with Cardano. Thank you.