 So this is great. It's back to my roots. I was a student here in 1990, but I was actually born in London. I'm half British, which you won't hear in my accent any more. Brooklyn changed that. The first day I arrived at UCL, I was telling my mom, I'm going to be doing my orientation session at the University College Hospital, which is across the street. They have a big auditorium, and that's where they're sending all the students. And she said, well, just be careful because last time you were there, someone assaulted you with four sips. Turns out I was born at UCL, and then ended up going to college at UCL, and I had no idea. So this is really a fantastic honor for me, and I'm truly very happy to be here and be doing this at UCL. All right, so today's presentation is about the consensus mechanism of Bitcoin. And we're going to go into whatever technical depth is appropriate for the audience. So I need to get an idea. How many people here are programmers or have a computer science background? Very good. How many people are completely new to Bitcoin? Okay. And, oh, there's an overlap there. Very interesting. Excellent. All right, so we're going to keep it quite technical. Bitcoin is a system that is made out of a network protocol, some core cryptographic functions, and a set of game-theoretical equilibrium systems that dynamically adjust, which basically means there are some economics happening on a global scale. And these economics influence the operation of the network protocol. What we're going to talk about today is the invention that lies at the center of Bitcoin. Like many technologies, Bitcoin isn't entirely novel. In fact, the core idea in Bitcoin is to take five or six technologies that existed in the 70s, 80s, and 90s, and mash them together to create something completely new. All of the constituent components in Bitcoin existed about 15 years before Bitcoin actually showed up, but no one had combined them in this particular way. So what's really novel about Bitcoin is its architecture and the design characteristics that make it work. Think of it as a recipe. All of the ingredients were there, but no one had ever thought of combining them in this particular way. One of the most important components of Bitcoin is the use of cryptographic hash functions, specifically SHA-256. So how many people here are familiar with hash functions? At least understand the basic. Okay, great. I apologize in advance for my handwriting. I type fast. I write horribly. SHA-256 is an algorithm that takes an input, and this can be an input of any size, any form of data as input. Then it mixes it up, and it produces a fixed-size output, which is 256-bit long. You can just take data in here, and it will always produce 256-bit hash, as it's called. A useful way to think of this is a fingerprint for the data. Now, when you put data into SHA-256, you have no way of predicting what's going to come out. What does come out looks random. If you do any kind of statistical test on the output of SHA-256, it will appear random. However, it's deterministic, meaning that if you put the same thing in the top, you get the same thing out the bottom, always. Which means that if you have this fingerprint, and someone gives you a piece of data, you can put the data in, verify the fingerprint, and you know that they applied SHA-256 to it. It can be used to prove a certain data set. This particular technology, in 1997, Adam Back created a system for anti-spam called hash-cache. He said, well, a lot of people are sending spam, so how about we use this function to make them do some work, so that before they can post a message on this message board, they have to do a few hundred thousand SHA operations in a row. And then, at the end of that, you produce a fingerprint. Now, it's fairly easy then for someone to check that for a specific value. And if they check it, they know that the other person has done the work. As a result, what that means is that if you want to post a message on the message board, you have to do maybe half a second of computing before you post the message. Now, if you're a legitimate user of this message board, half a second of computing is nothing. If you're a spammer and you want to send a million messages, half a second of computing means you have to do half a million seconds to send a million messages, and suddenly it changes the economics. This mechanism is called proof of work. And what Satoshi Nakamoto did was to use this proof-of-work mechanism in order to establish consensus on a global decentralized network. And so this is at the core of Bitcoin. This invention, SHA-256, has existed for decades. Hash cash had existed since 1997. What Satoshi Nakamoto did was combine this with a peer-to-peer network that operates very similar to how BitTorrent operates, or many other peer-to-peer networks, to create a digital cash system. So let's say you have a 256-bit fingerprint. You put data in, something comes out. It's seemingly random, but deterministic. What are the chances the first bit is going to be zero? Anyone? Fifty percent. So half the hashes that are going to come out of this will have the first bit as zero. What are the chances the first two bits are zero? Twenty-five percent. What are the chances the first ten bits are zero? So it gets exponentially more difficult as you increase the number of zeros in the front. So if I say, I want you to generate a random number using SHA-256 by putting some input into it, and the criteria I have, rather bizarre criteria if you think about it, is I want you to produce a random number, but I want that random number to have a certain pattern to it. The pattern is it has to have zero bits at the beginning of the number. In numerical terms, you can describe this as the number has to be smaller than a certain value. So effectively, if you say the first bit is zero, it has to be zero. What you're saying is that the number has to be less than 255 minus one. If you say the first two bits have to be zero, you're saying the number has to be less than two to the 254. So what I'm saying is generate a seemingly random number that is smaller than a specific value. That value is called the target, specifically in Bitcoin we call it the difficulty target. Because the lower that value is, the harder it is to find one of these numbers. Now how do you find one of these numbers? Let's take a typical example. SHA-256 is often used to fingerprint documents. So you can create a fingerprint that allows you to verify that a document hasn't been modified. Typical example of that, let's say you signed a contract with someone. You take the PDF, you throw it through SHA-256, you get a fingerprint. Now, if any PDF that matches that fingerprint that produces that same fingerprint is the exact same PDF that you originally had, and you can verify that. If you want to download software from a website, and that software is extremely sensitive, security-sensitive code, you'll often see at the bottom of the website, it will say, verify that the software package you download has this fingerprint. And you can know that not a single bit in that software has changed. This is due to a characteristic of SHA-256, which is that the output changes dramatically, even if you change just one bit. So this is a cascade effect. You change one bit in the input. What you get out is not just one bit different. It is in a completely different part of the 256-bit space. So you never know where you're going to land. So let's take a simple example. I take this phrase, hello, exclamation mark. Anybody who has a laptop can throw SHA-256-SUM or SHA. Any Mac, any UNIX-based system, any Windows-based system has that function in it. Type SHA and it's going to give you a hash. You tell it to use the 256-bit algorithm. You type in capital H, E-L-L-O, exclamation mark. It's going to give you a specific fingerprint. That fingerprint is specific to this phrase. You change the exclamation mark, add a space, change the first letter to a lower case age. A completely different fingerprint. But as long as you enter the exact same phrase, the result will be a specific fingerprint. Anybody did that on their laptop? Okay, what's the result? Just give me the first few digits. Anyone in the world can take the phrase, hello, exclamation mark, put it into SHA-256, and they will get A8D19153. How do I change that? If I wanted to produce different fingerprints from this, I could introduce some change to the input. How about adding a number? I can take the same string and instead add a number to it. Hello, zero. Please help me out here if you don't mind. 0-E-0-9-7-8-C. It's 64 characters long. So, completely different output. And I can keep going. Hello, one. No need to do it, but completely different output. Each time I do this, I'm going to get a different result. Now, you remember before I said, I want my target to be... Let's say the target is that the first four bits are zero. Well, this one fulfills it. The first hexadecimal digit is zero, which means that this particular input produces a value less than the target. That was relatively easy to do. Once again, the probability of finding a hash where the first hexadecimal digit is zero is not that low. I can just run a few iterations, and I'm going to get at least one value less than zero. If you gave me the string, hello, and you said, find me a specific random number or any number that you add to hello, and when you add it, it produces a hash that starts with zeros. How do I find that number? I have to brute-force it. I have to try every possible number. So I would start, hello, zero, hello, one, two, three, four, five, six, seven, eight, and just keep going in a loop as fast as I can in order to produce hashes. At some point, a hash would pop out that would meet my difficulty target. As soon as I have that hash, I can show it to you. I can do two things. I can show you the hash and the number. I can say, for example, hello, 39,137 produces a hash that has two zeros in the beginning. I'm just guessing. If I do that, can you verify it? Well, of course, you could take, hello, 39,137, plug it into a hash, cut them out with a fingerprint, and if the first few digits are zero, you're like, oh, that is correct. What have I just proved? What I've proved is that I did several tens of thousands of SHA operations to find that. There is no way I can produce that result without doing the work. There is no way I can produce that result without searching through the space of fingerprints in a brute force manner. And based on the difficulty, one bit zero, 50%, two bits zero, 25%, three bits zero, 12%, four bits zero, 6%, etc., you can estimate statistically, simply from the pattern, approximately how many operations I need to do on average before I find it. Who has access to a blockchain right now and can look up the latest block? Anyone? Look up the block hash of the latest block. Can you read me out at the beginning? This is the fingerprint of the current block in the blockchain. In order to produce this fingerprint, a miner used the header of the block, which is a known set of data. It represents all of the transactions that are in the block, the date and time, and a few other datasets. It's a standard data structure. Think of the header as the hello part. That's the useful payload, if you want, that we're trying to validate through proof of work. In addition to the header, they introduced a number, which we call a random number. This is called the nonce. If you take one look at this, what that means is that when you take that block's header, and the specific nonce that was found by this miner, it produces this fingerprint. These are hexadecimal digits, 17 times 4 bits. So 68 bits at the beginning of this hash, the first 68 bits are zero. What is the probability that the first 68 bits are zero? You can do the math. But I can estimate approximately how many hashes that took. At the moment, the Bitcoin network is producing 500 petahashes per second. That is 500 quadrillion hashes per second, and it still takes, on average, ten minutes to find a block. 500 petahashes times ten minutes, on average, is how many hash operations the miner needed to do this. If I see this hash, I know that the only way a miner was able to find that very special nonce that produced this result, there is no way to shortcut this function. The only way they did it is by performing quadrillions upon quadrillions of hash operations. In order to perform these quadrillions upon quadrillion hash operations, they had to use a very important, precious resource, energy. So laws of thermodynamics tell me that in order to run the Shah algorithm, you have to flip bits. You have to expend energy. Expending energy dissipates heat, and therefore I know two things about this miner. One, they are running some serious hardware, because they are able, within a period of less than ten minutes, to do quadrillions of hashes. This hardware didn't exist three years ago. This hardware, I can tell you from the specs we see, is probably a 20-nanometer fabrication chip that is designed to do just Shah-256 and nothing else. The way it works is by having hundreds of thousands of parallel Shah-256 engines packed as densely as possible. These chips consume enormous power. You can heat your house with them. You can toast bread with them. If you don't dissipate them, they will melt. These chips consume electricity. In fact, you can estimate approximately how much electricity is consumed in terms of watts per giga hash. How many watts of electricity do you consume to do a billion calculations of hashes? That metric is the efficiency of this mining equipment. Just by looking at this number, I know that the miner has expended enormous amounts of effort, which translates into enormous amounts of electricity and heat, which means that they had to pay someone to give them electricity, which means they incurred a financial cost. This is really important. You will hear people say that Bitcoin wastes electricity. Bitcoin does not waste electricity. Bitcoin uses electricity to underpin the security function, because it creates an economic system whereby, in order to participate, you have to incur cost. By incurring cost, the only reason you would incur cost is for the possibility of reward. The possibility of reward is determined by whether your block meets the consensus rules. Which means you spend money, and if you play fair by the rules, you get money back. If you spend money and try to cheat, you don't get money back, which means you lose money. Therefore, it doesn't pay to cheat. That simple game theoretical equilibrium is the core of the Bitcoin consensus algorithm. Every ten minutes, a miner has to take a whole bunch of transactions, maybe a thousand transactions that are currently outstanding, unconfirmed, ready to be included in the blockchain. They have to then perform consensus rule validation, meaning they have to look at each one of these transactions, and say, does it meet the rules? The rules are the same for everyone. There is a list of maybe thirty or forty rules that every transaction has to comply with, and there is a list of thirty or forty rules that every block has to comply with. They must construct a block with transactions that match all of these rules. Once they construct that block, they are then going to try and find a nonce that makes its header fingerprint look like this. They are going to spend a lot of electricity trying to find that nonce. If they then find that nonce and send it to the rest of the network, and the rest of the network says, sorry, doesn't meet the consensus rules, they just wasted all that electricity. They have to make sure that that block meets the rules, which means that every transaction in it meets the rules. Every transaction is properly signed, has never been spent before, is properly structured. The amount inside doesn't exceed the number of inputs, the outputs are properly formatted. It doesn't spend mining reward in less than a hundred blocks. There is a whole list of rules. That way, you align the interest of the miner, who is wasting electricity, or using electricity, with validating the consensus rules. In every block, the miner gets to write themselves a check. When a miner is constructing a block, the first transaction they put in that block is a transaction that pays them reward. It pays them 25 bitcoins. That is a unique transaction in bitcoin, because most transactions in bitcoin have inputs and outputs. The coin-based transaction, as it is known, has no inputs. That means it is writing a check from nowhere to myself for 25 bitcoins. Every miner will put a check in the beginning of the block that says, Pay me out of thin air 25 bitcoin. That 25 bitcoin is new, never existed before, and is created. Here is a simple question. Why 25? Why doesn't the miner write themselves a check for 26 bitcoin or 30 bitcoin? None of the other miners would agree to trust that. Imagine a scenario where the miner gets greedy. Instead of writing a check to themselves for 25 bitcoin, they write a check for 26 bitcoin. They construct the candidate block, fill it with transactions, and put the first transaction to pay themselves 26 bitcoin. Now they have the header, and they have to search for this number, the nuns. They do all of this work and produce a hash. They find the proof of work. They then disseminate that block to all of the other miners. What do the other miners do? They validate the block consensus rules. One of the consensus rules is, you can only pay your self-reward at the correct rate based on what block you are at. If it is 2009 to 2012 year, the first 800,000 blocks, then it is 50. If it is now until August 2016, then it is 25. At a very specific moment on a specific block, that equation is going to produce a result of 12.5 blocks. Shortly after August 2016, when enough blocks have been mined, it is after every 200,000 blocks. After 400,000 blocks have been mined, on the very next block, everyone will expect to see a reward of 12.5. If you write yourself a check for 25 at that point, your block will be rejected. Who sets the validation rules? This is a really good question. The validation rules are defined by the software, the reference implementation. If you want to know what the current validation rules are, you read the software code for the Bitcoin Core reference implementation. It is a C++ program that contains functions that do things like block valid, transaction valid, and these functions evaluate things based on a simple set of rules. You might ask, where are these rules documented? The answer is nowhere. The rules are whatever the core implementation says the rules are. The rules are only documented in C++. In fact, this is one of the tricky parts of consensus. The rules are whatever the core implementation does, including the bugs, including every bug ever found since 2009. If you want to write a competing implementation, you have to simulate every bug that was found in the code since 2009, and process the blocks in exactly the same way, because you have to reprocess them from the beginning until today. That means you have to fail where Bitcoin Core failed, and succeed where Bitcoin Core succeeded. Bugs and all. Who's definition of C++? Yes, so with that software code comes a long list of very specific dependencies. Dependencies on underlying database code, C++ versions, boost, and various other libraries that are used. Up to recently open SSL libraries. Now they've changed that a bit. There is a whole edifice of dependencies that constitute the reference code. Yes, right, so let's talk about that. First of all, let's explain how the decision is made. I've said until now that when miners receive a new block, they validate it by the rules. Then they decide if they're going to accept it. How do they decide? How do they vote? How is consensus actually affected on the network? Consensus is affected on the network based on a decentralized mechanism of voting by means of the longest difficulty chain. Let me explain what I mean by that. The blockchain is a sequence of blocks. Each block within it contains the hash of the previous block. The hash of the previous block is based on the header, which contains the hash of the previous block. Each block within it contains something that, if it changed, would change its own hash, the next block's hash, which would ripple all the way through the blockchain. This all goes down back to the first block. In January 2009, Satoshi Nakamoto mined the first block. They did not use the Bitcoin consensus algorithm to mine this first block. They kind of jury-rigged it so that it would come out properly, because there was no consensus mechanism at the time. The second block was mined based on consensus. The first block is actually embedded in every version of Bitcoin. Every piece of software that implements the Bitcoin consensus algorithm has within it statically defined the Genesis block, including the Genesis block's hash, its fingerprint. If you are looking at the Bitcoin blockchain and you have the Genesis block and you have its fingerprint, and someone gives you a block and says, this is block number two, you can verify it because block number two contains a reference to the hash of the previous block, which is the Genesis block, and you have that as a constant in your code. Then you can verify that the second block was mined properly, because its hash should contain a number of zeros in front, which means that the proof of work was done. Now you validate the second block. If someone gives you the third block, you would then do the same process. You would validate all the transactions within it. You would validate that it links to the second block, and you would validate the third block. If you do this process after three or four days, you are going to get to block $366,000. Then the network will tell you that you already have the latest block. You have essentially rebuilt the entire chain from Genesis to today. You can independently verify that every block you have is correctly linked to every previous block, all the way back to the Genesis block, which means you have the authentic blockchain. This property is really important. This property means that in a distributed system, nodes can join and leave at will. One of the important properties of Bitcoin is what happens when you turn your back. What happens when you leave the network, stop paying attention, and then you come back? You come back and the network tells you you are a thousand blocks behind. But you can then retrieve each one of these blocks, rebuild them on top of what you knew to be the truth before, and arrive at the same truth as everybody else in a way that is completely irrefutable, that requires no appeal to authority, that can be verified independently by your node. The blocks essentially get built in this chain that references everything down to the past. So far, so good? Let's say I am a block $366,001. I validated that up to this block, I can link everything back to the Genesis block. As far as I know, this is the latest block on the network. I am connected to the network, and somebody sends me a message that says, I have block $366,002, so I can add that block to the chain and validate it. What I can also do is calculate how much proof-of-work was in this block, based on how many zeros were in the front of the header. I can add up all of the proof-of-work so far in this chain, and I can estimate the total amount of difficulty represented by all of the proof-of-work that has gone into all 366,002 blocks. That gives me the weight of that chain, which means I can estimate the total cumulative difficulty that is represented by that. Here is why I need that. Imagine this is a network that is distributed across the world, as it is, of course. Imagine that miners are working on this problem in different places in the world. The miners all start with the same data, they include more or less the same transactions, and then they race to find proof-of-work for a block. Let's say that a miner in Canada finds a block, and 200 milliseconds later, a miner in Australia finds proof of a block. The Canadian miner started construction of Candidate Block 366,003. At the same time, the Australian miner started construction of Candidate Block 366,003. When did the miners start constructing these blocks, as soon as they received the previous block? Think of this as a race. You are trying to find blocks, but if someone tells you there is a new block, that means you just lost the race. Upon receiving a block from someone else, you validate it as quickly as possible, and then you know the race has started again. You immediately stop doing what you were doing before, construct a new block on top of it, referencing the hash of the previous block, and start trying to find proof-of-work as fast as possible. This is a race. Milliseconds matter. The moment this block was broadcast across the network, the Canadian miner and the Australian miner both start their engines and start hashing as fast as they can. They have constructed a Candidate Block, stuffed all the transactions in, calculated the header, and started throwing random numbers next to it in order to fill that hash function and produce proof-of-work. Eight and a half minutes later, the Canadian miner finds a nonce that gives them proof-of-work. Their header is going to be slightly different than the Australian miner. The timestamp is going to be different. They might be off by a couple of seconds. The clocks are not perfectly synchronized. They may have received transactions in a different order. Maybe they saw the Australian transactions first, and then the Canadian transactions arrived a few milliseconds later. The order in which they added them to the blockchain was slightly different. All of these things completely changed the fingerprint of the header, and completely changed the outcome of the proof-of-work. Effectively, they are working on blocks that are slightly different. Maybe just a few bits different. Maybe a few kilobytes different. They are both racing. At eight and a half minutes, this miner finds a block. At eight and a half minutes and a hundred milliseconds, this miner finds a block, but on opposite sides of the world. What do they do as soon as they find a block? They tell everyone about it. Now, the race is to make their block the one that the world knows about. They have won the race, but they need to make sure the world knows they won the race. That is what matters. They start propagating. They start sending that block on all of the nodes that are connected to them as fast as they can. Imagine now you have these nodes, and you have these two blocks pop, pop, and they start propagating. Every node receives the block, says, OK, a new one has been found, sends it to all of their peers after validating it. They each receive it, validates and sends it to all of their peers, and propagation starts. A ripple, like dropping two stones on two opposite sides of the lake, creating these ripples across the network. Effectively, half the network is being painted green, while half the network is being painted red. All of the nodes that are closest to Canada see the green block. They assume this is now the latest block. All of the nodes that are closest to Australia see the red block. They assume this is the latest block. We have a race condition. In distributed system terms, we have a race condition. This race condition now has to be resolved. In Bitcoin terms, this is called a fork. Both of these blocks are valid. They have all of the consensus rules. They have sufficient proof of work below the target. They are fully validated by everyone. But only one can survive. At the moment, there are now two versions of the Bitcoin blockchain across the world. Two competing versions of history. Only one can survive. At this moment, you have this fork. What happens next? There is no voting mechanism in Bitcoin. You don't get nodes going, hey, I think it's green, or I think it's red. Voting happens through the application of mining power. Essentially, what happens now is, what does a miner do when they receive a block? They start working on the next one as fast as they can. When you receive a block and you start building a child of that block, what you are effectively doing is voting for that block. Your vote is your hashing power. You put your computing behind that block. You say, this is the longest chain as far as I know. I am going to build on this. At this point, you have all of the nodes that receive this one building on top. All of the nodes that receive this one building on top. The race is on, but now there are two race tracks, a fork. What happens next? Someone finds a block. Seven and a half minutes later, someone who had this block as their parent by complete coincidence, doesn't matter, finds a new block. They immediately broadcast it to the entire world. Everybody who had the Australian block before sees the new block and says, okay, new block, I have its parent, connects it, validates it, and immediately starts building on top of it. Why are these people? Everybody who is on this side and suddenly sees this block, looks at that block, starts validating, looks at the parent hash and says, oops, this is a child of red. I thought green was winning. I was wrong. The fact that a child of red won means that green was not the longest chain, meaning that every miner who thought green was the longest chain now has a new picture of reality. They now know that the rest of the network thinks red is the longest chain. They are on the wrong side of the fork. So what do they do? They orphan this block. They basically say, that was a momentary mistake. Our history deviated into a parallel universe for ten minutes where we were wrong. But now we know better. What happens to all of the transactions that were in that block? Well, if you think about it, most of the transactions that were in this block are also in this block. There's probably very little difference between them. So the first thing that happens is all of these miners go, could somebody tell me what the red block looks like? Because I've never heard of it. So they go out on the network and they ask for the specific parent fingerprint. They say, please give me 5D165. I've never seen it, this one. And someone on their network says, here. So it propagates the red block fully. Now, most likely they've already received it. So this is a bit of a cheat, right? Which means that even if you got this one first, you also got this one eventually. Maybe a few seconds later. And to avoid duplicating communication, what you do is you put this one on the shelf. You say, I think we're green, but this red one also is a candidate. So I'm going to put it on the shelf there, just in case I find them on the wrong side of history. And then when you find out you're on the wrong side of history, you go back to your shelf and you say, turns out it was red. You take green, you throw away all of the transactions that are in, and you put them back in the queue. Then you take red, you start checking off the list, all of the transactions that are already in red, and see if there's anything left, which is the difference between the two. And you start putting those. You then check off all of the transactions that are in orange, and then you start building on top of orange. So this side of the network now converges onto orange, and they also join the race on top of orange. Okay, let me just repeat that. Race condition, eventual convergence, triggered by the discovery of the next block, forcing a longer chain validation. This is now the longest chain. Therefore, this wins. Therefore, this is not on the longest chain. Therefore, this loses. The entire network then converges on the longest chain, and then they vote again by building on top of this a new child and continue going. Yes, questions? Well, the difference in proof of work will be very, very minor. But it's still a way of determining which one. There are other protocols for consensus. The problem with that is that it can cause various weird effects in the network. So this model of eventual convergence after ten minutes works. There is another protocol called ghost, G-O-S-T. It does some interesting things that allow, for example, children, proof of work to still be counted, because you still did the work, and to get maybe a partial reward for that. That's not Bitcoin. Bitcoin works with this very simplistic algorithm. There are many other competing consensus mechanisms. Yes? Well, so here's the question. Each one of these has a check that pays 25 Bitcoins to someone. The checks are only valid if they are on the longest blockchain. So this check is on the longest blockchain. The entire network knows that this check is no longer on the longest blockchain. Essentially, this transaction never happened. You disappear from history. The truth is the longest chain. The other chain never happened. Now, here's where critical consideration comes in. When you earn a reward check for mining a block, you can't spend that for 100 blocks. Why? Because for one block, history is fickle. After 100 blocks, if your transaction is still in the chain, it is history. Think about the Bitcoin blockchain as a geological strata. You are drilling a core sample in the ice in Antarctica. The top 10 centimeters are slush. They come, they go, they melt, the wind blows stuff around. You can't tell anything. You go three meters down, you are looking at 100 years of history, and that layer hasn't moved in 100 years. You go 300 meters down, and you are looking at the Cretaceous era. That millimeter-thin layer hasn't moved in millions of years at all. The way that happens is because layers get deposited on top and compacted. The deeper they are, the harder it is to change them. Bitcoin's consensus algorithm creates a geological history. At the top, the winds blowing and things are very fickle. You go 144 blocks down, which is one day old. The probability of a block changing after 144 confirmations is vanishingly small. But the algorithm allows it. You can change block two. All you have to do is produce a competing longest chain that has more proof of work cumulatively than 366,000 blocks until today. In ten minutes. Because you have to do that before the other chain gets one block longer. This is how the cumulative work... They discard them pretty quickly. You can always retrieve the alternate block. If the node is publishing a longest chain, it also has the history behind it. You can always retrieve that. That is an implementation detail, but it is a good question. This is a very important consideration. Who is bandwidth rich? There are two important considerations in mining. How cheap can you buy electricity? How low-latency networking can you achieve? If you have cheap electricity, you can mine as close as possible. Keep in mind that the efficiency of mining equipment is bound on the upper side by Moore's law. We are already seeing 14 nanometer chip fabrication. Bitcoin mining is approaching the edge of Moore's law faster than desktop computing. Why? Because there is a $3 billion economic incentive behind it, which doesn't exist in desktop computing anymore. Bitcoin mining is now driving the development of silicon fabrication, which is shocking. What that means is that you cannot squeeze more efficiencies out of silicon. Now it becomes a matter of how densely you can pack that silicon in a chip, how densely you can pack the chips on a board, how densely you can pack the boards in a rack, how quickly you can suck heat out of that, and how quickly you can push gigawatts or kilowatts or megawatts of energy into that rack. It is a data-centered game. Then your economic efficiency depends on the cost of your inputs, electricity, the cost of your operations and ability to maintain the hardware, and your ability to propagate these blocks faster on the network. Latency is enormously important, which means that, at the moment, most mining has migrated to China. The reason for that is because subsidized coal-fired electrical power in China is... I think the ironic term would be dirt-cheap. As a result, it leads that economic concentration there. However, China has bandwidth issues, and latency is a big problem. As the block size increases, it puts the Chinese miners at a disadvantage. If you have a one-megabyte block and you are trying to propagate it to eight nodes, it takes a certain amount of time. If you take that one-megabyte block and increase the size limit to eight megabytes, it takes you eight times as long. While you are propagating a block, someone else is beating you to it. How often does a fork happen? Approximately every day, once a day, on average. Two-blocked fork, maybe once a week, maybe once a month. If you start seeing a three-block fork, something really weird is happening on the Bitcoin network. Why? Because imagine the two competing sides created two blocks. Then the two competing sides of the network started building on top. Again, by coincidence, two blocks were discovered sufficiently far on the network... to propagate to equal parts of the network, and sufficiently close together in time... to not be able to overwhelm each other. Then everybody starts building on top. Again, by coincidence, two blocks were found simultaneously at opposing sides of the network. The probability of that happening once, twice, rare, three times, exceedingly rare, etc. It is an exponential function. That is the basis of the consensus algorithm. Do we have a crowd? There is somebody. We can't keep long. Somebody might appear in time. If someone shows up, we will have to leave in a hurry. Alright. How often do forks that are longer than this happen? In April of 2013, Bitcoin experienced a two-blocked fork. Eight minutes later, Bitcoin experienced a three-blocked fork. Ten minutes after that, Bitcoin experienced a four-blocked fork. At this point, people started getting worried. Statistically, we are now in a three-four-sigma event. Then Bitcoin experienced a five-blocked fork, followed by a six-blocked fork, followed by a seven-blocked fork. That is when the emergency alert message went out. April 2013, Bitcoin had an, oh, shit, moment. Because this isn't supposed to happen. So what had happened? This is a really interesting study in the mechanics of the network. That day, or actually about a week before, a new version of Bitcoin had been released. This new version of Bitcoin used a new database for storage of the blocks. Instead of using Berkeley DB, it used Google's level DB to improve efficiency, indexing, and various other characteristics. Databases are not part of the consensus mechanism. Only they are. There is a difference between the implied consensus rules or defined consensus rules in the reference implementation. The runtime consensus that is exhibited by the network dynamically as a behavioral artifact. And bugs affect the exhibited runtime consensus. All that matters is runtime consensus. So what happened? Berkeley DB had a bug. That bug was limited to 1,024 file descriptors. That bug could not be triggered when everyone was running Berkeley DB. No one could create a block with more than 1,000 transactions. If they did, it would crash their machine and they would never propagate it. When version 0.8 came out and machines were now mining on level DB, one of those machines was able to create a block with 1,200 transactions. That block could not be consumed by any nodes running Berkeley DB. They would start processing the transactions to validate them. They would open file descriptors. They would process the first 1,024 transactions. Then they would attempt to validate transaction 1,025. Choke on it, crash, and restart. They would restart, join the network, ask it what the latest block is, receive the exact same block, start validating 1,025 transactions later, choke, crash, reboot, ask for a block, validate it, choke, crash, reboot. The problem is half the network had adapted to level DB, half the network was on Berkeley DB. The network suffered a complete bifurcation, almost perfectly 50-50% balanced, and one side could not move forward. They couldn't move to the next block, because every time they got on the network, they would try to validate the same block. So one side mines another block, this one chokes. And another block, this one chokes. And another block. The April 2013 event resulted in a 26 block fork, which should not happen during the life of the universe. This was a perfect example of runtime consensus being different than explicit consensus in the rules. And it was caused by a bug. It led to an emergency overnight summit, let's call it optimistically a summit. That's when everyone is screaming on IRC and running around as if their hair is on fire. And by that morning, this happened around midnight in the US, by that morning the mining operators had all gotten on board. They did some emergency upgrades, downgrades, patches, etc. And by running code that anticipated this mistake, they were able, after 26 blocks, to shift all of the hashing power back to the first fork, create a 27 block longer chain than invalidated the 26 block chain that had gotten in front, wiped out all of those transactions, reloaded them on the other side, and the network continued. What's really fascinating about this example is that if you made a transaction in the network, it got delayed by 26 blocks, but it got processed eventually. Not a single transaction was dropped. There were a couple of double spends. People spend transactions on both sides of the fork, and they did this as a proof of concept. Then they refunded the merchants for their lost money. This was one of the events that gives us an opportunity to study the consensus mechanism. On July 4, 2015, five days ago, we had a six-block fork event. The six-block fork event was caused by the introduction of a new standard for processing consensus rules, called BIP-66, Bitcoin Improvement Proposal 66. BIP-66 has been in planning for more than a year. BIP-66 requires that, after a certain point, all transactions have elliptic curve digital signature algorithm signatures that are encoded in a very strict way. Previously, anything open SSL accepted as a valid signature was a valid signature. The problem with that is open SSL has bugs, and this leads to another phenomenon called transaction malleability. In order to fix that, a proposal was introduced about a year ago to require that all signatures are strictly encoded in a specific way. It narrowed the range of possibilities and signatures, tightening the consensus rules. Which means that some transactions that were previously valid would now be invalid. This is a soft fork event, meaning it is backwards compatible. This proposal was put to a vote, meaning that it uses the blockchain mechanism to do an automatic upgrade. Every miner puts a version number in the block, so they say, this is version two. If a miner was supportive of BIP-66's strict encoding of signatures, they would signal this to the network by making all of their blocks version three blocks. Then they count. They look at the blocks that are coming in, and they count of the last 1,000 blocks we have seen on the network. Approximately 10 days of blocks. What percentage are version three? Once the percentage goes above 75%, BIP-66 is in effect. Meaning that everybody should create transactions with strict signature encoding, because we are about to make a major transition in the network. This is the grace period. Everybody should switch over to BIP-66 at that point. It has been voted in by the network. Once 95% of the previous 1,000 blocks, or 950 blocks, are all version three, this is a transition. Not only is BIP-66 encoding required, but non-BIP-66 encoded, non-strictly encoded, old consensus rules transactions are now considered invalid. Meaning that every node should not only produce correctly signed transactions, they should reject incorrectly signed transactions after that threshold. That threshold was reached on the morning of July 4th. In the afternoon of July 4th, we discovered something interesting. Miners who received blocks were cheating. What they were doing was building the next block without validating all of the transactions. Why? Because it takes time to validate the transactions, and this is a race. Because they weren't validating transactions and proceeding to mine the next block, when someone who was on the minority, the 5% who had not yet upgraded to BIP-66, created a transaction that was invalid and mined it into a block. That block was accepted. 75% of the miners started mining the next block without validating it. Then the rest of the network validated it, found the invalid transaction, and rejected that block. The network progressed one block, and part of the network kept rejecting that block. So you had a fork emerge. It lasted six blocks. The miners who were taking the shortcut lost $50,000 in Bitcoin reward over these blocks, in cumulative power that they consumed. They were basically building blocks, doing the proof-of-work, and those blocks were rejected by consensus because they did not conform to the new BIP-66 standard. So because they were taking a shortcut, because they were not validating to consensus rules, they found themselves on the wrong side of consensus. In Bitcoin, you can have opinions, you can believe in big blocks, you can vote for small blocks, you can vote for BIP-66, you can vote against BIP-66. The one thing you can't go is you can't go against consensus. You go against consensus, you're burning electricity for nothing, and you will pay for that. That is the basic game theory of consensus. On July 5th, we had a three-block fork. On July 6th, we had a four-block fork. On July 7th, we had another three-block fork. BIP-66 is still causing instability in the Bitcoin network. But over the next couple of weeks, we expect things will get upgraded, and we will see stability, and the network will converge on strict encoding of signatures. In the meantime, what's interesting about this scenario is that while this is happening in the network, the network is self-healing. So when these forks happen, they get resolved by eventual convergence, and as soon as they get resolved, all of the transactions get properly sequenced, and the network continues. This is an extremely resilient mechanism for a massively decentralized system. Any questions? No? Theoretically, you could. The protocol allows it. You would have to not only sustain 51% hash rate, you would then have to do 366,000 blocks worth of proof-of-work in ten minutes. More likely, you can change one or two blocks in the most recent history. Maybe three. No, because every node should be able to fully validate from the Genesis block up, no matter what you present to it, based on the same rules that were in existence at the time. If you present a completely valid, alternate history with sufficient proof-of-work to a Bitcoin node, it should be able to validate it all the way to the present from the Genesis block. In fact, every node, when it starts, only knows the Genesis block. The first thing it has to do is synchronize with the network, and it does that independently, and painstakingly verifying everything from Genesis block to today, it reconstructs the entire chain independently. Every node does this. It takes four to five days. The blockchain is about 40 gig at the moment, I think, depending on whether you're indexing all of the transactions or not. And it's growing fast. The actual synchronization takes quite a bit of bandwidth and quite a bit of time. Questions? Yes? Do they use the same mechanics as Bitcoin? Not all. The vast majority of altcoins operate using what we would call Nakamoto consensus. Nakamoto consensus being the longest chain proof-of-work, as determined by usually a SHA-256 algorithm. Some use a SHA-3 algorithm, or a script algorithm, or different forms of hashing algorithms, but they still implement Nakamoto consensus in terms of longest chain proof-of-work. But there are altcoins that use other forms of consensus, modified consensus with taking into consideration orphaned childs, for example, which is ghost, what I described before. We have some experimental implementations of that. There are consensus mechanisms that, instead of proof-of-work, are based on proof-of-stake or delegated proof-of-stake. And we're seeing all kinds of new consensus algorithms being dreamt up. How many of those can scale to a global level of security that is resistant to global attacks? So far, one. Bitcoin. But that doesn't mean another one can't scale. What's difficult, however, is that if you try to scale a consensus algorithm today, you have to reach scale before you are attacked at scale. You have to build an industrial infrastructure of hashing, or mining, or a user-adoption base, or an economic base that is big enough to resist attack before you are attacked. Bitcoin did this by everybody ignoring it for a couple of years, because they didn't think it was important. And by the time everybody noticed and thought, okay, maybe this is important and worth attacking, it was already strong enough that it couldn't be attacked. And then the strength of the network has outpaced the adoption and demand and attacks to make it extremely resilient to attack. The problem is you can't do that again, because if you actually have a really innovative consensus algorithm and people think that's going to be valuable, and they think it's going to be valuable enough to join the consensus algorithm and mine for it, they're also going to think it's valuable enough to attack, and there's no flying under the radar anymore. This is not just an algorithm. This is now a completely new scientific discipline. Consensus algorithms will be an entire computer science curricula in the future. This is a completely new branch of distributed computing. It is extremely important. It is now six years old. This is the birth of a new scientific discipline. We've gone from one scientific paper, the Satoshi Nakamoto paper. Last year, about 140 papers, peer-reviewed academic journal papers, were written on consensus algorithms. This is now a thriving scientific discipline, with dozens and dozens of researchers around the world working on this. Consensus works in waves, and this is an important concept to understand. There's what I would call process consensus, which is a process of debate and proposal that happens among the development community. It starts with Bitcoin improvement proposals, discussions on GitHub, Bitcoin talk, the Bitcoin developers mailing list, and various other places, where people suggest changes, slight modifications to the rule, or major modifications to the rules. The reason for that discussion is to enable smooth software transitions in runtime consensus. You gather, debate, and try to reach process consensus, which means you have enough people in the development community agreeing with you. You then do a lot of testing in order to validate the software. You provide a reference implementation that implements the change. You demonstrate that reference implementation on the testnet, which is a parallel Bitcoin blockchain. You run a battery of tests, the other developers poke holes in it, find bugs, suggest improvements, and at some point you reach consensus. Then that is implemented in the Bitcoin Core reference implementation. Now you've reached reference consensus, which means it is introduced as a release in the software. In order to do that, you have to get a broader set of consensus. That software is released. In order for that software to actually go into the network, people have to upgrade. Now you have to have consensus among the constituencies. People think that miners are the only constituency, but there are actually five consensus constituencies in Bitcoin. The software developers who are making the reference implementation, the miners who are implementing the runtime consensus for mining blocks, but also the exchanges. Each exchange that exchanges Bitcoin into other forms of currencies is running nodes that validate transactions. They choose which version of the software they are running. The wallets. Each wallet company or wallet software out there creates transactions that must be validated by consensus rules. If they are doing centralized wallet processing, they are also running nodes that must validate transactions based on consensus rules. And finally, merchants and merchant processing, the economic engine of Bitcoin. Merchants either directly or through processors are running nodes to validate transactions. They are doing the strictest validation possible, because they are the ones who give out a plasma TV for this mythical magic internet money. If they don't validate the transaction correctly, they are out on TV and don't have Bitcoin to show for it. What happens if the miners go off on their own and the merchants, exchanges, and wallets choose a different version? Well, the miners create Bitcoin and they mine it. A hundred transactions later, they try to spend that Bitcoin. Only they can't spend the Bitcoin, because they can't buy anything it, because the merchants are on a different chain. Therefore, their transaction never happened according to the merchants. They can't convert it into currency to pay for their electricity, because the exchanges are on a different change. Therefore, the transaction never happened according to them. By the way, all this time they have been mining empty blocks because the wallets are on a different chain and they are producing transactions based on different consensus rules. It's not so easy to shift consensus in Bitcoin. In fact, what we are seeing over time is that it is getting harder and harder to modify consensus rules. This is a process that we see in distributed systems and protocols. I call it ossification. I don't know if that's an industry standard term, but the idea is that, after a while, as the protocol gets embedded in more software systems, more developers learn how to use it. It gets embedded in hardware, systems that are not updated often enough, or maintained often enough. Then it becomes harder and harder to change. A great example of that is IPv4. IPv4 got so embedded that we've now spent almost twenty years trying to upgrade it to IPv6, and it is resisting its own successor. It's become incredibly difficult to upgrade IPv4. The reason for that is because you have it embedded in fridges, light switches, wireless access points, and things that don't have interfaces and can't be upgraded, or don't have enough memory and can't be upgraded. IPv4 became ossified. The best protocol doesn't win. The protocol that's good enough and achieves network scale first wins. The consensus rules of Bitcoin today are likely to be able to absorb change for a couple more years at the core protocol level. After that, most of the innovation has to move to protocol layers above. Just like most of the innovation on the internet moved from IP to TCP to HTTP and to other protocols above HTTP. Each of the layers below gradually became ossified and could not be changed dramatically. You can't go and change TCP-IP today. It's simply impossible. Let me take one more question, and then we're going to wrap, because we're running late. How many transactions can you get done in ten minutes? That is a capacity limitation, which is artificially constrained by the maximum size of a block. A block today can be a maximum size of one megabyte. With a maximum size of one megabyte, it can fit a few thousand transactions, depending on the size of each transaction, which is variable. The average processing capacity is between three transactions per second to seven transactions per second, with the current constraints. These are artificial constraints. There's a big debate going on in Bitcoin at the moment as to how and when to raise the capacity limit. For the time being, blocks are coming in full about 60-75% of the time, meaning that there's still excess capacity to fill with low-fee transactions. Most transactions, the vast majority, get confirmed within the next available block on a best-effort basis. Occasionally, when the network is under stress, it may take two or three blocks for a transaction to be processed. Transactions don't have an expiration date. As long as the network knows about them, they will be eventually confirmed. They are valid forever. Therefore, you can keep retransmitting a transaction until there's sufficient capacity in the system to absorb it. It's fairly resilient in that way. The proposes of the moment are to increase the block size capacity by January 2016 to 8 megabytes, an eight-fold increase, followed by an increase twice over every four years, so 8, 16, 32, every four years. To keep increasing the size in 2032, that gets us to a 20-gigabyte block. If we keep approximately the same size of transaction, it means a 20,000-time increase in the capacity of transactions. You are looking at approximately 100,000 transactions per second. 100,000 transactions per second is the global capacity of the Visa network. Depending on whether you need to do all of the transactions on the blockchain, because there's a big argument that you can actually do a lot of the transactions off the blockchain, you don't need to put every transaction on the blockchain. There are many circumstances where you can do incremental transactions between two parties, this technology is called payment channels, and then do eventual settlement of the net difference between all of the sum of transactions as a single transaction on the blockchain. What that does is it takes the trust capability of the network, and it provides it as an attribute to a layer above. It can leverage it, but without flooding the blockchain with transactions. There are the base amount of transactions you can actually record on the blockchain, but each one of those could represent hundreds of transactions that happen off the blockchain in between two parties. The actual capacity may be a lot higher. It's a bit like, in monetary terms, we talk about M0, which is the base amount of physical currency that is in existence. The amount of cash that exists in the economy is less than 3% of the total amount of money in the economy. But because that cash gets used again and again, it actually enables for a lot more economic activity, velocity for each unit of currency. In blockchain, you can think of the base transaction rate as the capacity of the base mechanism, but with overlay networks, you can magnify that and increase the velocity of each transaction. Does that answer your question? So, essentially, you're not expecting to sell a pair of some PERSPATVs in each block. That would be dealt with by some high protocol. Well, at the moment you could, if you wanted to, just about. But if you have more... I'm thinking in terms of... You have to predict, is Bitcoin the currency with which you buy aircraft carriers? Or is Bitcoin the currency with which everyone, two, three billion people, buy a cup of coffee every single day? And then the secondary question is, if Bitcoin is the currency with which billions of people buy a cup of coffee every day, do all of those transactions happen on the core blockchain, which means that we need to massively increase capacity? Or do many of them happen on overlay networks with eventual settlement, which still preserves the decentralized nature? It still preserves the transactional trust in the thing, but it doesn't flood the blockchain. And there's competing schools of thought on this. So, we necessarily must scale up capacity. And the question is not whether we will use technique A, B, or C, but more, how quickly can we use A and B and C in parallel to increase capacity? We're seeing overlay networks and block size increases, and optimization of block propagation, and pruning of transactions off the blockchain to reduce the footprint on disk, and optimizations on validation and processing. All of these things are happening at the same time. It's a bit of a philosophical issue at the moment. We don't know where Bitcoin is going as a transaction processing environment yet. Your suggestion is that it won't be the whole thing. It will be a supporting layer for other things. Not necessarily. I think that's more likely, but it could very well scale to support hundreds of thousands of transactions per second. One of the other things that is the context in which all of this is happening, of course, is Moore's law. If bandwidth, storage, and compute capacity continue to increase at Moore's law, then in the next 20 years, as Bitcoin reaches mainstream adoption, you could actually support billions of users with quadrillions of transactions. You just have to start moving a lot more data on a lot beefier computers. We don't know exactly where it's going. We'll see. This is a software engineering problem. This is why it's exciting. Hopefully, that research will be done here at UCL. Thank you so much for coming today.