 Okay, my name is Sergio Demián Lerner and I'm Ethereum fan and a few part of the community since its beginning and when I did the first security audits of the theorem code and design and Currently I work for a company called RSK labs, which is developing an ethereum compatible sidechain to Bitcoin But today's talk is about a research I did a couple of years ago that applies both to ethereum and to RSK and to any other Blockchain platform. So the talk's name is the blockchain virus. Can a blockchain pay to replicate? Can a blockchain itself pay for all the resources it needs to survive like a organism or It could have been called how to war full nodes Or it could have been called how to create a reward base peer-to-peer file sharing system Control entirely by smart contracts by no additional blockchain Okay So first of all, I'm gonna talk a little bit about decentralization Because we wouldn't be here if not because we appreciate decentralization and any Attempt to build a cryptocurrency or smart contract platform in the past that was centralized just failed Okay, so this is what we must stand. This is what we must try to to to keep So why is decentralization so important? Well, it brings back the power to the people so it means a lot of things like people being able to audit their payments this Intermediation having open access no central control no monopolies no central point of failure And of course censorship resistance and corruption resistance. These are very very important properties of our networks So, how do we promote decentralization? Well, first we have to identify measure it and then we can see how can we improve it So one of the ways is increasing the number of independent implementation developers full nodes, okay, and another way is by Reducing the cost of running a full node because as the network scales the cost of running a full node increases so if I have to depend on a third party to know About my transactions then the network is going to be centralized and last The actual full node topology is very important If I take a node out and then the networks disrupt or disconnect then of course, this is not a good decentralized network But we can see that in all kind of ways that we can improve decentralization It's very very important to have full nodes. We cannot live without full nodes and have decentralization So what is a full node anyway, okay a full node does a lot of services to the network it broadcast blocks Transactions it also hides the nodes IP So we don't give an attacker the possibility to gather all information about the network and do Partition attacks of denial of service attacks also a node provides filtering services for light clients and of course the most important function it has a copy of the state and The blockchain to serve to new nodes that want to bootstrap and you know sync with the network But what are the incentives to run a full node currently? There are none, okay, so currently is quite cheap. So we do it But in the future we don't know so anything we can do to reward full nodes will Foster decentralization. So how can we do this? Well, there are very simple ways But most of them do not work for instance We can create peer-to-peer pairs paid services like creating a micro channel from one node to appear then I can easily or a probabilistic payment which also works and Then I can just ask for a block and then you give it a block. I make use very small micro payment Well, the problem with this setup is of course that we can have proxy nodes and proxy nodes can you know charge a little more and just Forward a request to another node and so they just start there to you know, too They don't know contribute to decentralization. It just make the network be more expensive Some other cryptocurrencies some other networks rely on some calls something called master node Of course this master node Regularly ping nodes so they can vote. So which nodes have to be rewarded this of course Has a lot of problems with centralization These nodes are centralized and also these nodes gather a lot of information about the IP's of the nodes Which is something we want to avoid So I will present a third way of doing this and the interesting thing is that this is a communication between smart contracts and full nodes without any other user or External system intervention. So basically there is a smart contract that will say something like this Give me a proof of work that proves with high probability that you store in your own hundred hard disk a copy of the blockchain Okay, or that you are so irrational that you are willing to buy a lot of hardware Just to fake this proof, but you will spend more more money than just buying the hard disk to sort of file Okay, and I will tell you if nobody else finds that you have cheat So if the full node a submits a proof in transaction data Then another full node can come as you know this proof is fake and the smart contract can evaluate this proof this proof must be able to be evaluated with very low amount of gas and Can keep a pre-deposit a spray deposit and reward B for providing that useful information so in To summarize we want a note to be able to prove to all remaining nodes in the network that is storing the blockchain Okay, so the benefits are obvious. We can detect proxy nodes. We won't pay proxy nodes and we don't require any special change to the network for the nodes that won't want to participate and Maybe the drawback is that we are not actually proving we are serving the blockchain We are proving we have a copy of the blockchain, but since we have a copy of the blockchain We are in sync with the network Almost sure we are serving it and the cost of serving it will be small So to to manage to do is we are need we need some cryptographic protocols. Okay, so these are the previous attempts to do these kind of things so we have provable data possession which allows you a Verifier to send some data to a server and then challenge the server to see if this data is still there Also, we have proof of retubability where the same a very fire. It can also Have certain assurance that it will be able to recover this information in the future. These two are Called proof of storage. So I will present a protocol very simple protocol Which is I called proof of unique blockchain storage that it's a proof of storage scheme But also allows you to ensure that the other party the server has an actual copy of this file Okay, so that means that this file will be related to some some identity Okay, so the same file cannot be related to two identities So it's interesting that this year Filecoin presented us very similar proof that they call proof of replication That I also I also published a couple of years ago and it's kind of the same idea, but it's worse and I will tell you later why it has many problems so in a nutshell we take the file we want to prove possession and We will encode it together with an identity and this identity will be generally your note address your Ethereum address or RSK address. So what we need is that the coding is fast But it should only be fast enough, you know for for not to interfere where your network operations On the contrary appending and updating will be a much slower Operation, but again not slow enough to interfere with operations The property is that when you encode a file and you encode a second file. There is nothing to share in these encodings so the server will engage in a challenge response protocol with a verifier and He will use the encoded data not the uncoded the coded data But just the encoded data and will it must be irrational from the economic point of view to do anything else like Relay to another party or like having a supercomputer or whatever So this is how the peer-to-peer protocol goes without without smart contracts So there is the harness verifier say give me the hash of and sell the randomly selected blocks of the file and That had been encoded with your note address and please do it in less than one second So the honest pruber has the information just put the information to memory and hash it and it works Now if there is a malicious pruber, he can think of two things. Okay first He can think okay I'm gonna relay the same query the same challenge to another party But the problem is that there is the response he will receive is related to another identity So he's not going to get paid. She's just going to be paying another party if he wants to Redo this encoding it the problem is that it takes a lot more than one second. Okay, so in a sense He cannot fake a proof So to create this asymmetry with it We need something I call practical asymmetric time encoders or patis So we have a program which is an encoder We have another program which is a decoder for a certain tuning machine and we define the steps This program performs as the number of steps it requires to run for a random inputs until it holds And we will ask that these Steps function is mostly uniform So we we said that we have two programs and we said that decoding If you if I encode an input and I decode it I get the same input So we define the asymmetry ratio as the number of steps it takes to the code Divided that by the number of steps it takes to encode So the desired properties at of course the program must be small and if I if there is any other party That encodes the same mapping then this function this Sorry this encoding must be relatively equally efficient efficient. Okay, and to guarantee this we need to rely on some Complexity assumptions or number theoretical assumptions to make sure that the encoding that we pick is one of the fastest encoding possible for that mapping also we want as I said that the coding and encoding speeds will be Practical for our application and we want the highest possible asymmetry ratio because that is going to protect us from the attacker Trying to encode on the fly Well, there is a function that has these these properties It's called the polykelman cipher private key cipher. It's a very very old one of the first number theoretic Encryption systems it works like this to well in this case is the opposite decryption would be like encoding but So to decode a value We just rise it to the third power model of big prime and big prime and to encode it We will rise it to a value u which has approximately n bits, which is the inverse of the Of three models p means one. So we get this encoding the coding property So using the standard square and multiply algorithm encoding would take 1.5 times n multiplications While the coding takes only three multiplications. So the asymmetry ratio here is and Divided by two. Okay, so we can make decent asymmetry ratio as as large as we want just by taking Prime, which is you know larger So what are the practical parameters that we can choose that serve this purpose? Okay, I tested it in my laptop with a lib gmp So for n equals 2048. This is a little bit technical But please follow me Exponentiation takes 2.8 milliseconds while three multiplication takes only 3.6 microseconds so the theoretical asymmetry ratio is Approximately 1,000 what I measure is approximately 7,000 and 77 so now think about the Bitcoin blockchain Okay, so we get one megabyte block every 10 minutes and it takes only 11 seconds to encode it So it was perfectly for Bitcoin and it was also perfectly for RSK and for Ethereum and Decoding is really fast. You get 3rd 71 megabytes per second So that is not going to interfere with your node operations If you want to get a higher symmetry ratio, you can take n equal 20k and And you still are in the bones that you can still use for encoding your blockchain So this this is the one-time blockchain encoding step You take your blockchain or your file and you split it in n-bit chunks And then you scramble each one with your node address and this scrambling is can be very simple can be just exploring with a full domain hash of the of the node address and then you encode it with your slow function So this is what a typical proof of a storage looks like you have some kind of memory bone problem because you want to prove You have a lot of a lot of data and you transform it into a disk IO problem Like this you take a seed the verifier will will send you a seed And from that seed you derive a number of positions in the file And then you will just access those positions and and build a hash Okay, well the problem with all these constructions is that they depend on this technology So if I use SSD then the parameters are completely different if I use a hard disk and also this This characteristic properties change all over time So this is what we really do we first do the step of encoding and creating this asymmetry and Then we receive from the verifier in this case is the poops contract the proof of unique blockchain storage Contract, we receive a C0 with this C0 we will take some parts We will derive some indexes some part of the blockchain or the file and we will bring all these into RAM Okay, so we are independent of the technology that this access technology and this is the challenge response preparation phase Now we have we receive from the smart contract like 20 seconds later a second seed Okay, seed one and well from seed one We will derive these indexes of RAM where we will take these small pieces small blocks and hash them Okay. Well, we have transformed this memory bone problem into a CP bone problem. Well, there is some important problem here That we cannot do this from a smart contract is that the blocking turbo Can be about 10 seconds 20 seconds. So the number K the number of pieces we have to hash To prevent an attacker doing this encoding on the fly would be very Very high. Okay, like in the order of 20 20,000 or maybe more So the proof that we get is very large and the smart contract won't be able to you know With the gas limits to evaluate if it is correct or incorrect So we changed the last part and we transformed a CPU bone problem into a memory bus IO problem And this has been done. Oh This has been done in Parma coin for instance the same the same procedure. So instead of taking a large number of Blocks to hash we take only 40, but we request that the hash that we obtain Has proof of work. Okay, so we will try many many times Like deriving new seeds from the seed one and announce, you know Try a lot of times maybe 20 seconds or 10 seconds until we'll get one that has this property of the proof of work So this is how the whole protocol works and it's very very simple. So in the step zero We have this one time encoding But we also have that full-nose will register with these poops smart contract and this can be they don't have to Rebuild their IP. They just have to register RSK or a stadium address to be paid Also, they will do some pre-deposit because they are going to be penalized if they present an invalid proof So in step one the poops contract will derive the seed zero from the block hash and We'll give the nodes enough time to bring all these pieces or a big piece into RAM So in step three Because the poops contract will derive a seed one and now the competition begins that it's not a competition it's cooperation actually, but The full nodes will try to get this proof of work will try to get from all proof candidates We try to find one that matches that has a time much that whose cash is lower than the proof of work target So they will submit these proofs to the block chain in normal transaction There will be plenty of time like six six blocks ahead to prevent Sensorship so that they can do that and in step six that there is a deadline for this and then an external challenge phase begins in this phase full nodes will Try to see if the other nodes are being arched in So they will pick the nonce and the and the hash that has been commit from other full nodes and will reevaluate their that those proofs But since there are only 40 elements, then it takes about a hundred milliseconds so they can just Evaluate every every other full nodes or a commitment If one of them finds that some one has cheated then it just sends Frog proof to the blockchain to the poops smart contract Which is just the full expansion of this of this proof which will show if the original commitment was correct Incorrect. So in step nine. There is a end deadline for proof submission for fraud put submission and then There was has been paid. Of course, they are all they're only going to be paid to the nodes that are behaved honestly so because that the reward is going to be shared and Nose full nodes are also incentivize it incentivize to detect cheaters Now what's the difference? I said before that this protocol can be used to prove any file Not just the blockchain the blockchain is the easiest one because the node already has a copy of the blockchain Maybe so it can more easily Evaluate things but it can be used to any file to create, you know a file sharing system that you can reward node Just to hold the data for you and multiple copies. You can just create a contract that Create a hundred copies of the same information But there are some difference between the file called proof of replication and proof of unique file search proof of Replication depends on the success time so it's difficult to tune for all kinds of this and it gives you Typical asymmetries of about 10 to a hundred so it's very easy that technology changes and then an attacker is able to To fake proof on the contrary proof of unique file storage does not depend on access time and Has typical asymmetries between 1,000 and 10,000 and also you can combine them so you get 10 times more you can combine 10,000 of asymmetry ratio of of proof of unique block of a file search and Combine them will know 10 times more for proof of replication well, the summary is that well the centralization the centralization is very very important in these networks and Increasing or setting of a full node reward for the first time we can Incentivize the creation of new full nodes and keep the number of full nodes even if we request them to store 10 times more information and This proof of unique file storage and proof of unique blockchain storage Allows you to prove that you have this unique copy of the file of the blockchain and the important thing is this can be done right from a smart contract with almost no gas consumption and Also, the proofs are very short. It's just a hash and an ounce Okay, so this could be implemented in in ethereum or in RSK tomorrow Well, we are working on implementation and so when we open source you would be able to to play with it So thank you very much and the organizers