 The next presentation is from Emery who is quite regular at our local hacker space and working on various topics related to Nix and Nim and such and today He will show us the proposal for encoding for robust immutable storage as you can see behind me And I'm particularly looking forward to this because like immutable data structures and content address storage is also a personal interest of mine and Let's say a functional program as delight. So I'm yes quite interested what you have to say welcome Emery Okay, thank you Right so First I want to talk about the problems that eras can help solve and then I want to talk about What is content addressing I'll explain eras the implementation Cover some use cases and then there's a few open questions that I wouldn't mention So the problem of data First if you look at how do we refer to data? You have to go back to the 1960s to Ted Nelson's project Xanadu, which was the precursor to our web and this was the first hypertext project and set a lot of their terminology we use for the web now and In 1981 there was the 17 rules Published and these were the requirements for Xanadu and And 10 was every document is uniquely and securely identified and 12 is every document can be rapidly searched stored and retrieved without user knowledge of where it is physically stored and then in the early 90s the web came along and Then the IETF to find The URI URI is a compact stream of characters for identifying an abstract or physical resource and then this was split into two Categories the URL and the URL and the URL Is a subset of UI that identifies the resource via representation of the primary access mechanism rather than Identifying resource by name or some other attributes So an example here you have The primary access mechanism and then you have a server DNS Name and then you have whatever path You follow on the server to get to your article But URLs don't have to refer to anything in particular like all the Wikipedia Instances have to have this random random URL and More and more you see these really ugly URLs. I mean this was from like a privacy Related mailing list and they just have all this garbage and server state they pack into URLs now and The Ellen in URLs for location not link for a long time I thought it was link and I put together these slides in the Netherlands and I got that Got this page and then in Germany. It looks like this and if you're in India You can't visit Pakistani sites if you're in China. You can't visit a lot of other things Yeah, and the other problem is the link rot that DNS records get more expensive over time and There was a study this year on the New York Times and they found Yeah, 25% of all deep links, which is a link not to one website but to a page on a website 25% of these were completely inaccessible 6% from 2018 43% from 2008 and 72% from 1998 and Half of all articles that contain deep links contain at least one rotted link I think is once the link rots then you can just buy the the domain name and so in July There was a web hosting a video hosting site called vid me I don't know when they went down, but these new sites were embedding videos in their articles from vid me and poem porn site bought the The domain name and then so you had all this porn appearing in these articles And of course if you're reading this Twitter garbage, you might as well watch some porn So I'm sure they made money doing this But the the the other URI category is the urn Refers to a subset of URI they require to remain globally unique and persist even when the resource uses to exist So all these all these standards have this have this urn format and books Books have this you have this international book number and you can put this in your inform and And like a bank accounts a good example because if you close your bank account the The bank doesn't recycle your ebon. It's it's persistent But That's not really what Xana do said what we would need But then we had magnet and magnet did really concretely refer to very specific data and This was like a de facto standard from 2002 and it's What is What we usually use for for file sharing? So this is a bit torrent magnet link with typical bit torrent link and then this is for direct connect So you can use the magnet link for different protocols and This phantom matrix example is a good example because the torrent was released in 2003, but their website went down in 2006 So the website lasted for three years, but this torrent has been active for 18 years Yeah And magnet links are usually content address, but not always So then what is content address data? Data address by not by location, but by the content of the data itself And if you're referring to the content of the content changes you then have a different address So it's necessarily immutable data when you're talking about content address data So And the magnet link examples in both of them you have this sort of garbage stuff in the middle and this is a hash digest And these these two protocols use two different hash functions, but they look similar And this is probably your view for a lot of people But a hash function is a one-way function where for any input you feed in you get a fixed length output And the same input always results in the same output and if you looked on if you look on the left-hand side You change any bit of the input you get a different output and the output should be Indistinguishable from random data, so you can't actually tell if the output is a hash or not or What the input data was again fixed output length for any input but the thing is if One one million bytes is a reasonable size for an image file But 32 bytes is a reasonable size for a hash digest, so Digests can't actually be unique because you have much more inputs than outputs But if one byte is 8 bits than 32 bits is 256 digits of binary Which is 2 to the 256 which is a relatively large number but if you want to exhaustively search to Find mappings between the outputs of the inputs for one For one match you have to only do two to the 128th on average and the Bitcoin network does search for collisions at About 128 times 10 to the 18th power But if you use the Bitcoin network to do this, it's still going to take you 10 trillion years But the problem is That's not feasible because Bitcoin uses somewhere between 100 and 200 terawatt hours per year for hash collisions and that's 100 terawatt hours is about eight and a half megatons of crude oil, so You can't actually find collisions without destroying the earth anyway So what are some existing content address systems? So back in the 2000s we had a lot of file sharing options, so Napster uses the MD5 hash function Nutella was Shaw one fastTrack Uses this UU hash and you might remember this as Kazaa or Morpheus and Then there are these Neo modus Direct connect advanced direct connect and they use the tiger tree But UU hash is interesting Because in 2001 computers are much slower and hard drives are much noisier and Usually you shared your computer with your family So if your hard drive is constantly making this grinding noise your parents are going to ask you like what's going on This has to stop So they had an optimization where you only hashed the first 300k of a file with MD5 You hashed the end with what you just run the end through CRC32 and Then every megabyte in between you do another 300k of CRC32 and just sort those together But the problem was that there were companies that were doing market research in the fastTrack network And if something got popular they would then download it and then corrupt the pieces that weren't being hashed and push it back out And this was the end of fastTrack fastTrack was the most popular file sharing protocol in 2003, but Eventually they say yeah half of it was garbage So the important lesson is that you always have to hash an entire file and because you cannot trust people You really should be able to verify the file and pieces not all at once So then BitTorrent was a big improvement It's still one of the most popular file sharing protocols It's a very good network protocol but It's good for transferring files, but not much else unfortunately and Part of the problem is well BitTorrent was an improvement because you would take all the data in the torrent break it into pieces and then hash each piece Put the hashes in your BitTorrent info file, and then you have one hash to refer to the the info file and Then now last year there was a new BitTorrent version released and now every file in the torrent has its own tree of hashes and This makes it easier to have files shared between different torrents And also to to produce new torrents that update a set of files Where previously you would have to rehash the whole thing. This is much faster And that's because now in the new torrent format you take For every file you break it up into fixed size pieces You take a few of those pieces hash that To create a level of a tree, and then you keep going up hashing each New hash until you only have one and this is the root of the file And then there's Git Was first released as the stupid content tracker. I don't know what Microsoft calls it now, but it was originally developed for Developing Linux. It's not totally original, but It was the most successful successful of these kinds of version control systems And if you don't know what Git is it allows people to work on a one code base simultaneously without trampling on each other And you do have some extensions for putting large files that are better something other than source code and to get Get is kind of peer-to-peer. You don't have to have a central host for anything It's possible to copy files between machines this way and it's definitely the most why they use source source code management tool unfortunately uses sha one which is not Not the hash function of choice these days and If you've ever used get you've seen these These sort of strings and those those are hashes so you may not realize it But this is a very much a content address system And get is proof that that this scales well Scuttlebutt is worth mentioning Scuttlebutt is a friend-to-friend social network There's no no no servers. There's no centralization and you can You can post public messages to whoever wants to read and you can send private messages to other people But you can also publish get repositories through it and this is the best tool I've seen for centralized get and Scuttlebutt Well patchwork is like the main Scuttlebutt client and it looks like this so you have some text and you have an image and Anytime you see something like an image. This is done with like a blob and all the blobs are hashes and and When you see a message you can see okay, this message mentions blobs. I don't have this blob you ask your friends You tell your friends what blobs you want and your friends tell their Their friends what blobs they want and what their own friends want and eventually everything replicates Quite nicely. It's very robust the way Scuttlebutt works But the problem is that all these blobs are visible to One or two degrees away from from the origin so if you if you run Scott you you can start Scuttlebutt let it run for an hour or two and then You can just you can just you can just iterate over your blob cache and see what people in posting have been posting for the last few years You can encrypt the blobs, but this is not Usually how it's done because it's supposed to be a public social network And unfortunately the the blobs are just hashed in a flat manner like one hash per file Yeah, then there's IPFS, which is fairly well known and IPFS defines protocols for how you would store things how you would encode things if you have large files you break this into blocks You can put these files into a Unix like file system and They have mutable references using public private keys And IPFS does define a network protocol For synchronizing data and you can use it to host websites, which is sort of the killer feature of IPFS and Then you can see in this URL. This is the URL, but you do have the content address hash But I don't like IPFS so much because there's a variable block size and they define multiple hash functions for referring to blocks Okay, ten like they define ten officially, but I think you can use two or three maybe to two within the system so What would in my in my mind? What is an ideal standard for encoding content address data? You should have a common hash function You want to be encoding data in a way that you can share between applications and using different protocols and You want your encoding standard not to be biased towards any one method of transport But it should be also privacy respecting somehow so then there's eras and The simplest explanation of eras is an encoding of arbitrary content into a set of uniformly sized encrypted and content address blocks as Well as a short identifier and we use the urn format it's an eras urn looks like this and Eras came out of GNU net Christian Gotthoff wrote a paper on ECRS which describes What would become eras and Puka mustard wrote the era standard he did this for open in Jadina, which is a Yeah local knowledge platform for use in this Swiss Alps and Then I started working on eras because of dream the dream project Which was about building the collaboration tools using CRD T's and the CRD T's would be transmitted through eras So what is eras? Specifically so eras is built with two cryptographic imperatives one is the salsa 20 stream cipher and the other is Blake to be and Blake to be is inspired in part by salsa 20. So they're they're closely related and both quite fast And we already have a number of implementations in relatively high-level high-level languages So then what is what is eras actually? So if you want to encode that data in eras you first break it up into fixed-size pieces You take your final piece or block you Take the you append a byte with the highest bit set to one and then the rest is zeros And then for each block you hash it with Blake to be and now you have a key You can optionally salt the hash function, but I'll explain that later So for each of these blocks you have a hash and you treat this hash as a key you key a cha cha 20 stream with the hash You then zore xor the plain text block with the key stream and now you have a Cypher text block and Then you hash the Cypher text or the encrypted block and now you have a reference to the block and Then if the data doesn't fit into one block then you construct a merkle tree of hashes with the key and reference pairs Yeah, can cap those all together And then when you get to the end you just pat out these blocks with zero and Then hash and encrypt the the tree blocks and keep reducing them until you only have one pair a root pair and so like Yeah, so the bottom you have the blocks the file Each of those blocks has a reference and key That forms one level of the tree and you keep Irritably reducing the tree until you have one node at the top So When the blocks are an encrypted form you only need a 32 byte reference This is the hash of the encrypted block and then if you want to actually access the data This is done with a 68 byte identifier One byte is the block size and the second byte is the tree depth And then after that you have a root reference Which is the root block? The hash of the encrypted form of the root block which then you can request from storage and then you have the key Which is the inside after it's been decrypted So once you have the key you can decrypt the encrypted block and then verify the contents of the encrypted block and When it's encoded it looks like this these base 32 It's ugly, but it's not actually bigger than these stupid links you get in your emails these days So there's only two choices of block size. The first size is one 1k So because of padding in one block you can store up to 1k minus one bytes because of one byte has to be for padding and If you want to go past that the trees will be 16 arity so you can refer to 16 blocks in one block Because 64 goes into 1k 16 times So then on the one level tree you can store up to 16k minus one and this is intended for metadata like RDF short pieces of data and then With it with a block size like that you can put multiple blocks in one packet one IP packet The other block size is 32k so 32k minus one in one block and Then the arity of those trees is 512 So you can fit two to the ninth block pairs and one tree block Which is 16 megabytes minus one for a one level tree This is intended for bulk data for large pieces of data and 32k still fits in a 16 bit integer So in theory you could put one block into an IP packet if you have jumbo frames enabled and you're sure they're enabled and so on So to to measure the overhead I took one of these libgen zip files with the thousand PDFs inside of it and The one I looked at was 1.7 gigabytes and to encode this into 1k Blocks, this was 61 megabytes of metadata of eras only metadata But with 32k, it's only three megabytes Then I unzip the zip and Repacked it into an arrow fs image an arrow fs is what Huawei uses for their system partitions on their phones It's like a squash fs replacement more or less The this image was 16k larger than the zip file, but The encoded eras form was smaller by four megabytes So if I had three megabytes before and now have four megabytes less Then it's about six megabytes of deduplication inside the blocks So how is this possible that zip and arrow fs didn't find these? These deduplication well, it's I mean 1.7 is a lot of space, but but why why would it why would it be smaller than the zip? so With squash if squash fs usually your squash fs Squash fs image runs in memory but if you write it to a block device and You have 4k Like if you can imagine every 4k block in the squash fs is to compress to 3k If you're on average, you're gonna have to read two physical blocks to get one logical block inside the squash fs So what arrow fs does is it compresses until it reaches? the smallest possible size for a physical block and then stops and then clears the compressor and then starts again So if you do this if you align the compressed blocks, then you reduce the read amplification on the physical layer But also if you do this, then you can you can easily deduplicate blocks Which wouldn't happen in a zip or a squash fs So anyway back what about privacy and security? So you can see that it's easy to store the data and transport the data as it's encrypted But the problem is that it's convert well It's convergent encryption. So the same data will always encrypt the same way So you have the problem of a confirmation of file attack Where if you can imagine? I'm in a conspiracy and I have four other conspirators and each of them I send a PDF But each one is different I know that if they encrypt this and I know that if they publish this PDF through errors Then I know what the hashes of each PDF will be and then I can monitor the network and see like I has this has this file Been has this file appeared and I can know who exactly sent it Or if I see some hard drive I can and I know what the hashes of the encrypted form would be then I can check for for the presence of that and Then slightly less serious is the learn the remaining information attack So if you can imagine I have like surveys that are errors encoded and If there's only a millions or a few billion variations on what the survey could be I could just generate all those and then check for the presence of those in a error status set But we can solve the inner hash function with a 32 by key so the So the the outer hash function stays the same But the inner hash function you have to have this 32 bit key to then produce the correct key to encrypt or decrypt the block and If you think about it, you could you could serve files using different keys if you Encode the data keep the block references and then can correlate those Different block references with file offsets with some sort of lookup cache Anyway, but the important thing is that you don't have to trust your storage now because you can verify everything that comes off of it and This is good if you have if you want to create like stateless storage where you have read only storage or Right one. Well, yeah read only storage So what are the disadvantages of using eras? We don't have any replication mechanisms defined and we see that as a bit out of the scope we want to explore that but That's something for a different spec You could use bloom filters so you could you could send someone a bloom filter and say send me any block that matches This filter you could just send a list of hashes Of what you want. I think IPFS uses Lists like this, but you can also put eras data into IPFS and use IPFS to replicate stuff Now we don't have any any mechanism of correlating content to data or content to eras your ends. So if Yeah, that that course is complicated. We want to do that also, but this is a different different story If you lose the urn to some data, you're not gonna be able to recover it again. That's a potential problem and There's no sort of permanent storage Format defined if you want to write eras directly to a block device. I Think I'll look at venti and see what venti does from plan 9 and maybe reuse some of that and You're gonna maybe want to garbage collect data and that's not something that we've Worked through yet So what are some use cases so puka mustard who wrote the spec? He's interested in RDF and he says Most existing RDF content is location address the URIs or pointers to hosts that hold the content if the host goes down The content is no longer available. This happens frequently enough to seriously undermine the robustness of RDF So this is this is an example from Sorry, this is an example from the w3 And This is a graph saying there's dr. Eric Miller. He has an email address, but you can see inside of this They have all these URLs and there isn't there isn't a guarantee that wc3.org is is reachable or will persist Something I'm working on with the group in Bengaluru called Janus do is papad and And Yeah, the audio visual publishing platform for low literates without barriers of knowing to read and write so this is a system for Storing oral knowledge and then annotating and making correlations between these recordings So you can see you would create a recording and then at different points of recording You can add an annotation and add tags add images So that if you weren't really able to read but you can navigate using images, then you have access to this This knowledge and we want this to work offline, so It would be nice to have some content address system that can replicate Content in a mesh I've looked into using erases like an intrinsic feature of operating systems And I look specifically at dynamic linking and dynamic loading So I had the problem that I want to make an operating system where you can start programs without having a file system but I was using next to build these programs and You can do this on the geno the OS framework, which is in microkernel OS developed and on the other side of the river here and it works if you just widen some some fields in the and the base library and Here and there are tweak a few things and so it's not you don't really have to go into the the Dynamic loader and make it understand eras you just have to make it simple enough that it will go out to a back end that does understand eras and This is this is nice to experiment, but it's never gonna happen for g-libc because it's too complicated and Just don't go there. Like stuff is gonna break people gonna be angry so Quickly if this this really doesn't work in nicks because you just have to define a hook That then you can inject into your packages that once the package is built You go into the elf header and you can patch these File paths to point to errors your ends so Go into the header collect all the dependencies that are listed in the elf header make sure there's no cycles I'll replace those in the binary and then you'll want to write a file that says the final Eras urn and then all the mappings between the urn's in the binary to the the file system of the build machine and Then when you want to build the actual runtime system and you collect all these dependencies so you have a complete closure and The dependencies can be installed in in their like encrypted eras form or you can just use eras as a Key that you then look up the binaries in But if you think about it, you're no longer really dynamically linking things because The link is absolute Not dynamic, so you're really just deferring linking Which then like well then what then what is the future package management and then how would eras fit into this or what are the optimizations here? Of course, you can do nothing and just use DB and forever like But one proposal is called all VM Where you take all the packages in your system? You compile them to the LLVM virtual instruction set rather than your native instruction set You create one binary for everything and the binary is multi-call So if the binary is called with arg zero set to different Names, then you then you select the behavior Right so make these giant binaries and then you have a bootstrap system that starts a JIT compiler and You just JIT everything Yeah, and if you're using eras you would then just take this mega binary and try and align everything on 32k So you can hopefully deduplicate between different releases less less radical or conservative is distry where all the packages are squash afs images and each package Distributed in squash afs form contains all the dependencies of the package and Then when you install a package you fetch the squash afs image and you mount it at some path in the file system and if you were you if you wanted to then trans like Store these packages in air in eras, then you would want to use arrow fs instead of zip Instead of squash afs and then make sure you Align the blocks Well, actually arrow fs without any optimise it without any tweaks does work pretty well, but That's what you would do So I do I'll do some demos the first is How you would use eras just to produce check sums File encryption and then I'll show a file locker example Yeah, so I have the utility called eras some and if like I'll just take some of these libgen PDFs Yeah, so this acts like shot to five six some md5 some b2 some but I Don't discriminate so it also does the DSD output format Yeah, so you can also use it for file encryption so There's a utility eras in code You give it a file that you're going to put blocks into And I can say hello Totten spoon and encoded I get this urn if I have the urn then I can Decode the message but If you need plausible denigability You can then pack in other messages into the Into the block file Okay, and then the upload So here I have a chat bot a talks bot running here and If I send it a file. Oh No, that's too big. I upload a file. It will encode it to its own database of eras data Should give me a urn back and And then it has it has an integrated HTTP server and When I make a request to the HTTP server it can then decode the content, but if I don't log At the bot and I don't log the HTTP request then it's sort of a zero knowledge File hosting service Which is relatively trivial to to to implement so then open questions I think it's worth questioning if Something like this makes it harder or easier for archaeologists in the future If everything's encrypted, of course, it's a problem We could potentially you lose a lot of information this way But on the other hand if if everything's content address then we may be able to put things back together. I Don't know. I think it's something we're thinking about There is a legal problem that now the the EU is really got this Obsession with child pornography and they want to make it mandatory for What is it number independent to messaging services to scan for child-born I Think like there's a there's a loophole now that you can do this without violating the privacy regulations But they may make this mandatory and as I showed in this upload example. This could be a problem in that case and I think there's another problem that We don't like censorship now while we say we don't like it, but I think we kind of like it and we may go to our Go go in a direction where? Censorship resistant software is not as popular, but we'll we'll see Anyway, we have a mailing list for the standard which isn't finalized yet, but I think it will be soon The code I've written is on source hut and I'm on IRC is emery if you have any questions But yeah, that's it Thank you emery we have some time for questions and we also now have the whole microphone which can move around So if you have any questions just Give us a signal. I Was myself I was thinking of questions during the talk and I think most of them have been answered when you moved into this use case stuff I I've been I've grown a bit more paranoid now You're paranoid more paranoid now like like this scenario like what you said to give Specifically crafted files to each co-conspirator and then see who uploads it. Yeah, this stuff That's that's an angle hadn't considered. So that was quite interesting Yeah Regarding the legality that you touched on at the last point. I mean if they may really make it illegal to do this then we just put all these Data that we want to store in some steganographic images and I mean, yeah life will find a way. Yeah, well, it's interesting also because The like some EU court throughout Some some some They said you couldn't retain data indefinitely because you have an obligation to keep your keep people's data private So yeah, I think I think this this scanning for child pornography it will be ultimately illegal But of course, it's legal until the court says it's not legal. So you always have this gap between Which interpretation is valid? We have a question up there Yeah Maybe I missed the point a little bit, but I'm not sure you said about something like censorship. There is no way to Yeah, it's accomplished censorship using this method. Yeah as far as I understood It's just some kind of a reference pointer to some file in certain terms and why if there are no censorship is available because if Using this pointer, I can't reach the content. There could be censorship as well. Couldn't it? Yes, it would be but I so I mentioned censorship because If you are passing these your ends that reference data in in an encrypted channel then The data itself is safe So long as the data isn't known outside your communications But it's harder to then censor censor the data because it's content address. So it's location independent So, I mean we want to develop this for sneaker nuts and stuff like this. So In that case, I think you you have robust Protection against censorship I'd have two to requirements which would be a deniability for the host and Abilability so I can charge for it Deniability I think is difficult Yeah, I mean you You you can you can salt the encoding so It's harder to see what you might have stored But deniability is hard I mean especially if we don't if like it's right if it's Read and if it's right once read many and there's no garbage collection and you can't remove things But charging things I think you could do this like I mean Yeah, if if you had like a the the data locker scenario where you're just storing people data for people then you can just charge on Just on terms of how many how many bytes they're requesting How many bytes they're they're pulling from you just that point needs to be trustless So they I need to prove that I really stored it for a certain time and they of course pay for that They want me to prove that I really stored it Yeah, that that's something I haven't really thought about proof of storage Yeah, that I don't know much about You'd have to look and see what other people are dying for that