 Olga Krickl, I'm very happy to be here. I'm involved with the Python community since about 2001. I've given some probably talks or so Python conferences, Python, EuroPython, mostly, but also some smaller ones. And I'm still getting nervous in front of large audiences. And I think the reason for that somebody said this very intelligently is that imagine yourself in an archaic situation and you have lots of people staring at you. What does that mean? It means they're probably going to kill you, you know, because naturally you don't have this situation. Many people are doing this. That's why I think it's very natural to and it's kind of like a way for me to soothe myself a bit. Okay, that's how it is and then I try to get over it. And I'm sure that many of you are doing the first talk. Feel this even more strongly. Okay, so apart from being heavily involved in Python, PyPy Project, PyTest, many tools and things, in the last two years I have been going around in other communities. Mostly people from Node.js, a bit Erlang, a bit Haskell, and also meetups, small meetups of people who don't focus on one language but actually on concepts. And I'd like to tell you some stories of what I learned in that time under the general topic of a more effective and decentralized web. And for that I'm going to take you a bit to the past in order to talk about the future. So, this is some real rocket science that happened almost 50 years ago. That's the, I think Apollo 11 that landed on the moon. And it was obviously rocket science because you had to steer rockets. And it's still the speed record for humans traveling is still lying with this craft, this aircraft. I think it's something like 35,000 kilometers per hour, so. And it hasn't been beaten yet. And in fact, I argue also later in the talk that at that time in the 60s and 70s, actually the advances, the rocket science advances kind of slowed down. From 1685 on, every 15 years, the number of scientific papers doubled. You may relate this to Moore's law of like doubling of transistors, but it's like the amount of scientific papers every 15 years since 1685 doubled. And this leveled off beginning in the 70s. So we don't have this increase anymore. And I'm going to get back to why this is so at the end of the talk. So who was doing this rocket science? Who was programming the rockets to land and to fly? That was Margaret Hamilton. She was the lead programmer and she's standing next to the source code, I guess written in assembler. And she led the whole programming effort for steering the rockets. By the way, if you print out the source code, the high level source code of Firefox or Linux, you get something like 15 meters, you know? Another interesting thing is in the 60s, more women than men were actually programming. Now, can any of you answer why that changed? They marketed computers for men. They marketed computers for men, yes. Actually, if you look into feminist research, there's a general obfuscation that the more professionalization and the more money gets into some kind of area, and you all know that a lot of money actually went into programming, the more men dominate that area. So basically the programming at that time was considered by men, it's just a lowly thing to do and it's typing like on a typewriter and so on, so it cannot be that important. And they didn't realize actually the meaning of what this programming was. So at that time, this was like one of the leading rocket scientists. Another thing happened leading up to some big invention in 1974. And I'm going to talk a bit about networking here. In 1939, telephones had pulse dialing. So when you actually have a phone number of several digits, and you dial the first digit, then this would go to the relay and switch a cable. Then the next number would go to the next relay and switch a cable. Then the next digit would go to the next relay and switch a cable. So basically dialing your phone number means switching a cable across the continent to a certain endpoint, another telephone. That was all operated by, like in the US, it was all operated by one company. And they had this big network and they were in total control of all the relays and the hardware, and they spent a lot of time to make these relays and the whole network very reliable. So in 1974, rocket science invention changed things. And it was completely despised by most, like 98% of the researchers, they said it doesn't make any sense. We are researchers who are doing the real research into phone networks. And this stuff, where you actually have a modem, and you use the telephone network in order to do an overlay network, which was first the ARPA net, and then the internet protocol cannot possibly be more effective because it's using the same phone network. So if you're doing something on top, how could this possibly be more effective? Well, what they did is, they assumed that we are going to establish connections between routers, where our phones connect to, and then we're going to send data packets. We're not going to switch a cable across the continent and then send audio data over it. We're going to chunk audio data into packets, and each of the packets gets a telephone number, but it's a higher level telephone number, it's the IP address, the internet protocol IP address. And by doing that, you gain a very big advantage that wasn't very obvious to most of the researchers at the time in the beginning of the 70s, which is you don't have all of the setup costs. If you think about the pulse dialing, that takes quite some time to just set up the connection to the cable basically across the continent. If you already are connected and you send a data packet with a virtual telephone number, then you already have the line, you're already connected, and you're just sending it on, and the router makes a decision how to forward that packet. So there's no setup costs. The lines are already there, connected, and the model that they envisioned doing this was a very distributed network. Lots of nodes all interconnected. It came from the military, you know that. If some nodes actually go down, you can just route around it. That was the idea and you can route around it because the nodes actually are aware of their connections, and if they get a packet and some route is down, they can just route it somewhere else. That was the idea. Now it turned out this was a bit of a hippie dream. Kind of like everything is connected and like everybody can talk to everyone and all the nodes are autonomous and all those kind of thing. What actually happened fast forward to 2009, that's a picture of the internet traffic in the UK, 2009, and the larger the sphere, the larger the volume, the more traffic goes to this telephone endpoint because the IP network is still kind of using the same idea than the original telephone network. You're still actually addressing some kind of endpoint. You are saying, okay, connect me or send me some data, get me some data from this remote endpoint. Why did this happen? Why do we have not like this evenly distributed network, but we have this rather, we have like lots of star networks where basically lots of people access one telephone. I think the main reason is economies of scale and that means that companies realized and Google was one of the pioneers there, if I am the phone endpoint everybody talks to, then I need to handle all the load. That's some effort. I mean, I need to be very reliable and stuff like this, but I get a lot of overview about who is accessing what and that turned out to be in terms of tracking and interjecting advertisement to be very profitable. It was profitable because the more you scale up and the more people are phoning you, the cost for that does not increase in the same way that your profits go up by knowing more about people. And there's one guy actually last year who put it very clearly, he has been doing network research a lot and he says that this effect that I just described means that there is a complexity task, complexity text that means that the more nodes you have to handle, you have to have load balancers and all kinds of things. And the further up you go and the more nodes actually are phoning you, you have additional costs, but the complexity task is regressive. Going from one user to 100 is one thing, going to 10,000 adds a bit more cost, but less than one to 100, then one to million and that's kind of like seen per person, it gets less and less, and then 500 million is even less, that's why it's regressive. You pay less and less for getting more and more users, but you have kind of like a constant income from advertising and stuff like this and that's why the economies of scale mean the bigger you get, the network effect and everything, the more money you can make and there's a tipping point basically at which point you become very profitable. So I'm Yuri Menko, I'm going to get back to him later. I'll describe this all in more detail, the history of networking there. The Jeff Hammersbacher, who was for five years, he was one of the first Facebook researchers, he quit in one of the first 100, I think. He quit in 2011 and he was looking around and giving an interview and said, look at Silicon Valley, the best people actually, the best IT minds, they're focusing all their efforts on how to make people click more ads, you know? And it's, I think it's something that is understandable. A lot of money actually was to be made from working there, very good working conditions and so on, but arguably it's not really rocket science. I mean it's rocket science to a degree, but it's very much focused on one particular purpose. It's not about getting us into space, flying cars, whatever. Okay, so what we ended up with now is what I call million-to-one architectures, or maybe even billion-to-one architectures. So we are spending a lot of effort actually on how to make, like in the web, how to make very scalable websites and how to make the most regressive complexity tax. And lots of startups actually are trying to tap into this and to become the mediator of communications. The thing is people want to connect to each other, like the future is distributed in the sense that we want to be able to talk to whoever we want. We want to be able to talk to, and to listen to see videos on YouTube interacting on Twitter and peer-to-peer basically socially. On a social level we are actually interacting among us, but there's this mediator thing that actually monetizes these interactions, but it's driven, the whole network effect is actually driven by people's want to interact in a very distributed and self-determined manner. And I think to a degree the thing that is very big in the Python community currently, big data, the whole NumPy, SciPy and so on stack, is getting a lot of us good jobs, is also focused on, you know, computing on large amounts of data collected from human interactions. And so I think that that's just a description. I mean, you can also probably see that I'm a bit critical of this development, but it's also a description, like this is what happened so far. Now let's go into space again. In 11 years, Elon Musk, who also does the Tesla, car and many other efforts, he wants to get humanity to a mass base. So he wants to have a mass station on there. And now I ask you, do you think that using TCP IP and HTTP, the good old 40 or 20 year old, HTTP is 20 years and 21, and IP is like 40 something, 41, do you think that these old networks are going to work on mass? Like can you call Gmail from mass as a web app? Patience. Patience. With a lot of patience, yes. I think it will revive the idea of email. So it's clear that if you want to think about networking and you think about rocket science again and going to mass, then the protocols that we have are not going to work. Now your mouse might say, well, but we can just find a special solution for mass. We just do something else. We just have good old email servers and a bit of chat. So when people actually chat inside the mass colony, they can just talk directly to each other. You forget about HTTP and DNS and SSL and all those kind of stuff. You just don't do it. Okay. That's certainly a somewhat pragmatic solution. But I argue that we have mass already in Bilbao. We have mass already in many places in the world where connectivity is bad. The more people connect, you don't have the bandwidth, things disconnect, they don't go through and so on. And we are getting more and more devices everywhere. Like in 1981, we had 300 computers in the Internet. Now we have billions. And we are still using the good old telephone protocol, a bit of a higher level than the PAL styling in 1939, but we are still using this protocol, except it's not actually true anymore. When you connect devices, be it Apple devices or all kinds of Internet of Things and so on, there's all kinds of protocols that are spoken actually locally. So you have synchronization mechanisms between your mobile and your laptop, and they don't go via HTTP somewhere very remote and try to synchronize there. Although sometimes you have to do that if you are meeting someone else and you want to transfer a file, you actually figure it seems to be more practical to go all the way up to California somewhere and exchange the file there instead of being able to send this around. But you also know that, especially Apple I think, is doing more and more in the area of this peer-to-peer syncing and finding each other on local networks because it's just much more effective. You don't want to have this. Uploads are very bad basically. Usually you have like big download links, but very slow upload links. So if you want to transfer some big file, it's just very annoying to actually go through a small upload link. It's very ineffective. But these synchronization mechanisms, they are most of the time, a lot of the time they are proprietary and there's no web standard for that. So there is a group, I think it's mainly in Berlin, that says, well, when we went from 300 to billions of devices, then maybe things changed. Our devices actually got more and more powerful, our endpoints. They have like more and more CPU and they have lots of local connectivity. And maybe this model of phoning around the world all the time is not really how we want to organize data. So that's something that's going on in the mobile world a lot, where Python is not a big part of, but it's happening there a lot. That people actually say, okay, I want to have a local application that works even if it's offline and even if I do some chatting or something, but at some point I'm going to get a connection and then I synchronize. So it's not like this client server constantly phoning somewhere. It's really about thinking offline first and thinking about synchronization. If you actually think about a number of very successful projects, in the last 10 years, many of them are using synchronization and replication mechanisms that don't work according to this client server paradigm. I guess many of you are using Git, for example, and Git is not like subversion was before, constantly going somewhere else to actually find out about stuff. It was just said, okay, our laptops are fast enough and they have enough storage. We can just store the whole copy and do everything locally. When we do some commits, we just synchronize later, offline first. BitTorrent said, well, what's the point if like 50 people of us now are downloading some video stream of Guido's keynote yesterday or something? What's the point of everybody making this phone call back to California to get this video? I mean, we have the data here. We're just not talking to each other to actually do it. BitTorrent organized a network distribution protocol that means when I access this remote resource, I chunk it up into data, I compute hashes, and then I say, okay, I can provide these hashes and I register this. And then the next person comes along and says, oh, I need all the data for these hashes. Who has them? And then we basically start to talk to each other and we can basically one person downloads, but the others just get it locally. So that's a much more efficient network protocol. And that has already, I mean, BitTorrent at one point had 50% of internet traffic. Git is certainly very well known. I'm not going to talk too much about the others, but they're using similar techniques. And do you know what the central data structure that is behind virtually all of these techniques, what that is? What? Sorry, again? Merkle trees, exactly. The Merkle directed acyclic graph. I'm going to talk about this in a second. So Merkle trees are a way to chunk up to have data objects. And then instead of having a mutable name to it, I compute a cryptographic hash. So just a short recap on what a cryptographic hash is. It's a computation that takes basically an arbitrarily long data chunk. It performs some computation and out you get something like 32 or 64 or some number of bytes. And these are a unique identifier. In the following sense, you can always easily perform a computation from the data to the hash. But you cannot, if you have the hash, construct the data. So it's a one-way computation. And that's kind of like the cryptographic mathematics behind that make this possible. And of course, there have been weak hashes and stuff like this, but this is a topic like having secure hashes that is a research topic in many, many places, also SSL and whatnot. So we do have reasonably safe cryptographic fast hashes. And if you have that, we can chunk our data blocks and we can compute the hash. And then in the tree, we can actually have a hash of two hashes. And then we have a root hash that actually combines all the hashes. So when I modify one data item, the path up to the root changes. Like if I modify the data block 004, then the F hash changes, the B hash changes, and the root hash changes. But the rest of the tree stays the same, like the A, C, D hash. And it turned out that this data structure is what Git is using, and React is using like a replicated database and also to a degree bit torrent and many others are using. And so the thing is, this is all in implementations. There is no web protocol for that. Just a little... Allow me, please, a little personal comment here. Merkel is not the same name as the German Chancellor. And I'd like to take the opportunity that I totally disagree with the politics of Mrs. Merkel. So another development, when you think about the Merkel tree, it's an immutable data structure. As soon as you change something, you get a new root hash, just like in Git, and you have a new data structure, basically. And the root reference that you have identifies your complete tree. And this concept of immutability has been going on for many, many years. I mean, functional languages, even back in the 80s and 90s, they used the concept of immutable data structures. But they have become, because of scalability and replication and synchronization and other tasks, this idea of using immutable data structures has become very popular. So you can see in virtually every language that gained popularity or was invented in the last 10 years, immutability is king. It's like you rather switch from having immutable situation to an immutable situation, and then you do immutability on top. And in JavaScript, it's very, very popular, I found out. Facebook did a library called Immutable.js. There's also very good, I think well-tested Python libraries like PyR Assistant, which provides you an immutable dictionary and immutable list and set and so on. And actually, when you add an item to an immutable dictionary, you get a new dictionary object. But it's internally stored efficiently. So it's not like a full copy of the whole thing. It just takes the part of the tree that actually is needed. And the effect, if you have immutable data structures, is not only that it becomes easier to reason about a whole data tree, it also means that programming gets safer. Because when I pass immutable data structures to some library, I can be sure it cannot do any damage. It cannot just mutate my dictionary. So this idea of immutability, I think, is very beneficial apart from replication and stuff. It's very beneficial also for arranging an application and arranging APIs and getting some certainty. And you can do this today. In Python, in JavaScript, other languages have more first-class support for that. And you know what, one of the reasons I fell in love with Python in 2001 was this great idea of namespaces. Like I came from C++ and I worked with the introspective abilities of Python, and I saw, well, that's great. The module has a dictionary that contains a mapping from name to class objects or name to function objects. Then I can go to the function object that has a dictionary that contains the name to value mappings, the function attributes. I can go to the class that has a dictionary that contains the names to function objects, the methods. So it was basically namespaces all the way down. And I really, like, I thought this is great. I mean, I can get a complete perspective on how the whole code is actually related to each other. And it was certainly one of the reasons that made me very interested in doing a Python interpreter myself. So that was also one of the motivations behind contributing and co-funding PyPy. Now, what always bothered me for, like, the last 14 years is, okay, I don't know if Guido made this statement or Tim Peters, or I'm not sure. The namespaces are honking great idea. Let's do more of those. I always wondered, what is this more of those? You know, what could this be? And I really, like, I think every year, a couple of times I thought about this, you know, what more can we do there? What is the next thing there? And I think this mutability and immutability gives a clue. I think thinking about kind of immutable data structures and immutable namespaces can provide some advance, basically, in what namespaces mean and how stable they are. You know, I can, if I have a reference to a namespace and that contains recursively data structures, then this reference can signal me like a complete snapshot, like a git commit on all of the contained namespaces. So it's totally stable. There's no, nobody can change anything about this when I have this reference. And I think it's worthwhile to try to do some rocket science, trying to find out what, how this could be beneficial. I mean, maybe it could be beneficial, for example, for, suppose that we had many applications in Python that used immutable data structures. I think it could be very well possible that this opens up possibilities for removing the gil. Because when you know that some code is working on an immutable data structure and you are in a local function working with local variables, you know that there can be nothing that can modify your base that you're working on. It's a vague idea. It's not like a thought-out thing and I'm happy to discuss if that's possible. But I think even if that doesn't work out, working with immutable data structures and maybe also namespaces is beneficial for writing larger applications and for reasoning about a program. So, finally, I'd like to introduce you to the interplanetary file system. I got to know this two months ago. I talked to the lead developer, Jean Benet, in Berlin and he gave some of the thoughts that I'm presenting here is actually motivated by his storytelling. And IPFS aims to bring all the goodness that we have in proven implementations of synchronization mechanisms and languages to a new hypermedia protocol that substitutes HTTP. So, let's go to the difference between the two. HTTPS and that location that follows, quotesbig.net, and then the path is a phone number. It's, I'm basically going to some destination. I want to have a secure communication. I do SSL and all kinds of stuff. It's basically, I'm phoning some computer that is somewhere. I ask it to authenticate itself and then I do an access to the path and see what is there. And that's inherently a mutable operation. Like, everything in between there can mutate. Like, the server can serve a new file. The location can change to a different IP address. The SSL certificates can change and so on. So, it's inherently a very, it's one, on one hand, it's a phone conversation. On the other hand, it's a very mutable thing. Now, contrast this with the basic new idea that IPFS introduces. It says, instead of HTTP, we can introduce a new protocol called IPFS here and we use the content hash, which is pointing to the root of a Merkle tree. And then in that Merkle tree, just like in Git, you can have directory objects and file objects and so on. But once you have the content hash, you don't need to phone anywhere. I can ask anybody of you, does anybody of you have the data that's behind that hash? I don't need to make a phone call and verify it's the correct sender and stuff like this. I can just ask around. So instead of addressing a location, I'm addressing data. Of course, the question is, where do I get this content hash from? I said it's like 32 or 64 bytes or something like this and we already have problems remembering phone numbers. So what IPFS needs to do is it needs to introduce mutability on top of immutability. And IPFS is, again, just like IP was over the phone network and overlay network. So we can still continue to use the internet protocol and we can even continue to use for the time being until we have something better, the DNS system. So if you have an IPFS demon installed, you can actually use the buff link and the codespeak.net thing. That actually triggers a DNS lookup to get you an IPFS link. In this case, not directly an IPFS link, but a so-called SFS link because I have an identity, a public key, and when I publish mutable names, I sign them. So the name to immutable object mapping is signed by my public key and that's what is the SFS link. So the second link you see is actually still a mutable link and that then eventually, and the third link, the IPFS link, turns into an immutable link because now I have a concrete content hash and I'm using the IPFS corp protocol and that means now I can ask around for the data. And if you install the binary currently where IPFS and IPNS is implemented, then you can use the IPFS protocol already by using your local host demon that does all of the connections. Now, just like the internet, the IPFS architecture is kind of an hourglass model. I didn't get around to actually make a nice painting here, certainly not as good as the ones on Monday. The thing is with IPFS links, the new hypermedia protocol, the web continues to function like HTML, CSS, all this great stuff, continues to function, but instead of phone numbers, HTTP and SSL and all this stuff, you can actually use, that's the idea, you can use IPFS and IPNS and suddenly we can use all the goodness from being able to talk to each other. The naming is currently done by the self-certified naming system. This public key idea, I'm going to make a session tomorrow where I can show, and we can play a bit more with details. SS is a bit, it's not like, you cannot describe it like in like one minute or so. And then the central data structure is the Merkle duck. At some point you get the content hash and then you have the same data structure that Git uses that also means you get a versioned web. It's, I think on some levels actually what this offers is the same thing that you get from working on source code and having this on a network drive without any kind of versioning and then moving to version control. Where you get much more precise references who changed what was the previous version and so on. And then the exchange of the actual data blocks, the leaves and the Merkle tree, that is actually done by BitTorrent or BitTorrent like mechanism. So again I don't need to make a phone call somewhere, I can just ask around, does anybody have this data around here? And what IPFS currently implements is something uses a classic data structure called the distributed hash table or multicast DNS which for example should also work in this room. So it's like two different ways how you can find possible peers who give you something and I think multicast DNS is also used by Apple for example. So you get like a global scale distributed network just like the file sharing did like 10 years ago and you also get this local connectivity to the peers around you and you don't need to be worried because if you ask for a data block you have the hash, nobody can cheat you. Compared to HTTP today if I ask for the contents of a Google search and I just ask some notes I could get anything, right? I don't really know what's the answer is actually authoritative. The routing, the distributed hash table is going to be explained by Nicholas Talaway, see here? Over there, in the afternoon that's why I'm not going to I think at quarter past three or so because he programmed one in Python and knows quite a bit about this. So we have this kind of architecture and for networking we for the time being just use the internet protocol because well it's 41 years old it's a phone protocol but still it's there, everybody has it and we have all kinds of software stacks for it. I'd like to also introduce you to another thought from Adam Gehry Menko. He's a practitioner in building distributed networks and he saw the problem when you do a peer-to-peer network like the one that Nicholas is going to describe in the afternoon then one problem is what are your initial peers? How do you start to talk to? Now if you think about this offline first situation where everything is battery powered people are constantly switching on and off and nothing is really reliable as back in the 1939 phone network situation when you have this situation then there is kind of a trade-off like if you assume that everything like if you disconnect for two days and then you reconnect and everything changed you will take a long time to get any kind of reliable connectivity to your peers again and he said that that's the reason why in practice many of the peer-to-peer networks they introduce just like in Skype stable nodes that are always there, they're operated but suddenly of course you have some kind of these nodes are owned by people and they can actually disrupt the network and his question that he puts up in his I would say famous blog post I want to believe because he wants to believe in this complete decentralization but he says in practice it doesn't really work completely so he says rocket science questions can we construct a blind, idiot god of the internet stable nodes but can we make it so that they don't get to see very much like you do have the pyramid but without the eye right you don't really, it's blind you kind of like you coordinate things but you don't really know as a node what's going on and one of the things of course is end-to-end encryption and stuff like this but then there's the metadata question so lastly I'd like to recite very famous inventor and researcher I think instead of blaming Google who has also been doing Facebook we have also been doing great stuff and provided great services open source stuff, they brought many of us into the position to have to earn good money so attacking them and saying something like this is probably not a good idea I agree with Mr. Fuller that probably the better way to get a new system is to just build it and then let it be better this IPFS has the chance to be just a much more effective protocol in exchanging data you can implement Git on top of IPFS you can implement Bitcoin on top of IPFS and people are doing that so you can implement all kinds of stuff on this kind of new protocols and it's also going to help to get some new actual protocols not just good implementation but actual protocols into the web so we don't depend on all the proprietary effective protocols lastly I said I would get back to the discussion of like we had this rocket science situation, Margaret Hamilton programming the rockets to the moon and the science paper thing that actually a lot of science stopped going exponential now we certainly have a lot of innovations that are still happening and have been happening since the 70s but David Graber makes a very good historic argument with lots of data and facts and says in fact the whole innovation has kind of leveled off and it has been like doing something like real rocket science is not really happening anymore universities are very much restricted by which companies they get money for that a lot of administrative overhead he talks about bureaucracy and so on so he says the last 50 years haven't actually brought the advances that people thought in the 60s would happen compared to what we had between 2010 sorry 1910 and 1960 or 1970 the advance it's like advantage like electrification electrification in many places that wasn't there before or we got like people flying the highest speed they ever flew but then we stopped any kind of space program and many other examples kind of contradict the idea that we are living in a time where everything just exponentially grows all the time it grows in very certain directions like you have new million-to-one architectures like new technology that actually helps there but in terms of getting an overall picture and like making real technical advances that don't relate to monetization but just make everything more effective not so much is happening so I think the situation that we had in the 70s where very few people invented the IP protocol and brought a lot of change might be actually happening again with people like IPFS maybe it's not IPFS but something along the lines combining these ideas turning this into a web standard and also using this for all kinds of languages in a non-proprietary open source way is hopefully the future Thank you very much Is there any question? Please tell me I convinced you that cannot be the praise Hi Holger, thanks for the talk I just didn't quite understand the role of the blind idiot God Could you explain that again for me? Yes, the blind idiot God the idea is that if you have a peer-to-peer network like the one that Nicolas is going to explain in the afternoon then you have basically lots of peers and you segment these peers into sections and then everybody talks to their neighbors and all kinds of stuff but these nodes that are everywhere there's like millions of nodes and you just have contact to a few nodes so if you want to address something like you want to store something in the distributed dictionary that's also what IPFS is doing then you need to conduct these maybe 20 nodes that you don't know about but now if you turn online because your battery is gone or the wifi is gone or something and then you turn online again then many of these nodes might just be gone so you try to contact them and you don't get any connectivity and then maybe the 19th node actually knows something but that node is only responsible for a very small subset of the whole network so the blind idiot got idea says that how can we do stable nodes that are operating all the time so we have some kind of we have low latency you can just directly ask them you know they're going to be there and they can quickly give us information about the shape of the whole situation but how can we do this without leaking too much knowledge and stable nodes because I mean Google and CDNs they are actually internally of course completely distributed networks like if you look up the IP address of google.com you get 20 addresses and of course Google has in their network a complete coordination and they distribute all the information across the globe so you actually have this you have this architecture you have these central nodes that are stable and that tap into the actual distributed network that Google is operating but they are actually not blind because we actually connect to them and they provide the applications so if we then do something like if we then do something like having end-to-end encryption then the nodes in the middle they don't really know that much anymore but it remains unclear because you have metadata information still who talks to whom so basically the construction of something better that doesn't leak so much knowledge in the middle and the mediators that's the idea of the blind idiot god but I guess that Pope Francis which I cited at the first slide wouldn't approve of Yes, question here I like actually very well the idea and I think it makes a lot of sense for the showcase the example you said like transferring data, big videos and so but I wonder actually because we have also a lot of so you presented something that would replace basically HTTPS and the current protocols but we have a lot of things coming up with API requesting things going live having very short ping, very short latency a lot of very small calls and so and I guess it's not really adapted to that so do you see like parallel, do you see some adaptation possibilities? Yes, I think it's a very good point I mean how do you actually talk to entities and they have some storage and you have, you access some API because you modify resources and stuff basically I don't think that something like IPFS is going to replace everything you know even now it's using for having human readable names it has users currently DNS and I think it's going to coexist for a long while the like I'm using this presentation is actually I'm not sure if you can see that but it's served from my local IPFS demon you know and I'm using HTTP here so it's all work in progress I mean people are working on this they're thinking about the problems and they're trying to make a stable implementation and they're using currently HTTP for example one of the things is can you do a Firefox plugin that allows to on the one hand allow you to use IPFS in your web page but if you don't have it you have some kind of pulley fill that actually redirects to some available nodes because the IPFS network is also currently relayed into the HTTP DNS network under gateway.ipfs.io so you can actually just access the central sites and they provide you a view into the whole situation so I think it's going to be for a long while it's going to be a hybrid situation and the question about API access well the thing is that our endpoints are so powerful and like in 10 years our phones and laptops and so on will be very very powerful I think somebody computed that in 50 or 60 years our mobile phones will be able to do what the NSA data center is doing today I mean the more intelligent these devices get the more you don't want to actually delegate stuff somewhere else and just exchange data and you get into architectures where we can just talk to each other and arrange some state change in a very secure manner because you have like versioned state changes and it's a bit hard to actually foresee in what time frame this will I mean Elon Musk says we have 11 years left and we already have 41 years of IP so you need to get rolling Hi From here it seems that IPFS is for distributing public data but currently if I put something private on the internet then probably Facebook or Google or NSA can see it but I'm maybe more concerned that my competitor or my ex doesn't see it Are there any solutions to this problem? Yes Good question I have to admit that I don't actually I mean my if I think about it I would actually say the way how I would do this now is I through IPFS I publish some contact points and then I would probably currently use HTTP to perform some kind of authentication so that I know I'm actually talking to the correct endpoint and I provide my authentication credentials to that point and then I basically use it on top so I can still just get my private data using IPFS or SSL as we currently have and I think it's true IPFS and IPNS is basically first of all it's something for the public data and I'm not sure if something is going on in terms of how you arrange end-to-end encrypted to device communication I think you can just still use a mobile device wants to talk to the mobile device of my partner somewhere in Germany I really want to talk to that device because that's actually where she is chatting and I'm chatting so on some level it's okay that we actually have this phone call it's actually a phone call I really want to talk to that device so in that sense I'm just using in that case I would just use IP so I guess IP or something like that we have another overlay network that provides benefits I'm not sure where the next I have the microphone already so thank you for thank you for the talk my name is Wojtek it's really great concept this interplanetary file system but there is a part that obviously I'm missing because it seems like the principle of this is chunking you have to partition your data into smaller chunks then calculate the cryptographic hash and then in order to reduce the amount of information being exchanged between the nodes you obviously look for the chunks that are already in the local network which seems like looking up the cached data that is locally so is there any research done because you can go to very extremes you chunk the data in such a level that each integer number one has its own hash tag instead of sending number one you would be sending the hash tag is there any like a sweet spot or research done how low you would go maybe it's where should I find what amount of data I should be chunking into I think the chunking of data if you have like a large data blob and you put this with a put request into the local host IPFS then it chunks it for you it uses some kind of chunk size that people have been found just like in TCP IP the default windows and stuff like this you just have some default chunk sizes and they are used so you don't it's a higher level structure basically you don't get to think about how that is done okay content defined chunking okay so you know I'm not knowing everything about what's going on Thomas knows some more there I think that this is all very much in flux basically how you have which kind of numbers make sense and you need to have real networks to actually see what the effects are on how you do your chunking and your bit torrent exchange and stuff like this I've been using it for a while now and it worked very smoothly for me but that's certainly a much topic and there's an IRC channel IPFS on free mode where you can ask such questions so I'm sorry too much information Hello, I'm here Continuing with the last question DARPA has done a lot of research on the late tolerant networks I don't know if you know about that there's an RFC there's a bundle protocol which is the 50-50 there's a lot of research and there are even some fools like me that did a PhD on that kind of stuff how do you see all these coming together on this because these bundle protocols the late tolerant networks focus more on not having an end to end point from one point of the network to the other like for example satellites for example what? you don't have connection between all the satellites all the time or for example on mobile phones when you lose the network connection and that kind of stuff how do you see feeding all these protocols together I think that really relates to how do you do a stable distributed hash table DHTs what all of the file sharing protocols and some other protocols have been using that has been a lot of research I know that I met last month I met David Diaz he's a researcher in Lisbon Lisboa who deals with exactly these questions like how can you how can you make this very effective in the default case of things disconnecting and connecting and that has been a lot of research and actual practical work in the last 10-15 years but the main change is that it actually ties all of these ideas together he's kind of like proven ideas all around the git-mercle-duck and tries to get the whole stack basically lined up so everything works together in a very stable way so that's exactly the challenge right and it's not something that is probably totally workable right now but it's a very active project it's not like the whole world is working on it just like in the 70s so it always depends on people actually doing things and regarding Python actually there is a Go implementation there is an upcoming Node.js implementation and there is a Python skeleton implementation, nothing basically so it would probably be interesting to start working on getting some of the things also working in Python because then you can start using it I want to use it for my own project as well so I would just use the demon written in Go and interface with that from Python and just use the network basically but of course it would be nice if you can just embed it directly into Python applications last question do you not see a problem with versioning the web sometimes I mean it's good if you can see how people have updated their terms and conditions of application yes I mean versioning is not enforced if you for example if you have yourself certifying namespace and you publish a new name to content hash mapping then like in Git you would have something like a commit object that points to the last commit but you don't have to do that you can just make a brand new name that doesn't contain the history to the old one so it's forgotten so versioning is not it's not enforced it's still your choice when you create the data how you want to do it ok thanks again organization we want to from the organization we want to thanks all the keynoters even then a present a present I want to give you this