 To complement what we have studied so far, let us look at one example where the game theoretic ideas that we have discussed in the previous modules is being used in practice. And this is in the context of peer-to-peer file sharing. The slides are actually adapted from a similar course CS186 at Harvard University. And the images that I am going to use in these slides are actually from the book of that instructors. So, what is peer-to-peer file sharing and how is it useful in practice? So, on the left hand side, we have shown the traditional server client model for downloading files. And the right hand side, we have the peer-to-peer file sharing. So, in the traditional model of server and client, there are typically one or few servers and there are many users. And as the number of users increase because each of these servers have a fixed uplink bandwidth, the download rate per user is actually going down. Whereas in peer-to-peer file sharing, there is no fixed server and no fixed client. Each of the users are having the dual role of acting as a server as well as a client. So, therefore, whenever they are getting some pieces of a file or maybe some file that they have downloaded, they also act as a server to share that file with the other users. That is the principle of peer-to-peer file sharing. In an ideal peer-to-peer system, if the number of users increase, that means the server as well as the client increases proportionately and there is no effect on the download rate per user. So, that is one of the advantages of using peer-to-peer and it is being used in various applications. In this slide, we are going to look at some of the advantages of using peer-to-peer file sharing and some terminology that we will be using in the rest of the module. So, the advantage of peer-to-peer file sharing as we have said that if the number of users increase, that also increases the number of servers and therefore, it is scalable. The download rate is not seriously affected if there are lots of users who are trying to download files and it is also failure resilient. So, imagine a situation where there was only one server and if that server goes down, nobody can download the file. While in peer-to-peer file sharing, every user has some copy of that file and it is very likely that even though some of this peers disappear or some of the servers go down, then also the other users are capable of downloading this file maybe from some other peers. So, in the peer-to-peer world, by protocol will mean the rules through which these messages can be sent and also the actions that can be taken over a network. We will discuss a few protocols, peer-to-peer protocols and that essentially lays down the rules how this message exchange and also the actions of different users are done. So, a client is essentially a process of sending messages and taking actions. So, client of course is also a user who is trying to download some files, but at the same time looking at a more abstract level, it is essentially the process that is generating those messages or taking those actions. Oftentimes, we will also refer to a reference client which is kind of the the standard client which has been prescribed by the protocol. This is going to be a very specific implementation and whenever we are talking about certain game theoretic properties, we are using this reference protocol as the benchmark and how other clients can behave differently. And peer in this case are the individuals who are running these clients using or misusing the protocol in some way and in some sense they are the players in this game theoretic setup. So, let us look at the historical development of peer-to-peer technologies. The first peer-to-peer technology to come up was Napster which came about in 1999 and it closed down to two years later. It had a centralized database which had all the files and the users could download music from each other. So, the database kept a control over who has which files. This is the purpose of maintaining that database and whenever some user was trying to find some music, they were actually not contacting those other users who has that music and they were downloading. The reason for closing down was primarily because some of this music were copyrighted and there was violations in terms of this peer-to-peer file sharing. The second one which was relatively more popular and also is being used even today. So, it came about in 2000 and this is called Nutella. So, how does Nutella work? It gets first the list of the IP addresses of the peers from a set of known peers. So, there is no server anymore. The peers essentially gets the addresses from some other known peers. And now, how the peer-to-peer file sharing works? If a specific peer wants to get a certain file, then it broadcasts a message to its known peers and this query will be responded by some other user, let's say user B, if it has the desired file. And this desired file will be, this message will be first responded and then A can download, A is the requester and B is the user who has that file. A can then download this file directly from B. But notice that it downloads the whole file in one go. Now, if you look at this situation of Nutella in the light of game theory, you can see that each of these players, and let us assume that there are only two players, both of them can either share that file or can free write. So, that means that they are not going to share any file, they are only going to download the file from others. Now, this game matrix essentially shows how much utility or benefit that they get when they pick a certain strategy profile. So, if both the players are sharing, so imagining that they have a complementary file, so player one is trying to download let's say a music file and player two is trying to download some other file, let's say book file. And if both of them share, then both of them get some positive payoff, which is denoted by two, their utility are two. And if both of them free write, try to free write, then none of them essentially share anything. So, they do not also get any file and they do not share anything as well. So, but the interesting thing as before happens, when one of these players free write and the other one shares, so then the person who is sharing, the player who is sharing, it actually gets a negative payoff because it is uploading. So, its upload bandwidth is being used up, but it is not getting the file that it was looking for. While the free writing player is now getting the file that it desires, and it also does not have any upload bandwidth, therefore it gets the file a little faster. So, therefore it gets a higher payoff. Now, this situation is essentially very similar to the neighboring kingdom's dilemma, if you remember. So, if you see in this game, free writing is a strictly dominant strategy for both these players. And that gives the sorry state of the strongly dominant strategy equilibrium of free write comma free write, which does not give any of these players any payoff. So, but that is a predictable outcome. So, Nutella actually suffers from this problem because it does not take any care of any strategic aspects of these players. And this is not just in theory, in fact, there has been studies and this study has been done by other and Huberman as you can see in this in the references. So, it says in this figure it shows the rank ordering of some 33,000 host, different hosts who are running in Nutella. And you can see that only very few number of hosts are actually sharing a huge number of files. So, that might be perhaps that they do not pay for those upload bandwidth or maybe for some other reasons, their own alt-reason. But these numbers are much, much less than the number of total hosts who are using this. So, out of this 33,000 hosts, just one-third of them are actually sharing some amount of file, even though this files are actually the number of files shared are actually much smaller. The y-axis is essentially a log axis and almost two-third of the population is not sharing anything, they are just free-diting. So, this threat of not sharing and free-diting is actually real. Now, one can actually argue that why don't you stop this situation by developing a reference client. Maybe you can design a client which does not allow people to choke or un-choke this kind of upload links. But you can see that there are then different kinds of client developers and it will not be very difficult to develop a client or write a new client which ensures file sharing and allows people to choke their upload bandwidth. So, that is essentially what was the summary for Nutella. 85% of the peers actually freerod the system by 2005 and today Nutella has less than 1% of worldwide peer-to-peer traffic. So, this is a data from 2013 but I believe that the numbers are almost the same even today. So, there were few other peer-to-peer systems like Nutella and who also did not care for the strategic behaviors of these users and they also made the same fate. They are not really very popular or used. Now, here comes the next protocol which is quite popular starting from 2001 and today almost 85% of the peer-to-peer traffic in the United States is actually through BitTorrent. So, this is typically used for large file sharing. So, for instance, softwares or other bigger files, particularly the open source software distributions are actually shared using BitTorrent client. Now, what is the difference between this BitTorrent and its predecessors? So, the key innovation that BitTorrent has actually brought in is to break the large file into smaller pieces. The advantage is that you can now pretend as if this is a repeated game. So, instead of giving the whole file in one go, you divide a large file, let's say in 100,000 pieces and it's a 100,000 round repeated game and the driving principle goes as follows. The BitTorrent client says that if you let me download, then I will also reciprocate and this client is actually running in every user's system, every of those peer system and they are using the same principle. So, it is only sharing with those other peers who are actually letting this peer download. So, here is a little detailed view of the underlying engineering of BitTorrent. So, typically we will find a torrent file which has a lot of information. So, the name of which file you are looking for, it's SHA1 fingerprint file size and there is an important thing called the tracker URL, among other things. Now, this torrent file is typically you find it from certain websites. So, suppose if you are looking for a specific Linux distribution, you go to a, you will search for that torrent file and you will possibly hit a directory where you can find all the links of this torrent and you go to that torrent file, download that torrent file, it has all this information including the tracker URL. Now, what is a tracker? Tracker is a, is something like a planner or a centralized entity which is actually keeping the list of all the peers and it is dynamically getting updated. So, tracker as the name suggests, it is tracking all the peers who has this specific file given in this torrent file and it is keeping their IP addresses, their port numbers and if the new peer now comes in, it will add that entry into its list of peer and also if some peer goes down, it will delete it. So, dynamically this tracker is tracking the status of this file in the internet. So, that is the how the tracker and the torrent file looks like. Now, what does this big torrent client or the protocol using? So, big torrent uses an algorithm, what is known, which is known as the Optimistic Unchoking Algorithm. So, as I said, the tracker is a centralized entity that controls this traffic, tracks the connection between peers and their speed of upload and the various other parameters, it is essentially tracking everything. Now, the reference client or the standard client that big torrent uses, it sets a specific threshold of uploading speed. How this threshold is set typically depends on the implementation in the most common case, the third maximum speed in the recent past. But suppose that is given to you first. Now, it uses this protocol, if peer j uploaded to i at a rate greater than equal to r, then this peer i will actually unchoke j in the next period. So, it is just saying that if you are allowing me to download at a rate which is above that threshold, then I am going to unchoke you and I will also share files with you. But if it is less than r, then it will be choked. So, it will not be allowed to download files from this particular user. So, this particular peer is i. So, fair enough, this is just a simple tit for tat. So, if you are allowing me to download, then I will also allow you to download, otherwise not. The algorithm could have stopped at this point, but then there is a problem. And the problem here is that what to do with some peers which has just appeared. So, for instance, a peer which has just come and it is unfair to choke that peer because it has no files. So, how can it actually share? How can it actually upload at a specific rate? So, that is where this optimistic unchoking part is coming in. If it does not unchoke those kind of peers, then they will remain choked forever. How it unchokes is essentially heuristic, but the way it does is after three time periods, a peer i optimistically unchokes some other random peer from its neighborhood who is currently choked and leave that peer unchoked for three time periods and then choke again and keep on doing this until it comes to the first state where it has sufficient amount of file which it can upload and then it will move it to that unchoked state if it is above r, if it is less than r then it will be choked. Now, this is essentially this optimistic unchoking algorithm is a slight variation of that repeated game we discussed. So, it is just doing this protocol of tit for tat. So, the cedar who is actually sharing this file and the leecher who is actually the client of downloading that file is nothing but a repeated prisoner's dilemma. Now, the strategy of the cedar or this reference client protocol is nothing but that tit for tat. So, tit for tat is a standard technique for prisoner's dilemma in repeated prisoner's dilemma. So, which says that if you cooperate, so there are two strategies, cooperate and defect. If you cooperate, then I will also cooperate. The moment you start defecting, I will also defect until the point you start cooperating again. So, let us look at one small illustration of how this works. So, here you can see that there are a bunch of peers who does not have any file and there are two peers which has all these files. If it was using this reference client protocol without that optimistic unchoking part, then none of these new peers would have got any file. So, initially you can see that it is sending in a burst of three and that is essentially due to this optimistic unchoking part. Now, once some of these files are received by these users, they also start transmitting to other users. And once they start transmitting, they come into that first category where their upload rate is also above that threshold and therefore they will also get files from other users. So, this is essentially pictorially showing how optimistic unchoking algorithm in BigTorrent works and let us allow this to complete. We will come back and see this at the end of the module. So, what can be some strategic behaviors in this BigTorrent system? So, even though this is quite robust compared to Newtel and other kind of peer-to-peer file sharing systems, it also allows for certain strategic behaviors. So, some of these strategic behaviors could be like how often to contact tracker because the moment you contact tracker, it is going to give you some peers and you can exploit the optimistic unchoking part and maybe you can avoid uploading files still. The second strategic behavior could be which pieces to reveal because the pieces are also have different demands. So, there might be some files which are very much in demand or some pieces of that file is very much in demand. There might be some other pieces which are not so much in demand. So, if you are doing this exchange policy that you are going to give files, some pieces of files to some other user in exchange of certain other pieces of which piece should you reveal. Now, there are several other kind of strategic behaviors, how many upload slots, these are going into the nitty gritty details of how the peer-to-peer system works, which peers to unchoke at what speed, what data to allow others to download and by and large the goal is always to minimize the upload, maximize the download speed and some sort of a balance. So, there are several attacks on BigTorrent. So, in fact, there has been papers written to how you can actually fool the BigTorrent. The first of them is BigThief. So, as the name suggests, it is actually stealing the bits from a BigTorrent system. Goal remains the same download files without uploading and how does it do it? So, it actually asks the tracker very frequently. So, if you see the way BigTorrent trackers work is that if you ask the trackers, it is actually going to give a list of peers from which you can download. So, if you query that tracker too often, so that means that you are not using the reference client, you are using some other kind of client which is querying the tracker too often and therefore, you are getting a list of peers that can potentially give you the files that you are looking for and if your neighborhood grows very quickly, then you can actually exploit just the optimistic and choking part. Maybe you are just unchoked by those new peers who doesn't really know it is considering you as a new peer and therefore, optimistically unchoking you and you can download it, download the file that you are looking for and you never upload. So, this can be actually fixed slightly by modifying the tracker. You can make the tracker a little smarter if it is coming from the same IP address within 30 minutes, then you actually block it. But of course, this can be bypassed by using a different IP address every time or randomizing over IP addresses. So, this bit thief is essentially a paper in Hotnet 2006. If you are interested, you can take a look at that. The second strategic behavior is what is called the strategic peace revealer. So, how the reference client works and this reference client we are going to assume that all the other neighbors, the other peers are actually using this reference client. It tells its neighbors about the new pieces and also requests the rarest first. So, for instance, if there are 10 pieces and there are certain peers, most of the peers are actually looking for one and two, then you already know that this piece one and two are very highly demanded. If you are a strategic peace revealer and suppose you have these pieces, both these one and two pieces, should you reveal that versus some other user who is actually looking for eight, the piece number eight. And you are actually looking for piece number five and suppose all these other users are actually having that five. So, whom should you trade with? Notice that this big current is nothing but a kind of a bilateral trade between this, between one specific peer and the other peer and they are using this optimistic and joking thing and other stuff, but it is a peer to peer deal. Now, instead of sharing that you have this one and two and advertise them, you are going to advertise as if you have eight. So, so that you can connect to that peer who is looking for eight and get five from it. So, you get your whatever you are looking for, but in the process you are also saving that one and two which you know is highly demanded. So, at the end possibly you can keep others, other people interested and you can recover any piece that you are looking for because you have the most rare pieces. So, this is essentially the strategic piece revealer. This is something like an auction. You are just saving, you are keeping your monopolies and it has been seen that in experiments this strategic piece revealer which is the solid line here compared to the standard client or the reference client, it downloads all the files almost 12 percent earlier than the reference client. Okay. So, let us summarize that this the peer to peer demonstrates these importance of game theory in computer systems. Early systems like the Nutella and even before were very easily manipulated. BitTorrent is the first client which actually brought in this idea of repeated games by breaking the files into pieces and using tit for tat which was quite successful because this is the principle through which the current peer to peer system works, but it still has some vulnerabilities. But it is by and large a very successful example of insensitive based protocol design. So, let us go back and check whether our peer to peer file sharing you can see that all the files are with all the peers now. So, if you just now add a new peer, so you can see that it has started optimistic and choking from all the peers. So, yeah, this is a nice animation for this peers.