 Except the bullets looked all fucked up. Okay, I will assume you're all familiar with existing methods of transferring files on the internet, but basically what I was delving into here was that FTP has basically been relegated to the purpose of anonymous FTP, where you have a central file distribution server, and you have to give a large number of files out, and then HTTP has sort of evolved into a general purpose file transfer protocol as well. And one of the problems is that these current methods are entirely client server, and so the server must therefore have sufficient bandwidth to be able to handle all traffic requests. So, that will move. Wait, that wasn't even the first slide. Okay, this is Gimpy. Can anybody lend me their laptop? You'd rather really sit me, like, watch me stand up here and mess around with my iBook and give me open source software. Alright. Let me try to get the fucking first slide to show up. Hey, it changed. There we go. Alright. Here is the basis of what my protocol is. It's a next generation file distribution architecture. It's basically trying to merge the concepts of peer-to-peer protocols and client server protocols. It's intended primarily as a replacement for anonymous FTP for large distribution sites, like, say, Debian's, like, master distribution servers. And more than that, which I'll get into later, it's a way of replacing, like, large networks of mirrors where someone manually has to choose which mirror they want to get a file from. So it provides a much more effective way of coordinating a large number of servers to coordinate large file transfers. And, yeah, I'll get into that later. But primarily it's for when you have a large number of anonymous FTP mirrors transferring files. And, yeah, that's sort of what the second thing is there is that it uses server networks rather than individual servers that are just functions on an island and use R sync or something along those lines to sync content. And primarily what it is is intended as a quasi-side-by-side replacement for BitTorrent. And it provides the full client server experience while still utilizing BitTorrent-like content distribution where peers take place in all transfers and upload content to other peers. And I assume you're all familiar with BitTorrent, I hope. No, not at all. Not at all. If someone isn't, please raise your hand and I can give a more detailed explanation. I'll get into that. He asked about seeding. And I'm actually taking a much different approach to that. So here's a little bit of background on what BitTorrent is. PDTP is very similar and I'll get into the exact differences in a bit, but as I explained earlier, the whole idea of BitTorrent is you have a tracker that various clients can post which pieces they need, which pieces they have, and then other clients can look up what pieces need to be transferred where and transfer pieces of a single file to each other. And I take the same approach in PDTP. I break files into pieces which are hashed and transferred between clients. Yeah, let me go into a little bit of why I started writing this. In early 2003, I started writing a SEO library for BitTorrent. And as I was writing that, I ran into a number of fundamental design flaws with BitTorrent, which basically drew me insane. The first one is BitTorrent serves static file groupings where if you want to put a torrent on the internet, you have to generate a torrent file from a preset group of content. And because of that, all data within a torrent is entirely static. And that just does not work for something like the web where you do have dynamic content or dynamically generated content. Or if you have, say, a large distribution server where people need to pick individual files out rather than, say, the entire, like, the entire FTP repository, if you were getting updates for a Debian system or FreeBSD upgrading your ports tree, you probably don't want everything on the entire server. So that is the main problem I saw. And one of the other problems is, you know, there still isn't web browser integration for BitTorrent and people cannot use BitTorrent to serve websites or any sort of dynamic content like that. And probably the main problem, which I'm sure any of you have experienced if you've tried to download a popular torrent, is BitTorrent uses HTTP for its tracker protocol and uses it asynchronously to where a client will make an HTTP request whenever they have completed a piece transfer or need to find a new piece to download or upload. And because of that, the web server can get exhausted by too many clients trying to transfer a torrent once. And while Bram Cohen promises limitless scale ability to BitTorrent, the problem is a web server just can't handle a load of, say, 10,000 clients all making several hundred requests every second. So I have taken a different approach, which I will get to in a bit. I'm not sure if any of you have heard about BitTorrent to you, but one of the main things they'll be moving to is hash trees where you can check different parts of the file rather than just a set number of pieces. So right now with BitTorrent, every time you resume a transfer, it hashes the entire file again to see which pieces you have and which pieces you don't unless you have a client like Gazarus that tracks that externally and it's incredibly slow and annoying. So they're trying to introduce that feature to speed it up. And oddly enough, they decided to go to UDP for their new protocol. I have absolutely no idea why because how do they intend to manage drop packets along those lines? I have absolutely no idea and I think it's just going to end up needlessly complicated. So, yeah, so what PETP provides is dynamic directory mappings that are updated on the fly whenever the file's explored by the server change. The server I'm writing automatically hashes files as they become available. I know there are various scripts for BitTorrent that will use a program like denotified automatically generated torrent files, but if you have a large master distribution server for something and you have, say, tens of thousands of files, that's tens of thousands of torrents. You have to run it once. And BitTorrent has issues running more than one torrent at once on most servers due to scalability issues. And the other nice thing about PETP is I provide all sorts of components like a proxy server that would allow you to use PETP behind a firewall. And it would also allow several users behind, say, an ad gateway to share a single incoming port and automatically direct incoming connections to the proper computer on your local network. You know, I'll get into how that works in a little bit. And as I mentioned before, BitTorrent uses HTTP asynchronously and connectionlessly. And BitTorrent 2 will move to UDP and PDTPE uses a lightweight TCP-based transaction protocol, which I think is the best choice for this job. Okay, this gets into your Tracker Cedar question here. If you're all aware, BitTorrent uses trackers, which are basically either Python or PHP scripts, which is usually talking to a backend database. And then you have to run at least one BitTorrent client, which is called the Cedar, that has all the pieces you wish to serve to the network. And PDTPE structured quite a bit differently. There are two different components to the protocol, which handle the Tracker and Cedar functionality, but they also do different things and are structured a little bit differently. And I'll get into those in a bit. I don't think OpenOffice can read the name of this slide. Okay, never mind. If any of you have ever tried to implement a BitTorrent client of your own, one of the problems you're going to run into is Bram Cohen has not documented the protocol. Like, he has about a two-paragraph description of how it generally works, but for the actual mechanics of how BitTorrent functions, you're not going to find any documentation besides the code itself. And I have an entire IATF Internet Draft and HTML and RFCAT69 XML format, which is on your handy-dandy DEF CON CDs if you have those with you, which you probably don't, and most of you seem to be sleeping soon. All right. But if you really care, you can pop in that CD and check out the entire protocol specification. It's also on the web at PDTPE.org. And we have applied with IANA for port assignments. BitTorrent kind of jacks some ports and called them there. All right, so here's where we get into the actual components of the protocol of how to build a PDTPE network, and I have some diagrams of this coming up. There's two primary components of the system. The first is a server which acts like a BitTorrent tracker. Now, the nice thing about this is you can have multiple servers on a single PDTPE network. So where before with BitTorrent, you might exhaust the resources of the system you have doing the service with PDTPE, you can say what used to be an FTP mirror is now another PDTPE server, and it manages a subset of the available clients and routing pieces to and from each other. And then instead of a seeder, there are hubs. How this actually functions is all the servers connect to the hub rather than having a seeder connect to the tracker. Every network must have a hub, so if you have a server disconnected from a hub, it won't serve any files, and it's basically useless. So if you are connected to a hub, you will be able, if the server is connected to a hub, you will be able to access all the content of the network as opposed to BitTorrent where you might run into a dead torrent where there's no seeders for it. And the hub manages the directory listings and serving of the initial pieces to the network much like a BitTorrent seed. An additional component of the system are piece proxies, which can fetch pieces from the hub and cache them to serve to the network. So if the hub's upstream bandwidth is exhausted, you can use proxies to mitigate that. Yeah. And that's basically what I just said. I'm sorry, this is all really boring intro. Once I get to the diagrams, it should be a little bit better. It should be a little bit easier for you to figure out how this all comes together. Okay, there are two types of proxy servers. The piece proxies I mentioned earlier and the actual just proxy servers, which I mentioned at first. The proxy servers use dynamic protocol translation to allow multiple clients behind a firewall or otherwise or just on a common corporate network to share it downloaded from a single server. So if say one person, all large corporate network is grabbing a software update, they will actually serve that to all the other clients on the local network. So if Windows update or something were to eventually make use of this, it would replace SOS basically that they can fetch an update and then every other computer on the network that needs that update can get it from the first person you've fetched it. And it also allows people behind a firewall to make outgoing connections to other computers or receive connections from other peers. And this is what clients can basically do. They can browse available files. They can transfer files and they're either passive behind a firewall without a proxy or active where they can make both incoming and outgoing connections. And that becomes really important because you need a large number of active clients on the network for it to be able to function. I did look into some ways of getting around that. I don't know if any of you know about Beep. Yeah, okay, one guy knows about Beep. How about, who knows about JXTA? One person knows about JXTA. Okay. There are several ways of getting past the incoming outgoing connection problem with firewalls using protocols like Beep and JXTA. But apparently no one here knows about this, so I'll just move along. Alright, here's what I really wanted to get to you. Here's how it all comes together. So basically, this is the smallest possible configuration of a PDTP network you can have where you have a single hub and a single server that serves several clients. The server manages who's getting what from whom and the hub actually has the pieces of the content the clients are interested in. If there's no clients on the network that have an available piece, it will fetch it from the hub. Otherwise, the server will try to locate the piece on the best available client and tell that client to transfer it to the other client. Much like BitTorrent. And there's actually a very complex ranking algorithm that goes into that. You probably don't want me to delve into that because it's a lot of set theory and I'm sure you don't want to hear about that right now. You'd much rather get on and get trashed or something. Does anybody actually want to hear about that a little or not? No, okay. Alright, this is the next step up and this is where we actually get into probably something a little more interesting which would be corrupting the law. In this configuration, well this configuration still has client fetching from the hub but okay, I'll get into that part a little later. In this, there's just multiple servers. So multiple servers can allow more than one system. Say you had a big network of FTPers you could designate several of those as servers and you would have a single hub that would manage all the clients on the network. Yeah, okay. And in this configuration, the peace proxies are being used to basically shield the hub from the clients themselves. So the upstream bandwidth of the hub will not be utilized except if they're transferring the peace the peace proxies don't have or just for serving directory listings. So in this configuration, the only bandwidth that's really being taxed is the bandwidth of the servers and the peace proxies in the hub is largely minimal. It's bandwidth consumption. Okay, I'm not sure if you were... What can I do to liven this up? Everybody is just out there zoned out. I'm sorry it's boring the subject musical but there's not really much I can do with it. Does anybody have any questions? Any thoughts? Anything? What? The what? Yeah. No, I haven't read that. The mean paper I have read on that is the one Bram Cohen read on the properties of randomly selected networks. Okay. Somebody else has given us a better eye about here. So let me have that. I'm sorry about that. Do you want to just come up here and use the mic? Because I can't really hear you and all of this other laptop, huh? The question I had was just about... The paper suggested that there was a number of potential flaws or vectors for attack from the Microsoft perspective on content distribution networks and one of the things they said was around the servers that would host the directories of the databases so if you had a database of all the content you could find in the network because if a client can find it where that database is the lawyers could find it where that database is that would be where they would focus their efforts is being able to bring lawyers against your database of your directory server. Yeah, I'm not really sure if you saw that on the last diagram but in the configuration I had there the hub server, which manages the directory listings and the available pieces is actually entirely shielded from the client network through the peace proxies and the PDTP servers. And, okay, if you're saying if someone has, say, elicit content being served through a hub how would they shield that from the lawyers or the company that wants to take it down? Well, what they would have to do is take down all the actual servers the PDTP servers and the peace proxies so if you can find people to run those for you we'll just take them down on request you can actually keep the addresses of where the actual content is hosted separate. All right, okay. Who did any of this again? I have a picture of a motorcycle up here. Oh, do one now? Oh, yeah, which I can't see. I've just got a picture of a motorcycle up here is the thing. All right. Well done, all right. So, yeah, this is kind of what I was talking about before Bram Cohen's paper. Bram Cohen, who developed BitTorrent wrote a paper on the advantageous properties random networks and that's what he depends on to handle things like clients who aren't really contributing to the network. He uses a algorithm called choking where the tracker will either say this client is doing a good job transferring pieces on the network and it will be unshoked or this client is just leaching pieces off the network and not uploading anything and then it becomes choked. PDTP is a weighted scoring algorithm where it basically keeps track of every piece that a given client is uploaded and multiplies the score of the destination host by all of those values to compute a new score every time the transfer finishes. So clients have been successfully uploading a lot of content, will be ranked higher and then clients who may be slower will have a lower score and therefore if a client experiences a slow upload to a crappy client with a low score that value will be weighted lower in the average. So that's basically how it tries to select the best peers. The actual protocol format is binary like most modern protocols. It's fully bi-directional. It supports synchronous operations and asynchronous operation operations. And who here knows what ASN.1 is? Okay. It's used primarily by the X.509 public key architecture. It's a standardized binary format and I did consider using it at first. But I just decided it was far too complex and given a large series of input validation errors in both open SSL and Microsoft SSL implementation I decided it's probably too complex for its own good for me just starting from scratch. So I developed my own binary format, binary protocol format. It's pretty simple here. You have a length value, a serial number, whenever you have synchronous transactions you give them a unique serial number and then when you get a reply to that it will have the same serial number. On opcode this is just a value that determines which command you're executing on the server. If you look down at the bottom it uses the first bit of that to determine between a request or a reply. And then after that it takes an arbitrary number of binary objects which can be strings, integers, basically strings and integers. And it lets you basically have variable arguments to any given operation. And that's the format of the objects. They just have a length identifier for what they are and then the actual payload. And one thing if you look at this the length of the entire transaction is the first thing that's given and then every object has its own explicit length. So there's no need to compute lengths which dramatically decreases the complexity and input validation. And then it also uses existing standards for authenticating either clients or servers on the network, servers to hubs or proxies to hubs. And who's familiar with HMAC here? Seeing few people have been raising their hands every time, okay. It uses the HMAC hashing algorithm with the secure hash algorithm to authenticate. One of the mean things I've seen missing in Internet file transfer for years is integrity validation. I'm sure many of you have heard about when SunSight got broken into the University of Alberta got broken into and the master distribution about the SSH was charging. And nobody realized it at first. PDTP provides a way for clients to get a signed X.509 certificate for the server. And it will also, at the option of the server administrator, let you compute and apply DSS signatures to files which will be given to clients after or upon request. So after a file is finished downloading it can compute the DSS signature for that file and check to see that the signed key for the server or with the signed key for the server and then check to see if it matches the pre-computed signature given by the server. So there's a formalized mechanism to prevent file tampering. So hopefully you won't run any Trojans using PDTP. Another feature this hasn't gone to IANA yet. It has its own decentralized search system. Who's familiar with Hotline? Okay, more people. All right. It's sort of based on the idea of Hotline where you had trackers that would keep track of servers. But what it allows is for clients to query large numbers of servers at once by just sending a single search request PDTP datagram which they'll resend a few times if the server doesn't reply and if they don't get any reply after that they'll just assume it's down. And after they've given that datagram and the server's done doing its search they'll fire back another UDP datagram at them telling the search is complete. They'll also have a key for them to fetch the search results from the server then they connect by TCP to fetch the search results. So the nice thing about this is I will be keeping a master list of all trackers and then the trackers only keep track of the addresses of servers. Nowhere in the first two tiers of that entire system is the actual content being searched for or exposed. So hopefully no one will have any sort of legal recourse to try and show a system like that down. And yeah, there's what I just said about how the search process works. They send a UDP request and then the server responds if it's up and then the client's can fetch the search results by TCP. These are the actual implementations that I and a few other people have been working on. They're all available on Sourceforge. There's a client library. It's designed for use in both single and multi-threaded applications and it's written in C. Everything is all portable across Win32, macOS 10, Linux, any BSD you want and Solaris. I'd like to support a few additional EX operating systems like HPX and IRX, but I don't have access to those systems so I can't really do testing. If anybody wishes to help with the development and volunteer in HPX or IRX system, that would be great. The server component, well the server components are all part of a project called Squal which is also available on Sourceforge. There's a Squal server which is a PTP server, Squal Hub, Squal Peace Proxy and Squal Proxy. And again, everything is portable across. The server is limited to NT-based Windows. We run on Windows 95, 98 or anything. It uses file system R and features found in NT only. I'm also in the process of developing a GUI client written using QT4. That just released the developer preview of QT4 which has a number of features I really need for it including anybody know QT or should I just not go into this. One guy it uses that is not present in QT3 is the cross thread signals and slots where you can call signal in one thread or invoke a slot in another. And then, because I have far too much known not invented here I'm developing my own Apache portable runtime application library which supports a lot of features that APR doesn't. Who knows actually like an avid programmer which you consider yourself an avid programmer. Not many people. Of you three people raise your hands. Are any of you familiar with continuations? Okay I just won't go into that. But USRI provides a cross platform implementation of continuations and file system monitoring. So it's got a lot of cool features. Getting involved considering not many of you are programmers you probably don't want to do that but we're certainly looking for documentation writers if you're all interested or tester, yeah. Sign you up all right. I know everybody loves writing documentation but yeah basically right now all we have is a deoxygen output which probably no one here knows what that is three insane three guys. Wow you guys are good. Who knows what JavaDoc is. Okay a few more people. Deoxygen is basically an open source counterpart to JavaDoc for whatever language you want. But the main thing you need are unit testers and there's no programmers here so I'm probably not going to find any of them. It had one more slide but it seems to have vanished. All right. Okay so that's the basic overview do you have any questions? Two minutes. No questions on anything you I've got a question you've got a question. Okay Yavitch you come up here and we'll get you mics for your question. And now is the actual file that's being distributed on the PDTP server itself now it's on the hub oh so there's actually a central file for this it's not like big tour where it's actually on a client's file um well the clients will fetch it initially from the hub and then transfer it to each other or they can fetch it from the piece proxy so in order for it to be served our PDTP it has to actually be uploaded first to the server yeah um one of the nice things is all you have to do to put the file on the hub is just drop it into a directory that you're exporting so there's no generating a torrent file you just upload it to that directory and the server will automatically pick it up through file system monitoring and generate hash list for it All right. My question is how does PDTP bypass stateful inspection firewalls to be able to establish the connection to be able to transfer the file? Do you have a proxy server that can be used to do outgoing connections and also receive incoming ones? Everything's keyed by IP2 so it only allows one connection per host so anybody behind in that firewall who wants to transfer the same file from the same server pretty much needs to use the proxy server but yeah it will route the incoming connections since the proxy server has complete access to all the protocol, the whole wire protocol from the client to the server it knows which host is going to be making an incoming connection or what's going to need to make an outgoing connection and it can route the connection appropriately All right. Yeah, multiple questions All right. Last one. Yeah. The client library basically runs each thread each transfer its own thread and then it allows you to multiplex directory listing traffic or multiple file transfers through that connection so you can still browse all the content on the server while multiple transfers are going It's all written in C so it has its own server that you can run on any Windows, Linux, BSD, Mac OS X or Solaris system Thank you.