 I would like to share with you how I wrote a Python client for HTTP-free proxies. I will give you a short introduction to the latest HTTP version. We will look at proxies in general. And we will also look how all of this fits into the Python ecosystem. My name is Miloslav. I'm from Prague. And I work for Akamai, where lots of my work over the last five years was related to HTTP-free. HTTP-free is important to Akamai because we run one of the largest CDNs with over then 300,000 servers. But today, I won't talk about the CDN. I will talk about proxies. Last year in 2021, Akamai built a brand new network of proxies. And as all software needs testing, my job was to write a client that would connect and test these proxies. If you are wondering how such proxies could be useful, you can look to your iPhone or Mac and search for iCloud Private Relay. Closer. Private Relay is described in a paper from Apple. And if you look to that paper, you will find that it's using so-called mask protocol. Mask is a draft from IETF. So I won't talk here about some proprietary service. We will look at modern, open standards for proxies. Mask means multiplex application. I guess that it means just somebody wanted a nice acronym. Mask proxies are normal HTTP proxies, just with one important difference. They are using HTTP-3 instead of the common HTTP-1. So if mask proxies are just HTTP proxies, HTTP-3 proxies, the question is, what is HTTP-3? What makes it so special to talk about it? HTTP-3 is HTTP over Qwik. OK. So the question real is, what is Qwik? And that's a little bit more difficult to explain. And we don't have much time. But I will try to give you a short introduction. Qwik is an alternative to TCP on top of UDP. You know, TCP is a protocol that powers the internet since 80s. And it's implemented in operating systems. So your apps can open a TCP socket, write data to it, and the data will get to a remote side, complete and in order. And your apps does not need to handle that because it's what your system does. But TCP is not the only internet protocol. For practical considerations, one other can be interesting. It's UDP. UDP is dull compared to TCP. With UDP, you send a datagram, and you cannot be sure whether it gets to the other side or not. So approximately 10 years ago, some people in Google got an idea. What if we built something like TCP, just better improved hopefully, using the primitives that UDP has? And they did it. They made Qwik. Now, with TCP, you can add TLS for secure connections. With Qwik, you don't have to because Qwik is encrypted by default. It has TLS built in. So it's always fully encrypted. That's one of the important advantages. And now, I'm getting back to HTTP. HTTP 1 and 2 use TCP with TLS, hopefully, for transport. HTTP 3 use Qwik, meaning UDP. Besides that, it's quite similar to HTTP 2. An important advantage of Qwik is that it's multiplexed. Multiplexing means that you can open multiple parallel requests over a single connection. You may know that even HTTP 2 is multiplexed. You open one TCP socket, send multiple parallel requests over it, HTTP 2 requests over it. But internally, those requests have to be serialized into one TCP stream. So if something bad happens there, like a packet is lost, everything in that stream is being blocked has to wait for that one lost packet. Qwik supports multiplexing at the network layer, the transport layer. So each request gets an independent stream. And if one packet with one stream is lost, then only dead one request is blocked and others stay unaffected. And this is a very important property for proxies, because you are very likely to use one proxy to connect to many different origin servers, to have many different parallel connections. OK, so we know that mask proxies are just HTTP 3 proxies. But proxies are quite broad term. So before we dig deeper, I would like to clarify about what kind of proxies I'm talking about. We should distinguish forward and reverse proxies. The reverse proxies are quite common, because they are utilized by websites. For example, when I visit the EuroPython websites, it's quite likely that my requests go through one or more reverse proxies. But I, as a user, I'm not aware of that. I don't care. They are not my proxies. It's the business of the other side. By contrast, forward proxies are chosen by user. I explicitly connect to a proxy and ask it to act on my behalf. I can use that proxy. For all websites that I visit, the website does not need to know that I'm using the proxy. It's my proxy. Another source of confusion could be the difference between HTTP proxies and SOX proxies. The SOX proxies are low level, quite simple. They are great for local network, development, private networks. For example, SSH clients can open a local SOX proxy that tunnels your traffic through an SSA connection. But I would not use, I would not expose the SOX proxy over the public internet. The protocol is simple. It does not support encryption. There are only basic authentication options. Compared to SOX, HTTP proxies are much more powerful as they get everything from the HTTP ecosystem. Do you want encryption? Just use HTTPS instead of HTTP. Do you want authentication? There are many options in HTTP. You can employ all your favorite tools or toys that you have in HTTP. I don't know. You can add a reverse proxy in front of a forward proxy. Why not? It's all HTTP. Mask proxies are HTTP forward proxies. I won't talk about SOX proxies, and I won't talk about reverse proxies. I would like to show you how a proxied request look like. But it would be quite difficult to read binary data on the slide, and the HTTP free is a binary protocol. So I will use HTTP one in my examples. The principles remain very similar. I'm sure that most, if not a few, know what an HTTP request look like. I open a TCP socket, for example, using netcat, write my request to it, and I eventually read the response. Simple. That's all. A proxy is a server that can return resources from third-party hosts. By the way, these third-party hosts are often called origins. So we say that the proxy is behind me and the origin. So in the proxy use case, I connect to a proxy, for example, running at my local host, and I request a resource from an origin. And then the proxy issues a request on my behalf and forwards a response back to me. This is called proxy forwarding, and it's almost useless today. These days, all important websites use HTTPS, meaning that the traffic is encrypted using TLS, and we don't want some proxies to peek into our traffic. So to support encrypted traffic, proxies need a completely different mode of operation. This mode is called tunneling, and in the tunneling mode, I connect to the proxy as before, but I do not send my request. Instead, I ask the proxy to set up a tunnel. We have a special HTTP verb for that called connect, and when the proxy set up the tunnel, it responds 200 to indicate success. From now on, everything I send to the tunnel is being tunneled to the origin, and everything from the origin goes unmodified back to me. Now, you should be asking, where is the HTTPS? I have just told you that the tunneling mode is used for encrypted traffic, and the example is still plain HTTP. The truth is that we can tunnel any protocol. The proxy does not understand the data that we are sending. It's just bytes for it. So I can tunnel plain HTTP, as in my example, I can tunnel TLS for secure connections, or I can tunnel any other protocol. In the tunneling mode, HTTP is there just to set up a connection. Once the connection, once the tunnel is established, the payload can be anything. And when I say anything, it includes another proxy connections. This is called onion routing. I can connect to one proxy and ask it to set up a tunnel to a second proxy. Then through the first proxy and second proxy, I can connect to a third proxy, and so on, as many times as I want. This is great for your privacy, because each proxy just sees the IP address before and after, and not everything. But let's get back to mask proxies. I have told you that mask proxies are HTTP-free proxies. Actually, they should also support HTTP-2 as a fallback for networks where UDP is being blocked. Mask proxies work in the tunneling mode only. It's logical. It doesn't make sense to implement some forwarding legacy mode just because of small fraction of unequity traffic. The mask specs explicitly mention the onion routing, so it's thereby designed. There are some other goodies that normal proxies do not have, but I won't go into that today. OK. I want a client that supports mask, meaning that I need support for proxies and for HTTP-free. What are my options? What do we have in Python? Python has batteries included, so we have an HTTP library in the standard library, but probably not probably surely the most popular option today are requests. Unfortunately, both these libraries support HTTP-1 only. For HTTP-2, I can use a nice library called HTTP-X, but I am not aware about any ready-to-use client that would support HTTP-3. I carefully say ready-to-use because we have so-called sans-io libraries, and we have them for all major HTTP versions, including the latest one. Sans-io or bring-your-own-io libraries are distilled protocol implementations. They can convert your requests to bytes, they can convert bytes to responses, bytes to responses, but they don't transfer anything over the internet. That's your responsibility, that's something you have to add. Sans-io is great if you don't want to start from scratch, but I'm not so sure that it's such a great idea to write protocol implementations in Python, especially if you care about performance. And we usually choose HTTP-2 or free because of performance. We should look at libraries like NGE, HTTP-2, which is a C library, implement HTTP-2, obviously, but this is in projects like NGNICs or QL. We have competing HTTP-3 implementations, for some reasons, two of them are called cache, so just be careful to distinguish them. But in any case, I think that we should consider our options how to wrap these libraries into Python. At this point, I would like to make clear that I'm speaking about very low-level layers, something that most people don't care about at all. Users want something like for humans, but that's not my topic today. I'm looking for internals of the libraries that would allow me to combine protocols for proxies and origins. So let's finally look at what I made, how my mask line doesn't look like. The core of my client is based on the IEO Quick Library. Honestly, I did not have many options here as it's the only quick implementation in Python. And as IEO Quick is a SunSight library, I had to add some IEO, so I took sync IEO from standard library. Together, I got a simple HTTP-3 client. At this layer, there is nothing related to proxies. The proxy-related logic is one layer above. I have a proxy client that uses HTTP-3 client to tunnel bytes through an HTTP-3 proxy. At this layer, I have something like nutcat. But with proxy support, I can write bytes into it and read bytes from it. But that's still not enough. I want to test real HTTP requests. So let's add one more layer. I took H11, which is a SunSight implementation, combined with my proxy client. And at this point, I can finally tunnel HTTP-1 traffic through HTTP-3 proxies. Now, do you see any pattern in my class structure? I see that I have two layers. And at each layer, I have a SunSiO library, once for HTTP-3, once for HTTP-1. So at both layers, I had to combine it with some IEO. In one case, it was from the standard library. In one case, it was my own class. And together, at both cases, at both layers, I got something that speaks HTTP, a very low-level HTTP client. Looking at this pattern, I was considering how to properly generalize it. What if I want to support HTTP-2 proxies? If I want to support HTTP-2 traffic, what about other combinations? What about SOX proxies? 10 years ago, two years ago, I gave a talk at Europe about HTTP-3, where I said that HTTP is only one. Its concepts remain the same between versions. HTTP-1, 2, or 3 are just implementations mapping the one HTTP into TCP or UDP. And new RFCs from this year follow exactly this design. We have one generic about HTTP semantics, and then RFCs for each of the versions. And I think that my next client should be structured exactly like that. I should have an interface saying what HTTP should do. And then I can implement it for different versions, typically by combining a SunSight library with some IO, or maybe for better performance by wrapping some C implementation. To support proxies, the design should have injectable IO. Like, I don't want hard-coded sockets. So I think that I should have interfaces for TCP-like streams or for UDP-like flows. Obviously, the most common implementation will just wrap sockets. But I will be able to plug in implementation that tunnels traffic through a proxy, be it TCP or UDP. And by the way, we can even tunnel UDP traffic through an proxy. There are few design details, or maybe lessons learned that I would like to mention. The first one, maybe the most obvious one, is that the main advantage of HTTP2 or HTTP3 is that these versions are multiplex. And to make use of that, I think that most of the code should be asynchronous. Another quite detail, but I think important detail, is that Qwik is implemented in user space, meaning in your code or in a library that you take with you. So whenever you open HTTP3, a Qwik connection, there has to be some background task running that does the networking stuff, like sending packets, acknowledging packets, retransmitting packets, and so on. And last, not least, I somehow learned it's not easy for me to write low-level async.io code. There are some protocols that you push into some design. So maybe next time I would consider something like trio-issuesing, which can be maybe an AI or something like that. A logical question is, like, now I show you what I did, where you can find it. I tried to hack some very simple support into HTPX. It was very limited, but I haven't published it because it's really tough. HTTP3 alone is complicated. And my task was even more difficult because I had to support the different protocols for proxy connections, original connections. So there is not much overlap between the existing libraries and the code that I wrote. So if you are interested in this, I will be really happy to discuss that and look how to do that. But so far, I cannot share much, unfortunately. OK, so I shared my vision, my lessons learned. So it's time to conclude the talk, what you can remember. Please remember that HTTP tunnels are simple. Like, you just sent a connect request, and from then on, everything is tunneled through. Simple. Proxies are not complicated, but HTTP2 and HTTP3 are different than the first version of the protocol. So maybe the existing abstractions may not be sufficient in all cases, especially if I want to support combinations of protocols for proxy-oriented traffic. And speaking about abstractions, do not forget that HTTP itself is an interface. Where HTTP versions are its implementations, we can implement them using Sunset libraries, which are great. They help me a lot. But I would not forget about native C++ implementations if I care about performance. And that's all from my side. Thank you for your attention. Thank you that I was able to share my experience. Thank you, Miloslav. So we started a little later. The next on the agenda is coffee break. So we are going to take some questions now. So you can queue up. OK, we have one. And I think we even will have the remote ones. So please ask the question. Hey, thank you for the great talk. I have a question because you said that Qwik is implemented in user space, and it probably makes sense because then you can easily adapt it to Windows or Linux or macOS. But then the question is, are they going to implement it in the kernel space? Or maybe they are leveraging things like IOU ring on Linux to make it faster or efficient. You know it? What I know is that it started in user space because it was started by Google, so you get it into Chrome. So it was everywhere on Google servers, on Akamai servers, on your laptops because everybody is using Chrome. And that was helped to develop the protocol. Then it was standardized by ITF. That's by the way we have two quick versions, like Google Qwik and ITF Qwik. And I noticed that in some talks that people from Apple or from Microsoft speak about these protocols. And I think that iOS has some network, some APIs, but I don't get it. But it's quite likely. If I had to guess, I would not be surprised if it got to kernel at one point. Because the problem is that you will have a lot of interrupts with this UDP connection. But if you would use something like IOU ring, then maybe you would not have them. I know that about this performance and interrupts and this kind of like you get stuff that I don't understand. We have lots of people that spend a lot of time on that. That's definitely something that has to be solved. But even with all these things, like the numbers that we saw for like Qwik, it's like the people choose it for some reason. It's like it's working even in user space. So we have time for one short question if someone has it. And IOU, OK? So you mentioned Qwik is underlying HTTP 3, but Qwik is a general transport. Is there anything other than HTTP using it yet? I think there are some attempts that I don't remember. But that's just like approximate idea. But I think that there was some kind of effort to standardize Qwik as a protocol. And it was taking a lot of the time with no clear direction. So at some point, somebody said, like just standardize HTTP 3. That's something that everybody will be using. But if you, for example, look at the code, you see the layers. This is the Qwik layer. You open a stream. You write data to it. And then you have HTTP layer, which says here is a frame with headers. Here is a frame with data. So you can easily, for example, simple demos can send HTTP 1 over Qwik. And it perfectly works. And maybe it's even valid based on the specification. You can use it. You can write it. That's like working. But I am not aware about other use cases at the moment. But it's definitely possible. OK. Thank you very much, Miloslav. Please show that you really like the talk by clapping really hard.