 Let me say hello. I'm Toby. I'm a researcher at the University of Hamburg in the northern part of Germany My other hat that I'm wearing is a gnome hat. So I am also active in the gnome project and within the gnome project I'm trying to uh, well increase the privacy and security aspect of you know, your computing and Today, I'm here on this slot. I'm here to present research that we've conducted in our lab in Hamburg So good to have experts here, that's amazing brilliant stuff So what was the fix just push the button or perfect Probably with the right tension on the button and Right, this is uh, this is the computer stuff. Actually. I was uh, I tried to push this new line of view research with this Which is a theological research theological computer science because you know in the computing everything is non deterministic And you need to sacrifice childs and have demons and everything so but that's for another talk so um As I said, um, I have a few hats one is this research hat from my university the other is My gnome hat and today or in this very slot. I want to talk about some well research we've done in What we call core internet protocols? Regarding well tracking users privacy aspects in these protocols And um, I have to name my colleagues I will refer to a few papers that we've Written and published so if you want to know more about like several Graphs or several strategies that I will present I will have the pointers in them in the slides so Tracking them in internet protocols in order to understand what the what the issue is what the scenario is Let me ask you to briefly think about what happens if you you know type a Some adders in your in your browser and hit enter and then you know the next few Milliseconds of seconds what happens and now we could do that interactively as a quiz and so on but we don't have so much time So I'll tell you what happens we have Many round trips back and forth and it starts with the DNS and then eventually you you know You have the IP address then you open a TCP connection and then you send your TLS request because you know we want to push Encrypted communication forward and eventually here down down below you have your payload This is what you what you actually want to send to the other party the stuff before is the overhead involved in Well connecting to that web serve in this case Arguably Bandwidth is not so much of an issue these days. I mean of course Well, you could always want more and you know, you're never satisfied with the amount of bandwidth you have But arguably latency is much more important these days than bandwidth And let's just make that an assumption right if an academic doesn't have anything any data to back things up We just assume that it's the case and we just assume, you know that latency is an issue so people want to reduce the number of round trips here the number of times back and forth and well Thinking that this well makes the application more snappy and you know Uses are more happy when they have snappy websites and in fact there was a I think it was The motivation like the paper that that introduced some quick when when Google introduced quick I think they they conducted research in their on their YouTube platform and I remember something like Google claiming having and now I make a number of thirty percent more revenue when they have Clients connecting via quick because then people are less likely to just you know Prematurely abort the connection and not watch the video and the advertisement So maybe it's not 30% maybe it's 13 but even then you know saving one or two round trips does have an impact on well People like Google or entities like Google when they terminate loads of connections and well saving a few round trips is good Anyway, so this is the standard connection establishment as of today And what do people do to reduce the number of round trips people do things like this So there's a few mechanisms one is TCP fast open there's a TLS session resumption which may be more which you may be from more familiar with and Then there's quick which will be the new Or it will be the base for the HTTP 3 a protocol so these are mechanisms to reduce the round trips and We had a look at the privacy properties in these protocols because you know, maybe Sure, Google is interested in making your connection faster giving you a more pleasant experience while browsing the videos But maybe just maybe it's a hidden tracking mechanism You know that nobody thought of and it's the grand evil plan and now everything comes together and then you know They know everything So we had a look but before discussing these in more detail. Let's have a look at those this is quick and We don't really we don't really need to understand each and every line item there What we will see is that there's a connection establishment made by the client and then at the end of the day We get a like some token from the from the server which we can then in our subsequent request use here So This is the initial connection establishment This is the sort of resumption when you come back later Then you know you save one round trip and you do that because you you share states with the server you have established this You know this the state which is encoded in this well cookie or token whatever you want to call it and Eventually you send it back so that the server will know who you were or who you are and can then link your state So this is some and it's a very brief overview of a quick right if It's of course There's many more aspects to this but but this is there the core aspect that we were interested in TCP fast open works similarly it's some in TCP You know, there's the streetway handshake right we send our son and then we get the response back And then we again send the act and modern like systems they only Push the payload with this third Packet there with the act there's interestingly enough There's nothing in the there's nothing preventing you from already sending your payload in the initial soon But turns out in real life people are bad and on the internet people are especially bad So the the servers don't they don't hold state for your initial soon. In fact, you know Servers are so afraid of keeping state that they don't even Want to to hold the state of your connection ID or TCP ID and instead tend to use something like sin cookies which you know is a Mitigation for a for a DOS attack so that the server does not need does not need to maintain even these few bytes of TCP session ID they would Be even more afraid of keeping 1500 bytes of payload in your from your initial TCP soon having said that Facebook thought it'd be clever if we found if we still found a mechanism to send payload in our very In our initial packet So How does that work Facebook thought it'd be reasonable that if you established a connection once This is this is regular TCP. This is fast open if you established a connection once Then you're deemed good enough to come back a second time With the payload already in the first packet assuming that if you're a botnet or whatever then you can Possibly not establish a full connection and then you know, it's it's a trade-off. They made they say if you came once and you've had established a connection once then we Trust you that you're good enough that you can come back a second time And we're happy to then accept your payload in the first packet. So how does that work? You may see here is an addition compared to this is regular TCP You see the sin and the act and you see here is the cookie and the cookie is an opaque Blob it's a few bytes I think something between 8 and 16 bytes That the server sends you and then you can see here in the in the fast open case We not only send our sin flag. We also send these few bytes back to the server and the server then Well checks that the cookie matches its expectations and then accepts your your data that you fear that you've pushed here Here's the payload that we want to send and Well, there's a few technical details such that this cookie of course does not Have to make this have a maintained state because that's what we were afraid of right so what does This mechanism do that's probably the arguably the clever part of it They it's a Mac over the metadata and in the case of Linux. It's the Mac over the Source IP address and the target IP address Maybe the ports I forgot and then the server when you come back The server just needs to to recompute the Mac of the metadata and see whether it matches the cookie and in that case Well, you're good enough all right, so this is fast open and the The attentive reader may already see some problems with these schemes regarding Well tracking and so on but we come back to that later. Let's first talk about session resumption The idea is very similar you establish a connection once and then You ought to be good enough to come back in the case of TLS It's a bit more complicated because we have negotiated encryption keys and There's a few mechanisms. Well, actually two main ones with in TLS 1.2 And then there's a pre-share keys in TLS 1.3 That allow you to Tell the server that you've been here before you've had a connection before I am the client I've had a connection with you before here's my state Please Sava, let's reuse that state. Is that okay with you? That's roughly how session tickets work There's some these two mechanisms session tickets is Let's start with the session ID one everybody remembers The TCP fast open cookie I've had before that was only like a few bytes right only sort of and if you want to find an abstraction It's only your ID or your number of you know your ID Your running number of connections that you've established so that and then the server could look up look it up in a database This is what session IDs are so you get a 16 bytes or so that identify you and then the server has to pull the state from somewhere Like from a database or something and then reuse that state turned out servers are lazy They don't want to maintain the state So they serialize all the all the state of the TLS connection in a relatively large What they call ticket and then the server the client sends this ticket back each and every time it wants to connect well Trying to coarses as a server and to De-serializing the state and then well taking it from there and resuming the the connection That works well And with the TLS 1.3 everybody ought to use it these tickets are even encrypted that was not the case before So that's good How does that work it's this is TLS 1.2 we have our client. Hello We have the server hello, and then eventually the server well We indicate that we have support for session tickets that we want one and then we eventually get this session ticket and when we Reestablish the connection we sent this the session ticket back in our client alone So we tell the server here's my ticket that you've given me before Would you please reuse that state and then we can save the relatively expensive negotiation of a shared key So that's right. That's a good mechanism and if you Well Listened carefully, and if you have the ethical mindset then you notice the problems which are that well You get the state from the server and the server needs to somehow Well reuse the state from before so there's nothing preventing the server from linking these requests, right? There's the server By definition the server needs to know who you were before in order to reuse these this key material that you've established before and in the case of Well TLS it's the session ticket obviously could be the session ID to but it's not that prevalent and The TCP fast open case we have this cookie Do you remember we had this eight bytes that we get from the server and turns out that if you send it back to the Server then well surprise surprise the server learns who you were Even worse the a network-based Attica a passive Attica like me now when I'm snooping all your wireless connections I can see what token you receive from the server and then I can see you later when you send this token back and At least in the TCP case and in the TLS 1.2 case because the these tokens are Sent in as plain text almost and similar with the with this quick token that I've mentioned before So that's a problem right because now we have You know especially in the TLS case because we wanted to to use TLS in first place to protect our privacy and now what happened Now we're leaking not only to the to the server like who we were which may be fair enough, right? If you're connecting to Facebook, it's you know, maybe you're not so much concerned that Facebook learns who you were But even worse your network-based Attica can now re-identify you based on This privacy preserving protocol that you're trying to use which is TLS And it's a complicated problem because we we have this inherent conflict of what we're trying to achieve In terms of performance with what we're trying to achieve in terms of privacy You could say now. Well, you know, I don't use any of these mechanisms and then I'm good and you're right You could do that. You could simply not use TLS resumption in fairness though you'd have to configure all your libraries to not do that because they tend to give you the advantage of like more a Quicker connection establishment, but you could do that. You could opt out and What we ought to try though is to make it harder to make these simplest attacks harder to these these tracking Attacks if you want to call it column that right this that the passive network Attica is not able to link your requests that would be already a good step right then well, we would Save some of our privacy When we send these cookies or tokens back to the server So how prevalent is that how much of a problem is that really this resumption case? This is the Lexa top million. I think yeah one million. It's this proprietary top list of website made by Amazon And what we can see here is that only 4% of the internet. Let's call it the internet is Not using any form of session resumption Conversely the rest does We can say that the internet uses a TLS session resumption Well, not only to make your experience better, but also to save some CPU cycles and round trips as we've learned before So potentially These are all tracking you through like the mechanism we've seen before And what does that even mean to track like how big of an issue is this tracking? problem saying We've measured how for how long? service advertised that They can track you. This is my negative interpretation. Their interpretation is for how long this key is valid now and We have also measured How long clients actually? Reused this token like this Well, this tracking cookie if you want to call it that so Turns out that self-advertised Tracking time is roughly 24 hours. There's a few exceptions. I think you can Facebook were like they had long tracking periods and There is Well, when you reconnect then you get a cut off Here at about thousand minutes or so where and when clients tend to not send it back for whatever reason By investigating the clients more closely We see this behavior, which is that well Clients keep the session ID, which is this relatively short token for this long and the session ticket Which is this rather large? well state for that long and we see things like Anything from 30 minutes to a day or a day is the longest and then the question is Can we extend this period? You know we have clients because clients are not necessarily dumb They they know it's a problem. You know that you can be tracked. So clients try to prevent you from being tracked So what do they do? Well, they Let the this token time out and in this case they let the token time out in 30 minutes or here in one day And then the question is could we as a smart Attica somehow prolong this period? You know could we somehow track users for longer than that and Yes, it's possible if we think of an Attica like this like a third party, you know your Google AdSense or whatever and you have a website a and a website B and a being Let's say Facebook and B being say eBay and both include content from the third party say Google AdWords or whatever it is And then because you're browsing website a you're establishing a connection to Google Google gives you their their session ticket. This is TLS 1.3 now by the way I thought I'd throw some more modern stuff in this is a TLS 1.3 handshake and Then when you connect to website B to eBay You also establish a connection to Google and because you're reusing your session ticket or in TLS 1.3 It's the key share then Google can link your two requests together and again your browser might extend the period on which this token is valid and This is a measurement of how long you can actually track users using this method it's based on data we have acquired a real browsing behavior and We see that the This is the reconnection time of when users reconnect to a website for whatever reason Not necessarily directly, but also indirectly and we've measured that you can Track users for like very long time You find details in this paper there. It's some Quite interesting research. So what do we do to like prevent this? Well, how about we Do not send plain text tokens in first place wouldn't that be nice? Yeah, it would be so good to have more encryption on the link layer. Yeah, whoo everybody here should be in favor of that And yeah, how about we encrypt these? tokens using well some mechanism that we don't have yet in fact we this Quick token is not yet encrypted. How about we just encrypt that the fast open cookie is not encrypted yet How about we encrypt that and then TLS it's easy though though because TLS 1.3 encrypts our session ticket. That's good so um Let's not focus on this one because we've talked a bit about quick. Let's talk about the fast open case And fast open we have the problem that we only we are on the TCP layer and TCP has no encryption yet And we at the same time require to not incur state on the server side Because it ought to be cheap for the server. So our our options here are limited What we came up with is to combine these two layers That is the TLS layer and the TCP layer to send the token over the encrypted channel Yeah, that's the idea and it looks it then looks like this you have your TCP connection establishment You send your TLS client hello, and you get this cookie this TCP fast open cookie back in this encrypted channel And then you have the problem of feeding it back to Linux because well You somehow Linux needs to know about this cookie so we introduced that you know system to new sock ups These are socket options. We needed to introduce to make this happen it's well some Sock up to get and set the cookie on a socket and on the server side We need to actually generate this cookie in order to you know for the client to to connect back And this is the patch. It's like 200 lines more or less So it's easy enough to get into Linux. It's not terribly complicated to Bring more privacy to these protocols on the TLS side. We've used Wolf SSL They have a booth here also. I think in an H good guys. I can highly recommend talking to those people. They're very nice and Wolf SSL we've patched such that Cookie comes over TLS and then we have this patch of also roughly 200 lines and it's minimal overhead We have some more ideas how to well further increase performance and privacy Of these protocols by combining these layers and the core ideas are to somehow pre-fetch data and send them over an encrypted link And there is some more applications. We could well or some more protocols We could treat that way my time is up. I was told If you have any comments questions, whatever, you're very welcome to contact me the email or personally And I don't know whether we have time for questions. I'm fearing not but I will stick around. Thank you very much