 now towards more trustworthy Tor network by NUSINU. The talk will give examples of malicious relay groups and current issues and how to tackle those to empower Tor users for self-defense, so they don't necessarily need to rely on the detection and removal of those groups. So without further ado, enjoy and we'll see each other for Q&A afterwards. Thanks for inviting me to give a talk about something I deeply care about, the Tor network. The Tor network is a crucial privacy infrastructure without which we could not use Tor browser. I like to uncover malicious Tor relays to help protect Tor users, but since that does not come without personal risks, I'm taking steps to protect myself from those running those malicious nodes so I can continue to fight them. For this reason, this is a prerecorded talk without using my own voice. Thanks to the people behind the scenes who made it possible to present this talk in a safe way. A few words about me. I have a long-standing interest in the state of the Tor network. In 2015, I started all-net radar, which is a public mailing list and website showing reports about new relay groups and possible civil attacks. In 2017, I was asked to join the private bad relays Tor project mailing list to help analyze and confirm reports about malicious relays. To get a better understanding of who runs what fraction of the Tor network over time, I started all-net stats. It shows you also which operators could the anonymized Tor users because they are in a position to perform end-to-end correlation attacks, something we will describe later. I'm also the maintainer of Ansible Relayer, which is an Ansible role used by many large relay operators. Out of curiosity, I also like engaging in some limited open-source intelligence gathering on malicious Tor network actors, especially when their motivation for running relays has not been well understood. To avoid confusions with regards to the Tor project, I am not employed by the Tor project, and I do not speak for the Tor project. In this presentation, we will go through some examples of malicious actors on the Tor network. They basically represent our problem statement that motivates us to improve the status quo. After describing some issues with current approaches to fight malicious relays, we present a new additional approach aiming at achieving a safer Tor experience using trusted relays to some extent. The primary target audience of this presentation are Tor users like Tor browser users, relay operators, onion service operators like, for example, secure drop, and anyone else that cares about Tor. To get everyone on the same page, a quick refresher on how Tor works and what type of relays also called nodes there are. When Alice uses Tor browser to visit Bob's website, her Tor client selects three Tor relays to construct a circuit that will be used to route her traffic through the Tor network before it reaches Bob. This gives Alice location anonymity. The first relay in such a circuit is called an entry guard relay. This relay is the only relay seeing Alice's real IP address, and is therefore considered a more sensitive type of relay. The guard relay does not learn that Alice is connecting to Bob, though. It only sees the next relay as destination. Guard relays are not changed frequently, and Alice's Tor client waits up to 12 weeks before choosing a new guard to make some attacks less effective. The second relay is called a middle or middle only relay. This relay is the least sensitive position since it only sees other relays, but does not know anything about Alice or Bob because it just forwards encrypted traffic. And the final relay is called an exit relay. The exit relay gets to learn the destination, Bob, but does not know who is connecting to Bob. The exit relay is also considered a more sensitive relay type since it potentially gets to see and manipulate clear text traffic if Alice is not using an encrypted protocol like HTTPS. Although exit relays see the destination, they cannot link all sites Alice visits at a given point in time to the same Tor client to profile her because Alice's Tor browser instructs the Tor client to create and use distinct circuits for distinct URL bar domains. So although this diagram shows a single circuit only, a Tor client usually has multiple open Tor circuits at the same time. In networks where Tor is censored, users make use of a special node type which is called bridge. Their primary difference is that they are not included in the public list of relays to make it harder to censor them. Alice has to manually configure Tor browser if she wants to use a bridge. For redundancy, it is good to have more than one bridge in case a bridge goes down or gets censored. The use bridge also gets to see Alice's real IP address but not the destination. Now that we have a basic understanding of Tor's design, we might wonder why do we need to trust the network when roles are distributed across multiple relays. So let's look into some attack scenarios. If an attacker controls Alice's guard and exit relay, they can learn that Alice connected to Bob by performing end-to-end correlation attacks. Such attacks can be passive, meaning no traffic is manipulated and therefore cannot be detected by probing suspect relays with test traffic. Ornate stats gives you a daily updated list of potential operators in such a position. There are some restrictions a default Tor client follows when building circuits to reduce the likelihood of this occurring. For example, a Tor client does not use more than one relay in the same slash 16 IPv4 network block when building circuits. For example, Alice's Tor client would never create this circuit because guard and exit relays are in the same net block 192.0. slash 16. For this reason, the number of distinct slash 16 network blocks an attacker distributed its relays across is relevant when evaluating this kind of risk. Honest relay operators declare their group of relays in the so-called my family setting. This way, they are transparent about their set of relays, and Tor clients automatically avoid using more than a single relay from any given family in a single circuit. Militious actors will either not declare relay families or pretend to be more than one family. Another variant of the end-to-end correlation attack is possible when Bob is the attacker or has been compromised by the attacker, and the attacker also happens to run Alice's guard relay. In this case, the attacker can also determine the actual source IP address used by Alice when she visits Bob's website. In cases of large suspicious non-exit relay groups, it is also plausible that they are after onion services because circuits for onion services do not require exit relays. Onion services provide location anonymity to the server side. By running many non-exits, an attacker could aim at finding the real IP address stroke location of an onion service. Manipulating exit relays are probably the most common attack type detected in the wild. That is also the easiest to perform attack type. Militious exits usually do not care who Alice is or what her actual IP address is. They are mainly interested to profit from traffic manipulation. This type of attack can be detected by probing exits with decoy traffic, but since malicious exits moved to more targeted approaches, specific domains only, detection is less trivial than one might think. The best protection against this kind of attack is using encryption. Militious exit relays cannot harm connections going to onion services. Now, let's look into two real-world examples of large-scale and persistent malicious actors on the toll network. The first example, tracked as BTC MIT M20, is in the malicious exits business and performs SSL strip attacks on exit relays to manipulate plain-text HTTP traffic like Bitcoin addresses to divert Bitcoin transactions to them. They have been detected for the first time in 2020 and had some pretty large relay groups. On this graph, you can see how much of the toll exit fraction was under their control in the first half of 2020. The different colors represent different contact infos they gave on their relays to pretend they are distinct groups. The sharp drops show events when they were removed from the network before adding relays again. In February 2021, they managed over 27 percent of the toll network's exit capacity despite multiple removal attempts over almost a year. At some point in the future, we will hopefully have HTTPS-only mode enabled by default in toll browser to kill this entire attack vector for good and make malicious exits less lucrative. I encourage you to test HTTPS-only mode in toll browser and notify website operators that do not work in that mode. If a website does not work in HTTPS-only mode, you also know it is probably not safe to use in the first place. The second example actor tracked as KAX-17 is still somewhat of a mystery and that is not the best situation to be in. They are remarkable for their focus on non-exit relays, their network diversity with over 200 distinct slash 16 subnets, their size. It is the first actor I know of that peaked at over 100 gigabits per second advertised non-exit bandwidth and they are active since a very long time. Let's have a look at some KAX-17 related events in the past two years. I first detected and reported them to the toll project in September 2019. In October 2019, KAX-17 relays got removed by the toll directory authorities for the first time. In December 2019, I published the first blog post about them. At that point, they were already rebuilding their infrastructure by adding new relays. In February 2020, I contacted an email address that was used on some relays that did not properly declare their relay group using the my family setting. At the time, they said they will run bridges instead, so they do not have to set my family. Side note, my family is not supported for bridges. I was not aware that this email address is linked to KAX-17 until October 2021. In the first half of 2020, I regularly reported large quantities of relays to the toll project and they got removed at high pace until June 2020 when directory authorities changed their practices and stopped removing them because they didn't want to scare away potential new relay operators. In July 2020, an email address joined a toll relay's mailing list discussion. I started about a proposal to limit large-scale attacks on the network. Now we know that email address is linked to KAX-17. Since the toll directory authorities no longer removed the relay groups showing up, I sent the information of over 600 KAX-17 relays to the public toll talk mailing list. In October 2021, someone who asked for anonymity reached out to me and provided a new way to detect toll relay groups that do not run the official toll software. Using this methodology, we were able to detect KAX-17 using a second detection method. This also apparently convinced the toll directory authorities and in November this year, a major removal event took place. Sadly, the time span during which KAX-17 was running relays without limitations was rather long. This motivated us to come up with a design that avoids this kind of complete dependency on toll directory authorities when it comes to safety issues. And as you might guess, KAX-17 is already attempting to restore their foothold again. Here are some KAX-17 properties. After the release of my second KAX-17 blog post in November 2021, the media was quick with using words like nation-state and advanced persistent threat. But I find it hard to believe such serious entities would be so sloppy. Since they claimed to work for an ISP in every other email, I looked into their AS distribution. I guess they work for more than one ISP. This chart shows used autonomous system sorted by the unique IP addresses used at that hoster. So for example, they used more than 400 IP addresses at Microsoft to run relays. These are not exact numbers since it only includes relays since 2019 and there are likely more. If we map their IP addresses to countries, we get this. Do not take this map too seriously as the used GOIP database was severely outdated and such databases are never completely accurate, but it gives us a rough idea. To be clear, I have no evidence that KAX-17 is performing any kind of attacks against toll users, but in our threat model, it is already a considerable risk if even a benevolent operator is not declaring their more than 800 relays as a family. Good protections should protect against benevolent and malicious civil attacks equally. The strongest input factor for the risk assessment of this actor is the fact they do not run the official toll software on their relays. There are still many open questions and the analysis into KAX-17 is ongoing. If you have any input, feel free to reach out to me. After looking at some examples of malicious actors, I want to shortly summarize some of the issues in how the malicious relays problem is currently approached. It is pretty much like playing whack-a-mole. You hit them and they come back. You hit them again and they come back again over and over. And while you're at it, you're also training them to come back stronger next time. Malicious actors can run relays until they get caught, stroke detected, or are considered suspicious enough for removal by toll directory authorities. If your threat model does not match the toll directories threat model, you are out of luck or have to maintain your own exclusion lists. Attempts to define a former set of do not do requirements for relays that toll directory authorities commit to enforce have failed, even with the involvement of a core toll developer. It is time for a paradigm change. The current processes for detecting and removing malicious relays are failing us and are not sustainable in the long run. In recent years, malicious groups have become larger, harder to detect, harder to get removed, and more persistent. Here are some of our design goals. Instead of continuing the single-sided arms race with malicious actors, we aim to empower toll users for self-defense without requiring the detection of malicious toll relays and without solely depending on toll directory authorities for protecting us from malicious relays. We aim to reduce the risk of de-anonymization by using at least a trusted guard or exit or both. We also acknowledged it is increasingly impossible to detect all malicious relays using decoy traffic, therefore we stopped depending on the detectability of malicious relays to protect users. In today's toll network, we hope to not choose a malicious guard when we pick one. In the proposed design, we would pick a trusted guard instead. In fact, this can be done with today's toll browser if you set any trusted relays as your bridge. Another supported configuration would be to use trusted guards and trusted exits. Such designs are possible without requiring code changes in toll, but are cumbersome to configure manually since toll only supports relay fingerprints and does not know about relay operator identifiers. But what do we actually mean by trusted relays? Trusted relays are operated by trusted operators. These operators are believed to run relays without malicious intent. Trusted operators are specified by the user. Users assign trust at the operator, not the relay level, for scalability reasons and to avoid configuration changes when an operator changes their relays. Since users should be able to specify trusted operators, we need human readable, authenticated and globally unique operator identifiers. By authenticated, we mean they should not be spoofable arbitrarily, like current relay contact info. For simplicity, we use DNS domains as relay operator identifiers, and we will probably restrict them to 40 characters in length. How do authenticated relay operator IDs short AROI work? From an operator point of view, configuring an AROI is easy. Step one, the operator specifies the desired domain under her control using tolls contact info option. Step two, the operator publishes a simple text file using the IANA well-known URI containing all relay fingerprints. If no web server is available or if the web server is not considered safe enough, DNS-signed TXT records are also an option for authentication. Using DNS is great for scalability and availability due to DNS caching. But since every relay requires its own TXT record, it will take longer than the URI type proof when performing proof validation. Operators that have no domain at all can use free services like GitHub pages or similar to serve the text file. For convenience, Iren Sandler created this simple-to-use contact info generator, so relay operators don't have to read the specification to generate the required contact info string for their configuration. For the authenticated relay operator ID, the URL and proof fields are the only relevant fields. There are already over 1,000 relays that have implemented the authenticated relay operator ID. Ornate stats displays an icon in case the operator implemented it correctly. Out of the top 24 largest families by bandwidth, all but eight operators have implemented the authenticated relay operator ID already. On the right hand side, you can see a few logos of organizations running relays with a properly set up AROI. The most relevant distinction between lines having that checkmark icon and those that do not have it is the fact that the string in lines that do not include the icon can be arbitrarily spoofed. This graph shows the largest exit operators that implemented the AROI. I want to stress one crucial point about AROIs, authenticated must not be confused with trusted. Militious operators can also authenticate their domain and they do. A given AROI can be trusted or not, it is up to the user, but using AROIs instead of contact info for assigning trust is crucial, because contact infos cannot be trusted directly without further checks. This graph shows what fraction of the Tor network's exit capacity implemented the authenticated relay operator ID over time. Currently we are at around 60% already, but guard capacity is a lot lower, around 15%. The reason for that is that exits are operated mostly by large operators and organizations, while guards are distributed across a lot more operators. There are over 1800 guard families, but only around 400 exit families. How does a Tor client make use of AROIs? Current Tor versions do not know what AROIs are and primarily take relay fingerprints as configuration inputs. So we need some tooling to generate a list of relay fingerprints starting from a list of trusted AROIs. We have implemented a quick and dirty proof of concept that puts everything together and performs all the steps shown on this slide to demonstrate the concept of using trusted AROIs to configure Tor client to use trusted exit relays. It is not meant to be used by end users, it merely is a preview for the technical audience who would like to see it in action to achieve a better understanding of the design. The current proof of concept performs all proof checks itself without relying on third parties, but since there are a lot of reasons for doing proof checks centrally instead, for example by directory authorities, I recently submitted a partial proposal for it to the Tor development mailing list to see whether they would consider it before proceeding with a more serious implementation than the current proof of concept. I find it important to always try achieving a common goal together with upstream first before creating solutions that are maintained outside of upstream because it will lead to better maintained improvements and likely a more user-friendly experience if they are integrated in upstream. Here is a link to the mentioned TorDev email for those who would like to follow along to summarize. After reviewing some real-world examples of malicious actors on the Tor network, we concluded that current approaches to limit risks by bad relays on Tor users might not live up to Tor users' expectations, are not sustainable in the long run, and need an upgrade to avoid depending on the detectability of malicious relays, which is becoming increasingly hard. We presented a design to extend current anti-bad relay approaches that does not rely on the detection of malicious relays using trusted authenticated relay operator IDs. We have shown that most exit capacity has implemented AROIs already, while guard capacity is currently significantly lower, showing a lack of insights on who operates Tor's guard capacity. When publicly speaking about modifying Tor's path selection in front of a wide audience, I also consider it to be my responsibility to explicitly state that you should not change your Tor configuration options that influence path selection behavior without a clear need according to your threat model to avoid potentially standing out. Using trusted AROIs certainly comes with some trade-offs of its own, like, for example, network load balancing to name only one. Thanks to many large trusted exit operators, it should be feasible in the near future to use trusted exits without standing out in a trivially detectable way, because it is harder in the sense of takes longer to statistically detect a Tor client changed its possible pool of exits if it only excluded a smaller fraction of exits. Detecting Tor clients using only a subset of all guards takes a lot longer than detecting custom exit sets, because guards are not changed over a longer period of time when compared with exits. And finally, Tor clients that make use of trusted AROIs will need a way to find trusted AROIs. Ideally, they could learn about them dynamically in a safe way. There is an early work in progress draft specification linked on the slide. I wanted to dedicate this talk to Carsten Lusing, who passed away last year. He was the kindest person I got to interact with in the Tor community. Carsten was the Tor metrics team lead, and without his work, my projects Ornat stats and Ornat Radar would not exist. Every time you use metrics.torproject.org, for example, the so-called relay search, you are using his legacy. Thank you for listening, and I'm really looking forward to your questions. I'm not sure I'll be able to respond to questions after the talk in real time, but it would be nice to have them read out so they are part of the recording. And I'll make an effort to publish answers to all of them via mastodon should I not be able to respond in real time. I'm also happy to take tips about unusual things you observed on the Tor network. Do not underestimate your power as Tor user to contribute to a safer Tor network by reporting unusual things. Most major hits against bad relay actors were the result of Tor user reports. Okay, thank you very much for this very informative talk. And yeah, so we will switch over to the Q&A now. And yeah, thanks again. Very fascinating. So we have collected several questions from our IRCD chat. I'm just going to start. If bridges don't need my family setting, isn't this a wide open gap for ant to end correlation attacks? For example, for malicious actor can somehow make their relay popular as bridge? Yes, bridges are a concern in the context of my family. For that reason, it is not recommended to run bridges and exits at the same time in current versions of Tor. But future versions of Tor will get a new and more relay operator friendly my family setting. That new my family design will also support bridges. This will likely be in Tor 0.4, 0.8, 0.X at some point in 2022. Okay, thanks. Despite what kind of attack is there or other statistics who or from which country these attacks are coming most background here is there are rumors about NSA driven exit notes. I don't know about any general statistics, but I usually include used autonomous systems by certain groups when logging about them. There are some autonomous systems that are notorious for being used by malicious groups, but malicious groups also try to blend in with the rest by using large ISPs like Hetzner and OVH. Thanks. Is using a bridge that I host also safe than using random guard? This is a tricky question, since it also depends on whether it is a private bridge, a bridge that is not distributed to other users via bridge DB. I would say it is better to not run the bridges you use yourself. Okay, what is worse KX-17 or a well-known trusted operators running 20% of Tor's exits? Currently, I would say KX-17. Okay, and I think that's the last one for now. Isn't the anonymity bit not decreased or changed while using a trusted relay list? Yes, this is a trade-off that users will need to make. This heavily depends on the threat model. Okay, so I think we have gathered all the questions and they were all answered. So thank you again for...