 Hello and welcome to my presentation on Contra Corona. This is joint work with Vasili Peskov-Vainov, Gunnar Hartung, Alexander Koch, John Willer-Quade, Thorsten Strufe and my name is Felix Dörre and I will be presenting our work on how we can improve privacy for contact tracing systems. So currently in Germany we have an official contact tracing app that uses the Google Apple Exposure Notification Framework and has roughly downloads for about 40% of the population, but whenever a new infection is detected, only about 20% as seen in this graph on the right of those infections result in a key shared with the contact tracing app and then by that contacts being warned. So about only one in five infections uses the contact tracing app to warn other people. So in this talk I will like to talk about the Google Apple Exposure Notification Framework a bit as background and what the privacy problems that, what the privacy problems are that it leaves desired, our goals, how we would improve privacy upon the system and how we model security of our approach to have a better contact tracing system. So first a bit about the Google Apple Exposure Notification Framework and how it would work cryptographically. So when a user installs the app, the app draws a random key once per day which it uses to derive short term keys that are broadcasted and switched every about 15 minutes. Under normal operation the apps will then record all keys that are in proximity to them by regularly checking, receiving Bluetooth communication and record them together with some meta information like signal strength. When later on a participant gets infected they can and they are tested, so the infection is detected, they receive some kind of information like a transaction number and then they upload their secret key that they use to derive the short term identifiers and they will upload all their secret keys for the last 14 days to a back end and all apps regularly pull the secret keys from the back end will expand them again into those short term identifiers and check if they have seen any of those. And if they have seen any of those they will accumulate the signal strength to approximate the risk and if it is somehow high the user will get warned. And what the apps implement is just the management and backend information and all the other information like storing these identifiers and summing it up. This is done by the operating system itself. So the apps don't have direct access to this whole Bluetooth interface. And what are the privacy problems with this approach? When a person is infected they need to upload their secret identities and thereby they will make all their short term identifiers that they used over the day linkable. So my 14 short term and my 14 secret keys are not linkable. They are just mixed with all the others when I upload them but when someone has met me twice during the day they can see that those two encounters were the same person. This is not necessary for them being warned. And additionally they learn which of the encounters and how many of the encounters during the day led to a warning. So if they have been in contact with two infected persons they get to know that there have been two persons and which encounters they were. So if they would somehow write down at what point in time and they met which person by where they were then they could later on look at the record of short term identifiers, see which one have matches and then see where exactly they were infected. And also this is not really information that they would necessarily need. They would only need the information that they have been in contact with an infected person and that they should probably go quarantine. So our goal is to improve the privacy of infected participants more. One could argue that only few participants should be infected but as the pandemic goes for many, many people become infected and they are potentially the more vulnerable participants in the system. And we hope that increasing privacy could also help to convince people to more likely voluntarily share their diagnostic skills. And we want to do this in a way where we do not compromise the privacy of all other participants. Of course we don't want some central authority that can build up contact graphs or movement profiles. And we want to give a security model so we can see which information exactly is leaked about the contacts even in the ideal functionality of contact tracing, which we would require and which we would want to leak. So our approach is kind of hybrid between a fully decentralized and some centralized approach where we have some central logic but we would like to have the central logic split up into three distinct functionalities that are ideally operated by independent parties so that they do not collaborate and share information, share more information than required by the interfaces that we specify. And the users in the system would also regularly once per day draw a new secret key and then would register them with the submission server and they would do that through some kind of anonymizing proxy because that's also something that is overlooked in current applications that even though all this contact tracing might be pseudonymous, the moment that you interact with the central infrastructure, if you don't take additional precautions, you implicitly leak your IP address and IP addresses can be resolved to natural persons. So we would want some kind of anonymizing proxy here operated by an independent party and the apps would, as I said, upload their regularly generated identifiers and then query the system for warnings and let's see how an infection is reported so when a user gets some positive test and the positive test is somehow authorized through a ton number, for example, as is done with the most of the current contact tracing apps, the users in this system upload all the identities that they recorded. So they do not upload the identities that they broadcasted themselves, but they upload the identities that they recorded in order to get all those people warned. In that way, this approach is kind of dual to the approach that the Google Apple exposure notification framework uses. These uploaded identifiers are then translated by the matching and the warning server into warning identifiers and then there will be one warning identifier per person and day for each person that should be warned. So even if you are warned by two people, in the end on the warning server, there will be only one warning identifier for you and you cannot tell how many people have warned you, thereby protecting their privacy. Now a bit more on our security model, as it's common nowadays we compare ourselves with an ideal functionality and then later on prove that there is no environment that can distinguish if it is executing with an adversary and the real protocol or if it is in an experiment with a simulator that uses the adversary and that has access to an ideal functionality and the ideal functionality only reveals information that is necessary for the functionality we want to achieve. And the environment in our case is allowed to choose the physical scenario. So the scenario which persons are in contact with each other at what point in time and at what point in time which persons become infected and then will warn other participants. We model time as epochs that are just incremented over time and we have this long-term epochs that is the one-day time interval that's usually used and the short-term epochs, the epochs, how often the identifiers are rotated are part of that. So now a bit more about how the contact graph is modeled. So when we have a contact graph, this is such a contact graph that the environment would specify, we model it as a directed graph because even though you would in the first moment think of contacts like something undirected in the real world, if you probably send Bluetooth messages to people and we use that as a model for if you are in contact with someone, those Bluetooth messages might only receive, might only go through in one direction. And that's why we chose to not restrict this graph to an undirected graph. So in this case, Bluetooth messages from IA reach C but not the other way round for whatever reason. And then there are nodes B and D in this graph and those nodes are corrupted so the adversary can statically corrupt some participants and all those participants would then try to collaboratively break the security off and break the privacy of some other participants in the system. And when we have this contact graph, some information about this contact graph needs to be leaked to the adversary because we cannot hide it anyway. And yeah, in order for the later on to be able to improve security, we need to make this leakage explicit. So for example, this edge between the corrupted nodes B and D, the adversary can always find out if there is this edge and that it's only this direction because it can just let the corrupted party B send a Bluetooth broadcast with some arbitrary value. And if party D receives that value, it's clear that this edge has to be present. So in the leak contact graph, all edges between only corrupted nodes are just copied over. The edges from honest nodes are also copied over, but as shown here in the right graph, the lines of the nodes are dashed lines, which would indicate that the adversary does not get the full identity of those nodes, but only a pseudonym that's switched every short time epoch. So the environment can set a new contact graph every short time epoch. And the pseudonym for A is chosen fresh every time. So the adversary cannot see which node from the previous contact graph this represents. And as you see, there are also edges outgoing from the corrupted nodes, but there's even less information there because for each outgoing nodes, edges, the nodes, the target nodes are split. So instead of this node for A and this node for A being the same node, we made one each for each corrupted node as the adversary cannot distinguish if the broadcasts from B and D reach the same honest node or only reach different honest nodes. But remodeled it as if the adversary can detect how many honest nodes the corrupted broadcasts reach because when later node B gets infected, the number of infected participants will go up and that's something that the adversary will be able to notice. So a bit more detail on the real ideal modeling. Yeah, as we compare the protocol execution in the real world where the adversary interacts with parties that execute the protocol as specified and an environment that gives them the inputs and that can talk arbitrary messages with an adversary interactively at any time. We have an ideal world on the other side where the adversary is wrapped in a simulator. So all messages that the adversary saw in the real world happen and the simulator needs to build in the ideal world so that the adversary feels as if it is in the real world. Now how does the contact graph play into all this? When the environment regularly specifies a new contact graph, it will send this to a special designated party, PMAT, which will only forward it to an ideal functionality in the real model that represents how the physical world interacts. So we don't specify this as an ideal trusted program to model the way we expect the physical world to behave. And in this case, this means the environment can specify which parties are in proximity to each other by specifying such a contact graph. And when then party P2 will broadcast an identifier, PMAT will look up in the current contact graph which parties are in proximity and forward the broadcast to those parties. And in the ideal world, we would replace all the protocol execution with an ideal functionality. This is FCT and FCT will just receive the inputs from all the parties. So the parties here on the left, they execute the protocol as specified. And the parties here on the right will just forward all their inputs and receive their outputs directly from FCT. And you may notice that the ideal functionality FMAT that was present here in the real world is missing in the ideal world. This is because the contact graph that the environment sends to PMAT is forwarded directly to FCT and FCT can then use the contact graph to look up which parties in proximity to each other and then by that which party should be warned from which infection to best model the ideal functionality we would want from contact tracing. In order for the simulator to simulate all the messages that the adversary sees in the real world, we would need to send in this leakage graph that we discussed earlier and this is done by FCT explicitly sending the leakage graph as we described to the simulator. So as a conclusion, the exposure notification framework as proposed by Google and Apple is a great first step compared to centralized contact tracing information where all encounters are just stored centrally and some central authority has the possibility to access this data. And but even this leaves something to desire. So we present an approach that can protect privacy better, especially the privacy for the effective participants and as current discussions about how one would continue developing on those contact tracing apps shows every information that is theoretically there will try to be used. So if we can protect such information cryptographically, we might prevent such additional privacy intrusion and to show that our model provides privacy better, we give a security model and show how our protocol is secure in this model. And while in the conference version, we focus only on proving security against honest service in the full version that's given under this link, we give more detail on how to prove security for the case when those servers are corrupted. Thanks for your attention.