 All right, in that case, I shall self-introduce, which is fine. I am part of the organizing team for the Forsythia Summit and have been since it first came to Singapore in, I think, 2014. My day job is as Chief Privacy Officer and Chief Technical Officer for Trust here, we're an analytics company, but my responsibility is for the technology and for the information policy and practices. I'm also President of the Singapore Chapter of the Intent Society, who can be a science degree, dancer, runner, and hand radio operator. But what I'm going to talk about today is a way to improve Bluetooth-assisted contact tracing, and particularly Singapore's approach. Singapore's tracing of the program has been the most successful in the world and has shaved, reportedly, days of the contact tracing program, which translates into a few people ill, a few people killed, their economic disruption, shorter lockdowns, and so on. So the approach I'm calling Aonix. Aonix is actually a genus of otter. And as certainly Singaporeans will know, there are two troops of otters living in Singapore. Some of them you see here, taking in a golf match at the Tampines Golf Course. The otter has become the logo or the mascot rather of the tracing of the program, because of the TT and its name and the fact that they're a sort of popular Singapore animal. So I have chosen Aonix as the sort of parent genus name for my related project to improve the privacy and freedom. I'll start with the problem, because this is surprisingly poorly understood in some places, and in a lot of places it's denied just because it's uncomfortable. When an epidemic breaks out, the initial response in order to be effective is reliant upon contact tracing. You can do without, but you will pay for that in illness, in economic impact, and in death. So the rationale is that an effective response relies on contact tracing. This achieves a number of things. The big thing is that it interrupts transmission. If you identify carriers and isolate them, then they're less likely to affect other people. The same thing can be achieved with lockdowns, but of course the impact is much, much smaller if you're isolating one or two or thousand individuals than if you're isolating an entire population. You also get to alert potentially exposed people. So I had this discussion with one of the staff at the cafe where I usually work out of, and for her, it was helpful that she had received an alert that one of the customers in the cafe had tested positive, which means that she may have been exposed, and it meant that she could decide not to visit her grandparents and risk giving them a disease that two years ago, prior to vaccines, severely risks their health or their lives. It does mean that for infected people, you detect them quickly, and therefore you can treat them timely to the extent that treatment is possible. And it also provides a source for epidemiological data to allow the continual refinement of the response. There are, however, some temptations. Once you've got this need for the state to get involved to sort of compel people to cooperate on contact tracing, different groups are tempted to respond in some rather unfortunate ways. One of them showed up in one of the Scandinavian countries who actually built and deployed a contact tracing support app that did not work by using Bluetooth encounter proximity data as Trace Together does, but instead just used the smartphone's location services to measure the location minute by minute and report in real time to the health authority the location of every single person who had the app installed. And this happened in a country where the EU's GDPR is in fact the law and they did this without talking to their privacy regulator. Sure enough, the privacy regulator heard about it, ordered them to shut it down and delete the data and they did and they then built another app based on Bluetooth. But it is a risk that when there's like, hey, we have to solve this problem, we've got technology, quickly let's just use it and not think about the consequences. Surprisingly, the reverse temptation comes up as well. So the health authorities in New South Wales, even though the federal government built a contact tracing support app based on Trace Together, the New South Wales contact tracing teams like no, no, no, our grandparents did contact tracing without technology, therefore we will too. The fact that it takes several days and the fact that that means that infected people get to infect more people didn't apparently occur to them. And as a result, New South Wales and most of Australia had much, much longer lockdowns than Singapore experienced. There's also the anti-everything, the sort of anarchist type of people who don't like tracing together, tracing support, masks, hand washing, vaccines, anything at all is because an authority proposed it therefore they will oppose it rather than rationally evaluating whether the idea was a good one. This group exists, it's real enough that it's the, most of the driver for, or most of the market rather for alternative facts, their existence limits public health policy responses and therefore limits public health outcomes to the, which has shown up to be terribly dangerous for the populations of the US and the UK in particular. A bit closer to home is the sort of techno-libertarian ideal that, hey, we can get rid of centralized government by putting everything into apps and decentralized systems. And on this occasion, this has infected Apple. So without going all the way into it, for Trace Together to work quickly, efficiently and well on iPhones would have needed a change from Apple. Apple did initially coordinate on preparing such a change but sort of halfway along suddenly decided that actually they much preferred to hate governments and instead produced the exposure notification system, which is specifically designed to not be useful for contact tracing. The only thing it does is allows a person who has been near someone who has published the fact that they are infected to learn that they might have been exposed but it provides, and it's designed to make it impossible to provide any information to contact tracers. So it's a great anti-authority thing, but it turns out to be not true that a bunch of people with apps in their phones can replace the functions of a contact tracing team in a health authority. Surprise, surprise, and see also my earlier slide. So the question is to strike a balance. The approach that was taken in Singapore and I think it's wise was to decentralize the proximity logging, the proximity information that's detected, it stays on your phone or token until or unless your diagnosis is positive. If it's not, it's after 25 days, it just gets deleted. However, they did decide to centralize the contact information to allow contact tracers and ambulances and others to contact the person quickly and effectively. Notice that they've centralized all contact information, not just the contact information for people who've tested positive. It is, in fact, that difference that my proposal addresses. The other thing that happened was that a bit later on they realized that it was desirable to also record ID numbers. And the reason for that was not well explained, but it's tied up with what happens when you have either a blood sample or a nasal swab tested for SARS-CoV-2 or for COVID. The sample is placed in a seal bag which has a sticker on it which has your name, your date of birth and your ID number on it and that a barcode with the same information. It does not have on it the serial number from trace to yield. The stuff that goes on for health protocols is your name, your date of birth and your ID number. And so what putting your ID number into trace to get it made possible was faster connecting of health test results with the activities of contact tracing teams so they could contact affected people, the suspected contacts faster, get them isolated faster, produce the main infection, reduce the amount of economic impact. Some people get a bit upset about this and think about it as like, you know, scope creep. It's like, hey, we agreed to this and now you've taken this other thing. That's true as far as it goes, but it's also the case that rational data minimization means you start with the smallest thing you need and you're only going back and ask for the next thing when you have a clear reason for using it, as is the case here. So it's the fact that the scope expanded isn't by itself evidence of a problem. In fact, in the particular case of Trace Together, every step was carefully justified and carefully controlled. So what we're left with is this centralized database which contains the name, the phone number, the ID number, and the trace to get identified for something like 90% of Singapore's population. That's a pretty scary thing. Ideally it shouldn't exist. It exposes two risks. One is an insider abuse risk that in theory some sort of criminal inside government could get their hands on the database and use it as part of some scheme to unlawfully track individuals. And the other is a breach risk. Every time a database exists there's a risk that an outsider will get their hands on it. If we eliminate or at least drastically change the form of that central database, we can reduce both of these risks and that's what I'm going to propose. The basic idea, the observation is that contact traces only ever access a tiny fraction of the contact and ID database. It's less than 10%. If that was not the case, if they needed to access all of it at some point then this approach wouldn't work. But the fact that throughout the life of the program the contact traces will only ever access a small fraction of the data means that there's a possibility to introduce a different way of thinking about the problem. And that is to incorporate or to insert an honest broker which receives the registration information but doesn't hand it over to the authorities unless they're willing to stay on the record that they need it for contact tracing purposes. And then when they do that it will notify the person whose details were accessed that they've been accessed and it will also periodically publish, perhaps daily, publish access statistics. So, where do we find... sorry, before I do it having an honest broker in place doesn't forcibly prevent insider abuse but it does introduce some involuntary transparency. That is an insider who's thinking of abusing now knows that in order to abuse they're going to be... it's going to be visible and it's going to be visible to the person whose data they're accessing or people if they access many people's data. So that's... tends to disincentivize the act so it's about discouraging by changing the potential abuser's motivations in the first place if the fact that they're going to be caught is part of the course and they might not do it. It also as a side effect necessarily shapes the database into a form so I'll get into how that is in the moment. You wouldn't pay the cost to do this if you weren't building an honest broker but if you are, then this is a handy side effect. So, what are we going to use? We're going to use something that the Free Software Foundation calls Treacherous Computing it's usually called Trusted Computing or Trusted Processing Module I feel that FSF is sometimes a bit you know, excited with the language but I think it's done rather good at right. The whole point of the TPN was to limit your phone or your PC's ability to work for you so an immensely powerful billion dollar corporation could decide what you could do with your phone which is bad. However, we can use exactly the same technology to limit what a government can do with your data on their computers on behalf of the population which I would suggest is excellent. So, concretely I was talking of course about Netflix Netflix will talk to the TPN in your phone or your PC and they all decide whether or not to send the stream you want to watch to your computer based upon whether the TPN can determine that you're running a player that they trust. This is why Firefox has EME in it so it can play Netflix specifically Netflix will not send the stream if there isn't the Trusted Code running each of the TPN to check that where it's available. So, let's turn that around let's have the database live in a computer that contains a TPN government support and then decide not to send data to the government computer unless it can prove that it's running the same code in the broker that it's published. Granted, you allow in the app to perform the authentication and the withholding in the other half. However, the broker can be authenticated independently of the app. That is, third party experts can directly query the broker online and verify that it really is running the same code that the the development that government body is producing it has published the program and its function explored. So, how does all this work? Oh, dear. I don't know if you can see the diagram properly. The lines are a little bit thin. So, here is the app at the top left the broker is at the top right. So, the idea is that the app sends a registration message to the broker at the time that you register and perhaps more frequently. That includes an ephemeral trace to get identified the phone number encrypted with a key that the health authority can read not the broker and also the ID number encrypted with a key that only the health authority can read not the broker. So, the database stays in the broker. The broker does not have the key. So, even if the broker is breached the data is meaningless. In the event that a contact tracer is willing to state on the record that they need to identify a specific individual as a potential contact of an affected person they send a request to the broker the broker sends back. So, the request is to identify a particular ID and they also provide a reason a message that has to go to the user. What they get back is the encrypted phone number and the encrypted ID number and then the contact tracer has people within the health authority have the key to decrypt that. The broker immediately notifies the app and includes the reason message. So, the fact of the health authority having accessed the user's data is something that the user becomes aware of immediately using the Android or iOS notification application facility. All the third-party stuff if you're using non-Android, non-Google services Android. And then similarly the broker can through a variety of mechanisms publish aggregate stats each day to say, hey, today the contact tracer has looked at 10,000 people's records or for some reason 1 million. These are accountability aids that are normally invisible because the data is normally just in the possession of an IT team. Whereas here it's in the possession of a broker whose software we can examine and whose behavior can be tightly limited. So, the development team's procedure is to create and publish the source code for the broker. This is just a simple service running in Python or something. They then create and publish a system image to run inside a secure enclave. Microsoft, Google and Amazon all provide the ability to have virtual servers that run inside enclaves with encrypted RAM. So this is the key. This is a simple service in Python that runs inside encrypted RAM. Build the system image that contains the broker and publish that. That can be examined by third parties along with the code in the broker. Put the image hash into the app build configuration and then build and release the app. So the app because of the way the trusted processing module works the app can now put itself in the same situation that Netflix is in with respect to your phone or your PC. The app connects to the broker and demands proof that the broker is running the code whose system image has a particular hash. And if the TPM can't provide that proof, or if the proof turns up in its own side, then the app does not provide the registration information in the first place. So, this whole thing does depend on the app. You are still cooperating with the dev team. There's a whole question about whether the code you're running in the code that's published are doing the same thing. And that's a large, existing problem that I won't get into other than to say that there has been progress on that during the last two years here. But in particular, the broker can be tested directly without relying on the app by anybody with a curl command line utility. This gives rise to a novel disaster recovery procedure. The database must be kept in an encrypted RAM which sounds hard, but it's not because the apps still need to contact a central service daily to get new Blue Trace if anyone identifies anyway. The Blue Trace mechanism works by using a new identifier every 15 minutes. The app doesn't have the means to generate those identifiers. It has to contact the backend once a day to get them. So, you could easily put that within the broker and indeed secure the system key in the broker and have it provide the registration data as part of its daily poll. Which means if there's a disaster and the server disappears for some reason, okay, you're blind for 24 hours. But as apps connect to refetch their newer family IDs, they can also replace their contact data. This means that the well, sorry, that's what I just described, the recovery is automatic in the event of a total loss. This also drastically simplifies the database. There's no persistent storage. You store entirely within the RAM inside a small number of secure enclaves in hosting providers. Therefore, there's no need for key management for persistent storage, which is a whole area of complexity. There's no need for consensus at evolution. There's no need for storage consensus at all. The nature of the data is such that the latest update wins. You don't need to keep track of password change attempts or something. Therefore, the redundancy design is simpler and you don't need DBAs at all. It's not just going to hire them, but they also don't have to do things securely. You just don't have a DBA anymore. It's not a conventional way to architect the application, but it takes a whole category of problems to disappear, security and operational. Likewise, broker code evolution. If you need to make a new version of the broker, great. You just do it. You update the apps. The apps talk to the new broker and initially the broker has no data. That's fine. Within 24 hours, the broker is fully populated as apps have contracted to fetch their ephemeral IDs. They've also updated the contact data. Once all the updates have been taken up, you can then just delete the broker, the old version of the broker, so the data goes away. Likewise, data purge. If a user hasn't collected new ephemeral identifiers for 25 days, then they're already not transmitting because they've got nothing to transmit. Therefore, the broker can delete the data automatically at that point without requiring user action. It's not deleting proximity data at this point. It's deleting the contact and identity data. If it hasn't heard from the phone for 25 days, then fine. Just throw away the registration information. So even if you fail to delete your account before seeking to use the app, if your phone dies, it's stolen or whatever, your data just disappears. And this doesn't require special operations by a human being or by operations team. They can be built in a few lines of code inside the broker. Just throw everything away after 25 days of not hearing from you. And this also means that cleanup of persistent copies is never required because, of course, there are no persistent copies. Accountability uses no straightaway if the data is accessed because the broker alerts their phone immediately. So if you're getting an alert that says the data's been accessed and then you don't get a phone call or a text from the contact trace is telling you why, then you might start asking questions. If many people start getting this sort of thing and start asking each other, then perhaps journalists get involved and your alarm is raised. The publishing of our data is actually only an additional protection. It's not the primary protection mechanism. Useful, but not critical. Non-problems, I won't go in on many other time. There's a problem with the fact that even though the content of the RAM is encrypted, a malicious service provider could monitor the bus lines between the CPU and the RAM and work out which pages of RAM are being accessed. And Moxie Malin, Spike at Signal has demonstrated that in the case of his contact matching for Signal, this presents a threat. So he's implemented something called oblivious RAM, which means that no matter what operations occurred, the patterns that are visible by someone who's hooked up a device to monitor the bus doesn't actually learn anything. So far as I can tell, that risk doesn't apply here. Incidentally, a lot of this idea comes from Malin, Spike's work for Signal. So now, standing problems by a happy irony, I got sick with COVID in the week that I had planned to actually write the prototype. It was 10 days ago or two weeks ago. And therefore I haven't actually written my prototype. I've studied a lot of code, I've designed in detail, I haven't written a line of code. So that's a fairly important outstanding problem. Clearly there are side-channel attacks. Granted, the data is encrypted, the risk is quite small, but there are theoretical risks. Once again, Malin, Spike has addressed these with the use of LFENSE and RETPOLINE. It does mean a change of language. What is all this stuff in Python would be a performance disaster. He's working in C or Rust perhaps. There remains a problem with the MOH keys being disclosed or used without authorization. But I'd point out a couple of things. One, it's much, much easier to protect a key than it is to protect an entire database. In fact, you often have a hardware security module within the MOH environment, the Ministry of Health for that purpose. And two, of course, abuse would be detected. It's the same thing. If an adversary gets control of the key or the security module that contains the key and begins extracting lots of data from the broker, then all of the affected people will know about it in real time and will start to make noise about it. And the intruder will know this. So this also tends to disidentify as intruding behavior in the first place. So it's not a perfect solution, but it's a much smaller risk than it looks. My only real next step is to implement a prototype. And that's in fact the end of my presentation. There's always echo coming from me when I talk.