 Please welcome our next speaker, Nick Dorian, on Geolocation Homomorphic Encryption. Alright, good morning. Can everyone hear me correctly? Okay, awesome, awesome. Okay, so today we're going to talk about geolocation and homomorphic encryption. My name is Nick Dorian. You can find me on Twitter, GitHub, maybe the darknet using that screen name. And this is my first DEF CON, so thank you all of you for being part of my first DEF CON. Okay, so first a little bit about me. I currently work at McKinsey. Kind of has this kind of very like traditional old-fashioned vibe to it, but currently we use, not to make a whole pitch, but you know currently we use React, Node, Docker, kind of modern stuff. Plus if your grandparents have ever been like, why don't you stop playing video games and get a real job? Like your grandparents will know what you do now. But I like to think that I have this hacker ethos from my previous roles with one laptop per child, Code for America, kind of like open source groups. And I think despite working on some very serious projects, kind of my more famous work is Fortran.io, if any of you have been there. But anyway, today we're not going to talk about any of that. We're going to talk about this really interesting idea that goes to something we use every day, mapping and geolocation. And pretty much all the time you're doing this, probably you did this today already. You get a GPS signal down to your phone or device. You ask a server where's the nearest place that I'm trying to get to, and it comes back with a bunch of results. But also if you think about it, it's a little bit creepy that, especially if you have Google location history on, I can go back years and figure out what restaurant that I go to, what was I doing this day, that kind of thing. So, yeah, so is anyone like a mapping geospatial developer person in real life? Okay, great. Okay, so if you've done this before or even if you haven't, there's really two different ways that people go about doing this. The first one is make a client-side app. Everything's client-side. And to find places that are near you or in that area, you have to download all the geodata down to the client, and then they make some sort of calculation based on a JavaScript spatial library. And there are two reasons that people, especially companies, don't like to do this. One of them is that if it's a big data set, for example, let's say you were trying to figure out the precinct where I vote. That needs to be very accurate, needs to be very precise, and just downloading the data set for my state would be hundreds of megabytes. So that's not something we can do on the client-side. And then for something like the Google example that I showed earlier, if I'm looking up burrito places, Google doesn't want to share its full data set of burrito places in my area. So instead, what most people do is the server option, this option B, where this client gives up its geolocation data, sends it over, and then the server decides what's good for that. So this works for all these different examples I mentioned before, where there's a lot of data, you can't really download all the data, or it's proprietary information. The weak point is you have to give up your location, and it makes sense because how are they supposed to tell where you are or what's near you if you haven't given them your latitude longitude? It's like not possible. But what if there was a third option? So I'm not a mathematician or a cryptographer by training. I assume we do have a number of math and cryptography people here. But when I was reading about this a couple of years ago, I was surprised to hear about this thing, which is apparently a real thing called homomorphic encryption. And when I tried to describe it to someone, it sounds a little bit like magic. But it's just advanced math that took people a long time to figure out, and I'm not prepared to fully understand. But essentially what it means is people can do calculations on numbers while they're encrypted without ever knowing the underlying value. So there are two classical examples of why you might try to do that, which is what I have been hearing before I came into this mapping problem of it. So the first example is like a stock example. Like say you're about to sell many thousands of dollars of stock, and there are different fees and things that are going to be incurred by different people in your supply chain. But the moment you get the word out there, I'm going to sell $3,000 of this stock. Other people are going to start speculating against you. Like there's all these micro-traders or high-frequency trading. So you don't want to leak that information while you're doing the calculation, but you do want to know what the final calculation is. So the concept in that example is you might send the number encrypted, they do math on it encrypted, and then you decrypt the result on your side. The other example that I've seen, this is a commercial product, but I don't really remember who's it is, is that if you think about it, Google has access to all the plain text of all of your emails because when you do a search, Google needs to be able to search that text. And so there's this particular software or service where they encrypt all of your emails on the server, and you send kind of an encrypted search query into that to search your email without revealing what is your email and what are you searching. So this is relatively new in terms of like math. The first partially homomorphic system was described in the 1970s. So the reason it's called partially homomorphic is that it can do addition or it can do multiplication, but it can't do everything in general. So it was only in 2009 that some researchers who now work at IBM published the first fully homomorphic in crypto system. So if you look at the code, it kind of goes down very far down to like assembly of different logic gates and things like that. So it's very intimidating, and we're not going to talk about that today. Instead I use something called the Piliar Crypto System. Is there a cryptographer here who can correct my pronunciation of Piliar Crypto System? Oh thank goodness, okay. I get to be the expert, this is great. Okay, I watched a YouTube video this morning and they said Piliar, so let's go with it. So why are we not using homomorphic encryption magic all the time? Because then everyone would have things. First thing is people are already pretty okay with sharing data. Like you didn't feel uncomfortable telling Google that you were in Las Vegas. And especially if you trust the business with their money. Like the example I gave earlier of the stock market like makes sense like in terms of like an academic paper. But then it's like wait a minute, why am I doing thousands of dollars of trades with a company that I don't even trust? Like what if they get the result and then they wait five seconds and then they broadcast it but they've actually done something. So it seems like a really interesting idea but because of the practicalities and then also because it's very expensive on a computation speed and time level, it hasn't really taken off as like a general purpose thing. But the example we're going to talk about today I think is a little bit more possible or more relevant use of homomorphic encryption. And it's involving your geolocation and the example that I'm going to do is sort of like a treasure hunt or geocaching kind of example. So here you are on the road and you're like hey I think that I've solved the puzzle I've got the end of the road. I think I've found it. But for whatever reason you're being watched you don't want to give away your location. You're like being very secretive. So the server can't tell you where the finish line is either because otherwise you would say hey I'm in Las Vegas am I winning and they'd say no you're only winning if you're in Colorado and then they'd be like okay great now I'm going there. So the server can't reveal the answer and you for whatever reason don't want to share your coordinates. And then there's this other person here with the spy device that I'm saying is watching your connection or something trying to see if someone's going to share where they are or where the finish line is. So in order to make this work like I said before I chose the Pillier crypto system not because it's partially homomorphic so it can do addition and multiplication by other like non-encrypted numbers but it can't do something like x squared multiplying a number by itself just because of however it is architected. There's some other systems out there while I was preparing these slides I saw there's a web assembly library. The main reason behind me choosing the library as I did was I wanted there to be a server and a client that could communicate using the same thing and the bottleneck on that is finding you know some cryptographer who's made like a JavaScript implementation of this. So I found the Pillier crypto system this goes back to the 1990s or so and there's a few different implementations out there I was pretty happy with how it worked out here. And like I mentioned there's limits to what you can and can't do within these systems. So okay I'm going to show you some code now I hope there were I don't yeah here's what I'm going to show you some code now so you're on the treasure hunt and you've loaded this library you create a key pair so I know I think a number of us have used PGP, GPG so if you really like public-private key pairs before now you can have public-private key pairs for each one of your numbers that you use so now you have a lot more keys in this example I generated a key with only 128 length but I've since updated the GitHub repo now it's 1024 length you can pretty much set it to any size it just takes a little longer for it to generate your public-private key so you're at this location you know where you are just with latitude and longitude and you generate a key and then you want to send up the latitude and longitude encrypted now for whatever reason the JavaScript library was using only accepts them as positive integers I think I asked earlier and there's some mapping people here but not a lot and generally the way things we do latitude and longitude these days is we have negative 90 be the south pole positive 90, north pole negative 180 be the international dateline going west and then wrap around to plus 180 so I had to make everything into positive integers so I have like a little ackee thing there and then encrypt them so that gets set up to the server and the latitude and longitude are now 617 digit numbers like it looks just like a number but it's actually encrypted kind of like when you send a PGP message there's a blob of crypto text and you're like what is that this is kind of a number and then the pillar crypto system represents your latitude or your longitude and in fact I assume some of you have done JavaScript trying to work with the Twitter API and already the number of tweets is above the limit of a JavaScript number so if you deal with a 617 digit number there's a special JavaScript library called big number that helps you deal with these crazy big numbers and another thing is there's some kind of nonce or something in the system so that the server can't just guess your latitude longitude and generate the same number with the same public key instead they're just going to keep generating lots of different numbers and not be able to find out if this is bigger or smaller than it so in this case I'm going to kind of spoil the source code for you in my example so the main location is Colorado because it's a rectangle and it should be a good simple example so if you're inside Colorado I should be able to measure your offset to the north, south, east and west and two of them should be positive and two of them should be negative right if you're in the center kind of this rectangle diagram here is what I'm talking about you should have north be positive south be negative and so on wait a minute, what if I send my location to the server and it calculates the difference from its target location it's basically telling me how far away I am from the target so I also multiply each one of them by a different random offset so you might know which direction you're in relative to the rectangle but you don't know how far away you are from it because it's randomized all of them so the only one who can actually decrypt and parse the response is the initial user so they get it back, they find out that they're still south of the target area and oops but they don't have to reveal that to anyone else and even if our hacker friends go over and look at the server even if they get inside the server and look at the logs and the database see any kind of information the server ever had they're not able to decrypt the original latitude and longitude so I try to make a Jupiter notebook for all of you but whenever when my code hit Jupiter it created kind of a catastrophic error this is like a 90s astronomy joke I thought of this morning did anyone okay anyway if someone wants to look at my code I have a bitly link and a long link if someone wants to see that but you might say wait a minute Nick in the first example you're just showing places that were near me I don't want to do a treasure hunt I don't want to do that I want to find places that are near you and this is something I'm really interested in finding a really high tech crypto way to handle it except Chipotle's isn't really a good example so I actually last year met with someone who's in the Syrian American Medical Society and they were saying that there are these hospitals that they have to keep track of and send information and supplies back and forth but they don't want there to be a list anywhere where you can just download a KML file and cause problems for people and because this is high risk and I thought because there's a risk that they're being targeted by Russia it's probably better if this list is never created and also because this is just an experiment I didn't want to like dive into it but as like a demo I'm like this is kind of something we should try to aspire to is there a way that people can keep track of their resources and know what's the place closest to them without revealing everything so I can't use the Pillar crypto system anymore because it only uses addition and subtraction there's actually a really funny error if you try to multiply it's a good luck with that error nice nice implementation by whoever wrote that library but it led me to think about this problem of like if the server can calculate the distance and sort the closest one to the furthest one for you they could do that for themselves as well they could create millions and billions of points all around the world and find out which point is closest to you and say I know where you are and if the server can't sort by distance how is it going to do anything other than send you all the data and all the distances and if you've ever spent some time with like a protractor and a compass you know that if you get the distance from a few different points you can triangulate where they are so this would actually be really kind of interesting cryptographically but practically if someone sends you three requests like oh what's near me what's near me now I know all of your locations so instead there's something called a geohash has anyone seen a geohash before it's been on xkcd a couple times so the idea behind a geohash is there's 26 letters and 10 numerals so a total of 36 options and on what you do is you divide the world up into a six by six grid and then you divide that square up into a six by six grid on and on until you have the level precision that you want right now in Las Vegas is this geohash it's something you can kind of share kind of quickly and copy back and forth and if you want it to be less precise you just take letters and numbers off so I'm interested in maybe the right way of implementing the system is for you to tell the server this is my current geohash can you give me the locations near me organized by a geohash that's a little bit longer than my current location if that makes some sense so you're telling them this is how much I'm willing to share in terms of location how much further are you willing to go on your locations when you share them with me then I thought about it a little bit more and I realized there's a problem if I'm in one of those geohash squares somewhere like in the center then I just send that one geohash square and I'm fine but if I'm at the corner I have to send and my search radius is big enough to go into the neighboring grids then I have to ask for four grids and then it'll be very obvious to whoever is listening in then I'm somewhere near that corner if I'm asking for all of them so instead I think probably the right way to implement this would be for the client to suggest a random offset so sometimes it's going to be one or two or three or four grid squares but it's meaningless because you just randomly offset that location so it's not clear to the server why they've chosen that offset or where they are within that space so future ideas right now I just told you I had the example of Colorado which is just a rectangle I think there's a potential for other shapes just even using addition, subtraction and scalar multiplication to do like a y equals mx plus db to make a triangle right so mix of triangles and boxes to make up larger more interesting shapes there's some different JavaScript libraries that I mentioned that could support this and other geo calculations I know there's a talk later today about post-quantum encryption do we have any post-quantum crypto fans here so one thing that that is using lattice based encryption which for reasons I don't fully understand again not a mathematician lattice based encryption is really good for homomorphic applications like this so as that gets more popular and more widespread these homomorphic applications could also benefit from that there's also something if you do research into homomorphic encryption there's like a voting system has anyone seen like a homomorphic voting system thing before so like we said before you're able to do calculations on numbers you can't see so there's a crypto system okay yeah I'm running out of time but yeah there's a crypto system where you can add up the votes totals without knowing people's votes okay I'm sorry I'm out of time do we have time for questions or that's like it okay that's it but I still am alive as a person I'm here you can figure out where I am thank you so much