 Alright, without any further ado, this is Ann Kim with Selfie or Mugshot. Today I'll be presenting Selfie or Mugshot. Is there a way to make it less echoing? Hello? Hello? Okay, that's a little better. So I'll be presenting Selfie or Mugshot, and it's the idea that you can actually reconstruct genetic information from just facial information. And the TLDR is you could take a fun snapshot like this and then reconstruct genetic information and then maybe go to jail, or get pwned in some other way. So who am I? Hello? I'm Ann Kim, and I'm a graduate student at MIT studying computer science and molecular biology. My thesis and startup work is in clinical trial optimization using federated learning and blockchain. My hobbies include running, slacklining, learning Yas Flute, and paranoia. So everyone has a side hobby, but my side hobby is reading nature genetics papers. So I was reading this fun one, and it was about genome-wide mapping of global to local genetic effects on human facial shape. So essentially, connecting your facial cranial shape with genetic information. Sounds really nifty, except for the context of news today. So there's a lot of people blindly taking genetic tests, just trying to figure out how white they are or their ancestry. And then you also have people who are taking a lot of obnoxious selfies, as well as an environment where you have surveillance just all over. You know, a Canadian mall even. Anarchy, right? So today I'll be imparting onto you my paranoia. And I'll be first giving a preface on basic genetics, as well as telling you about how this can be potentially abused by insurance companies or your employers, as well as criminal investigators. And then also telling you just like scoping out your paranoia and telling you, when will this be possible? How many sheeple do you need to take selfies of themselves or doAncestry.com? And how expensive will it be to create a patsy out of you? But never fear. I'll also be giving you some solutions of how you can use personal discretion, secure computation, as well as regulation in order to protect yourselves. So a little bit about my friend DNA. The whole genome sequence was done in about 2003, and it's three billion base pairs in your human genome. Wonderful, right? So much. But only 2% is actually useful for protein coding. And between you and me, it's only like 0.4% difference. And what you're paying $200 for is 0.016% of your genome, or your whole genome, which makes up your genotype for 23andMe. But even with this small amount of DNA, you can do a bunch of stuff. You can figure out your ancestry, where you came from, missing relatives that were estranged some time ago. You can figure out genetic risk factors that are going to get you in the night. And you can also figure out future risk prediction for family planning for all your brood. Genetic information and techniques have advanced so much that you can actually use facial information for phenotyping. And this is especially useful for children who have different types of diseases that might not have symptoms in their group. So you can have faster diagnosis, like diagnosis before they can actually speak, and you can also have more precise diagnosis, because oftentimes with sequencing, you have a lot of genetic variance and a lot of sequencing errors, and it can be really hard to decouple the two from each other. And this is overall much cheaper, because it's much cheaper to just take a photo of someone as opposed to paying like $600 for a whole genome sequence. Another benefit of genetic sequencing is literally catching criminals. So this method has been used since like 1987, and what has recently been used for, it's been used recently for the Golden State Killer. And if you're not familiar with the talk or not familiar with the crime, yesterday almost-humory BJ somewhere over there gave an excellent talk about the techniques used to actually catch him. But in brief, this was a man who had a lot of strings of murders as well as rapes between 1975 and 1986. The police had all this DNA in this cold case, and they decided to get creative and upload it to this website called Gedmatch. And this is a website where you can upload your genetic information for free, and then figure out who all your strange relatives are. The police uploaded the DNA posing as this man, and then they found all the third and fourth cousins of this man who had uploaded their own DNA. And as a result, they were able to match the warrant for his arrest. The techniques for doing this have been done since 1987, and it's about a one in a trillion false positive rate. The methods are endless for this. You can use a polymerase chain reaction, or PCR, in order to amplify any minute amount of DNA. And I'll use a bunch of other methods. I won't bore you with them, but the flavor of investigators these days is short tandem repeats, or STR. And what this is, is a technique where you take all your genetic information and instead of doing a whole sequence, because that's too expensive for criminal investigations, you just look at 13 sites, just look at 13 sites and call it a day where you can figure out sort of identifying them based on these 13 sites. And below you have an STR of 15 sites, as well as the amalgene that tells you what gender this person was. Good enough. Let's think about a dystopian scenario for all this information, because after all, we're at DEF CON, right? So the risks of DNA and identity are two-fold. One, someone can frame you for a crime, make you a patsy, or two, genetic discrimination. So in order to put you out as a criminal, they have to first capture your DNA, replicate enough of it, and then plant it at the scene of a crime. For genetic discrimination, we have laws like genetic information, the Discrimination Act of 2008, or GINA. And this provisions that your insurers, as well as your employers, can't actually discriminate against you based on your genetic information. But could your employers actually use your photo to discriminate against you, or your insurers maybe, and reconstruct your genetic information? I don't know, getting scared yet? So let's go through all the questions that you might be having, like how could someone actually frame me for a crime? I've done nothing wrong. How much would it actually cost? How long would it take? Can my discretion, my family's discretion, computers, or regulation protect me? Could your employers insurers actually get away with this heinous crime? Should you actually be scared? So there are three steps for framing someone for a crime. First, you have to capture their DNA. Second, you have to replicate enough of it. And third, plant it in a crime scene. So there are a lot of different sources of DNA. One is that you could actually just take a sample of their cheek or maybe some skin or some hair. Potentially you could hack into some of the databases that have one of the 15 million humans in the United States who have done genotyping or genome sequencing. And this has exploded recently. In 2017, tons of people maximized on the Amazon Black Friday sale by 23 ME kits. And it's estimated that about 100 million people will have had some sort of genetic testing by 2025. Another source of DNA, though, believe it or not, the selfie. Now, you might be wondering, how can you possibly get genetic information from just a photo of yourself? Let me science you guys. So in this really exciting paper I read in Nature, these researchers took 2,329 Europeans and you might be saying this is a very small sample and why just Europeans? So the reason why they took so many, they took only 2,300 people and particularly chose only Europeans was because they wanted to reduce the variance so they narrowed it down to just Europeans and they wanted to boost the statistical significance of any signal that would get from this very small cohort. The methods that they used were stereo spectral photography. So what this is, is that they have 2 cameras and it captures the depth in your face in order to get a geometry of your face. What was particularly interesting about this method is that they found the most variation around your nose. And using hierarchical spectral clustering, they were able to divide your face into 68, no, not 68, 38 associated SNPs or single nucleotide polymorphisms which are sites of high variation in your genetic information. So essentially they're like 38 sites in your face that map perfectly to 38 sites in your genetic information. Well, what is the extension of this? Say, I don't know, you have like 50 million diverse people. So now you're not just talking about Europeans, you're talking about Asian people, black people, purple people, European people, and you have 50 million of these diverse people not only doing spectral imaging, which you can only capture depth with, but you can also get color. And this would be captured maybe with iPhone X or whatever like next generation photography devices we have. And then, oh, there's a question. Yeah, is this the reason for iPhone having light photos? I don't think that that's their initiative. I mean, I'm not privy to that information, but that's a good hypothesis, certainly. So from all of this information, you could actually get a mapping of 50 million people's faces to their genetic information. And with that mapping, you can actually project some sort of prediction of genetic information from just a face. And the particular genetic information with this huge scale of information would be a genotype of 602,000 SNPs, enough to fake a 23andMe genotype. And if that's not scary enough, you can actually use this technique called imputation to reconstruct all three billion base pairs. And imputation is just a fancy statistical method where you use information about the likelihood of certain SNPs correlating with others to reconstruct the rest of your genome. So it looks like our evil mastermind has captured your DNA from just your Instagram selfie. But how do they replicate it? The cost of replication is quite high. It's 10 cents per brace pair, and that really explodes when you're trying to build your genome base by base, which currently would cost around $300 billion. You could also do it with CRISPR. And with CRISPR techniques, you'd get some sort of template DNA and just edit the sites that make them unique. Recall there's like 0.4% variations, so this one costs a little cheaper, you know, measly $120 billion. There are other techniques that Mr. Brimley actually looked at two days ago, so if you are interested, look at his talk. And there's also an initiative from the George Church group as well as other biologists who are actually trying to scale the human genome through synthetic editing. I mean, it's like you're saying 200 million and 120 million, not billion. Oh, whoops. I'm sorry. It's million. Okay. My bad. Oh, then it seems like it's much cheaper than I thought it was, but still pretty expensive. A little tired. Okay, but very expensive and potentially getting cheaper with this initiative of the genome project, right? And this is a way they're using sort of the template of the human genome project to scale back something that was once a $1 billion genome sequencing project to a thousand genome, which is what we currently have. And so they're trying to transfer those innovations with the recreation of genetic information. Alternatively, even though it's not billions of dollars, millions of dollars is pretty expensive. Alternatively, a cheaper way to get genetic information is simply to steal it and just amplify it. So abandoning everything that I said about, like, facial recognition and, like, photos, what's much cheaper is to actually steal your hair or steal a cup and then amplify it using a polymer chain reaction or PCR. So it looks like they've captured your DNA and they've also replicated it for either millions of dollars or less than a hundred dollars if they actually have access to your physical body. And then planting it in the crime scene is, you know, use your imagination. You, like, bribe the cook or something but the butler and you steal their genetic information. I'm imagining your target is very rich or something because why else? I don't know. But then you plant it in the crime scene. So some of the answers to the questions of can someone actually frame me for a crime? It would cost a lot of money currently, millions of dollars, not billions, thank you for the correction. And then it would also take lots and lots of time, potentially. This might be an initiative that might be, in the future, so if you have any sort of imminent crimes that you're planning, you'll have to wait on that. Should you actually be scared? That might be a religious question. So addressing the other questions, can my discretion, my family's discretion, computers or regulations save me? And if they can't, will my insurers and employers actually get away with such a crime? So this was from the proceedings of the American Society of Microbiology, and they said, genomics surveillance is neither hype or hope, nor hype, it's reality. So let's dive into this reality and figure out how to protect your DNA. There are three different ways I recommend. Discretion, secure computation, or regulation. So with discretion, you have to not only cover yourself, but also your family members, because if your family members, including your third and fourth cousins, upload any of their genetic information, you're in big trouble. You might be the golden sea killer. So don't associate any of your DNA with your identity. So if you have to do 23andme or ancestry.com, I would recommend using a pseudonym, getting the kit shipped to a shared office space, doing a number of different ways to pay for it with prepaid debit cards or something, and whatever you use in order to hide your digital identity when you look at your results. Second, do not upload your DNA, because it is subject to whatever the privacy policies are of whatever website you're using. So beware, and then try not to leave stray hairs behind because that's a really easy vector of attack. Potentially, you can also consider makeup or plastic surgery. So recall that I mentioned the techniques for this was using stereo spectral imagery, and if you have an unusually thick beard, you might be safe until they actually learn how to see past it with new cameras or something. A fun thing to use to hide your identity is makeup. So recall World War I, we have a lot of ships and they're trying to shoot other ships. So in order to hide them, we use something called dazzling, and this was a technique in order to obfuscate the speed of the ships in their exact locations, mix success, and then World War II is obviously abandoned, but it's pretty successful for facial recognition technology, and this is just an example of CV Dazzle. An alternative to protecting your DNA is using maybe computers. You might have your genetic information either on a website or on your own personal device. Please use whatever encryption methods that you know how to handle, and then maybe if you wanted to actually peek at it, or if you have to decrypt it, that sounds pretty bad. Maybe you could use some sort of homomorphic encryption to learn on it or federated learning or some sort of hardware integrated computation where you actually store the data and it manages the keys for you, say a trusted execution environment or something. Alternatively, if you still want to upload those cute cat photos, but you don't want anyone to know that you have a cat, or you don't want anyone to know that your face is like a cat because you're hermetic ranger or something, you don't want to consider image poisoning using GANs or generative adversarial neural networks. The way this works is that say I have a face, I have a cat face, and I want to fool people, particularly whoever I'm up, or the machine learning algorithm for discriminating whether it's my face or not, and fool it into thinking it's guacamole. With generative adversarial neural network, what you have is you have two neural networks. One is generative and one is discriminative. This one is trying to generate examples of cat photos, and the other one is trying to discern cat or guacamole. They go back and forth, and at each iteration, the generative neural network will actually make small perturbations in the image in order to fool it. At some threshold, you'll be satisfied with this image, and it'll look fairly like a cat, but it'll actually be registered as guacamole. New Instagram filter, maybe. Other ways of protecting your DNA is through regulation. You have your Fourth Amendment rights, you have the privacy policies of whatever company you're using, you have California's 2016 Electronic Communications Privacy Act if you live in California, and you have GINA. The Genetic Information Non-Discriminatory Discrimination Act of 2008. The Fourth Amendment. You have the right to secure your persons, this could extend to your DNA, and you have the right against unreasonable search and seizures. So in the case of the Golden State Killer, potentially he had the right to protect his genetic information and prevent it from actually having some sort of unreasonable search or seizure. Or in any other example, that's adversarial. You also have the terms of use and privacy policies of whatever website you're using to get your genetic analysis. So this is an example of Gedmatch, the company that outed the Golden State Killer, and you can see that they do not make any promises of securing your genetic information. So whoever their third cousin was who did not read the privacy policies, that was on them. 23andMe is a little better, so they actually require that there has to be a warrant for you to give up your genetic information, kind of better. You also have California's Electronic Communication Privacy Act. So through this act, the government can't coerce custodian or steward of your electronic information in order to make an arrest without a search warrant. So what this means is for Gmail, if you put all your emails in Gmail and the government does not have a warrant for your arrest, and they ask Google and they try to coerce Google into giving you their data, then Google doesn't have to. But obviously, if they have a warrant then, they have everything. And this could obviously extend to genetic information that is electronically stored. For Gina, this is a little better. So the Genetic Information Non-Discrimination Act says that it prohibits discrimination on the basis of genetic information with respect to health insurance and employment. That seems like pretty good coverage to me. You know, they acknowledge that genetic information is extremely valuable, so it should be shared for insight to a certain degree, but obviously this could be abused by your insurer or employer who could see your genetic information, predict that you're going to get a hunting disease and doesn't want to hire you because you're at risk. The next clause, although genes are facially neutral markers, I don't think it's very neutral after the talk that I just gave, so this is quite concerning that Gina has not been updated since 2008 with these findings that were published in 2018 about how there's actually some correlation with facial information and genetic information. So I'll leave you with sort of a to-do of trying to fix this or, you know, awareness that Gina does not currently protect you with, you know, your face. Thanks. Please take no more selfies and stay paranoid. I'm Ann Kim. If you have questions, please, like, go to the mic and ask them or you can also ask them just you or something. Okay. I'll go to another. It seems like the scenario of planting DNA in a crime is a little bit... Contrived? I mean, the really scary thing is targeted genetic attacks with chemical or biological factors that one knows are, you know, that certain segments of DNA are susceptible to. Yeah, for sure. That's definitely a concern, but... Please repeat the question. Oh, okay. Sure. So the question concerned targeted biological attacks using, I guess, metabolites and your genetic information, right, that you have a certain threshold for certain poisons. The problem with that, though, is that you can't target people enough. It's by threshold. So you would actually target not only them, success, but also you would target anyone who has a threshold lower than them of poison. And I'm not sure how that would play out in whatever your attack would look like, but it might be, like, half of them all dead or something. That might implicate you in a lot of trouble. Good question. Was there any diseases that modified genetics that selfies wouldn't really be able to do, such as, like, any initial deformity or... Right. So those are changes after genetic information, so they don't change genetic information at all. But that also reminds me of, like, epigenetic changes, and epigenetic changes are actually on top of your genetic information and might not necessarily come out in some of this research, but also in your data set. And you're accounting for not only genetic information, but also genetic expression in epigenetics. Oh, Bennett. So, like, how much... If your face was deformed, like, are there ways you can, like, deform your face easily that would, like, fool these... Yeah. So... So the question was, is there a deformity that, or maybe you guys potentially could fool any of these genome mappers? And the answer is yes. It's any sort of nose job could potentially fool these genome mappers because a lot of the variation in your face is actually captured in your nose. And that sort of makes sense, like, from an evolutionary biology perspective that, like, you know, a lot of this is connected to how you've survived and stuff. So nose jobs. If you use nose jobs and makeup, Hollywood's doing it right. Yes, Mr. Brimley? Actually, to answer the other question, there's a study out, and I'm not sure if you're familiar with it, they actually took a jungle of makeup and ran it through the facial recognition software, and right now, it beats it. So, develop on us for a lifetime, Posse. Thank you. You? So, all of the protections that you listed were basically from the United States. Yes. Is there any effort to make international regulations or some kind of a... Yes, so the question was about international regulations, and for Europe, they have some nifty thing called GDPR, and there's actually a very specific clause that talks about genetic information, but currently in its state, it does not address facial information because this is extremely new research. Yes? I explained it quickly out of the legal side of things that facial... the Fourth Amendment prediction for your face is probably not going to be that strong because of the thing where they say, if it's a biometric, because I'm going to be used to identify you, then it's not protected by the Fourth Amendment. It's really... Okay, so to reiterate, the Fourth Amendment does not protect you. No, well... I don't know how the courts are going to go. Oh, the Fourth Amendment will protect you, depending on your lawyer. Questions? You might want to use the microphone. Just a suggestion. Cool. Are you going to use the microphone? He just asked. Okay, yes. Yeah, so the PCR seems like a viable attack vector at this point. I'm thinking like a biologic, evil-made attack that takes some of your hair, uses a $100 PCR to amplify, plants it at a crime scene. Is there anything about PCR-amplified DNA that makes it obviously different than native DNA? Oh, well, it depends on the material that they're using, and they can certainly... I'm pretty sure that the synthetic material that they're using and the natural material of your body is quite similar. I don't think you can actually distinguish them, but I'll have to double-check. Good question. Do you want to use the mic? Oh, a comment. Methylation? What do you mean by that? PCR isn't methylated, right? Oh, yeah. Okay, so there's actually... there's a lot of differences. So PCR is different for a couple of different reasons from your DNA. There's not only epigenetic factors like methylation and acetylation, but there are also factors like certain sugars that are attached to your DNA post-processing, or at least for certain RNA and proteins that might be evident in synthetic DNA. Yes. Okay, any more questions about... Oh. I was just curious, your selfie thing here uses 3D facial images. Yes. But if you're talking about the Instagram selfie as being a singular point of data, which obviously wouldn't have 3D information, how reliable would it be to pull genetic information from a single picture or would you have to correlate several to sort of triangulate a 3D image from an entire spectra account? Yeah, you'll have to have a lot of... So the question was about the difference between hierarchical spectral clustering and, like, stereo imagery as opposed to your run-of-the-mill Instagram photo. And the answer is that although currently with, like, small populations and small ends, you need this stereo spectral, like, photography, when you have enough information, you can actually have some sort of, like, mapping or projection from different, I guess, not planes, but, like, mapping from different forms of photography. So you can have the mapping of these stereo images to the mapping of iPhone X images to the mapping of just regular photography, right, if you have enough data in each form. So yes, it doesn't really... Like, currently it does matter that you're using stereo imagery, but in a future where you have enough information, like enough photos, it doesn't matter. Yes? So the next time you go to a dating site, you may be able to find the DNA profile of your potential face. Oh. Really scary. So the note, the comment was about DNA and how you can actually genetically discriminate against your dates. Sounds like a startup idea, guys. Yes? So if STR only uses 13 sites, doesn't that make it a lot cheaper if you know that the police are going to use STR to image the DNA? So the thing about STR is that they're looking at 13 particular sites and it is not just SNP reads, but it's like, it's not single reads, but it's sticky ends of DNA. And the way that works is that you have an enzyme, and an enzyme will cut at a particular string of DNA, and then you'll figure out the lengths of all these cuts in order to describe what the identity of this person is. So even if you only... So I think what you're getting at in your question, his question was about STR and how it's only 13 sites and how maybe you could only affect these 13 sites and it'd be very cheap, but any sort of perturbation beyond the 13 sites or any sort of difference beyond the 13 sites might accidentally get cut by these enzymes and then it'll be in the reed. Yeah. That's what any sort of restriction enzyme digest that even if you affect certain places where you think it's going to cut, the other variation might not be enough. Mm-hmm. Is everyone scared? Yes? In terms of the DNA of crime, compared to their testing, would you actually need to have an exact match of someone's DNA? No, you don't. Okay, so this is really f'd up. When I read it, I was like, oh my god. So the question was about whether you need a perfect match in order to implicate someone in crime and the answer is no because oftentimes the sequencing has high enough variation and enough errors that good enough is good enough, unfortunately. Yep. So even if it's 13 sites, they might only need 10 to implicate you in a crime. Yes? If you have a good enough lawyer request a better DNA test, so let's say I'm implicated for a murder that my brother did. Sure. He goes to jail instead of me. Is there a way to go get my own test and submit it as evidence or do we have to use what the government uses? I think it's important to know about how you can in court of law get a better genetic test in order to get higher fidelity like matching for the crime. And it might come down to Mr. Brimley's point of money and I'm not quite sure how the courts handle that exactly. Any comments from the audience? Any lawyers? Yes. So I think, I mean in that case you're trying to prove the government wrong. Can I get access to this and look at what they use and look at the test? So the answer is that it comes down to proving the government wrong and you might not be able to get the tests. Did I? Yes. Fair phrasing. Nice. How do you get access? Did you hear that answer? So to clarify it comes down to how the court proceeds and what your access is to whatever genetic test they had. Right? Yes, maybe. So that can vary by state? Yeah, potentially. Shit. I know. It looks like California sort of has some stuff like this. This guy. This is pretty good. How's Mark's brother? Another question. So last year there was an FBI director who was in here giving a talk about how the genetic sequencing in these sites by 23 and we are used and that a lot of this is outsourced to foreign firms that don't fall under a lot of these rules and regulations. And it was suggested that the Chinese had made heavy investments in DNA to outsource DNA testing from international sources. Okay, so the comment was about it was a comment, not a question right? Maybe? It's just like how does this affect the legality of things? Is there any protection? Is there no hope? I'm not an expert on international law so I can't answer that question. The question was about how you would litigate in an international outsourced... How it should be regulated? Yeah, well currently in certain like clinical settings you can't export any DNA and then for government regulations like in China for example you can't export any genetic information or if you do what they do in China is they take the server that has all the genetic information or the hard drive and they ship it over and they send over like some sort of Chinese custodian who will watch you as you do whatever genetic analysis and they'll like take this device back because they don't trust any sort of like FTP or whatever. Yeah, isn't that funny? If you're having a coordination between China and Canada they'll like fly over this like giant machine with like someone who like custodians and like watches you. Yep, super safe though I guess. Yes, that question. Do you know, is there any attempt for the Supreme Court to hear the California Act? Have you heard anything about that? Oh, with the Golden State Killer? No, no, no. I never liked the California Act. It was developed by the California Supreme Court. Yes. Have you heard of anything about hearing the case? With the state standards? I had a note about this. Carpenter versus the United States that the police needed a warrant to obtain information on the location of an individual and phones from a phone company. So I think it has been upheld. Maybe. I'm not quite sure what it held. My notes are not that clear. But there is some stuff happening federally. Yes. No, it's correct. Sorry, I'm not an expert of law. I'm just a computational biologist who has a lot of paranoia. Any other questions? Comments? One last question. Sure. What about your brother? We're going to get there. We'll take that data and make 3D representations to falsify video eventually. Oh, okay. So taking genetic information and turning into facial information, right? Yeah, like, hey, I need to make this person look like a Korean. That is extremely difficult because from the sequencing information there's not a lot of... For whatever reason, the mapping is not equally equal both ways. If you're going from genetic information to, like, one, your face is weathered by whatever you've experienced in your life and it's not reflected in your genetic nor epigenetic information to a certain degree. And then, like, two, whatever sequencing you're using usually is not high depth enough to account for whatever error to accurately represent your face.