 So without further ado, I'm going to introduce John Dunlap and he's currently a pentester at GES. I'll let him explain everything that he's about as he goes through DNA encryption discussion. Alright. Oh, I don't think I'm hot. There we go. Hey everybody, who's ready to encrypt some stuff in living organisms. My name's John Dunlap. I am a security researcher, reverse engineer, pentester, whole nine yards of new security. We're for a firm called GES Security. We have offices in New York, London. They don't do much with DNA, but I thought I would just on a lark. I'll tell you more about how I got interested in that and using DNA as a side channel for encrypted messages. Now let's be clear. I'm a hacker. I'm a researcher. I'm a programmer. I'm a curious, stubborn, perhaps foolhardy. I am not a biologist. I'm not a genesis. I'm not a lab professional. I'm not a full-time cryptographer. But none of this has dissuaded me from synthesizing DNA and making synthetic life forms. Maybe it should, but it doesn't. So what was the emphasis for this talk? Well, I've had a long-term interest in biology. I mean, like many hacker type people, I have a wide range of like a special interest. You know, deep hobbies. I really liked reading about biology. There was a local sort of biohacker space that was offering classes on doing PCR. I went and did some gene editing on E. coli to implant glowing fluorescent protein. You can see this is the results of my experiment. These are E. coli bacteria inserted with green fluorescent protein. Actually, no, it's red gerry. Fluorescent protein. So I began talking with the people who ran the space about like, you know, I'm a security researcher. I'm a security engineer. I think I have a project that we can do. I'm willing to become a member of the space and pay the monthly fee. If you guys will help enable me to do some crazy security stuff with these living organisms. And I did. You know, and just a note on that is you have a local community biohacker space. You know, this is a new idea in life. Please support them. Please be responsible. Shout outs to Mike and Will at GenSpace for helping me out with all this. And that's my local biohacker space, by the way, GenSpace in New York. There's so few of these and you'll see in a second why I couldn't have done this without a community space to do it in. We'll talk about it. So what we're going to attempt here is a form of DNA data storage, which is not a new thing. I didn't invent a lot of these ideas. I'm just looking for a DIY implementation of it that someone with a minimal molecular biology background could implement on a minimal budget and do successfully. And if you want to think about the dream attack here, what we're looking to do is take, you know, the codes, maybe the launch codes, and put them into a bacteria where your local TSA agent, where it's not going to think to look for them, you know, oh, hey, this is just my bottle of E. coli. Think nothing of it. They can't ask you to decrypt that hard drive. And even if they were to be smart enough to sequence your DNA, they wouldn't know where the heck to look for it. It's such a big space of area to look for that data. And it's encrypted. It's the other thing. What we're going to do here. So we're going to attempt to implement biological encryption. You know, living stago, if you will. Now, people have been trying this on a research and experimental level for a while. There's some great going back to the 90s even, some great papers suggesting this and going through the viability of various encryption schemes all the way from, you know, one-time pads to RSA and that kind of stuff, or, you know, doing public key cryptography in DNA. The vast majority of it is sort of theoretical. There are groups working on this in practical terms, but there's limitations. We'll talk about that in a second. It's not really the same as what I'm trying to do. And there are definitely limitations that hold you back from doing this. You know, if you read my slides later on, you can take in the breadth of the research. A lot of it proposes using something called a DNA chip, which is sort of hybrid between something you buy and something you cook up in the lab that lets you to have a stable binding of the DNA that can be easily read and to. It's too expensive for me, so I'm disregarding it entirely. And this sort of side channel attack or secondography in DNA was proposed as far back as 1999 and continues to be talked about. Here's what one of those DNA chips looks like, by the way. But in general, there's a lot of crypto papers, papers from cryptographers, less than biologists, talking about how you would do this, the data science of whether it's viable, how you would implement the encryption, but not a whole lot of practical lab results. And again, I get the sense that people are doing this, but a lot of the actual experimental data isn't public. Like, for instance, I'm aware of there's a lot of DNA data storage research going on, like IBM, and their practical results are not public. So my idea is to do it small and do it simple. Because I don't have a lab of my own or funding or a team of grad students or microarrays. I'm just going to cook some cryptography up in an eco-like culture that can be replicated by DIY people. And the way I'm going to do that is to edit a plasmid. We'll talk a little bit about what that is in a second via restriction enzyme digestion, which is the classic gene editing technique that goes back to the 70s with Watson and Crick. And we'll sequence the culture and see if we can get the plain text back out. And then maybe giggle at the results madly because this is just too cool. So what we're going to want to do is first construct something called oligo. And for our purposes, simplified in this talk, we're just going to get a little crust of DNA that we're going to implant into our host organism. If you're going to get DNA synthesized, if you're going to have some like arbitrary DNA made up for you, the various DNA shops that will build the DNA for you have guidelines on what oligos should be. We'll talk about the practicalities of how that falls into my project in a second. But there are a few no-goes for what can go into DNA. And we're going to put our oligo into a plasmid. And a plasmid you can think of as like a DNA donut. For our purposes, in our host organism, a plasmid is like a little ring of DNA that exists outside of the chromosome, which is nice for a couple of purposes. Mainly that it's unlikely to directly negatively affect the life cycle of the host organism. And also, bacteria share these plasmids between each other rather promiscuously, so it's a very easy environment to do gene editing in. Bacteria normally just swap out these plasmids on their own. So inserting them forcibly into the bacteria is actually not going to be abnormal to their life cycle. If you've ever heard of antibiotic-resistant bacteria, one way that that property is sort of passed between bacteria is via a plasmid transfer. Here's what a plasmid looks like under a scanning electron microscope against this donut structure. And here are some bacteria sharing plasmids via what's called bacterial conjugation. They actually sort of reach out this long kind of tendril and swap stuff out. It's pretty freaking cool. So step one, we're going to pick an amenable model organism. In my case, I picked E. coli because it is cheap, easy to work with, readily available, and I've worked with it before. Next, we're going to pick an amenable encryption strategy. We want something that works with a small amount of data that's low complexity, high integrity, and low cost, and it's cheap, and it's cheap, and I can do it in a community lab. Okay? Affordable DNA synthesis, and if you're a little confused when I keep talking about synthesis, you can just pay a company to make you arbitrary DNA. You can order DNA in the mail, which is freaking cool. Every time I explain that to people, they're like, no, really? But size is a factor. Anything over about 60 nucleotides gets radically more expensive, and things like self-compliments or palindromes, basically, anytime that the complementary DNA nucleotides are repeated backwards on each other creates a situation where you can get something like a hairpin that will ruin your synthesis. Repeating patterns can be a problem for sequencing. We want to avoid all that stuff. So for my purposes, the classic encryption algorithms are just too big. The key size on its own is not going to work. It's too complex. There are papers that attempt to do this in DNA, but it would involve sequencing a lot more DNA than I want to spend money on. So I decided to keep things classic, keep it simple, and go with a one-time pad. It can be small. It can be perfect. I don't want to mess with key sharing for DNA. And the likelihood of anyone even finding where the encrypted data is pretty low. And since our message sharing is going to be so slow anyway, we're talking about mailing or walking DNA back and forth between our crypto partners. The key pad problem with one-time pads isn't going to really be a practical problem. So here is just some nucleotides. We have to find a way to encode our message. We've got to turn ASCII text into nucleotides. So we're going to convert ASCII into ACGT. And the exact method by which we do this determines the density of data we can encode. Let's check that out. So we can map 0, 1, 2, 3 into adenine, cytosine, guanine, and thymine. I had no idea before that a base-4 integer is called a quat, by the way. I think that's really funny. There is no standardized term for a base-4 byte either because there's bytes for normal bits to bytes, and then trits to trites, but quats to quites. I think everyone just gives up at that point. I found only fragmentary data of what you're supposed to call that. So we have a base-4 integer, and then we have to decide this is most important thing is byte width. How big our bytes are going to be is going to affect how much data we can encode, and also like the size of our ASCII table. Some people call a base-4 quat byte crumbs. I've run into that. That's really weird. And here's our comparison. So if we use a 4-quat byte, we can get the entire ASCII table. 3 is acceptable. 2 is kind of sketchy. So for simplicity's sake, I went with 4. Next, we have to design a primer. And so the easiest way to think about this is that when we make our oligo, we're going to have encrypted data, right? And on each of that data, we need some sticky bits that are going to help anneal that data onto our plasmid. So we have like most of a plasmid ready to go, just bought from a lab, and we're just going to like stick on our data that's synthesized. So what we want is a 20 nucleotide primer as a binding site and a 20 nucleotide reverse primer. And it's going to be in the shape of a primer binding site, payload, and reverse primer. And then, you know, I had to check with my lab to make sure that that format would work. And then I had to check with the synthesis people to make sure that they were okay with it. Lots of advising going on. If you check with each of the synthesis companies, they'll give you some advice on what's good for them. My very synthesis company, Synthesis Company, I use a company called IDT in my case. Usually they have some pretty elaborate like JavaScript toys to like try and give an idea of the sequence you've given them, it works. But it's not bulletproof. So you want to avoid, like I said before, repeating sequences can cause some problems. Self-compliments can cause some problems. There's a good explanation of the self-complement problem right there. Basically each nucleotide has a complementary sort of letter and it can cause the DNA to fold in and sort of bind with itself, which is bad. You also have to do some analysis on annealing temperatures depending on exactly what base pairs you put on, it affects the annealing temperature. If the primer and the verse primer have radically different annealing temperatures, that can throw off the entire experiment. Then, once you've done all that, you can put your little oligo into BLAST, which the National Institute of Health has a website where you can search to make sure that your oligo isn't part of some pathogen or something like that. Here's the results for my oligo that I most recently had synthesized. It's such a small pattern. It comes up in all kinds of other fragments from other organisms. Nothing too scary in there. I automated this process with a program. It converts ASCII into our base four representation and applies in one time pad. You can configure the bite size and description as well. It's up on my GitHub. It's still marked private right now. I'm going to send it public right after this thought. Let's do a POC on that. We're going to convert some LeetSpeak into DNA because that makes me laugh. A commercial synthesis company actually took this same LeetSpeak and synthesized it for me and made DNA without any complaints, which also amuses me. What it doesn't do is do some of the more robust structural checks on the DNA like self complement and temperature. I use external tools for that. That's on the to-do list for the tool itself. Also, again, when you order from synthesis companies, they will do some of that for you. For our one-time pad, you need a key that's the same size as message. User specified by the tool. Believe me, you'll have plenty of time to figure out no keys in between one-time messages in this process because it's like a two-week turnaround to get the entire E. coli gene editing thing going. If you're not familiar with one-time pad encryption, you need a book of unique keys that are never reused. Having that going between two labs, it's doable. The plain text we're going to use is elite feet because we accomplished something cool. The key is going to be CripKid because we're using this script to become a genetic engineering script kitty. Here are the help and arguments for my tool. You can load the key from a file. You can do different byte sizes and whatnot. If you want to do this at home, if you need to use more information, you can use a smaller byte size, et cetera. Here's what looks like being run. It spits out some nice nucleotide encodings, which is cool. Here's a decryption. Decryps to elite feet. Once I had that and pasted on my primer and reverse primer, I could order the DNA for synthesis. Except not. This is where the community lab thing comes in. For better or for worse, at least in the United States, in our system, you can't just order stuff from labs anymore, which is kind of sad, but I guess it's for the best. If you're ordering DNA from places like IDT, they're going to ask for your credentials as a biologist, what lab you work for, et cetera. Which is some pluses and minuses, but it sucks if you're a hacker. In my case, without the relationship with the community lab, I would be unable to order this DNA period, even though there's not much I could do with it without a lab in the first place. There's a lot of stuff in science you can't order with access to a legitimate lab, which kind of sucks. And it would be pretty tough to do the synthesis on your own because it requires some nasty chemicals that are hard to handle. You'd end up having to do it in a lab in the first place, unless you're very brave. So next up, we're going to be editing our equal lie and, you know, high level protocol here. We're going to use a restriction enzyme to edit our plasmid. Same technique Watson and Crick use in 70s. We're going to snip the plasmid, leaving some sticky ends, which our synthetic DNA may attach to. If you want to see lab protocols for how to do that, I give you some suggestions. Genscript.com is a really great site for checking out genetics protocols. They list out stuff like this, or how to do really accessible versions of PCR. One of the reasons I did this version of the experiment is because it's very well documented. There are many protocols on it. There's lots of advice on it. People do it in college all the time. But let's show some cool pictures because it's not real until you see it, right? So here's the DNA I ordered from IDT. It comes in little vials like this. Here's some more. You know, that's leapspeak carrying DNA. PCR machine had to do some amplification on this. Plates, little carriers for the vials. More plates and PCR gels trying to amplify the stuff pre-sequencing. And here's Genscript's protocol on this. A little too nitty gritty to go through in deep detail. But you add a bunch of reagents together, do some centrifuging, and there's a temperature cycling step that gets the plasmid to accept our DNA. Tons of variations on this. But once we have this done, we can verify that the DNA is in our E. coli. We want to send it out for sequencing. And again, this is another thing you're going to have trouble doing without access to a legit lab. A lot of places will sequence your human DNA. It's a little different than getting stuff like 23 in me, right? Like asking people to sequence bacteria is a little different. They want to know that you know what you're doing and they're also going to charge you more if you don't know what you're doing, because you can send them a plate of bacteria and ask them to figure it out, but they're going to charge you a lot more than if you have something very purified for them. At that stage, once you've done that, you're free to go interpret your results. The lab's going to send you basically chromatograph readings. They're going to turn out to be a bunch of nucleotide strings. It's your time to check and see if things went wrong or right. Results look kind of like this. That's where if you had the repeating patterns or the self-compliments, things could have gone very badly because again, the repeating patterns tend to throw the sequencing machine for a loop. It makes the head kind of think like it's gotten stuck or something like that. Once you have done all that, you can take your secret data anywhere. Anywhere that E. coli is accepted, so I don't know if that's the airport or not. How much did this cost me? About 300 bucks for a lab membership, since this cost me about 40 for a pretty good quantity of the DNA. I did pay about a couple hundred bucks for lab expendables like enzymes and that kind of thing. That's more than enough to last me for a while. Sequencing cost me a bit more because I paid for a primo sequencing kind of situation. About 800 bucks, but if I did it again or I'm continuing to do it right now, it costs less every time because I don't have to buy most of that upfront stuff. What's the method? What's the point? No one is going to ever guess your data is in that bacteria. No one's going to ask you to decrypt it. It's an obvious storage medium. There will never be any key escrow on your bacteria. It would take a very dedicated effort for the NSA to bacteria. You get to birth synthetic life, which is pretty cool. But it is pretty expensive. It's slow. Keeping one-time padbooks sucks. There is a problem of, since our synthetic DNA or whatever, does not provide any evolutionary advantage to the bacteria to E. coli. It has a high chance of generationally just doing away with it, evolving away the plasmid you put in there because it has no reason to keep it. Bacteria sort of evolves pretty quickly. You can't really do this without the involvement of several labs. But if you're really dedicated, you can. The cost per byte is a little high compared to on-chip AES, just saying. What am I going to do with this in the future? I'm going to try this out in a new model. Organisms put it on stuff like yeast genomes with CRISPR, trying to implement some more lab algorithms yet cheaply. Then maybe a viable method to package a living colony for stealthy transport. To do some longer-term testing to see how long our encrypted information is capable of living in the bacteria without being corrupted. There's a lot of talk in literature about error code implementations to make sure that your stuff doesn't go bad. Making the lab people less mad with fewer begs for help. Promising idea yeast might be able to make a dried yeast or something where our information stays alive pretty well. I hope you guys had fun. If you're a well-informed molecular biologist, I'm so sorry. This is going to be like when I watch CSI Cyber or something. Yeah. I'm a hacker. I dabble. That's what I do. Please get involved in biohacking community if you're not already. This stuff is pretty accessible to me to teach you. Just make sure you're around people to write credentials to teach you to handle lab equipment safely and you don't get bacteria in your eye or something. There's also a lot of fun bioinformatics work to be done. If you're thinking of starting a community lab, please do that. The world needs more of those. Please do hit me up on GitHub or Twitter if you're interested or have questions about my project. In general, have happy time biohacking.