 I'm going to start and just remind people a little bit of what DNA is. DNA is the hard drive, the memory in every cell of every living organism that has the instructions for how to make that cell. But it's a chemical molecule, it's not magic, it's not special. It's in fact four different kinds of molecules that can be stuck together in a chain, a larger molecule that is a chain, and you can put those four in any order. And if you can read that back, you have a sequence of characters, if you want to think of it like a digital code. You can see on the right is a little representation of what those four molecules are. In the middle of the famous double helix structure that nature uses to store a sequence of them stably in every cell. But if you read out the information that's there, it's just like a ticker tape of letters, each one being one of the four possibilities. And there's three billion of those, and that mere three billion letters defines your genome and all the instructions to make a living human or a living ant or yeast or any organism you can imagine. It's sort of like this, but it's incredibly small. We have a big data revolution in genomics, and this next slide illustrates why that's happened. Ten years ago, the cost of sequencing a genome of one person or one living organism was about the same as the price of this, which is famously the most expensive house in London. And ten years later, the cost of sequencing that one genome was the price of this, which is a season ticket to see Arsenal Football Club, which is one of the top UK clubs. And in another ten years, the cost will be the cost of going to watch one terrible club play one game. So the price is plummeting, and so scientists are very excited and they do more and more genome sequencing. And then after they've done that and they've done their experiments, they want to keep the data safe. And unfortunately, that's where my job comes into it, because they send that data via the internet to my institute, the European Informatics Institute, and they ask us to store that information. So our PR people give me a picture of a mountain and a graph, and I don't even know which graph it is, because all our graphs just go up exponentially. So that's one of the databases, or maybe all of the databases. We have a 60 petabyte storage system now that we're filling up. It's nearly full. And that gives us a bit of a headache on how we go about storing that data, because our budget does not go up exponentially. But if you'd like to put pressure on someone you know to give us more funding, please do. So we buy more and more computer servers and more and more hard disk drives to store this information. And we have headaches doing this. How do you capture exponentially increasing data and serve it back to the whole world on an essentially flat budget? Why do you store this data? Do people really care? Well, people really do care. And we just have a little live demo. What we have here, this is in real time showing you the hits on our website. We have about 100 times more hits on our website than CERN does on theirs. This is the people in real time using the website. If anyone's got a smartphone and they're quick, hit this QR code or that URL there. And if we're lucky, we'll see Davos light up. I don't know if that will work. Probably your phones might not be registered as appearing in Davos. But we get thousands of hits a minute, millions of hits a month. One thing we realized is that all this information we're storing, it's about DNA, but the DNA we're storing information about is a digital storage medium. It's a sequence of, not zeros and ones, like in your computers and smartphones, but a sequence of a discrete alphabet of four letters. And if we could manipulate some DNA, we could put a message in there ourselves and we could use DNA to store it. DNA is a really good way of storing information. It's been used for hundreds of millions of years on life on Earth. Evolving has used that as its hard disk drive. Maybe we could use that. So we devised an experiment to see if this was a feasible way of archiving and storing information. One of the things we needed to do was decide what information to store and what kind of code to use. So we had to invent a code that would store information. We didn't just want to store information about genes and processes in a living body. We wanted to store any kind of digital information, just as your computer and your phone and your iPod can do. There's various problems we had to overcome. We can't make very long pieces of DNA as humans yet. So we had to invent a code that could make a message out of many short fragments of DNA. We had to devise a code that could store any information. We had to be aware of problems that would come and errors that might occur in the writing or the reading of that information, just like digital TV transmission or mobile phones have error correction in them. So we devised a code that would do that. We picked some bits of information to store and we thought, what would be high value information you'd want to store a long time in a DNA format? Maybe the poem, The Sonnets of William Shakespeare, 154 Sonnets as a text file. We put actually not the picture of Martin Luther King but an excerpt from his I Have a Dream speech, which I hope. So I want to emphasise not a transcript of what he said, the actual recording of what he said in an MP3 format recorded into our DNA code written into pieces of DNA. Because we're molecular biologists at heart, a PDF copy of the Watson and Crick paper from 1953 describing the helix structure of DNA in living cells. And we encoded those and we had that made into DNA by the Agilent Company in California and it came in a test tube exactly like that one it wasn't nearly full. In fact, when it arrived, I opened the box up and I held it up and I thought something had gone wrong because it was empty and my more skilled molecular biology friends had to explain to me that that tiny smudge of dry dust sticking to the bottom of the tube was the actual DNA and it was a tiny, tiny spec. If the whole thing was full, we would have a petabyte of information in there and that's hard to imagine what's something the size of your finger with a petabyte of information is. So in other terms, I've just paced out the size of the stage and by my calculations, if you laid out CD-ROMs all over the stage, you get about 1,000 of them on here, if you did that 1,000 deep, so it'd be up to about here somewhere, this whole stage this deep in CDs, that's a petabyte. So you can either have that much information stored in that format or something the size of your finger in DNA. So it's really compact, that's why earlier on I reminded you that DNA is very small. And we devised our entire experiment from information on the top left on my computer, my laptop computer, we sent that to Agilent in California, they synthesized the DNA for us, they sent it back by courier, thank you FedEx, came back, it came back to us in Cambridge, we did a bit of purification, in fact it went over to our German laboratories because that's where we have our sequencing facility, it was read in a DNA sequencing machine exactly the same as used for human genomics or any genomics experiments around the world. And we recreated the computer files back on my laptop and it doesn't make for a very good image but they come out exactly the same, they weren't just similar every single bit, every single zero and every one was correctly reproduced. So we have essentially a proof of principle that we don't just have to marvel at the way nature has evolved a system for storing information in DNA in every living being, but we can use a very similar kind of system and use the same chemical molecule that's so good for storing information. I have a few science bits but I'm always warned like Stephen Hawking said, have no equations and I should probably have no graphs but I can't resist. We did various studies in order to get our scientific paper published. What I'm showing on the left that I don't expect anyone to fully understand in the one minute I'm going to devote to it is the fact that for the size of our experiment, our coding system is reasonably efficient and works quite well. As you store more and more information, there's some technical issues you have to deal with about how you reconstitute that information and it gets marginally less efficient and we were asked would it still be viable to larger amounts of data? And what we showed is that from our experiment which is sort of on the left hand side of the graph over to all the information on any device on the entirety of planet Earth, we're still pretty efficient. That's what the three zeta bytes data point represents and we remain adequately efficient even up to many, many orders of magnitude more information. So basically it does scale up quite well but if you want an easier image to take home with you, you could get all the information in the whole world encoded in a DNA format in the back of one and for Americans in one SUV and for the English in the back of one is state car. So you don't need many, many data centers all over the world, all that much information in principle would fit in one vehicle. Can we get the information back out? Well, we can do that. That's pretty efficient. That's what this revolution in DNA sequencing technology that's brought you the season ticket priced genome has brought. I have actually the latest devices look like that. Actually, I have a slightly more up-to-date one here that can go around the room. This device could sequence, well, they're in beta testing at the moment but within months people will be able to sequence a whole genome in a day or so. I'm very happy to go around the room but I do need it to come back to me at the end. The one on the screen has a USB plug which would plug straight into your computer and they had to admit they couldn't actually design that but it does come with a USB cable to plug it directly into your computer or maybe your smartphone. We can read DNA sequences that pretty quickly. That's not the problem. We can make lots and lots of copies of them. That's really easy. It's actually better than a photocopier. A photocopier, you have to run it once every time you want to wait one more copy. Copying DNA is really easy and it works exponentially quickly. You start with one and you make two copies. From those two you can make four and you don't need a different machine for every one of those. It's just one very standard piece of laboratory equipment can very rapidly grow exponentially many copies of what you started with. So copying it is not the problem. Once I have my message I can distribute it to all of you very easily but writing that in the first place is very difficult. Humans are not good yet at writing the first new copy of DNA. Once we have one we can make more copies of it but making that first one is difficult. This is Agilent's facility in Palo Alto. They need a clean room. They need a very, very complicated machine, a bit like an inkjet printer but more complicated and it's slow and it takes a while. This is very much the rate limiting step in the procedure we're working on with at the moment. It takes too long and it's very expensive. So the idea of having all the information in the back of one vehicle, there isn't enough money on earth at the moment to do that. I want to finish off with some ideas about why this is a good thing to do and depending on your age you will recognize or not recognize some of these but what they illustrate is a whole bunch of digital media or potentially digital media that have been used. I have used every single one of these. If you've got any of these things lying around in your house or your office or in your garage or whatever you won't be able to read them anymore. I'm pretty sure about that. Within a few years these things become obsolete and that's a problem if you're trying to do long-term archiving. I would argue that no one on the earth is currently long-term archiving digital information yet most information is now being created digitally. Currently it's stored digitally. It's observed, looked at on your screen digitally but we can't archive that information. Some things only exist in a digital form. Movie companies do things entirely digitally. Now they shoot things digitally. They mix it all up digitally. They do different versions on the airplane or the 3D movie or the 2D version of the movie. Everything they show it in the movie cinema digitally. When they want to make an archive copy then they make an analog celluloid copy because they know how to do it. That's not a good way to go. But if we go through any, how long will hard disks work? How long will memory sticks work for? How long will DNA work for? It works a very long time. How do we know that? We've done the experiment. We've looked at mammoth DNA that's 20,000 years old. Neanderthal DNA is 40,000 years old. The bison is 60,000. The current record is ancient horses. 700,000 year old DNA sequences have been successfully read from samples found. That wasn't even a carefully done experiment. That was a dead horse that lay there somewhere cold. If you do something careful with it you can make the DNA last much longer. We have the facility already, essentially. You don't, it's not difficult. It's not complicated. All you need is somewhere that's cold and dark and dry. And even those aren't entirely necessary but they're the best conditions. The global seed vault in Svalbard in Norway already exists, what they actually store there is seeds. But this is a facility, it costs almost nothing to run. There are no staff there. It's in the Arctic Circle. As long as you shut the door, it's dark in there, it's freezing cold, it keeps it dry. We already, it's very cheap to run a facility that can store this kind of information. If you want to do it yourself, your refrigerator is just perfect. And if you're really conservative your freezer is just perfect. Will we have a technology to read it back? Well, we will because it's DNA. As long as we have humans who are technologically advanced, we will be able to read DNA. We're changing the machine every year or two as the technology had proved so much. But they can all do the same thing. They can all read DNA. As long as there are people, there will be DNA readers. The same is not true of a floppy disk drive or probably a USB memory stick. What are we going to store in the long term? Well, it's very expensive. So if it's very expensive and it's going to last a very long time, you're going to start off with things that have a very high value. Maybe the presidential records in the US perhaps are considered to have very high value or information about where nuclear waste has been dumped. That's really important to keep that information safe for a long time in a format where anyone can read it back. But as the technology gets cheaper, what do you and I think? When it's down to a few dollars maybe your family photographs or something you would spend a little bit of money to have in a really safe format. Put somewhere out of harm's way where you know that future generations will for sure be able to read it back. I'm just going to finish off by thinking on a little bit further. Money is digital information these days. This is actually not a real coin. That's a pretend Bitcoin. But Bitcoin is a form of money that now really only exists on computers and with cryptography. That's something we can easily store in DNA. And I wanted everyone to be able to go home with a real reminder of what I've been talking about today. My kind assistants are going to come amongst you now and I'm going to finish speaking. What we have done is we have bought a Bitcoin and we have encoded the information of that Bitcoin into DNA. Please take one home. If you follow the link that's here you can get a pointer to a web page that describes our project. It gives you a pointer to the technical description of the encoding method we've used. A DNA sequencer at the moment, you don't need to own your own. You can send it off to commercial services to do the sequencing for you. Whoever gets there first and decodes it, the Bitcoin is yours for the taking. The first person can claim it for theirs. Once someone's sweep swept it up, then it's not available to anyone else. So yes, it is a race. Good luck with that. I'm going to finish there. Thank you very much for listening.