 Good morning. I'm Jay Fiedel. This is Think Tech. Think Tech talks, as a matter of fact. And we're talking about tech this morning with Sarah Winaker, who is a, let's see, a geneticist, right? Am I right, Sarah, in California? And she has spoken to us before about the genetics of COVID-19. But today we're going to talk about something else that's related to that. So, Sarah, can you tell us a little about your work in the area of DNA? Yes, so I was a research geneticist, molecular geneticist, at the University of California for 25 years. And our focus was the identification of genes that are associated with specific human genetic diseases, such as muscular dystrophy, Huntington's disease, dwarfism, and a number of others. So it started out in terms of identifying those genes. And then we sort of moved to figuring out how changes in those genes actually cause disease. This is very exciting. Back to the day of Watson and Crick and Rosalyn Franklin, you remember that story, don't you? We got the short end of the stick, didn't she? Yes, she did. There was a piece on television the other day, and they were talking about the work of Watson and Crick on the double helix. And they didn't mention Rosalyn Franklin. I say, see? There it's happening again. She's getting the wrong end of the stick. But a minute later, they brought her in. I mean, that part of the story, and they started talking about how she was, you know, she was badly treated and she wasn't given credit for her work on the double helix. But everybody knows today. Yeah, that's right. You didn't get the Nobel Prize, however. No, no, it's never really been properly corrected. And Watson and Crick get all the credit for it too bad. Anyway, so we have the double helix. And we have DNA and we can do a genetic mapping. And we can do that for anything with DNA. What in the world has DNA? What doesn't have DNA? Oh, so all let's, that's a good question. Most organisms, most living organisms are have DNA as their basic genetic material. There are exceptions. And the current virus is one of those. It's actually an RNA molecule. So normally, in our cells, the DNA is in the nucleus, it then gets transcribed into RNA that gets carried out to the rest of the cell, which is where all the proteins are made. And all the good stuff goes on. But this, this virus itself is just made of RNA and proteins and lipids. Well, in dealing with the virus and, you know, the whole genetic research around the virus and the epidemiology, China has developed tracking systems galore. And those tracking systems are being propagated not only in Wuhan, but all over China. And then you have other places in the world that are that are trying to do the same thing, including Israel, which is working on an app or maybe has actually, you know, deployed an app, which will allow, allow tracking. And of course, the U.S. has a, has an app by way of a consortium between Google and Apple. I mean, it's working on, they're working on an app that will do tracking. And this all suggests that the old day of, what do you want to call it, molecular privacy, molecular privacy is like over. And, and, and what going, going forward surveillance will mean that there'll be a database, a national database, I hate to say this, but a government database, or at least a database of large consortia that will have our DNA. And it's just sort of inevitable when you see all the surveillance and tracking and all the high tech that's going into trying to figure out the epidemiology of coronavirus. So I would like to talk to you this morning about how ultimately, inevitably governments and large consortia will be able to accumulate to, to collect DNA samples, collect DNA data, how they will keep the data and how they will retrieve the data when they want to find out who we are and where we are. It's pretty scary. But can you give us a look into the future about how such a database would be established? Well, let's first talk about which types of databases are out there right now. Oh, okay. Yeah. So there, there are so called national databases. These are for the most part, forensic or criminal databases. The US has one. It's called COVID. No, it's not called COVID. COVID on the mind. Very similar. It's called CODIS. It means a combined DNA index. And basically has collected DNA involuntarily from convicted felons. Okay, so this is, this is a national DNA database just for forensic and criminality purposes. And like 54 countries have a very similar type of national database. How do they get the DNA? Uh, from the convicted felons? Officially, what is it a swab? Is it a blood sample? What is it? Yeah. It's generally just a swab, like a bucle swab. So you can scrape off the cells in the interior of the mouth and put it into some sort of buffer that bursts the cells open and releases the DNA. That's sort of simplified explanation, but it's a very simple, straightforward kind of collection. Okay. And then what now you have the released DNA. How do you get from that to a database? Yeah. So then, you know, you collect the DNA from all the other, you know, portions of this soup that occurs once you've laced the cells and run that DNA through a DNA sequencer and then store that data, which is really just a hugely long string of ACTs and Gs, which are the nucleotides. So you can store that bit data in the, in the, in the database. So let me, let me ask, what is the sequencer and how do you get to zeros and ones out of the sequencer? So you're not necessarily getting zeros and ones, you're identifying the sequence of the nucleotides. So ACGT, A, P, A, you know, on and on and on and on for billions of billions of individual data points. So how does the sequence of work? Is it, is it work with, with light, with some kind of radioactive work? The most part it's with fluorescent dyes. So different for, for fluorescent dye for each nucleotide. So then you run it on a, on a, it's called an electrophoretic gel. So you, it has a negative charge at the top and DNA is negatively charged. So forces the DNA down through capillary action. And then you can just read it off. Okay. And this is all automated in the sequencer. Yeah. And now you have the entire human genome in a number of hours. Okay. Yeah, we don't have the time to do that when you're collecting for, for the, the existing databases, it must be a lot faster than that. Yeah. So the national DNA databases, our genome, I'm getting an echo there. Our genomes have a some variability built into it. So naturally, you know, your DNA is not identical to mine is not identical to the, to the neighbor up the street. And so we have been able to identify which areas of the genome are most variable. And just a collection of just 13 of those sites, can I can distinguish one individual to another? So most of the national DNA databases don't sequence the entire genome. They're just using this very, these very small snippets of DNA. That's so they don't need to and they can distinguish you from me from anyone else on the planet from everyone else on the planet with, what did you say 13 variables there? Yeah, that's incredible. Yeah. Okay. So now we have we have that reduced to data. And the data is in database. What happens with the database? If I want to if I want to find out, you know, if, if I have a sample of DNA taken, for example, from from a crime scene, right, and I want to find I want to find the person who has the same DNA, how does that work? So then you, you, we need to do the same sort of process and isolate DNA from that evidence left at the crime scene, perform sequence analysis or what's called PCR on that DNA, and then just search the database for a match. And I am, I am looking for a match of the 13 characteristics, 13 variable characteristics from one person to another. Someone in the database to the DNA collected at the crime scene. Sounds like this would be a perfect job for AI, wouldn't Oh, yeah. Yeah. That's a lot of it is very robotic. Okay. Yeah. So that's that's the national DNA database as it stands now. The national DNA database is not something currently anyway, that has collected DNA from the majority of citizens and sequenced the entire genome, which is three billion base pairs. But there would you want the entire genome if you can distinguish on the basis of 13 characteristics? It depends what you're utilizing the DNA database for. So there are other. These are not national DNA databases, but there are pharmaceutical companies and biotechnology companies that are collecting DNA from individuals sequencing the entire genome, and then correlating that information with medical and health records. So there is a company, for instance, in Iceland, called decode genetics, that has voluntarily collected DNA from the majority of their citizens. And correlated that with health records. Sorry, I was getting in. Yeah, okay. So that therefore they can determine if in this particular family, say Alzheimer's is running in this family. And they have sequenced the genome of individuals in that family. And they find this one specific mutation that only occurs in those individuals with Alzheimer's, then you might begin to suspect that that DNA mutation is contributing to the disease itself. So you're making a medical examination using the samples from this family in order to learn what what type of mutation causes the disease, not only in this family, but anywhere in the human species. So then that can be expanded to other families. And there may be multiple mutations in, you know, in the different families, different mutations, or there may be a very common mutation that contributes to the disease. Well, let me go a step further, just on this one example you're talking about. So now I find that there's a certain mutation in the genetic makeup of the people in the family, and thus in, you know, in much larger population. And I say, hmm, I would like to solve this problem. I would like to solve this problem for the human race. I know, I know the little bugger in there that's creating this problem. So I'm going to do gene splicing of some kind. And I'm going to fix this problem and make a medical, a medical technique to splice the genes of everybody on the planet who has this mutation. Is that is that being done? Can that be done? Or is that aspirational only? I think rather than doing that, which would be really prohibitive, what this information does is allow us to identify what those genes are that cause disease. And what potential therapeutics we can use to treat based on knowing what the function of that gene is, what kind of cellular pathway it's affecting, then you can kind of fine tune. Okay, what kind of medications can we use to actually not alter the DNA itself, but alter the process and sort of, you know, correct the error. So it's, it's mostly a collaboration now between these companies and pharmaceutical companies. So this company that I was referring to in Iceland called decode, it's actually has been bought by a US company called Amgen, which is a pharmaceutical company. Oh, sure. That big company been around a long time in San Diego. Yeah. Yeah. So they are utilizing that information, you know, for a, for a good purpose to help resolve some very, you know, common public health issues. So, so they're, they're, you know, they're two sides of the coin. It's, that's a very noble, and it will really benefit human health worldwide, that kind of study. But the flip side is that, you know, this information currently is coded so that the actual individual identifier information isn't known. But we do know that databases can be hacked and all sorts of things can be found out. So, so there is a risk of violating privacy. Just for, just for one detail. What, what format is this information and what kind of databases are, I mean, database programs are customarily used, what kind of database formats are customly, I suppose you could put it on a big spreadsheet, you could do that, or better yet, for searching, you wouldn't want to put it on a sophisticated database, but which ones are in play for this? Definitely not a spreadsheet. I mean, this is this huge amount of data. When you're talking about the human genome, each individual's genome is three billion base pairs long. Oh, we don't have that a spreadsheet. No, and to, and that's just one person's genome. And then to be able to look at, you know, across genomes and find, you know, areas that are similar to one another, you know, and correlate the specific mutations with disease, it's got to be a very, very sophisticated computer based software program. And, and you would need a big computer for the storage of all this information to my laptop, my laptop's not going to do it, is it? No, no, it's not. This company decode has got, you know, I don't know about warehouses, but massive numbers of computers to store the data. Yeah, it's very interesting about electricity, and it's sort of interesting because Iceland has a lot of geothermal energy, which is also why a lot of these Bitcoin companies are stationed in Iceland because it's cheap. It's yes. Well, you know, I was in Iceland a year ago. And I, you know, for vacation, I must, I must say that it's the people are very cool, irrational. The whole place, you know, is it works like a fine clock. And you're right about the geothermal they use it for for hot water everywhere. It's a perfect solution. They're very conscious of climate change and they're very conscious of technology. And I find this whole thing about the, you know, DNA database, an example of how advanced they are and how open their thinking is incredibly intelligent and creative, and as we say, rational and, you know, really concerned about the greater good. So now, you know, so okay, so Amgen buys this now, now it's in the American universe, so to speak. And Amgen is a private company. I mean, I said, it's a public company, but it's a business, yeah, not the government. Right. How sure can we be this would not fall into the hands of somebody who, you know, was up to it for mischief? I don't know how sure we can be. That is one of the, you know, arguments against having these sorts of databases is that as careful as we are to protect that data, there are always going to always but when you look at, for instance, that Experian was something like 80 million people had their data, you know, so so there is this it's gotta happen. It's gotta happen. So it's really as with most things in life, it's it's a balance between, you know, the benefit to mankind and the potential pitfalls. And the problem the problem I see is, you know, I'm into I'm into making systems that work. And if you want to have a system like this, the power of it medically and oh God, in every way, demographics, whatever you want is is huge, huge. But it it's it's never going to be as as useful. If you have people coming in on a voluntary basis, or let me say a loosely voluntary basis, because if you went to people on the street right now is it would you mind if we took a swab, because you want to put it in this database and Sam, and, and Amgen, Amgen, whatever. Just a small percentage of Americans would agree to do that. Think a lot of them would say no. And the question is how you incentivize them. There's got to be a way to incentivize them. If you want to have a robust sample of data, any ideas about that? Well, let me tell you, you know, for for another type of DNA database that we're all familiar with, these genealogical databases, like me and ancestry, they have a letter of consent that you that you assign if you so choose to allow your DNA profile to be used for research. 80% of people agreed. So their genomes then can be used for for research and sold to pharmaceutical companies. And so I don't know how many people would be opposed to it or not. And but you're right, you've got to incentivize them in some way. But you know, and it certainly has to remain voluntary, even though, of course, the power of these things would be that much more significant if it were compulsory, but I hope we don't go down that road. Well, you said you said it has to remain voluntary, but in the world we're living in, it may not. In fact, I think in China, it is not necessarily voluntarily, voluntary at all. Other places too, I think a dictatorship can find lots of ways, not voluntary to require people to provide. Yeah, I guess when I was saying has to that that it should remain voluntary. For example, for example, I'm just thinking loosely, but this is a problem that you and I can, I'm sure we can solve it right here in the show. Oh, yeah. So I have a public health system, everybody gets free, everybody gets free, you know, medical care. It's really wonderful. Everybody loves it. But you know, one of the conditions to that is if the baby is born in my public health system, we're gonna take his DNA or her DNA. Who's gonna say no, and we're gonna put that quietly in our database. Everybody who's born in that country is gonna be in the database. I think that would that would work fine. And if you if you didn't want to give your DNA fine, go give birth somewhere else. You know, at home, for example, I think there'll be ways there will be ways where governments can find a way or large medical organizations can find a way to get to you and for the most part, as I said, I mean, they are being utilized for the general good. Yes, they are. But let's take a look at the general bed for a moment. Yeah. Let me let me, you know, this is Dickens and the Ghost of Christmas Future. You know, how could a mischievous government or mischievous hacker for that matter, misuse misuse this, this data, the genome for a given individual group of individuals. I mean, let your imagination fly. Okay, well, okay, if we're talking perhaps about committing crimes and framing individuals for that crime, say get your neighbors find out, you know, that your neighbor is in this database, and go collect material from his trash can or cigarette butts or something and sort of scatter that around the crime scene. And that will be matched up to his or her DNA. So you can commit the crime, but frame somebody else for it. That's one that would that would be one sort of nefarious reason. Another probably more more likely flaw or, you know, disadvantage would be for years, or I'm getting that echo again. Yeah. For employers or insurance companies to be able to tap into that database and determine whether their potential higher ease or potential individuals that they would ensure are say prone to stroke or heart disease or, you know, some other disease that either the company would either have to pay increased health premiums for or something like that. So that that is a real concern, probably a more likely concern. Yeah, because you would have a digital divide between the people who have good genes and maybe not so good genes, good mutations, not good mutations. And the ones without the ones that didn't have the good mutations, you know, the master race, so to speak, hate to use that term. But there it is. Those people would have a disadvantage from the day they were born or the day somebody got that data. Yeah. And that's I guess that's one of the concerns they have now about using these potentially using these immunity cards, that that people would be more likely to be hired or we if if they were immune to this COVID. So that's really based on on DNA, but it is sort of bio worker. Well, Sarah, what do what do we do is there, you know, of course, what I get from this conversation is we do have the technology. We have the technology to get the data for the day of the DNA. We have the technology to store the data to search the data and so forth. I mean, it's all already there. The only thing is missing is, you know, grand, grand, you know, amounts of data, which, which someday may be available by one incentive or non incentive, or another. So what this is a hard question. What what can we do, at least in concept, to prevent the data from getting into the wrong hands and being used for purposes that are really, you know, not, not acceptable from our moral ethical point of view. Yeah. Well, I mean, the most important thing, I guess is entry of this data and collection of the samples themselves, that they have to be disassociated with personal identifier information. So it's one thing to have the sequence itself and the genetic profile, the DNA profile in the database. But if you can't, you know, connect that to an actual individual, then the concerns about privacy are much diminished. Of course, it's less valuable too. Let me let me offer a thought and see what you see what you would say. You talk about the sequencer, you talk about the machinery, if you will, both biochemical and otherwise that takes this data puts this data in the database. Would it be possible do you think to have a reliable form of signature that goes into the into the data into this huge, you know, billion billion field database to say this comes, this was taken by Amgen, for example. And this is legitimate. And we're real. And you know, if you if you have this data, you see this data, you know, this data has a certain amount of reliability, because it's real. And you can extend that idea to keeping it secret and keeping it private and only people who have volunteered to give that did. In other words, mark each sample, mark each record with a signature indicating its authenticity and you know, the quality of its collection. What do you think? That's easy enough to do. I mean, every one of these samples is given an identifier number. But that number is not or should not be traceable to an actual individual. Right. So it's anonymous. Yeah. And somebody has another database linking that number with the identity of the donor keep those databases separate would be. So who's working on this? Where can I where can I look at the progress being made, both in the technology, you know, the medical side and on the information technology side and on the privacy side. Somebody has to be working on this because the implications are so huge. Yeah. I mean, I think I keep referring to this company in Iceland and they've done it, you know, they've established this database back in the early 90s. And so they have been through a lot of these sort of ethical dilemmas. And it's it's called decode DE and then capital CODE. And you can you can look up a number of articles on, you know, the ethical implications, you know, just how they've designed their database, what they're utilizing it for. That's a good place to start, I think it's because it's sort of a very good example of the good that can come out of this. But there are other issues now. For instance, they in Iceland have been able to identify X number of women that carry a specific mutation that causes breast cancer, BRCA2. So they can theoretically identify the individuals like go back and de-mask and identify the specific women that carry this mutation. But they are now their hands are tied from being able to tell these women, because they are initially submitted their DNA with an informed consent that they would not be identified. So that's that's another dilemma. So there's all kinds of layers of ethical issues that are revolved around these DNA databases. They've only just begun. Yeah, we're sort of in our infancy. And, you know, thank goodness, we don't have a national database, other than for, you know, forensic and criminology purposes. It would be bad enough to have a national non DNA database. But you add DNA to it, it really gets bad. They really, you know, they've done a lot of good. You know, even in terms of this current virus being able to sequence, you know, the virus from again, going back to Iceland, the same company offered to isolate the virus from any infected individual. And they're able to trace like how mutations occur in the virus and where those viruses are coming from. And just getting a lot of information that can inform public health. Well, we're out of time, Sarah. I just want to ask you one more question, you know, for the average person watching this, that person trying to, you know, wrap around what's going on here and trying to appreciate the technology and its implications. What message would you leave? Would it be a message of, you know, don't cooperate, stop this if you have a chance? Or would it be a message, a kind of message of, let's let's work together to make to make a better medical world, a better, a better democracy, if you will. Yeah, I think it has more potential for good than it does for bad. I think we can solve or at least gain some insight into a lot of common diseases that affect people worldwide, you know, cancer and Alzheimer's and heart disease and stroke, which would be a huge benefit to mankind. I think that people just need to be aware of the potential pitfalls and do what they can to protect their privacy. For instance, you know, individuals that are submitting their DNA to 23andMe, pay attention to that research clause. If you don't want your information to be used for research purposes, or to potentially be sold to another company, don't sign that. I think 80% of people have agreed to that. But I bet a large portion of those haven't really given it another thought. Yeah, it's moving. It's changing. It's got implications everywhere, including political. Well, thank you. Thank you so much, Sarah Winiker. We really appreciate you coming on. I hope we can do this again soon. I want to follow your science. Okay, thanks so much, Jay. Aloha, Sarah.