 Tonight I'm happy to present Jason Fletcher. Jason is a professor of public affairs with appointments in sociology, agriculture, and applied economics and population health sciences, as well as the director of the Center for Demography, great, on health and aging at the University of Wisconsin Madison. Prior to becoming coming to the UW in 2013, he held appointments at Yale and Columbia University and has covered over a hundred academic articles published, sorry. Health economic Economist by training, he has worked to integrate genetics and social science over the past decade culminating in his book the Geno Factor. Please help me welcome Jason Fletcher. Thank you Amanda and Roberta. Those were just warm welcomes and I love this type of talk. I've given many and they're all they always all these discussions always bring up really interesting and new questions, so I look forward to that. But before we do that, I have to talk for a little while, so that's what I'll do first. I want to think Senator Herb Cole who gave a gift to La Follette School a few years ago to do exactly this outreach to the state, take the ideas from all the institutions in the state and spread them around, spread around the Wisconsin idea to all corners of the state, and so tonight I want to outline some of the advances coming from the genomic revolution and highlight how these advances have resurfaced critical questions on law, policy, and ethics around privacy, discrimination, and human forcing, so that's a lot. So I want to take us all the way to the end in some in some way like that, which is to say here we are. This is 15 years, the last 15 years there have been over 90,000 new findings in the human genome linking specific variants to disease and the NIH updates this catalog monthly because of the fast pace, so this is this is taken from over 4,000 publications in the last 10 years and what you're seeing are human chromosomes with little dots next to them where in each dot is a distinct finding related, relating a human genetic variant to an outcome. An outcome could be cardiovascular disease, Alzheimer's disease, height, and so on. So there's been a lot of rapid progress in genetic discovery so that this is part of the genetics revolution, the genomics revolution I'm going to describe. And I come to this having spent some time with a colleague, Dalton Conley, who's a PhD in sociology and a PhD in biology after being the Dean at NYU. He did that during the evenings. And we've reflected back on the last 10 or 15 years of integration of these genetic findings into the social sciences, the social sciences being economics, political science, sociology, and many others. And this book was published two years ago, which now unfortunately for us means it's in the history section of the library. I noticed that it's in this library too and checked out, so I appreciate that. But it's dated. This was published two years ago. And what we realize now is that our speculations on the far away future, what's going to happen in 20 years, have already happened. We were way too modest. Well, we thought we were talking about the future, but we were talking about the present. And we really didn't spend a lot of time. The other regret of ours is not spending a lot of time on policy. And that's really what I'm going to talk about today is to frame it around a call to action, a needed call to action, a call to action that was needed 10 years ago to anticipate the set of findings I'm going to describe and the rapidity of these findings in policy domains and how these findings are going to necessitate or generate new policies and new thinking about how these genetic findings work in our lives. So let me start off by reminding us about a few genetic facts that we'll want to take with you through the talk. So I only have a few slides. There's a couple of facts I want to remind you of. One is the amount of content we're talking about when we talk about the human genome. So it's 3 billion little bits of information. I'm going to call them letters throughout more technically nucleotides. But this is what's undergirding the spiral staircase of the DNA structure, which if you squint you'll see A's and C's and T's and G's. And those are the letters I'm going to describe. But there's 3 billion pairs of them. There's about 1.5 billion miles of DNA in your body. That's 3,000 round trips to the moon. So there's a lot of content. So it's fact 1, lots of information. Fact 2, a lot of this information is exactly the same in all humans forever and ever. So over 99.5% of all these letters are identical for all humans. So look around. A lot of common humanity in the DNA. Comma, 98% of the letters found in human DNA, the human genome, are identical to chimps. That then makes us recall or makes a point that some of these differences, although we're very similar, almost completely so, where there are differences, some of those differences can really matter. Okay, so one example. Where else would be chimps, right? So just as an example, a clarifying example, you see some letters here, C's and A's and T's and G's. So if you have 3 billion pairs and you zoomed in to 20 of them and you read C's and A's and T's and G's in normal cells and then you come down to this line and you see all the same C's and A's and T's and G's except for 1. And that one difference among these two people decide genetically whether you have normal blood cells or sickle cells. So that's an example of the point, which is that some of these letters really matter. That's one example of a letter that really matters to human lives. The final point, and then I'll move on from genetics, and you'll see an asterisk here anticipating maybe some of the Q&A about gene editing. But before we get to gene editing in humans, we should, I think, and for the purposes of this conversation, I want us to at least believe for the moment that these letters that I'll be describing are permanent in your lifetime. You have the same letters that you have now if you were a genotype. You have the same letters when you were an embryo. You'll have the same letters your whole life. You'll have the same letters when you die. That's what I mean by permanent. And again, asterisk, we can get to gene editing later. For most of human history, they're permanent. Okay, so genetic facts, that's what I just described. I want to now build up a revolution. Nothing I've said so far is revolutionary. Those are the facts that have always been true. So the revolution actually comes from two aspects. One is technological improvements in measuring these facts. And the second one is what is done with these measurements, meaning the amassing of enormous data sets of human DNA and the application of statistical algorithms to find new discoveries. So those are going to be the two features that in combination I'm going to describe a revolution. And the implication of that, and that's what the call to action is, is that when you have such a dramatic technological improvement and the use of statistical applications on those improvements, there are many new policy questions that should have been decided or at least discussed 10 years ago. So we're way behind, and that's the call to action part, but it's still, I think it's a reasonable and will generate a good discussion, I hope, today, which is what to do with this revolution. So what are the improvements, the technological improvements? That's my first claim. To have this revolution, I need to defend the claim that there's been important technological improvements. Okay, so I'm not there yet. I'm reminding us all that in your daily lives, as you've lived your life, you'll notice, you would have noticed that prices change over the last two decades. And in fact, you would not be surprised to know that college tuition fees are among one of the items we buy that has changed the most over the last 20 years. So go back to 1997, Senator, that's your starting point for college tuition fees that the price paid has increased by 150% over two decades. That's a lot, right? College tuition fees is in the news a lot about in terms of its price changes over time. Education, more generally, is an area of our lives where we've experienced price changes, child care, medical care. I'm just reminding you of what you already know to be true and have experienced, but you'll see where I'm going in just a minute. So these are prices that have increased. We've also had experiences in the recent past, in the last 25 years, 20 years, where some of the things we buy have gotten a lot cheaper for what you get for them. So television is the prime example here that over the last 20 years has declined in price by like 90% for what you get. If you go back to the first big screen TVs and how much they were, thousands and thousands of dollars and they weren't that great, now they're hundreds of dollars and they're much better. That's the idea. Okay, so this is all normal in our, that we've experienced in our lives. Now, let me move to something that's totally unprecedented, which is the price for sequence in human DNA. Over the same period of time, the last 20 years, the cost per human genome sequencing has fallen on this scale. So this is now totally differently scaled where the slowed line is Moore's Law, which is computational capacity doubling every 18 to 20 months. So how much you pay for a computer and what you get for it. This phenomenon called Moore's Law from the 60s has been true for a long time and is itself unprecedented until what I'm about to tell you. So this is really fast. Computers have gotten really cheap per dollar spent. The computational power is quite impressive, doubling every 18 months to two years. Okay, that's Moore's Law. This line is the cost per human genome over the last 20 years up until almost present. And this line shows a reduction in price of 100 million times. So what used to, what cost 20 years ago, hundreds of millions of dollars a sequence, one person's genome now, now being here, is in the $500 to $100 range where the projections are $50 by the next couple years. So going from 100 million to $50 over the same period of time as this last slide is what I mean by unprecedented. Okay, so there's going to be some implications for that rapid change over time. And one implication, and I'll read some of this if you can't see the type, is that there have been private companies, ancestry DNA, MyHeritage, Family Tree DNA are the ones I've listed. Those are the four biggest private companies that are genotyping individuals. You can see some growth patterns on this slide. So 2015, 23andMe had about 800,000 genotype samples. Fast forward a couple years, 2 million. Last year in August, 5 million. Now, 9 million. Ancestry DNA is at even more rapid increase. So they went from 7 million in August of last year in three months time, double to 14 million. This is the upstart. It went from zero a year and a half ago to 2 and a half million. And this one is floundering of only going from 900,000 to 2 million over the course of a couple years. So these are four private companies. I list here, they had a sale in Thanksgiving 2017 and over the weekend sold 1.5 million DNA tests. They haven't released the numbers for last Thanksgiving, but it was on that order. And so here's a recent, as of two months ago, now that more than 26 million people have taken an at-home DNA test. Here's the growth. And then I pulled one of the quotes. If the pace continues, this data set will be 100 million. These data sets combined will be 100 million people within 24 months. And this is all because it's so cheap. Let's go back to this point about it being cheap. You can't do this if each test costs $10,000. You can if each test costs $50. And the projection from an academic paper is here's the quote. There's different assumptions one could make. Even the slowest increase over time says by 2025 they envision at least 100 million. The upper projection, as of a couple years ago, is 2 billion human sequences analyzed. So, unprecedented. I'm going to use that word quite a bit on the technology and the amassing of, in this case, it's private data. There's a government version of this. Many governments across the world are enlisting their own research subjects. I've got a few. So in Iceland, it's everybody. Everybody's in one data set. There's 300,000 people in that island, and they're all in one data set. And there's large collections in the load of mid-100,000s in many, many countries. And individual health systems like Vanderbilt and this Emerge group. So government efforts. The U.S. version of government effort that's recent is called All of Us. Eventually they'd like to have one million. Individuals contribute their DNA. Madison is one of the many sites. So there's an effort in Madison to be part of this All of Us initiative in precision medicine. You'll notice that one million is about a weekend of ancestry DNA sales. It's going to take the federal government a lot longer than that to collect this information. So that's the main point I want to make about technological improvements. And the main implication is these reduced costs allow very quickly over a very short period of time amassing of just enormous, unprecedented scales of human DNA and single data sets. Now, what would one do with these data sets that are now in the tens of millions of people? And the answer is analyze them. Use statistical methods to comb through the genome of millions of people and consider predictive analyses. So let me describe a little bit of that. Next, which is the workhorse here. And I'm going to try to move quickly through the technical details. But the workhorse is called a genome-wide association study, which is step one, huge data set, step two, scan the whole data set point by point, location by location, and ask the question, if you have an A versus a C here, an A versus a T here, a G versus a C here, location by location, are you more likely to be X? Are you more likely to have cardiovascular disease? Are you more likely to have Alzheimer's disease? Are you more likely to be tall versus short? Are you more likely to have so-and-so-and-so-and? And you do this hundreds of thousands or millions of times, point by point across each chromosome. And what that leads to is just a huge number of potential findings to comb through. So that's what I'll describe briefly, which is when you have all these findings, you're going to visualize them because it's difficult to comb through hundreds of thousands of findings and figure out what the key results are. So here's an example, a cartoon of what an output of one of these genome-wide association studies are, which is called a Manhattan plot, and what you're looking for are tall buildings in Manhattan here. And when you see areas way up here, you say, oh, there might be something in the genome that is a true genetic discovery on the outcome of interest. And then, so that's step one, is to scan letter by letter all the way through all the chromosomes and ask what about here, what about here, what about here? Is there a difference in genetic variance on the outcome? And then there's hundreds of thousands of findings. You're going to summarize those in some way, I'll describe it a little later, and then use them to predict the outcome out of sample. So you ask if you have all these A's and C's and T's and G's, are you more likely to have Alzheimer's disease or less likely? And so on. So those are the two steps. Manhattan plot, visualize findings and then use, and then push them all together in some summary way to get out of sample prediction. So that's what I'm going to work through in a couple steps. Okay, so for example, here's a genome-wide association study. The outcome here is the inability to smell the distinct odor produced in the urine after eating asparagus. In this case, when you run hundreds of thousands of analyses, you find nothing except one place. Okay, so this is the Manhattan plot with one huge skyscraper and then nothing else. Think of this as a relatively simple straightforward genetic architecture of that outcome, the asparagus outcome. So that's hopefully give you a sense of what's behind this. So this is about as simple as it's going to get one place, nowhere else is where it matters. And this is going to be very unusual. More usually you're going to find, so these are different allergy outcomes. So it's peanut allergy, milk allergy, egg allergy. More likely when you do these analyses, you're going to find a lot of buildings. The genetic architecture is quite complex and most of the outcomes that we think are important. And so there's not one site for peanut allergy. There are quite a few small, there are quite a few relatively large buildings in this Manhattan plot and they're spread all over the genome. So that's going to be the more likely, this is the more typical genome-wide association output, which is lots of buildings. And in fact, this is the workhorse from my first slide. There has been 4,000 of these studies that have together identified 90,000 buildings. And so that quantity of findings means there's a new one every day because there's 4,000, there's been 10 years, it's like one a day. So the one that came out, here's the one from two days ago. This one was all the way in January, so it's old news. This one is as of two days ago. So the outcome here is measure of mood instability, but it's the same thing. So you get an outcome, you scan the whole genome and you see in this case a lot of tall buildings, mostly throughout the whole genome. There's not huge areas in the genome where there's no tall buildings. There's still genetically complicated architecture here. And so on and so forth. And you've seen a couple of these Manhattan plots. In general, what you'll see is a lot of buildings, a lot of complicated genetics that go into most outcomes that we care about. So that's not so useful, meaning all these buildings. There's a lot of complication. So what's the actual next step? So let me refer you back to these pushing all the results together and give that a name, which is what happens is you summarize all these results and you push them all together in like a credit score. And these are called polygenic scores and they look like this in the population. Some people are at very high risk for the outcome. Some people are at very low risk and most people are in the middle. So almost all of them look like a bell curve. And now we're talking about something that can be and is used in the clinic. Now we're talking about something that if you walk in and the clinician pulls up their epic screen and says, okay, I know your height and your weight and your blood pressure and so on. And I also see over here that you've been genotyped and you have a high score for cardiovascular risk. So now this can be and is used in the clinic. And this could be like a red light, green light thing or it could be a number just like your cholesterol has a number and above it you're in a high risk and low risk. Same thing here. So now we're in the clinic with these polygenic scores. So, so far this is just regular genetics. I mean it's fast moving, it's complicated, but nothing so far probably gives you a sense that there's a social genomics revolution. So I'm about to do that. But the two steps again in these kind of analyses are scan the whole genome, look for tall buildings and then take all that information, smoosh it all together into a credit score and then you can use that credit score in the clinic. Okay, so those are two steps. I'll entertain a question right now. I know this might get me in trouble, but if there's a clarifying question especially because if I've said something that was not so clear, it's going to be hard to keep following the rest of the talk. Yes, please. You said most of the private ones, most of the people are in the private ones. Are they doing this or is it the government that's doing this? The answer is both. It's not either or. It's both and. And they're combining forces. There'll be government data sets combined with 23andMe. Both those four agencies are companies that are doing that that you showed before. Are they virtually doing the exact same thing? It doesn't matter which one you go with. Let me, I'll get that back. So remind me to Q&A to give you a more precise answer there. But the workhorse methods of I'm going to try to find a genetic signal for something is the same across all the companies. Yeah. So how does the data on the help of an individual get linked to the DNA test? Great. So that's a great question. So it depends. So for 23andMe, they give you surveys. And if you decide to answer them, then that's their data. If you're in the clinic, they could genotype you and also have your blood pressure, right? So it depends as the answer. So does the data set know which is self-reported and which is clinically reported? The investigator would and would want to be careful in those cases, definitely. But it is what it is in the sense that if I go back to this, so this is an example. This is 400,000 people in what's called the UK Biobank, which is a large survey of individuals. So this is all survey information of indicators of mood instability. And with 360, 3,000 people, I'm going to go letter by letter and look for places where among these 365,000 people, if you have an A, you're more likely to indicate mood instability than a T, GC. And so on and so forth. But this is an example of a survey. This is neither a private company like 23andMe nor a health group. This is a government-funded survey in the UK. Okay. Relatively comfortable, I'm feeling here now. So let's move on to the fun part. Social genomics. Okay. Here's a genome-wide association study for educational attainment. Point two. This was two and a half years ago. So same thing. And this one published in the journal Nature found 74 locations in the human genome where if you have an A versus a T or a C versus a G, you're more likely to go to college. And here's what the Manhattan plot looks like. Lots of tall buildings here. This doesn't look like the asparagus example where there's nothing in the genome. There's a lot of tall buildings here. And this is of 300,000 people. And like I said, this is two and a half years ago. So that's step one, GWAS. Six months ago, Nature Genetics, another publication, got the new version of the same thing, educational attainment. But now it's not 300,000 people. It's 1.1 million people. And now it's not 74 locations. It's 1,200 locations in the genome. But so the buildings have gotten bigger. Okay. So that's step one. The genetic architecture of whether you go to college. Step two, let's get a credit score. Now you're not going to the clinic for this one. This one you can do in the privacy of your own home. Because when you submit your DNA to 23andMe, they'll send it back to you. They'll send your A's and C's and T's and G's back to you. And you can upload that file to this place, among many others. And they'll send you your credit score for years of schooling. So there's a lot that just happened here in these two slides, right? We've converted from maybe our comfort area of health indicators to social indicators, like years of schooling. And we're no longer in the clinic. There's no touch from a clinician in any of this. You can send your DNA to 23andMe, private company. And you can upload it to this nonprofit site. This is from NYU in Columbia. And they'll send you back your polygenic score. Why is the... And without being too technical, I'm not very specific. But why is the 14 years not in the middle of the... Which makes me a developer? That's you. You're not as polygenically high as the other ones. So some people have high risk of going to college, and some people have low risk of going to college. And that's the way it is. So let me do some variation on this theme. Okay, so hopefully you're thinking about, oh, maybe we should have some policies about the use of these things. So we'll get there. And so this is not an education score, but this is just a point to say, and this date shows you that this is over a year and a half ago, which is that private companies now, no clinical touch, this is $199 to get through the mail, your diagnosis for breast cancer. And here's a private company that I like to put up, because if you send them your DNA, it's going to unlock your insights of nutrition, sounds fine, health, fitness, ancestry, even the wine you might prefer. Spotify will create a playlist based on your DNA. So there's one set of policies about, do we want any guardrails on private companies in the use of DNA? Okay, so let me summarize where I'm going to be going for the next few minutes, which is the themes here are one, the expansion of large-scale genetic examinations into both health and anything else, anything measurable, your question about where does the data come from, anything on a survey can be analyzed exactly the same as years of schooling or cardiovascular risk and there can be a polygenic score for it. Point two, theme two, decentralization of knowledge. There's no guidance necessary here. There's no gatekeeper, which we would traditionally think of maybe as the medical establishment or your clinician would be the gatekeeper of telling you of both getting your biological sample, analyzing it and telling you what it means. None of that is necessary anymore and was not necessary for years. This is not in the future that I'm describing. This is in the past. And then the policy question are under the umbrella of the allowance of new directions and discrimination, meaning what used to be group-based discrimination, discrimination based on age, based on sex, based on race, ethnicity, and so on. Now it does not need to be group-based anymore. It can be personalized to you. What's your risk of Alzheimer's disease? So let me give you a couple examples. So there are two areas in your life that cannot discriminate against you. Your health insurer and your employer. This is from the GINA Act from George W. Bush in 2008. Way back when it was really expensive to do this and so on. If you remember the graph, if you remember the initial figure. But, nonetheless, this has survived for 10 years. It almost was gutted by one of the repeal and replace of Obamacare about a year and a half ago by one vote. We had this almost destroyed. But there are two places and only two places in your life where there's a legal remedy for genetic discrimination, meaning that if an employer discriminates, it can be found to discriminate against you based on your genetics. You can have legal action and your health insurer. Now we should start filling in what I did not say, which is what about your long-term care insurance? What about your educational system? What about your criminal justice system? What about anything? Like pick a name. It is not covered. This is the only thing that's covered. So policy question. What should be covered? Maybe a little bit more fun is what are the implications. Let me take this up in the book a little bit. Of dating and marriage when genetics is so cheap. So we speculated, this was in 2017, where there could be web-based dating, where you fill out surveys and you provide your eye color and whatever, and here's your profile. Now it's so simple to add polygenic scores for education or Alzheimer's risk for that matter on your dating profile. We got the same kind of chuckles two years ago when we just subscribed that. Then you look and say, well, that was two years ago. There are dating services that do exactly this, and here's two of them. There's no legal remedy for being discriminated against if you have high Alzheimer's risk here as a 20-year-old. Now something a little bit more serious, that was the fun part of the show. Now it's something a little bit more serious, which is remember, the reason I reminded you of the fact a long time ago when we first started, that you have the same letters now that you did when you were a fetus. Let's think about that a little bit more. What else might we actually want to predict about people, besides their schooling and so on, or besides their health and so on, and what people might want to target for these predictions? In particular, what do you want to predict about your fetus or baby or young child? The letters are the same. The predictions are the same. You would have had the same polygenic score as a fetus as you do now. Some are easy, like some are really easy to detect. Others are pretty easy to predict, like height and weight, adult height and weight. But what about other elements of these embryo's future lives that we might want to predict? And there's a question of, well, would parents ever want to know that? Here's the 1939 World's Fair patent from Dr. Seuss, Lee Dr. Seuss, of his technology to form predictions of facial features of your children, which is you take a picture of mom, you take a picture of dad, and you overlay them. So this didn't work. This was a failed patent, or this was a failed invention. Too many girls with mustaches. That was then, 1939. But there was maybe a demand for this. And then here's an example from by today, and by today I mean two years ago. So here's real faces versus faces predicted only based on A's and C's and T's and G's, using technology from two years ago. And remember, this is a face that you would predict of a fetus because the letters are the same as if you are an adult now, as an adult now. So here's the policy topic, the question of what forms of discrimination would we like to rule in versus rule out for future generations? We have the technology, it used to be that maybe there was the desire to make these assessments of fetuses or young children, but there wasn't a technology. Now that's no longer true. Now it's cheap and early. So here's what you can do in an IVF clinic, which is biopsy a single cell on day four. And that cell, because it's like all your cells in that future body has the same genetic content, this is where you can get a years of schooling polygenic score, day four. And so on, I mean, you can get everything that you can get as a 30-year-old or as a 50-year-old on day four. The college already said 14 years of college. The person doesn't have to have 14 years of college. You may have never gone to college, but it's a capacity is like a 14 year college person. Is that what you're saying? It's a prediction based on your A's and C's and C's and G's of how far you're going to go in school. If you went to school, you could be not going to school and still be an engineer yourself. That's right. Fair enough. You're permitting it's like 14 years. For an individual, and others it would be 20 years and other people would be 12. Totally right. But the point here is how early one can get this information. Again, technology years ago would have been wait until the child is born. Now it's not. Here's an ad I found a couple of weeks ago of where we are along. Of course, everyone probably already knows that people do this regularly for genetic conditions. That's typically done. These have clinics in Los Angeles and New York and Texas, not yet Madison or Wisconsin more generally. But their next service is eye color. But this is describing for those who don't know, there is essentially zero regulation on what can or cannot be ascertained from a fetus. Like I said before, technological constraints used to be there, which is that maybe you really want to know what your baby's eye color was, but there's no way to find out on a four-day-old fetus embryo. Now, no longer true. Now the constraints are potential parents asking and paying for this information. And here's an example of the first set. So this particular company does sex. They've done that for a while. And now they've added, announcing eye color. So it's a big hot thing. Here's our new thing, eye color. Was there a question? Yeah. So as we advance this technology, is it possible that you could also change the strength of their immune system or their IQ or athleticism? Yeah, so good question. So ask me that again in Q&A because there's two parts to your question. One is about editing this information, which I have not talked about. This is only taking the information as given and predicting based on that information. So let me come back with Q&A on the editing part because that's a little bit of a different story. This is just what can be done as of years ago on human embryos in terms of taking the A's and C's and T's and G's and a cell and extracting the DNA and making a prediction on many things but in here's eye color. And the point here is, well, in IVF clinics more generally it would be which of these embryos do you want normally in embryo selection it would be about genetic risk. It would be screening out embryos that had a high likelihood of dying or of failure to implant or of having a genetic condition that might be deadly. So that's old news. The new version of this is, well, what else would we want to predict in embryo selection? I do. Yep. So they could be embryos where you predict a certain color and it turns out not to be the case. Totally right. Yes. For eye color it's really, for eye color it'll depend on the parents how good the predictions are or how good the accuracy is but for eye color it's very accurate for IQ less so, right? In the back? No. So the question was about CRISPR. CRISPR I'll get to a little bit later as a gene editing tool. It's just useful I think to think about two different parts of the question here. One is as given there's DNA in people we can extract that and use it for prediction and that's all I'm going to be talking about for the first 45 minutes or so CRISPR and other gene editing technology is you take the DNA and you change it. So I'm not talking about the changing part I'm just taking it as given because this is actually really easy to just take as given and predict and just like what you said for some conditions like sickle cell very high accurate. Blue eyes very highly accurate. Sex very highly accurate. As you move down the complication level of the outcome like years of schooling that's accurate. The question is do we want that information to be given to parents to use the selection not whether it's 100% accurate which is an important question but that's not my the policy point I want to make is right now there is zero policy around this do we want that to be the answer zero policy and just let the market decide what parents want to know and that's where we are right now. And so the future and this is where like I described in the book where we thought oh this is going to take years and years to develop here I think it's a little clearer to us that the future is not that we're is tailing off where fewer and fewer people are getting their DNA tests to you know quite the opposite the data sets are amassing much quicker than they were a few years ago and there's incredible and because of that there's new outcomes to predict there's out in 23andMe gives pretty regular updates about the outcomes that they're predicting and giving you new information and that's only going to go further in that direction and there's we haven't experienced the following thing yet but that's another important part of policy is to assume for a minute that 23andMe data gets hacked and publicly released what safeguards would we like in place for that type of information so right now the answer is there's no safeguards except for employment and health insurance through the gene law which is those are the two places that cannot discriminate against you. So let me wrap up by reminding us of a couple recent events that are related to privacy and discrimination first a few years ago 40 million customers got their credit reports hacked right and more recently Marriott had a 500 million guest information hacked so the thing here that I want to scare you with is that the information that was hacked in these previous accounts are names, social security numbers credit card numbers addresses and if you go down that list you'll recognize that those are a list of things that are highly private and you don't want them released but they are changeable and they're not very predictive of your Alzheimer's risk whereas a hack for 23andMe is the opposite is not changeable and it is predictive of your Alzheimer's risk not like you as a group you're Alzheimer's risk the other part about privacy this is from a paper in science a few months ago is who is caught up in this privacy question you might think it's only people who's given their data to 23andMe no, no, no it is not and this is related to a question that might come up about criminal justice in Q&A about the finding of serial killers from cold cases same kind of genre here which is it's not that you need to be in the data set if your third cousin is in the data set that's how the serial killer in the Golden State serial killer was caught it wasn't the killer's DNA in the data it was his third cousin's DNA that was in the data and that was able to cast the net over well we know it was this person's third cousin find all the third cousins it's got to be one of them right so similar thing here which is that a data set this is in the journal Science a few months ago a data set of about 1.3 million not very large a data set they asked the question how likely could we find somebody's cousin in this data set and then and then fill this out going forward and that's the punchline the technique that they the algorithms that they produce in this paper make the following suggestion they could implicate meaning find nearly any US individual of European descent in the near future like these data sets are of the size that it's everybody in the US is in is implicated in the data set not just the people that were that gave their own saliva sample so it's not privacy as you know individual it's everyone's privacy that's at risk and I'm going to skip through a couple of these employment things so we have more time this is another just another example that was recent when I put up here now it's a year old a DNA test for 190 diseases for newborns and the point here is that it's essentially unregulated and it's outside the clinic this is something that you deal directly with the company it is for $649 and it detects 190 diseases and health traits without any clinical interaction so let me broaden it back up policy privacy who owns your contributed genetic data when you send it to 23 and me you are now a part of a large research study and there's no getting out of it is the short answer me that is that where we want it to be how do we way and this is the more general privacy question about this third cousin issue if your sister contributes to 23 and me and therefore you're in 23 and me whose privacy are we weighing here when we're talking about biological family members when one person can make the decision for their whole family of whether they're in the data set or not is this interesting kind of novel question about privacy in this context access and representation are these tests going to be covered by insurance maybe maybe not another this is going to take us in a different direction momentarily and I invite Q&A follow up presentation here what I mean by that is two things so in the whole world there are something like 15% of the world's population are of European ancestry give or take if you let me some leeway on that number if you asked what proportion of all these GWAS what proportion of the subjects are of European ancestry and all the GWAS findings that we have want to guess almost 90% and the answer for African ancestry is like 2% so none of the findings are for anybody right now except for individuals of European ancestry so think about the clinic again your clinic interaction where you are an individual who is not of European ancestry and might be face with the clinician making treatment choices for someone who is not you so that's representation the other point of all those European ancestry individuals in the data what are the top three countries there from that represent like 80% of all the data themselves I bet you are going to get two of them I want to see if you can get all three not Germany so Germany is an interesting case Germany does not do this so you will remember with so I won't have to go further there so Germany is actually on the very low end of anything about DNA and ancestry so let's start over three countries US UK are two correct ones what's the third Australia no no not France not Canada it's going to be so we're not getting closer so I'm going to tell you Iceland because all of them are in the data like everybody in Iceland is in one data set so that makes it a powerful data set so US UK in Iceland how representative of the environments of the world are those three places not very discrimination getting back to this policy theme what interactions between us and people who are not us do we want protected by law so right now that we've decided is there are two people you know two entities health insurers and employers that cannot discriminate against us or face legal consequences everyone else on that list can you can add colleges you can add K through 12 preschools etc everything else is on the list of do we want these do you want want these institutions to be able to discriminate for or against you should we use genetics to make clinical decisions maybe we'll get back with Q&A because I put that in quotes this is what precision medicine is you take genetics information and you make treatment decisions partly based on those and how do we feel about using that same so if we like precision medicine which my typical experience with crowds such as yourselves suggests a lot of enthusiasm for precision medicine you've had your DNA genotype your clinician says you know you have this A right here in your genome that seems to usually do people that have that seem to really respond well to treatment most people say yeah that sounds great what about educational intervention so be the same thing like you have these A's and T's and C's and G's people who have those seem to really respond to this type of educational intervention now we're talking about young children instead of old adults but nonetheless it's the same kind of issue do we want protections or no protections and then finally on this side in what ways would we like interactions to be protected or not protected between potential parents and clinicians in thinking about embryos and fetuses about sharing information genetic information do we want any guardrails on that right now the answer is no zero guardrails on what types of information can be elicited do we want that to be non-zero so that's another major policy question I'll just flag for the audience that UW Madison is like taking this seriously we have we're currently hiring three people three more people versus the people on this in addition to the four on this list who are really interested in this area of social genomics you know stay tuned it's an investment from the university and really building up this area of interest I'll very thankfully acknowledge all the funders I've had at the UW and around the US and now officially open it up for the main Q&A and thank you who determines the definition in your field of a valid data set I mean is there this is an analogy only is there a licensing authority is there some something like in the hospital business that the commission knows there's some entity in your particular field of work that determines you're just shaking your head no no no I'm anticipating the end of the question and the answer is going to be no now there's a scientific standard I think you're envisioning a case maybe where these private companies have proprietary ways of sequencing human DNA that only they know how to do it there's a standard protocol with low error so it's not the measurement anymore the measurement is done very cheaply and very precisely of human sequence DNA of human sequencing of human sequences so it's not that that we should be worried about so much there's very low error rate versus like any other kind of measurement we would take of somebody try you you keep referring to data sets which is great but there's different the way you measure sometimes determines a result and that's what I'm getting at that that's a fair question and there's not a one answer to your question because there would be some answers that would be the UK Biobank is a survey of 500,000 Brits and they filled out thousands of self reported bits of information about their income about their years of schooling and also contributed a saliva sample and by doing so they're in one data set of survey outcomes with genetic information that's one example another example would be 23andMe which is not a representative sample it's a collection of people who've paid to be part of research and they contributed their saliva and some of them many of them have also answered survey questions part three there are medically based consortia where they would attach to your electronic health records genetic information so that would be another data set so there's not one answer there's many answers to your question but it's a great question I wanted to make one point and also share a real life Wisconsin story the point was that 23andMe if I remember correctly is now owned by the ex-wife of the Amazon executive I'm sorry I don't remember her name so this is already really market based it feels like a year on my health insurance there was a question to go in and get extra screening basically an incentive plan if you come in and you do this through UMR it was a lot of money you save about I think it was $800 around that a year but then as I went through the questions so you go and you get the blood test you could do these things then there was a little box that said gives permission to share your genetic information with the company I think it was like for two years and it seemed really highly suspicious so I called the company and I said well I don't know if I consent to this and they said oh well you already checked the box you can't withdraw your consent and I hadn't filled it out all the way but there basically it seemed to me bribing people to give away permission to their genetic information and maybe that was an over I don't know I would be curious if you had any thoughts on that yeah no that's exactly what it is I mean it's worse than that when you do 23andMe you pay for the privilege of contributing your DNA to a research study so it's not so I don't have any you're right in the US I want to say close to impossible to withdraw your genetic data from a 23andMe company it's not true in the EU they have different laws they have a law that if you get out of their data set they must or face legal consequences so it's not that it couldn't be different it is different in the EU it's not different here it is very difficult to get your DNA out of these data sets once they're in and you're right so in 23andMe has been a private company I don't see them changing but they're a mix because they get NIH money National Institutes of Health money to do some of these analyses to us for their latest one and their latest one is for diabetes and because their data set is 10 million people they have a lot more information than a single like a UW Health or something that has does not have 10 million people who have been genotyped so 23andMe is a private company will be a private company but they're also dipping into federal health dollars too you have to get the mic before I just have a question getting back to the third cousin situation is there a difference if you are half siblings or if there are by different can you clarify what you mean by difference well I mean because the genetic material comes from both parents so if one of those is if it's the same parents for a couple of kids and then you know one parent stays the same and then the other parent changes somehow can we feel a little more comfortable if we're from a broken home perhaps no I mean the answer is no because you just need one person in your family who is biologically close to you for you to be in the data that we're comfortable with our sixth cousin in the data because that person has many many people who are also sixth cousins but you as a person have fewer sisters or half sisters so once that you're determined to be that level of closeness it really narrows the range of who's possibly that person who's not in the data but again the direction is not to say that right now there's no sense that right now every single person in the country of European ancestry is identifiable because of the data because of the data sets the science paper said it was only 60% of the people in the country but that's of a small data set as the data set gets larger it's just going to catch more of us in there as our biological relatives and their biological relatives contribute their information I asked before about the four different companies and how they may do things differently and then I also wanted to ask about why do you think most people are doing this why would they be sending their information to these companies are they more likely to find are they looking to find out their history of where their ancestors came from and that sort of thing more than their DNA I can't answer for everybody usually we kind of skipped over the first question I get asked which is your information 23 and me because the answer is yes I did but I did a long time ago like three years ago and why the answer is going to differ for a lot of people and this is where I should make a couple contrasts with the different companies because people are interested even after they've gone through the hour of scared straight there's still a lot of interest and not for a bad reason so 23 and me is the only let me just distinguish 23 and me versus Ancestry DNA those are the two big ones there's some minor players but you're probably not going to use those Ancestry DNA is they're not going to ask you a lot of survey questions they're not in the business of looking for trait predictions health predictions they are in the business of locating your ancestors because they take your DNA and they combine that with their catalog of genealogical genealogical information and they map your family where you've been and what year so Ancestry DNA is really ancestry and places where was your lineage 23 and me is a little bit of Ancestry and how they are distinguished from Ancestry DNA 23 and me is a little bit of Ancestry they'll give you a prediction predicting health information and traits by traits I mean the asparagus thing which they will tell you but also health like Parkinson's risk and I'll say one more thing 23 and me they're moving in the direction very rapidly of asking you over and over are you sure you want this information because they're moving into Parkinson's and more recently breast cancer and here are so they don't ask you so much about asparagus you're just kind of opted into those kind of silly ones but for disease conditions like Parkinson's they ask you kind of over and over are you sure you want this information so they're sensitive to the fact that people might not want to know everything that they could tell you so I think that's basically your question, thank you recently we've had a controversy in Wisconsin about rape tests the backlog of five years and whatever and how it's being resolved is technology doing the same thing to that issue in other words cheaper, faster private sector maybe doing it I can't speak to that the only thing you're asking this a little bit more of a specific question I think about private companies getting in the game of rape kits and assays themselves I can't speak to that I thought where you were going was the extent to which cold cases are going to be solved and old rape kits in particular will be useful because the data set if a rape or murder that's 10 years old if the biological sample was kept the fact that the data sets have grown so much over time makes this like a new analysis really productive and there's been I used to say dozens but I think it's more than a hundred cold cases of rape and murder that have led to new convictions just based on this type of you had a biological sample and now you're going there's a third cousin in the big data set of millions of people and that locates who the person you want to find is and the fact that these data sets are now of many millions of people means there's a lot of third cousins so it's less and less likely to not find someone related to this old biological sample from a rape or a murder or so on That was just taking a crime lab government funded five year backlog and doing showing us what the private sector is doing and getting knowledge based and techniques and computer power is making this quicker and quicker for the law enforcement so they could do it their local police station Yeah, they could send these to 23andMe, right? There's kind of a serious point there that, yeah, it's really cheap to do this. It's not the law enforcement that would be, it's just that the law enforcement part of this they're not trying to do like polygenic score test algorithm and so on they just want to locate someone who has similar DNA in the data set so it's really easy to do. It's just finding I don't, I don't Oh, actually I do know one more thing There's the four companies I said at the beginning, one of them the smallest one, Family and Family Tree DNA they have been in the news because they have said we will proactively cooperate with the FBI and crime labs to take our private samples what kind of conviction, like what kind of matches we can find. The other companies I don't think have taken the same stance You need a mic You need a mic What an Asian Wait, Tom So, my question is is there any proposed legislation and you know, or regulation and knowing how the sausage gets made it's so difficult to get things through who are the major organizations that are lobbying for or against regulation or The against is the thing right So, I mean my untrained I hear is that you take Gina, the 2008 law about health insurers and employers cannot discriminate or face legal consequences and you add other things But what's the constituency? It's not the private companies and this is where they make their money So it's like, you know, us people who, but these aren't it's occurred to me with different discussions I've had over these type of these type of discussions that people are not of the same mind people will have different ideas about what on the IVF side, what would be reasonable what would you like to know and on the criminal justice side there's a weighing of potentially solving fewer murders if you really put constraints on the use of DNA and in schools and so on, you know, there's people could very if I told you that there's a very high that dyslexia for example might be highly genetically predicted and that maybe it might be useful to have those predictions in mind as a child goes to preschool or kindergarten, some people would be swayed by maybe we should use that information in educational settings about autism, ADHD things that have, you know, things that we might be, that we might diagnose quote-unquote too late through our current diagnosis strategies of asking parents and children and teachers So there's outlining some, those are some summaries of discussions I've had which suggest to me that there's not, we're not of one mind of what we would actually want to do. I think this what I'm describing is some possible places we might want to think more So, I was going to ask, do you think that any of the companies that you've mentioned would actually be trustworthy with the data because there are no regulations? And secondly do you think there I know you're here to teach more people about these social implications and do you think that in the near future there would be enough people to actually make a difference and try to propose a law or regulation? Great questions So the first one is about private companies and what they're going to do with the data and trust You know they have some incentives not to do terrible things with your data you as customers So that might put some guardrails here but they would be unofficial versus the GINA Act which is officially health insurers and employers cannot do the following things So I think these are good questions that I won't have the answer to and people in the audience would have different answers themselves about the level of trust we would have for these private companies or the government for that matter and the second part about the social implications and getting people together for action is related to the point that was right before you what are the next steps who are the people who are thinking about it I hope I was able to get across a main point of how quickly these things are moving and that in itself means that we are behind and it could be that our answer is we like the way it is and we're okay with the direction it's going but we haven't officially had that answer there's not been a vote that says we have this answer and then there's a whole another lane and maybe I'll bring this up before someone else does which is gene editing is also really is in the news and it's gotten a lot more policy attention it's interesting the policies have shifted rapidly which is a few years ago the answer was never on human embryos and never will the federal government give any money for gene editing of human embryos that was like a couple years ago now yes human embryos and yes the government will fund our federal government will fund this if the embryos are not viable so that's been something that's changed over the last year and I'll point out this other issue that the US law is not the world law it does not govern every country and other countries some other countries are very proactive at both gene editing and the polygenic scoring that I've been describing and amassing even larger data sets especially enrolling high IQ individuals with very high IQ in the case of China to really push hard on these issues and the US government does not have oversight there I know 23 and B if you have your genome done by them they will give you a list of people your genomes are closely matched and what percent the genome matches does that mean those people are just in their data bank or do these private companies share data with each other or with the government or vice versa so those are the answers so when 23 and B gives you your relatives those are within their data and secondly they do not share between the different companies or with government in fact it's the opposite they're in a race right now who's going to be the one data set that wins over all other data sets and those four companies are trying to even provide free testing to get there because like some like twitter or facebook there's only one of them and being the one the big data set is all that matters and so these four companies are really competing against each other and will not be sharing information unless they join forces against the other one right that's the only way they would do that because their whole asset of these companies is their data it's not the technology everyone has the same technology everyone has the same statistical tools it's only the amount of data they have the number of people that they've genotyped actually we do have time for one more if there's another question anybody else okay kind of going back to to the beginning of your presentation a little bit where you're talking about the genome and things are being added every day and it's going really fast so here's my question I did get my 23andMe done during Thanksgiving of 2017 so because of how fast they're moving and they're adding more and more and more and and I'm assuming you mean new they're finding new genes and new gene snips correct so my raw data from 23andMe 2017 if I got 23andMe now in 2019 same is my raw data different basically no raw data never changes if you had this test then when you're in embryo if they find a new gene snip the newness here will come in a couple different varieties of new relatives not because they were born but because they were included in the data set right that's where you get a new relative for 23andMe likewise you can get a new genetic finding because they've gotten a data set that's big enough to find something new that they couldn't have found before so if if I'm running my raw data through something like strategy or codogen or whatever as they evolve I can go back and I'll get a new health report and something new might be on my new health report that wasn't on it a year ago they're adding traits so diabetes is relatively new something that was not there before it now is so they're adding outcomes or traits or phenotypes the precision of the findings is increasing because they are getting more people and they're doing more surveying and you're getting new relatives because more people are in there but your A's and C's and T's and G's to answer your question are the same okay great wonderful Jason thank you so much for such a lively wonderful conversation tonight thank you