 One, two, three, four, five, six, seven. Okay, good. All right. All right, I have a 10 o'clock sharp. So let's get started. Welcome everyone to the Science Circles Continuing series of panel discussions on science topics. I would like to remind everyone that the Science Circle is a grant funded nonprofit. So, and these presentations are videotaped and uploaded to YouTube. So I will please ask you to be on your best behavior. Today we're going to be looking at recent, or recent advances in nucleic acid sciences. That's RNA and DNA, which has seen in the last 20 years or so, really sort of jaw dropping displays of the power that we hold in our hand with this technology. And to really dive into this with us today, we have with us Marianne Clark, a Max Chattanoir who teaches biology at Texas Wesleyan University and Steven Gager, who is currently a scientist at Corteva Agrisciences and does education and research in molecular biology. So if you will direct your attention up front to the slide here, I have a very simplistic cartoon of DNA and I'm going to be using my little printer here. If you look by the word guanine there, you can see the little red dot. So that's my laser pointer. So hopefully you all can see that. So what you have with DNA is a double helix, right? The frame of the helix is this phosphate in yellow, right? Two sets of it, right? So, and then the steps of the helix are the nucleotides adenine, thymine, cytosine and guanine, right? So these steps are kind of like tetris pieces, I think is kind of maybe a good analogy for how they fit together. So that thymine always matches with adenine and likewise for cytosine and guanine. So, and they fit together like little tetris pieces. So, and DNA, and I should, I want RNA and DNA are virtually the same molecule. The only difference really is whether there's oxygen or not. I mean, so RNA is ribonucleic acid and DNA is deoxyribonucleic acid, right? So it's lost its oxygens, but otherwise they're pretty much identical. So DNA is thermolabial, which means that it melts. Part of this backbone contains sugars and sugars melt. So you can denature this protein molecule into single strands. And this is how DNA is sequenced. You denature it into a single strand and then using typically gel electrophoresis, for example, you chew it up and then run it through a gel, which creates bands. And by reading the bands, you can determine the sequence of the nucleotides. And this is a photograph here of what a bank of DNA sequencers looks like, right? So it's really scaled up, sort of an industrial scale. So here is a little cartoon of messenger RNA, right? So DNA is in the nucleus of the cell, this purple region here, right? And it gets read in a process called transcription in which a segment of this double helix is unzipped, so to speak, and read by specialized proteins, which then generate an RNA molecule, which is a mirror image of the portion of the DNA that's been read, or not even a mirror image, a copy actually. And this mRNA, what it's doing is it's nucleotide sequence codes for a protein, right? So then, but the nucleus does not have any machinery to make a protein. So the mRNA migrates out of the nucleus into the cytoplasm, where it attaches right here to this molecule, which reads, which then is kind of like, just like reading, almost like a cassette machine reading a song off of a cassette tape. And it reads the sequence and then generates it. It assembles amino acids out of the cytosol of the cell. This is the cytosol here, which is full of amino acids and it assembles them into a string of proteins here. So that's what mRNA is. It's messenger RNA. It's a messenger from the DNA, right? To get read here and create into a protein. So by utilizing this technology, we were able to create a vaccine, or rather by using this biology, we were able to create a vaccine for COVID-19, literally within weeks of the DNA sequence of COVID-19 being developed. And this is a really transformative way. So what we've done, what the way the vaccine works is that it's an mRNA vaccine. So the mRNA codes for the COVID-19 spike protein. And the spike protein is the protein that attaches to the cell and injects the COVID-19 DNA or nucleic acids into the cell. And then generates copies of the virus. So by injecting the mRNA into your body, it will attach to the transcribing molecule here, the enzyme here, and will just generate lots of copies of the COVID-19 spike protein, which then gets released into your system and triggers the virus. And triggers an antibody response. And then that antibody response in turns triggers a cellular response and basically just turbocharges the immune system against the COVID-19 virus. Now, this is a fantastic advance over traditional vaccines, which typically used inactivated viruses to generate the immune response. And this new method of messenger RNA vaccines is so much more elegant. The vaccine itself has many fewer ingredients than a traditional vaccine does. It's much easier and less expensive to produce. We've been able to generate millions of doses of the mRNA vaccine in a matter of months, which is unbelievable. And in fact, and so here's a slide showing how we have scaled up. This is the machinery involved in scaling up production of the COVID-19 vaccine, right? So just again, to show you sort of how, what it looks like at an industrial scale to do this stuff. And the mRNA vaccine is such a breakthrough that Merck recently announced that it has abandoned its efforts to develop a traditional incapacitated virus vaccine because it's just an acknowledgement that we have entered into a whole new world of vaccine production through the development of mRNA vaccines. Now, another big breakthrough in DNA technology was the ability to sequence DNA and the ability to sequence DNA is fantastically enhanced by the invention of polymerase chain reaction, PCR. So PCR is a method to create many, many, many copies of a strand of DNA. And the way it works, and I also wanted to point out that PCR is also extremely important in forensic applications of DNA technology because often at a crime scene, the amount of DNA left by a perpetrator may be minute and it's not really, there's really not enough there to develop a DNA profile of the perp, but this can be overcome by amplifying the amount of DNA by making many, many copies of it through the use of PCR. Now, this slide I have up here is actually a little inapplicable. This is actually about reverse transcriptase, but just ignore that because it was still the best sort of cartoon slide I found that kind of sort of helped explain the process, which is that you, so if you'll follow my pointer up here at the top, you have denatured the DNA into single strands, right? Then you do that because the DNA is thermo-label, you melt it and then develop it into single strands. You can then add a primer here in green at one end and then there is a molecule here, which a polymerase molecule in blue here, which then reconstitutes the original strand. So here you see yellow, red, blue here. These are all the nucleotides that are pairing up with their counterparts here and reconstituting it. So, and here you see the polymerase molecule here walking along the single strand, reassembling the DNA, right? And so here at the second to the bottom, here you have a reconstituted DNA, and then you can do the whole thing over again. You then take this reconstituted DNA and melt it to denature it into single strands here at the top, and you do the whole process over again. This is called thermo-cycling. So by doing this many thousands of times, you can generate big amounts of DNA, which you can then use to create a genetic profile of a perp, for example, all right? So I did also want to point out with respect to the forensic uses. So in addition to PCR, so, well, I should say that... So in a forensic instance, it's quite possible that a perp, besides leaving DNA, might have left hair behind. In the past, you could not generate a DNA profile from hair unless the hair had a root, because the root has actual cells that you can extract DNA from to create a profile. And that was always a big problem because hair rarely falls out by the root. It's often just sort of broken off and you just have a hair strand without a root. But recently, just in like 2019, a big breakthrough was made, which came out of techniques in paleontology, which were used to analyze Neanderthal DNA. And this technique is able to be applied to rootless hair to create a DNA profile from rootless hair. And this is a brand new breakthrough, which holds great promise for forensic applications. Then, of course, another aspect of DNA science is to be able to identify specific individuals from their DNA. And I'm not going to get into too much details about that because Steven is going to give a little presentation in a few moments here about how DNA is used to identify individuals. I would just briefly mention that it is generally done through the identification of short tandem repeats, what are called STRs. These are clusters of repeated segments of DNA sequences. And then that has been further enhanced through the discovery of single nucleotide polymorphisms, SNPs, SNPs, which can identify an individual even more precisely. Just as an example, through a short tandem repeat, you can identify an individual, say, two within one in a million, but using single nucleotide polymorphisms, you can identify an individual two within one in a trillion. You can identify an individual so precisely that the odds of the DNA belonging to someone else is more than all the people that have ever existed in the history of the earth. So very, very precise identification methods. Then another big breakthrough is DNA phenotyping, which is, I have an example up here on the slide, DNA phenotyping has become possible through the identification of suites of genes that control facial appearance with height, lips, eyes up here. So you can see kind of the images that are created. And through DNA phenotyping from DNA alone, in the bottom half of the slide, you can generate a composite image of what an individual might look like. And here at the far right, you can see what the individual actually looks like. So this is an unbelievable breakthrough in my mind. And not only that, you can age progress the composite so that what he would have looked like in his 20s, what he would have looked like in his 40s, what he might look like in his 60s. Here is an example of a DNA phenotype on the left generated through DNA phenotyping. And on the right is an arrest photo of the actual perp. And this is sort of interesting. The hair color, the skin tone, the width of the face, the eye placement, the lips, it's remarkable. It's, in my mind, these DNA phenotypes are often better than composite drawings developed by eyewitnesses to a crime. It's amazing. But doesn't always work. So here is a slide of what I consider kind of a failure of the DNA phenotyping, especially the one on the right where the DNA phenotype, I think the skin tone doesn't seem quite right. The actual perp looks like is much more light skinned than the DNA phenotype generated. Although the eyebrows, the hair, the chin, they're all pretty close. So, you know, yeah, could be from sun exposure. There could be a variety of things that could alter the skin shade. The main thing to keep in mind with the DNA phenotyping is that it's not so much that it comes up with what the person looks like. What it does do is allow police to eliminate suspects. Right? There's a famous, or there's an episode in forensic files where the police were searching for a serial killer who would go up to houses and knock on the door and when a woman would answer, he would ask to use their phone saying that his car had broken down or something and he wanted to use their phone. He would get into the house and then he would strangle them with a telephone cord. And there he did this five or six times. And they were searching for a white perp because eyewitnesses said he was white and also because serial killers tend to be white. But through DNA phenotyping, they realized that, in fact, the perp was probably of African descent and this completely redirected the direction of the investigation. So that's just an illustration of how DNA phenotyping simply helps you eliminate prospects or suspects and not waste police resources. Oops. Hang on here. I got that, I got my instructions wrong here. So, and I also want to mention that later on in a little few minutes, Mary Ann, Max Chattanois, is going to take us through a deep dive into forensic genetic phenotyping to kind of explain the science behind it, how it actually works. And then finally, one other huge breakthrough has been the development of genetic genealogy. This famously was used to catch the Golden State killer. They had DNA left at the crime scenes from the perpetrator. From that DNA, and by using public DNA databases, such as GEDMATCH, they were able to identify relatives of the perpetrator, the great, great, great grandparents of the perpetrator who lived on the East Coast in the 18th century. And through that, they were able to develop a series of genealogy charts. And so the way that works is, so here's our suspect down here, right? And let's say through public genealogical databases, you're able to make a link DNA-wise to some distant relative up here at the top, let's say grandparents. You create a family tree. Who did they marry? What children did they have? This is very labor-intensive work, often requiring looking at marriage records, obituaries, things like that. It's very labor-intensive. Although there is software, GEDMATCH in fact itself has very sophisticated software to help generate these genealogical charts. And you can go down. And also as you go through these generations, you can begin to eliminate people as being possible suspects. Maybe they're dead or they were dead at the time of the crime or they lived in a different part of the country and so forth. And you can get down to the generation that is alive when the crime was committed. And again, you can begin to eliminate suspects. You can eliminate the women, for example, if the suspect is a male, right? And so eventually you get down to the point where maybe it's one of two people. It's either this suspect guy or it's this fellow over here on the right or it's this cousin over here, right? So they're, and then, so then basically you just check out all three of these people and see who might have been around at the time of the crime. Did he live in the area? Things like that. Did he know the victim? And by doing that, you can eliminate everyone except for the suspect. And that's what they did with the Golden State Killer. They discovered that he was alive at the time of the crime. He lived in the area in Northern California. He was the right age and so forth. Then they surveilled him at his home and as he went about town and collected an article that he had discarded and collected DNA from it and then matched that DNA to the DNA at the crime scenes. And the use of these public databases for this work is very controversial. We will hopefully have an opportunity to discuss this further. The current, there are actually guidelines for genetic DNA genealogy at the federal level for use by the FBI, for example. It's only to be used for violent crimes or sex crimes. And then also for the public databases, I think pretty much all public databases like GEDMATCH and 23andme and ancestry.com do not allow law enforcement to access their databases except for people who have opted in. So the users have to opt in to allow law enforcement access to their DNA. And then also the data you get back from these, that law enforcement gets back from these databases does not always necessarily identify a specific individual. It identifies a relative or an ancestor of that person and so forth. So, but as you can imagine, despite these safeguards, there is a potential for abuse or for other societal issues and hopefully we'll have a chance to discuss that. Let me see if I have another slide. Nope, that's it. So those are my slides. Had to rush through that quickly because we do have a lot more to cover. So now I would, after that sort of quick overview of the various breakthroughs in nucleic acid sciences, I'd like to ask Stephen to talk to us a little bit more about how individuals are identified by DNA and what sorts of techniques are used. And I'm going to go ahead and take up my slide screen here to get it out of the way. Okay, take it away. Thank you, Matt, and I ask it. People confirm they can hear me in local chat. Excellent. So I wanted to step back a little bit in terms of the DNA technology so that we can all understand the common basis for how these different analyses work. So Berrigan mentioned STR analysis, DNA fingerprinting. And so let's step back a little bit in time and talk and go back to this idea of DNA and it being a sequence of letters. And so Matt presented the GATC aspect of it. And what I'll point you to up here at the top in my slide is the polymorphism that underlies sickle cell anemia. So there's a protein that gets made. It helps bind oxygen and the normal sequence of it it has a glutamate and that's a G-A-G as you can see in the slide. Well, there's a random mutation out there that leads to a G-T-G and that leads to a sickle chain which is bad if you have two copies of that you get sickle cell anemia but is protective against malaria. Now, scientists wanted to understand the underlying basis of the mutation that leads to sickle cell anemia primarily from a physician point of view is that we can help understand why it works maybe help correct it. But then ultimately, and so what was actually one of the really interesting early diagnostics of sickle cell anemia and the center is like OCD which you could actually run the protein in a 2D gel and the fact that there was a different charge that A led to the A mutation leading to a T exchanging a valine for glutamate actually changes the positive charge of the protein. So you can actually run that through a field and be different, that'd be great diagnostic but that's not as easy as analyzing DNA. And so the, and I don't wanna get into a lot of biology of this but there are these enzymes that we discovered from bacteria called restriction enzymes and what restriction enzymes do is recognize a particular sequence of DNA, usually very small six, even four, sometimes eight nucleotides and actually cuts it. So this is, if you heard about the early days of recombinant DNA, this is what people use to recombine DNA exchange parts do a lot of interesting stuff there. Well, by serendipity, this A to T change was also in a cut site for an enzyme. And so what I have here on the slide is representation that this, the sequence in normal sickle sonemia would be cut but then if it changed to a T that cut site would disappear. And so what I'm gonna do is now move my arrow bring your attention to this idea that on a gel and again, we're gonna come back to DNA electrophoresis blobs on a gel. But in essence, if you wanna know ahead of time if someone's heterozygous for sickle sonemia, again, they may not show a phenotype for this you can run your DNA and what you would see like you have in this pedigree is the parents would have two different blobs on a gel and that's because one of their DNA strands would have the full length sequence representing the mutation that they have and then the other one would have the smaller blob representing the fact that they had the original sequence which was cut in the middle and that led to a smaller blob. And so in this case, you could even and this was what the technology was developed for is if you had a pregnancy and you were concerned about whether it had sickle sonemia you could actually do this diagnostic and that's what you see in these theoretical children is if they had both copies of the bad allele they would have the blob on top. If they had two copies of the good allele they would only have the smaller blob and if they were heterozygous they would have both. And so again, this is an actual diagnostic ahead of time based on the DNA. Well, this concept of having what are now restriction fragment link of polymorphisms. Again, the idea that you have a restriction enzyme you have a fragment of DNA you can run on a gel and the length is polymorphic. There are two versions of the length based on whether you do or don't have a sequence there allows you to do this thing on the right which is you can take a moderate amount of DNA and run it on a gel and compare it between different people. And so this is a sample gel where if you look at the top you had a swab where you had a sample of DNA from a victim with a crime perpetrated upon her. And then you had two different suspects. And so this idea of you look at the, you take the suspect DNA and you see if the pattern matches. Sorry, I should say there are several pieces of evidence with the victim DNA. And then if you match the pattern you'll notice in this gel if you look at the pattern, which I'm now covering up, the suspect number one, their restriction fragment length polymorphism matches the suspect number two. And so this is kind of the basic idea of how you do this. Now, from a legal point of view and I want to make this distinction very clear and Berrigan kind of hints it at this is that from a legal standpoint you don't say this person created the DNA. What you do is you set up a mathematical probability that no one else in the world supplied that DNA at the crime scene. Does that make sense everyone? Is that ultimately you're not saying, you're not saying specifically this person created the DNA. You're saying nobody else, the probability of anybody else having contributed that DNA is basically infinitesimal. And I think, and we maybe can get into this later in the discussion because Shiloh had a question from local chat that if you have a monozygotic twin, an identical twin or a close relative where you're trying to determine between people, the math does become more complicated and not as easy to exclude a close relative. And it's basically impossible to exclude a monozygotic twin. So I hope that kind of gives you an underlying basis of how this works. Now, the limitation of this type of RFLP analysis is that you need a pretty good quantity of DNA. You can't amplify from small samples and the samples don't store for very long. And so that's the main difficulty. And now what came up, and I'm actually, so while Barry and I was talking and went and grabbed a PCR slide, I want to just go back and explain how PCR works. Cause it's this, you know, amazing technology, Nobel Prize and chemistry, I think in 1993 for Kerry. Muller. Muller, yeah. That it's just a pretty amazing. So I grabbed this from Kimball Biology. So this is a free textbook. It actually used to be a textbook used in classes. And he basically put all the information and everything online. So throw this in local chat for people. And what it's demonstrating here is that if you have some sample DNA, and that's what we see in the green, it's got the double helix. What you do is you denature it with heat. So the strands separate into single strands. And then you expose it to a short stretch of DNA known as a primer. And this primer is a specific sequence that matches a target sequence you're trying to amplify. And then it lies down. And then when you cycle, you add the raw material building blocks of DNA, you add a polymerase, you suddenly copy that stretch of DNA. And so, you know, you keep amplifying it and then you basically keep repeating the cycle until you have lots of DNA that you can run on a gel. And so the power of PCR is that you can go from very small samples to then create a detectable threshold of sample. But there's another power of PCR too, which is that basically you can amplify at the same time any number of segments you want. It's very specific. It's incredibly specific to say, I'm gonna amplify this stretch of DNA or this stretch of DNA or this stretch of DNA. So I hope, excuse me there, I hope that that explains the basic idea of PCR and how it's very powerful for how it's used specifically in forensics. So in this slide, what I'm demonstrating is something known as a short tandem repeat. If you look at the coloration and the letters on there, you see that the four nucleotide code of T-C-A-G, sorry, T-C-A-T is repeated either four, or sorry, either five, six or seven times. And what I have highlighted on the slide are these little arrows that represent primers that sit outside of that repeat that would amplify. Now the thing to notice is that on the left-hand side, that primer's in the same spot, but on the right-hand side, the so-called reverse primer, they are farther and farther away from the left-hand arrow every time. And so this is known as, again, a length polymorphism, but the power of it is that people randomly have different sizes of these repeats. So it's not like the sickle cell anemia, the restriction fragment polymorphism, where the size people either do or don't have that mutation and that's all you're trying to identify. There's a lot of variability in these short tandem repeats in the population. And so what it means is, if you take a panel of these, you can come up with a genetic profile where you're specifically tagging the length of each of these repeats at several sites. And so here's an example of this that I grabbed from a website describing these types of things and how they're used in forensics. And I'm trying to grab the webpage really quick to make sure that everyone can kind of, I hear this, this is from a nice website on nature.com. And there's just showing a sample here where you have at the crime scene, you have a suspect and you can say at these 13 different sites, they have this number of repeats or, sorry, remember, and remember there are two numbers here because everyone's a diploid at the sites are being analyzed that they have a 15 and a 17. Sorry, the evidence sample has a 15 or 17. Let me just pull the arrow here. Again, this first column is the evidence sample. Say here's the length at these two sites, at this one site for their two strands of DNA. Here's the length for their STRs at this other site. Here's this. And basically once you know the representation of these in a population of people, potentially at this point, we have a pretty good representation of it for the world, as you can say the probability of somebody contributing an evidence sample versus not. So when you look at suspect A, basically you can right away start excluding them because they have numbers that are different than the evidence sample. Sometimes they're the same, but in many cases are not. And you can come up with a probability about, or you basically can exclude them and they come up with a probability that this person whose numbers match the evidence sample that come up with a probability that they are the only person who could have contributed that sample. So this is a database called CODIS. The other really big advantage of this type of analysis that one is very cheap, you can do it as kits. And once everybody agrees upon a standard of doing this type of technology, and then you can also have databases that store this technology, sorry, store the data collected with this technology, then basically you can take a sample and try and match it to the database. So anytime you're watching Law and Order or an FBI show, they say, oh, we matched this to a criminal that then was apprehended later. And so what's really interesting about this, of course, you can connect people to a crime. And I think maybe at the end of this talk, we'll talk about privacy concerns, because the thing too is that the ability to collect DNA, then collecting DNA with the power PCR in this database is really not the limiting factor of connecting people to crimes or other types of things. I do wanna talk about a similar technology because we are talking, I think, a little bit about DNA advances in general and this idea of the haplotype. And so if you string together enough mutations over a stretch of DNA, they end up generating these things that are now termed haplotypes. So the haploid chromosome has a series of mutations. And this is just an example, a theoretical representation of this, where particular sequences change, although the majority of them are the same. And then you have this representation that's used in the population. So these are mutations that just randomly occur as DNA is replicated in germline cells and propagated in the population and then people inherit these. And so what's really important about these haplotypes is now you have these strings of information, you can come up with an inheritance pattern. So a haplotype is typically transmitted from parent to offspring. And then once the mutation happens, it's that haplotype that's carried from parent to subsequent offspring. And so now you have this ability to genetic genealogy and that's why I wanted to introduce this as Berrigan talked about databases, these 23 and me databases and whatnot, or ancestry.com. But essentially, scientists will use this also to look at the migration patterns of ancient humans. And so that's what's here on the bottom slide is looking at the Y chromosome, looking at these haplotypes to actually understand human migration patterns. And what's really interesting about this is we can actually go back in the scale of thousands of years to look at the inheritance patterns of where people came from and where they migrated to and how they really relate to each other. And so I think this is also a really interesting powerful technology. And there's an author, and I meant to look him up and I'm blanking on his name, he actually wants everybody's DNA. He says everybody should put their DNA into a database that we can commonly look at all of our ancestry. And so for any of you out there who are ancestry.com fanatics or fans, you put your DNA in and it'll do this type of haplotype analysis, mutation analysis and say, this is to whom you're related, maybe you're an eighth or 12th cousin, but it gives you the sense of everybody on the planet being really a part of one family and that we could actually maybe try and put that idea forth more than that would be very powerful. But of course, this idea of using 23andMe to identify a relative is also the type of thing that these databases can do from a criminal forensics standpoint. And that's what Berrigan alluded to and we'll talk a little bit about more. Does anybody have any questions on kind of this idea of genetic inheritance patterns, STRs and forensics and DNA? Otherwise I'll pass it on back to Matt and let Marianne do the next segment. Yes, this is Matt slash Berrigan. I did want to just, before Marianne jumps in to talk about the phenotyping, I did want to mention a little bit more about these databases. The CODIS database that you mentioned is the federal database maintained by the FBI and it is one of the interesting things about the CODIS database. So it is generated by DNA specimens obtained from suspects or even convicted people who basically collected by the police. The frustrating thing about the CODIS database is how infrequently you actually get a hit off of it from an unsubs DNA from a crime scene, which is why these other tools like genetic genealogy and phenotyping and so forth have been so useful because the CODIS database is incredibly useless. It turns out, which is something I learned recently, one reason the CODIS database is so useless is because it's heavily weighted toward African American DNA. Most of the DNA collected by the police is from African Americans, often for nonviolent crimes. And so when you're looking for a DNA match for a perp that's committed to violent crime or sexual crime, CODIS is not really that helpful. The other thing I want to mention is that a GED match, which is really sort of the database of choice for law enforcement, was it is a database that was developed, it uses a family tree software that was developed by GED match using techniques developed by the Mormon church. The Mormons have a intense obsession with genealogy because I guess, according to their belief system, families are reunited in the afterlife. And so because of this, they have genealogical records going back for centuries, very, very detailed genealogy and they have developed their own software to sort of manage all of that data. And so that has been exploited by GED match to develop the software that they use. And GED match was initially used for people to, well, I will say- Well, yeah, I couldn't, let me jump in a little bit because I've actually thought about. So let me make a general point about this type of technology when it comes to forensics is that all you're looking for is a match. And it comes down to the integrity or the power and the robustness of your database for whether you do or don't get a match. And I totally agree that there are biases in the way these are collected, particularly in America. Again, the United Kingdom has a similar one as well and be biased towards people who have, again, like getting their fingerprints taken when going into jail, have it in some sort of reason that has been taken and then available to other criminal investigations. So the power of the Mormons, as you mentioned, really does come down to the fact that they had generations of very good record keeping that actually could be used by scientists to do what's called linkage mapping for different traits. And so linkage mapping that allows you to say, oh, this person had a disease and we know what their inheritance pattern is or where they got their chromosomes from ancestors. If we can now look at the markers that are associated with the people in that lineage of chromosomes and then say, oh, here's a marker that correlates to a disease, then you can start saying, oh, that that location on the chromosome is related to that disease. That's kind of this idea of linkage mapping. And that's, again, that comes back to the general idea, but it does come back to the idea, of course, like you're mentioning that the record keeping is really important for making those records useful from a genetic point of view. Yeah, that is a huge challenge with genealogy is that a lot of times they'll track down some distant ancestor, but the record keeping is not good through all those generations, right? You have to go through public documents and so forth and it's very labor-intensive. Well, there's another complication too, and I don't know how many people out there who've done 23 and me at this point is sometimes you get data from the genetics that does not correspond to the parentage keeping. And so this idea of consanguinity, either related people or just having an affair that led to an outside of the marriage progeny, these are things that you actually have to, scientists really have to grapple with the privacy and the usefulness of data when those types of things come up. So I think it's an interesting point. I think, I just want to jump in to say that the integrity of how these work in matching is basically biased towards what sort of data you've collected. And I think that on the one hand for forensics, and this would be one of these tensions in the public discourse would be, obviously we would love to have robust law enforcement, but the most robust law enforcement would be a dramatic invasion of privacy if we tried to collect everybody's DNA in order to make the databases better. So I think that is ultimately where we come down to a lot of these things. Now, I do believe that 23 and me, and one reason why the JED databases is relevant is that it was made as an open source and everybody has access to it. Whereas things like 23 and me and ancestry.com, they protect your data. They specifically only allow and re-disseminate stuff that you say you can't. Law enforcement can't access their databases, at least as of the moment that I know of. So again, I just wanted to throw that in there. Again, I've taught a little bit on these types of topics before. So maybe it's a good topic for discussion later, or the fireside chat. Don't forget. Yeah, that's what I was thinking. I wanna get to Marianne real quickly and there's a lot more we can talk about with this. And I think we are gonna have to save some of it for the fireside chat. But while we still have time, I want to make sure that Marianne has an opportunity to talk about the phenotyping because this is also super interesting. So take it away. Okay, can you all hear me? Can you all hear me? Oh, okay, good. All right. Well, we're gonna be looking at some of the genes that contribute to the facial features that Matt was talking about a little bit earlier. We talked about beauty being skin deep, but your skin is laid over this platform of bone and cartilage and muscle. And so it's sort of bone deep. And these things are all shaped by genes. And you can see in this picture this had the skin stripped off, how close the bone and the cartilage come to the surfaces, especially around the brow ridges and the jawline and the cheekbones and so forth. So the genetic markers that are now being used to predict facial features were first determined by genome-wide association studies, G-W-A-S. And these involve getting the DNA sequences of thousands of people and then looking for specific features that some group of people might have and then looking for genetic markers that are associated with those populations and only with those populations. A lot of those markers are single nucleotide polymorphisms that Stephen talked about a little while ago. And each of those is mapped by a DNA address on a chromosome. For example, this one here has a sort of catalog number of RS-974448 and it's on chromosome two and the DNA address is at about base 222 million, et cetera. And the two differences there are between A at that locus and a G. So this is a single nucleotide polymorphism. So the first genes that were associated with facial features were reported in 2012 by consortium in the Netherlands. And the facial features are defined by these markers on here. You have the left and the right eye. You have the left and the right ear. You have the nasal prominence, which is this part down here. And then you have the base of the nose, which is up here. And the distances are from the eyes and the distance of the noses, the tip of the nose is from the eyes. And then you have the wings of the nose, the ailey, which are on either side of the tip of the nose. And of course, you have the mouth and you have a region called a zygion, which is the widest part of the upper jaw. So those are all different kinds of reference points that you use and associate with different polymorphisms. So the first five genes that were reported were these five here, PRDM16, Pax3, TP63, et cetera. And those are all names that probably mean nothing to you. And a lot of them actually were first found in fruit flies and other organisms. So this is a table from that paper that identifies the single nucleotide polymorphism, the catalog number for that polymorphism, the name of the gene that it is in, the chromosome that it's on, the base pair address on the chromosome. And then the effective allele, that is the allele that makes the most difference in that trait. And then the traits are over here on the right. And that includes how far apart the ailey, the nasal wings, are from the tip of the nose. Pax3 determines the distance between the right and the left eye to the base of the nose. So these are all little distances that are affected by these polymorphisms. So what most of these spatial genes do is to encode transcription factors. And these are genes that regulate the activity of other genes, and none of them only work on faces. They work all over the body, they're growth regulators in various body regions. So these are kind of the movers in the shapers of the human body. And they include things like sonic hedgehog, which regulates the anterior, posterior embryonic axis, the fibroblast growth factors in the bone morphogenetic proteins, which regulate the growth of bones and cartilage, some homeobox genes that determine where particular structures are located, and so forth. So for example, this Pax3 gene was first associated with nervous system development and fruit flies. So Pax3 is a transcription factor, and its name comes from a paired or a chunk of DNA that is found in that region, in that gene. So here are just a few of the genes that are associated with different facial features. Pax3, with the distance from the eye to the nose. PRDM16, with the length of the nose. Sox9, with the shape of the nose tiff and the width of the nasal alete. You can see that some of these, like PRDM16, are associated with more than one feature, like PRDM16 is associated both with the length of the nose and also with, I'll put that on there twice, nevermind, okay. So they all have weird names. So this next slide shows how those polymorphisms are used to generate the algorithms that produce a face from the DNA sample. So this is just one particular polymorphism, 97444H in Pax3, and this first little mask up here shows where that gene is active in the face, and then these next two show the difference in activity between those two different versions of that gene. And this wireframe down here shows where those effects are located. So the red wireframe is what you get with the allele and the green wireframe is what you get with the allele. So these individual, and bunches of these are used in generating the algorithms that produce a sort of human mask. And this slide just lists a bunch of other genes that are associated with different parts of the face. And you can see there are lots of genes at each of these locations. So there are about 200 genes now that have been identified that are associated with controlling facial features. This is from a recent report in 2020, and what you have here is different quadrants of the face. So you've got in the middle here, you've got the upper and lower nasal region, and then you have the upper region associated with the eyes and this region associated with the jaw and the chin. And then in the middle part of the face, there are several others. So each of these are color coded around the edges here. And there are a bunch of genes associated with each of these facial regions. And these little boxes here, you show the variation that you see in the United States on the outside of the ring and in the UK on the inside of the ring. And then over on the right side, it just shows where those genes are located on the different chromosomes. So this shows the chromosomal locations of those differences and those peaks that you see sticking out show what the variation you see at those different regions. So the next thing I want to do is just tell you this little story about how good this ability to identify individuals is. And this is from 1950. So it's probably a little bit better now. So in 1915, there were two employees of the New York Times and they had their DNA sequences and they sent their sequences off to Mark Shriver at Penn State who had developed the software for producing images from this genetic polymorphism information. And the images were produced for both individuals, one female, one male and given to their colleagues and asked and the colleagues were asked if they could identify the individuals that these faces belong to and 50 of their colleagues responded. So this was the guy who had donated the DNA for the male face and none of his colleagues were able to identify him correctly. On the other hand, the colleagues of the female about a third of them were able to correctly identify her from the faces that have been produced from the genes. This might be a male female difference because one of the things I've noticed is that it's a lot easier for me to remember which of my students is who, which of my female students is which. When I first get into class then it is for me to learn the males. So I don't know if we just pay more attention to female faces or female faces or more highly differentiated or what it is. So, oh, I'm sorry, that's not an idea. That's 2015, that's 2015, that's a mistake. I was just about to break in to ask you about that. So there's some, yeah. Okay, very good. Thank you for noticing that. Okay, so the next few slides, I think I'm just not going to show you because they're similar to what Matt showed you a little bit earlier. And that's all I have to say. I'll go ahead and run through these others quickly just so you can see what they look like. You know, something I'll just mention while people ran through that is people can now use telomere length to get a sense of somebody's age. So I wonder if that's something that they try and incorporate for myself. Yeah, that's interesting. That's a good idea. Yeah, I think that would make the age progression that much more advanced or precise. And there's some references if you want to read more about this. And I think that it is worth noting with regard to the DNA phenotyping that previously to generate an image of a perp, you had to take a witness to sit down with a forensic artist to generate a composite image of who the witness saw. And there's actually, that's called, those are called composites now because they're actually generated with facial feature software that the artists, you know, you basically show them a bunch of noses and the witness picks out the nose and then you show them a bunch of eyes and the witness picks out the eyes and you show them a bunch of shapefaces and the witness picks out a shapeface and you create a composite by doing that, which is very similar to what happens with the DNA phenotyping, right? You look at what is the gene for the nose tell you? What is the gene for the eyes tell you and so forth? So they are both composites, but the eyewitness composite images are, I think no more reliable or accurate than those generated through DNA. Well, I think that's actually been a really interesting topic over the past 15, 20 years is exactly how unreliable eyewitness testimony can be. Yes. And even fingerprints are now becoming, again, fingerprints were actually never established as a scientifically valid way of identifying people. It's something that can work. What was also interesting is the amount, like I believe fingerprinting does work, but there's also this level of validating the fingerprinter. And so exactly how in the numerics, they can put behind excluding people or, you know, identifying someone as a fingerprint lever in terms of evidence is something that's also now being highly questioned. At least relative to once you have DNA technology. Right. DNA is fingerprints are certainly being challenged more by defense attorneys than they have in the past. Although I have come across a few true case crime stories where the DNA in fact corroborated the fingerprint evidence too. So fingerprints are still pretty good, I think as a rule. Yeah, and I think just the databases for how many people are in fingerprint databases is very different, probably still bigger than what you'd have in terms of criminally or forensically accessible DNA databases. Yes, and I also think that fingerprint comparison software is much better now that it used to be. Fingerprints are more reliable because they're analyzed by algorithms rather than by the human eye. And I think that's improved fingerprints as well also. But I think again, I don't know how much the audience recognizes your attempt to look like Robert Stack or maybe Carl Maldon from The Unsolved Mysteries. But everyone knows from having watched TV the power of being able to put out a facial thing for people. So like America's most wanted to say, can you identify this person? And what it does is it casts a net, right? It allows you to cast a net that if you use these other things that can specifically say this person is essentially the criminal or no one else could have been, that's where the power of this DNA forensics of the facial features is so amazing. Right, right. I did stand up here because I did want everyone to admire my detective outfit. All right, I loved it. Well, there is a lot more we could discuss here, but we are just a little past the hour. So I think it might be best to maybe go ahead and stop here. And I think we should continue this with our fireside chat later this week. I think we can have a lot more fun getting into a lot more of the issues this power bestows upon us. My real goal here today is I really wanted our students to just come away with an appreciation of how really how powerful this technology is. And this has really just come about within the last 10 to 20 years or so that our understanding of genetic material, the tools we've developed to exploit it for all sorts of purposes, I just feel like it's underappreciated. People love the space program and people love the internet, but DNA technology is keeping pace. It is going to be revolutionary in our lives and in our grandchildren's lives. And I just wanted to ram that point home before we close out. We can talk about some of the ethical issues too. Absolutely, I think that's something we really need to dive into in the fireside chat. So I hope you all will join us. I think it's scheduled for Wednesday. Is that right? Yes, I think so, Wednesday evening. Let's take a look, I think, Gashunthal. Fireside chat, February 3rd, 4 p.m. SLT. So be there, be square. Thank you everyone for coming. I wanna thank my speakers today, Steven and Marianne, and thank you all for attending and for your comments. And with that, I'll gavel us to a close. Thank you. Thank you, man. Thanks for hosting.