 OK, ladies and gentlemen, it gives me great pleasure to introduce our next speaker, someone whose appearance here has been much anticipated by the genetic genealogy community, and it is Mike Sager from Family Tree DNA. Now, Mike never expected to be in charge of the Tree of Mankind. He certainly was lumbered with this task, which has become a passion project for Mike, and he has done such a wonderful job putting together the Tree of Mankind, and he probably of all the people on the planet has the greatest overview of how mankind has evolved and branched into different branches over the course of time. So I'm really looking forward to this presentation. Please give a warm welcome for Mike Sager. All right, thank you very much, Mr. OK. So today I want to talk a variety of topics, the Tree of Mankind, not necessarily the FTDNA Tree, but the Tree in general, maybe some tips, tricks to interpreting the Tree, how FTDNA is building the Tree, and a little bit of history behind it, and I want to touch on some of the more notable samples that FTDNA has produced recently. So I'm going to try and stay away from a real basics talk, but I do want to cover a couple of things first. So the white chromosome is passed down unchanged from father to son. Because of this, we are able to trace back an entire line all the way back up to essentially the root of mankind. So what we're able to do is, if no other data exists, that I could take every male in this room, sequence them, and then build a tree and tell you exactly how everybody is related to everybody else about how far back in time and about how closely you are related. So FTDNA is very unique in that aspect. So just a little bit about it. The white chromosome is currently about 57 million base pairs long, which may seem like a lot, but it's actually the second smallest chromosome behind chromosome 21. Again passed down, father to son. We do white sequencing, we have to have something to compare sequences against. So we have what is called the reference sequence. There is actually nothing special or unique about it. It's just a universally accepted sequence which everybody uses to compare against. These are updated regularly because we don't actually know the entire sequence of the white chromosome or any chromosome for that matter. There are regions that are difficult to access, very repetitive, stuff like that. So new references are being updated all the time. The latest one was in December of 2013, one before it was 2009. So we may have a new one coming up soon. That actually doesn't have much to do with genealogy because again the advancements on the reference are usually going to be more fringe elements that we aren't using for genealogy. Just a basic structure of the white chromosome here that this gray part is a very large portion which basically we don't know what's in the reference sequence. These parts in blue up here are called the U chromatum and this is basically where all of the great stuff for genetic genealogy and the tree building comes from. So it's actually about half of the chromosome that we are using. So SNP names. The information contained here, the first part when you see something like RM269, the first letter is always a prefix that is basically whoever named or discovered a particular mutation. In these examples, BY is family tree DNA. It's what we use for our SNP discoveries. We did switch to FT when big Y700 came around and we basically did this because we were up to about a quarter million variants and getting a hable group of JBY238713 can be a little bit cumbersome. So this next part is just a sequential numbering. BY10000 is simply the 10000th BY marker that was named. This here is the chromosomal location. Again, the Y chromosome is about 57 million base pairs long and so BY10000 simply exists at position 13,354,622. So it's basically just the address of the mutation and then it also describes what the mutation actually is. The ancestral base is a T and it mutates to an A. This is just a sample of the naming entities. So you may see a whole bunch of prefixes BY, FGC, Y. This is a not exhaustive list. You can find this on ISOG. There's about half of them but there's anything from academics to citizen scientists looking at big Y or other NGS data and then consumer genealogy companies like FTDNA and FGC. So typically the tree is built around the big Y or next-gen sequencing data. After we sequence a sample it goes through automated variant calling and matching. New SNPs are identified at this stage and FTDNA names them usually within about 24 hours of a sample posting and then we post those SNP names online. We can talk about that quite a bit but we are trying to get names out into the public because there used to be a lot bigger problem in the past of double and triple naming. A lot of our SNPs were being renamed by other companies, by other interpretive companies or by other analysts and it causes a lot of confusion so we just wanted to try and cut that down. Still a bit of a problem today but not nearly as bad. Samples are then placed onto the FTDNA haplotree. This is before I have really had a chance to look at them and then at that time matching variants and equivalent breakers are identified and tree structure is reviewed and I'll go on to about that here in a minute. Some people seem to have a bit of difficulty interpreting the tree. One of the things I like to suggest is to view mutations as actual men's names because in all actuality and all practicality that's exactly what it is. BY-10,000 had to have existed and occurred in one man in the past and one man only. You can just view all of this clunky information as say, Mark. An example of that in use. So here we know that Bob is a descendant of Tom and then we know that Tom and Bob are descendants of all these men here but you'll see all these men here are stacked up into the same line or the same block. These are called philo equivalents and that simply means that we do not know the order in which they occurred. So everybody tested to date either has all of them or none of them. So now say a tester comes and he tests negative for Bill, Charles and Arthur. So what does that tell us? That tells us that we know Bill, Charles and Arthur occurred more recently while everybody else occurred first, more distantly. So then we can update the tree and we know that Bill, Charles and Arthur down here in this block and the others remain up top. And so more testing may shed light onto the order of these but that's just typically how equivalent breakers are dealt with. So a little bit of history on the tree itself. In the 90s and early 2000s there was a lot of different researchers, academics, pursuing their own trees and phylogeny naming snips their own and that led to a whole bunch of confusion within the academic community. So a lot of them came together and decided to form what's called the Y chromosome consortium in 2002 and this was in an effort to unify the tree and make something more universally accepted. So this first tree contained 245 variants and 153 branches. Haplogroups A through R are defined, Mr. T over there, you're not known about yet. Haplogroup S is not yet discovered and actually there was a little bit of a caveat Haplogroup T is known about but it simply resides here at the root of Haplogroup K. So typically a new branch is not added to the tree unless it is viewed into men, thus the term haplogroup. Anything else if it's just observed in one person it's viewed as a singleton. However, this initial tree they just wanted to gain some structure so that where singleton branches added and so Haplogroup T actually does exist here but as a singleton. So it was not named yet. Okay, all branches today maintain what is called monophiley which simply means that they all share a common ancestor. So if you look here at Haplogroup E you'll see that everybody in E has the same common ancestor and that is E, the root of E. The only exception to that is the oldest Haplogroup which is Haplogroup A. And because of the nature of it we keep discovering older and older branches, many people in Haplogroup A are more closely related to people in Haplogroup R than they are to other people in Haplogroup A. So that's kind of an interesting way to look at it. Haplogroup A is as old as everything else down here combined and yet we know the least about it. And Haplogroup A double zero which is the oldest Y chromosome that we know is even older and we know even less about it. So there are also several interior nodes here that would later go on to be discovered but not by the YCC. You'll see here under K there are many branches. There's P, O, N, M, L and then several singleton branches. So the internal structure grouping these is not yet known. Haplogroup naming. In 2002 they also came up with a system for nomenclature for the tree, two main types and those two main types are still in use today. The Haplogroup by lineage which is what you're familiar with with R1B, E1A1, that's still used by ISOG if you go to their website and it's used by FTDNA but only internally it's useful for coding the structure of the tree and moving things around. But it's a little bit more difficult for actually communicating lineages because it can change over time. If we use that for some of the people that we have in Haplogroup J we'd have Haplogroups that are 30 characters long. And any change to the structure of the tree it'll change year after year. The most commonly used one is this nomenclature by mutation and so that's what you're more familiar with something like RM269, GM201. Those will never change regardless of how the tree structure changes. RM269 will always be RM269 so there's less ambiguity and it's a little bit easier for the community to use. I just thought this was interesting in this example that they put out. They show how to do a branch split and they showed here in Haplogroup H what happens if M52 and M69 split off. Well that actually did happen not too long ago but it was in the reverse order M69. It's proven to be the parent of M52. In 2003 two of the authors came up with a slight revision to it or an update to the YCC tree and in it they expressed their desire to resolve multi-fercations into bifercations. And that's just simply a fancy way of saying that they are looking for this internal structure here to add branches to places where branches could exist they just haven't discovered them. One example that they did find here was grouping Haplogroups N and O so they notated that one and then they notated R2. R2 was discovered and that's a primarily Arabic Haplogroup actually quite rare. Still missing are Haplogroups S and T and at this time the Coy San and Ethiopian Y chromosomes were believed to be the oldest but it would still be many years before we find Y chromosomes that are hundreds of thousands of years older than or tens of thousands of years older than these. The next update wouldn't come for another five years and basically they doubled the size of the tree. They went from 153 branches to 311. They went from 243 variants to 599 and finally Haplogroups S and T are documented and to date no other new Haplogroups have been discovered. They did add a couple of intermediary branches CF and IJ is finally joined but they would never find any of the resulting structure that consumer genealogy would later discover such as I and J will wind up being grouped with K and then H will be grouped with IJK and there are several other interior nodes that actually genealogy will solve rather than academia. So the YCC ceases to update. In 2005 FTDNA released its first tree which was basically a mirror of this YCC tree. Shortly after then within a couple months ISOG updated a tree and what their goal is was to kind of centralize the effort to make a standard where the YCC is no longer active in it. Archived ISOG trees are readily available from 2006 all the way through today it's still being updated and at this point is when basically the genealogy community begins to outpace academia. I'm not going to spend a whole lot of time on this this is just some graphics that show the growth of the tree over time you can see it starts with 100 snips a year 200 snips a year and we start to get into about 1,000 a year. May 2015 is probably about when I started a little bit before then I did and then we start really with the introduction of big Y and other next-gen sequencing we're able to really grow the tree at a rapid pace. And if we had today on here it'd be somewhere around 215 to 220,000 variants on the tree. There's a graphic that shows the FTDNA branches over time as you can see we're just over 25 I think if today was on there would be right around 27,000 branches. So what the YCC was able to do in five years jumping from 153 to 311 we're doing that easily weekly really about every other day or every couple of days we're adding that same kind of growth. So the big Y700 was a new product that we came out our goal was to sequence as much of the Y chromosome as we possibly could. For a validation of this we chose 88 samples most of these were we picked as many different haplogroups as we could A00R. But then we also chose 11 samples from haplogroup which is JZS1716 which is Bennett Greenspan's haplogroup. FTDNA estimates this to be about 1,000 to 1,500 years old. So what we wanted to do is we wanted to see what big Y700 uncovers that 500 could not. So we used SNPs that were called in a minimum of two samples and then we used Amanda was one branch upstream at BY101 as an outgroup to eliminate variants above that level. So what the original big Y showed we found 16 variants and seven branching points with the old one big Y700 retained all 16 but it also uncovered an additional eight new variants which is coincidentally we're covering almost exactly 50% more of the Y chromosome and then we got 50% more SNPs in just this branch. Six of these eight proved to be equivalents to known branches and two proved to be new branching points. One of these branching points if you look under ZS1724 we have S5, S6 and then BY171904. So we had three known lineages behind this. We knew, however, that S5 and S6 were each other's true closest matches. We know that because that's actually Bennett and his son and these are this BY171 group over here is I believe their second or third cousins. But the old big Y couldn't find a variant that would group them but big Y700 did uncover a variant that we called FT1. So now they have their own branch on the Y tree. Similarly under ZS1716 what we thought were three lineages we had BY170013, ZS1707 and then this lone man out here S11. Well we actually uncovered a variant that groups S11 with this BY17003 group so that is a new intermediary branch that we call FT2. So I just had to include this graphic. I think it's one of the most beautiful graphics of the tree that I've ever seen. It gives you a real nice perspective of saturation in certain haplogroups and the absolute lack of it in others. Haplogroup R is young. In the Y tree it's one of the youngest haplogroups that there are yet obviously it's the most well-defined, well-tested. And here A0 as old as everything else here barely gets an honorable mention up here. But you see that R, I, and J really dominate at least the growth that has been driven by genetic genealogy. And this is the actual tree structure, the FTDNA tree structure. I just love that graphic. So I'm going to give an example of how the FTDNA tree is built off of everyday results. So I took an example. This is from haplogroup J, a branch called Z18271. There were a wave of big Ys that came in from this group. And so what I do is I take the person's novel variants and other variants that are unique to just that group that have at least one ancestral value in this group. So currently there are seven people at this group. We call them A through G. And then we have these shared variants here. These are the variants. Ref just means that their reference or their ancestral, they don't have anything there. And then this T, these green boxes, indicate that there is a variant present. So just a simple reordering shows that there are three branches within this group that could be added to the tree. First, we have this single variant here that is shared between samples E and F, simple enough. Then we have another block here that is shared between samples D, B, and G. You'll notice here that sample B has no coverage. That's what this N means, that he has no coverage for these SNPs. However, we know that because he forms a branch with D that he has to have these mutations. So it's not necessary for him to have coverage here. In actuality, what this is, D and G have big Y 700s. And sample B was from big Y 500. So here is the third branch. Now, for a little bit of a trickiness added into it, this down here is a thing that I run for the total number of alt calls for the total number of SNPs that are viewed in the FDNA database. And I'm finding three people with this variant. Well, I'm only showing two here. So where is this other SNP? Well, it's occurred in the 2 Z 18 271 min that we mentioned. And then it also occurs in this branch, which is very close to here. So what exactly does that mean? If we go and look on our tree, these men are currently placed up here. And they're negative for everything downstream. But they share a mutation with this man. So how could that be? Another way to view this is if I search this mutation in all of these men. Everything highlighted in yellow here is everybody downstream of Z 18 290. You'll see that nobody has coverage, except for the single big Y 700 right here. So we know from this that everybody in this group has to have this mutation. So being that we know that, we know that this SNP is actually upstream of Z 18 290. And so when we add that to our tree, we have a new big Y 700 branching point downstream of Z 18 271. And then the two branches that we referenced in the discussion are down here. So there were actually four branches of the tree that were added. So there is a bit of trickiness comparing big Y 700 to 500. But it's really uncovering a lot of neat branches like this. And this is just a low level one. We're finding a lot of higher level ones that are proving to be quite interesting. This is just a block tree view of the same branch that was added. OK, so I want to talk a little bit about some of the, what I think are the more notable samples that FTDNA has produced within the last year, year and a half or so. So when somebody takes an STR test with FTDNA, we give them a predicted haplo group. We don't predict very far down. We're very conservative with them. But EM2 is actually one of the places that we predict to. Well, we had a man come in from Saudi Arabia that actually split EM2. It's the first time I'd seen somebody split a branch that we actually predict to. It's that high up in the tree. This is almost a primarily West African haplo group. You can see some of the countries here that are most common in our database, theorized to have an east to west expansion. And so who breaks it? Of course, a man from east of West Africa. Again, you'll know from our previous example he's negative for all of these mutations. And then there's about a couple hundred mutations that are still blocked up in this EM2 block. Early R1B splitters. There was a neat result that came from France, actually. Late 2018, a man split P310 and L151. Now, these are way high up in the tree. These are the parents of P312 and U106. So that was pretty neat. We have nearly 18,000 big-wise and tens of thousands, not hundreds of thousands of STRs downstream of these branches. But also in the end of this year we found another man who fits into this exact same line. However, they do not form a branch to each other. So we actually uncovered a third line. So another way to view this is we have three distinct lineages from P310. One with basically our entire database. Lineage B has only one person. And lineage C has only one person. So I mean, in such a heavily tested group, it's really hard to wrap your mind around how amazing that is and what's still left out there to be found. Here's another recent split in what ISOG calls half a group E1A. This is another West African branch found very predominantly in places like Mali. It's actually very common in the US. It was brought over in the slave trade. FTDNA estimates this branch to be about 51,000 years old with a TMRCA of 18.4 and who was at this split, this branch, a Saudi Arabian. So when I was talking about the STRs and we run, for unpredictable people, we run what is called the SNP Assurance Program, which is a small number of SNPs to give somebody an actual half a group. So I've done tens of thousands of these. In 2018, a customer came back, and I was scoring his backbone results, and he came back as CT star. So a star or an asterisk simply means that he's negative for everything downstream. So he's negative for, this is a visualization of the backbone of the tree. So he's positive for BTCT, negative for everything here. And F is the parent of IJKLR. So we're going to run a big Y on this person, of course, with his permission. But there is only 10 possibilities that could come out of such a result. And every one of them are going to be essentially groundbreaking in terms of the tree. Possible things, he could be a true CT star and form a new branch down here that is probably 140,000 years old. He could fall down DE. It's a branch that we don't test for in the backbone because nobody ever falls on just that branch. They're either D or E, CF, another branch that we don't test for because nobody belongs there. And he could also split any of these groups here because all of them have equivalents, and we just test for one. So he could split D, C, E, CT, DE, CF. Doesn't matter which one of these occurs, it's going to be quite interesting. And big Y is about to figure that out. And what he actually did was he split the root of haplogroup D. FTDNA has estimated that this split is about 65,000 years ago. So that means a man lived 65,000 years ago, has descendants of life today that the community had no idea about. The fact that people are still out there like that, even in under tests, that haplogroups is still just, it's mind blowing to me. So he was found to be derived from only 13 out of about 250 SNPs at the root of haplogroup B. The participant could trace his lineage back to Al-Wajj of Saudi Arabia along the northwestern coast of the Red Sea. This is between Egypt and Jordan. So we ran a big Y on him. And then we also asked him for his most distant known paternal relatives so that we could run a big Y on him. Unfortunately for him, he was not able to trace his male line back very far. And the best we could get was a first cousin. So we ran him. We found only one SNP difference. But then we added this branch to our tree and called it DFT75. Or it could also be referred to as D2. And so this branch formed about 65,000 years ago, but everybody in it has a common ancestor of about 100 years. Haplogroup B is an exclusively Asian haplogroup. It is found nowhere else. There are no instances of it in Europe, in the New World, in Africa. It's dominant in Japan, Tibet, and the Andaman Islands. Very low frequencies in Central and Southeast Asia. It was conspicuously absent from India. It had not been seen there. But recent studies have actually found very ancient Indian samples to be haplogroup D. So this is just a view of the tree after we split it. So now this is the new D root. These are just the number of SNPs below the root. This is what used to be haplogroup D, this M174. And then our new branch that we added here, but with about 700 more mutations in that block. So what we wanted to do is we wanted to have to find more structure within this group, as opposed to just a man and his first cousin. So we took the proactive approach and started mining our STR database and find particular people that might belong to this branch. Actually, there is a well-known FTDNA sample that has had the haplogroup label of DE star since 2011. This came from a Nat Geo kit. I think there might have been some hesitancy about if he was really DE. So the enthusiasm wasn't necessarily there. But then we decided to take a second look. And we ran a big Y on him. And he actually does belong to the new D2 branch. But he's not closely related to these men at all. He was of the 700 SNPs that we used to define D2. He was ancestral for over 250 of them. So this is about a 25,000-year split from a 65,000-year split. So the original two breakers that we identified got a new branch called FT76. And so you see the structure here. We decided to do a bit more. And we found another prospective D2 member from Hawaii. Again, he had very limited knowledge of his paternal history. But you can see his autosomal results, primarily West African. And he clusters with the original two big Ys that we did. But still, it's a pretty ancient relationship. As he lacks 58 SNPs than the original two shares. So we estimate that about a 5,000-year split. So the original breakers get a new branch. So now they go from D to FT75 to FT76. And now they're at FT155. And one last time, we found another man who had very unique STRs, but didn't match anybody closely at all. He could trace his paternal lineage to the tab plantation. So it was, again, brought over by the slave trade. As my origin showed, primarily West African. And he came back, again, related to this D2 lineage. But he was another 25,000-year-old split in this. So in all, we ran five big Ys. We found a split at the root of Haplegrup. DS made it be 65,000 years ago. We found three distinct and independent lineages. So our lineages, 25,000 years old. In one of these, we found another 5,000-year-old split. And we documented the first cases of Haplegrup D outside of Asia. So it's actually in the Middle East and in Africa. I tried to take a picture of what it looks like on the FTD nature. You see this giant block. This is the small block. I couldn't figure out any way to get FT75 to look anywhere close to being good enough to be shown. But it'd be about three or four times the size of this block. The information over here is, we've already mentioned this. So while we were doing this, at the same time, another researcher by the name of Haber had actually discovered this line as well. And so he put out a paper at the end of 2019 where he found this new D2 lineage in Nigeria. And in the paper, they refer to it as Haplegrup D0. It's similar to what was done with Haplegrup A. When we found something older than A, it was referred to as A0, found some older. It's A00. ISOG, I believe, refers to this as D2 as well. Now, the paper references only three samples with a TMRCA of about 2 and 1 half thousand years old. The sequence data for these are still locked up at the moment, so we don't have access to see where exactly they fit in. But there was enough information in the supplemental material that I was able to see where his Nigerian samples fit in with the five that we ran. And so this is an overview of basically everything that we have discovered since then. This was the old root of D. This is the new split. So now this is D2. We have a 25,000-year-old split. Right here is D2A. Here are the original two. The man from the tab plantation actually matches up perfectly with these Nigerians. And this whole group shares a common ancestor about 2,500 years old. But then we have the Russian, who could trace his roots to Syria, and another African-American man that are 25,000 years removed from everybody else. So there's still quite a bit of diversity in this newly discovered branch. And that is my talk. So thanks. Thanks very much, Mike. This is absolutely amazing, the work that you're doing. I'm sure there's got to be loads of questions. Is this being published, or how do you communicate the? We have not published anything yet. The Haber, coming out with his publication, kind of stifled that a little bit. But we're thinking of at least putting out a note to supplement their findings with ours. But we're working on that right now. And of course, publication takes a long time. It's three months, six months to review. And by the time the paper comes out, there's more discoveries that have been made in the meantime. Is there any plans for a blog, or an ongoing? Roberta Estes wrote a blog about it, I believe, late 2019, when we first added the new branch to the tree. It's something you could easily Google. Roberta Estes, haplogroupd. She has a nice blog post about this. And to what extent, I mean, obviously, this is changing the way people think about the evolution of mankind. To what extent do you work closely with archaeologists, with linguists, who are doing the same kind of research from an entirely different perspective? But it's obviously going to be very complementary of the work that you're doing with genetics. Personally, none. I'm dealing with new data all day, every day. And it's not much collaboration outside at the moment. Where would you like to see that collaboration going forward, or how could it be coordinated? There could be some very interesting things done. One of the things that we want to do is to start analyzing more scientific samples, such as the 1,000 Genomes Project and other things. And we are working on that. There's a lot of technical difficulties with that. Again, it's a completely different testing platform and getting those results into our database in a meaningful way is a bit tricky. The analysis isn't getting them in there properly is a bit tricky. And that was something that we discussed in Dublin, the genetic genealogy Ireland in October, when Lara Cassidy from Trinity College, Dublin, she was presenting the results of 150 ancient genomes from Ireland. They're putting them out. They really are. I mean, every week, they just keep pumping them out. It's fascinating. It really is. And it would be great to get them into YDNA warehouse and somehow find out where they're all. A lot of them have very poor coverage samples. You can make a lot of mistakes drawing inferences and getting too aggressive with your analysis on the ancient samples. It's not like a customer big Y where it's very clean and most of my decisions are very easy. You can make a lot of mistakes. And they can go unnoticed unless the right people look at it. You have to be very careful. Well, it will be interesting to see what happens when we finally get those ancient genomes into the database because, of course, a lot of them will be in radiocarbon data. And so that will actually help us to assess, calibrate, the mutation clock on the tree of mankind. Yeah, exactly. We have a question here from Derek. Come. Thank you, Mike. Actually, I have 100 questions. But more than everybody allowed me one or two. So Mike Welch, who you're probably coming to with, he's done some very good analysis. What interests me is the L21 branch, which stands for most in Ireland. And we're interested in surnames and plans and steps and so on and so forth. He has taken 100 surnames, where at least three branches contain the same surname. So we can make that probably the new branch for that surname. And 50 of those are Irish-related rest of Scottish Wales, so very, very interesting. Do you have any plans to actually do that systematically on the victory? No? Well, no, I'm working myself through the limitations that privacy is always going to be the big one. So I'm pretty sure you're referencing Alex Williamson's big tree. Yeah, and Mike Welch has separately done this separate analysis. And of course, Alex had a great tree. And part of it is it faded into the FTB name. Yes. The layout or the formatting was inspired from him, just the simple block visualization. But none of the data is transposed or shared. Our tree is completely independent. Here's the thing that the great have. Would you pick on the Irish flag, for instance, and you get all the branches just relevant to Ireland? That would be a lot. Oh, that would be a lot? Great. And then afterwards do it by problems, and then by count, that would be nice. Jared never wants much. One thing, what about surnames? Because on the big tree, you can see the surnames of absolutely everybody. But you only see surnames if you're within a 30-snip distance. Is it possible to, why haven't surnames been put on every single branch of the big white block tree? You don't know. It's privacy. Privacy, we can't publish the names of participants. But just the surnames? I'm not the person to ask about that. I am not. So I'm going to punt that one. Sure. Jay has a question down here, and then Donna up at the front. Mike, thank you very much for your address, but particularly I'd like to publicly thank your bosses for allowing you off campus to come and address us. You were telling me earlier this morning that even while you're here almost every minute, you're backlog of decisions and analyses of new big white tests coming through by the 100 by the day almost. The last 10 days, about 2,000 samples have posted. So you'll be anxious to get back to this. But I'm going to Roots Tech the week after. Well, on the half of those 10 days, even more thanks to your boss. Getting to the note, you mentioned backbone testing. I haven't talked to you about this, but it may come up with other surname administrators. I've got one of my flock who's got a backbone test underway. You can touch on it, but could you go over again perhaps to explain exactly what's happening? I think you're talking... It's not an advertised thing, but here we are suddenly getting something like that that we didn't ask for where you initiated it, am I right? Yeah. Okay, so the backbone that I'm talking about is for people who take an SDR test and we cannot predict what haplogroup that they are going to be in confidently. And we run a few SNPs for them. Now, when it comes to the construction of the haplotree, the haplotree is automated in for the variants that were automatically called by FTDNA. Sometimes I make a call that FTDNA... If FTDNA no calls it, because they're not confident in it, but I look at it and I'm confident in it, I override that. And the only mechanism I have to do that at the moment is by uploading a result the same way that we upload backbone results. So if you're a big Y tester and you see a Y haplogroup backbone order, that just means that I've been in your area and needed to have basically uploaded a positive SNP call to get an appropriate haplogroup assignment for you. I'm not clear. You're not clear. Okay, we'll talk. Talk to James later. I've just got a really practical question, actually. With all this work you're doing, with all this analysis, do you do a lot of it manually? Is it automated? Do you run proprietary software? What sort of applications are you using to do all this amazing work? It's not like so many people were doing this. It's not... So it was a lot more manual back in the day. And some of the things have been automated, such as generating matrixes to work with. Back in the day, I used to have to hand enter every single mutation into the tree by hand. Enter the SNP name, the position, the alleles. So if I added 100 SNPs, I had to do that by hand and it would take about 30, 45 minutes. It was very, very tedious. Very recently, I've been able to get that more manual then. So some of the things have been automated, but still branches are not added automatically. Everything is viewed by me. So the analysis portion of it is still very manual. But some of the data collection and visualization from my end is a little more automated. Thank you very much for the talk and I can see every day how big the tree was. I've even got my own particular branch, me and my dad. But my question is, I also see hundreds of emails coming through my project for all the mitochondrial tests that are running through. And I know the file of tree number 17 on the mitochondrial has paused in 2016. And I think that other person who was developing it, nothing's happened. I was wandering our family tree DNA going through on you and put your other side onto the mitochondrial tree. Thank you very much. Honestly, it's been kicked around a bit, but I don't see it happening anytime soon. It's a dawning task. And to be honest, I think the Y tree is much easier. But no, I don't see us trying to venture down and do a similar thing with the MT tree anytime soon. Lots of quick questions from Patty. Mike, you said you're still very conservative in predicting happy groups that most Irish men just get or dash M269. I know from running a county project, it's very easy to see that a man is M222 or M226. I mean, I think you could sell out more tests by asking Irish men, do you want to know are you descended from neither than I am most deserving of reward costs and these mythological figures. Any chance of getting more precise predictions off the STRs? So, I believe it was five, six years ago, FTDNA tried to get more aggressive with their STR predictions. That kind of bit us in the butt a bit. Because if we get it wrong, it's on us. If we predict somebody down here and they order a test based off of that prediction and they're wrong, well, then we have to refund them, comp them and that's out of our pocket. So, it was decided to go back. We'll stick with the basics and basically it's more to cover us than the admins can be. And everybody will have their particular tree and people will say, well, you can confidently predict this and I don't see us going back with the more aggressive thing anytime soon. Debbie? Just a very quick question. Are you able to say how many big-wide tests do you have in the family tree? Big-wide 700? So, we just crossed 40,000 a week ago and like I said, we've done about two. So, it's got to be around 42 now. So, about 42,000. Well, unfortunately, I mean we could stay here for another hour or even couldn't be with Mike. But unfortunately, we have to call it a day there. We have Ken and Alison Tate talking next about distance is no object and how family matters can be sorted out through DNA. But, Mike, you'll be standing around the family tree DNA stand for the rest of the day. So, if you have questions, please take advantage of Mike's presence here and come and ask him some questions for the family tree DNA stand. And until then, please give him a warm thank you for a fantastic day. Thank you. I know you enjoyed that.