 moving to the second talk of this morning's lecture program. And again, I'd just like to say a big thank you to Family Treat DNA for sponsoring the fourth year of genetic genealogy Ireland and all the volunteers from ISOG, the International Society of Genetic Genealogy, who are helping out here in the lecture area and also downstairs at the Family Treat DNA stand. So a big thank you to our sponsors and to our volunteers. And first up is John Cleary, and John will be talking about current developments in SNP testing for genealogical research. John teaches in a languages department at a university in Scotland, and has previously taught in colleges and universities in Germany, Japan, Malaysia, and the UK. So he has certainly been exposed to languages. He has been involved in educational development projects on teaching modern European languages, which have led him to travel widely in Eastern Europe and Central Asia. And in a previous slide, he also worked in a museum and wrote a history of the people who have built and inhabited medieval arms houses. So in recent years, there's been a huge explosion in the number of DNA markers available for testing on the Y chromosome. And as more and more people are taking up these advanced tests, our knowledge of the human evolutionary tree has expanded. Not only that, but the new SNP results, in combination with pre-existing STO data, are creating branching patterns within surname projects and helping our understanding of the evolution of surnames within Ireland. John summarizes these recent advances and shows us where they might lead. So please give a big welcome for John Thierry. Thank you very much, Maura. It's a real pleasure to be back here at Genetic Genealogy Island for my third time. I mean, I've been speaking here. And it's very nice to be back in this room. So it's a very nice room to present in. And I'd like to say good morning to all of you. Actually, good afternoon now to all of you. And thank you for coming along to listen. Some of you may have heard talks I've given either here on Birmingham about widening a project and SNP testing. And I think what many people have said to me afterwards is thank you very much, but it's quite a complicated area. And we were useful to have a more basic overview of how to go about exploiting SNP testing for your family history research. So what I'm going to do today is I'm going to try and go through a procedure of how we can approach using SNP testing on the Y chromosome for this in a more step-by-step way. So this is going to be more of a basic level talk. But I think it's a good time to do this, because Maura says there are changes afoot. We don't really know what they're going to be yet, but we can smell change in the air. And things have been stable for about three or four years since the introduction of affordable next generation sequencing tests, which allow you to sequence long chunks of the Y chromosome rather than sample just particular markers. And I think we're just on the verge of seeing big changes that will lead to different ways of doing this and perhaps more effective ways of doing it. And so I'll try and talk about what some of these may be towards the end of the talk. And I'm not going to suggest what the future will be. It's my own speculation, and there are probably people in this room who know more about it than I do. But anyway, I'm going to begin by doing an overview of what the various kinds of why SNP tests are. Then I'll look at how we can set up building small projects. And I view anything as a project, whether it's a large surname project with officially named administrators, or just something which you're doing yourself trying to investigate the history of your own surname by getting a few tests together. I'll look at some cases from Irish and Scottish and actually Scots-Irish surname research to illustrate some of the questions and some of the things you can do with SNP testing and furthering research into those surnames. And then I'll give my own idle speculations about the future of why SNP testing. So first of all, I'm just looking at the various types of why SNP tests. There are three broad types of ways you can test SNPs. Before I go on, are you all familiar with what I mean by SNPs? No. Well, this isn't quite a beginner's level course. I could talk, I'm afraid. But there's many people in the room are familiar with the STRs, the single tandem, what do they call it, single tandem? Short tandem repeats, thank you. Absolutely, short tandem repeat markers. And the problem again, Morris, the single nucleotide polymorphisms, which are the SNPs. And of course, they're two different types of marker on the white chromosome. So a SNP, in very, very simple terms, is a single mutation in one place on a chromosome, any chromosome. And it pretty well stays there. Most of these are, at least in the historical time scale, permanent. So once one of these markers mutations appears, it gets passed on to all the descendants of the person who has that first mutation. And therefore, they can become markers of genetic descent. So all the SNPs, don't worry about it if you're not familiar with terminology, is a marker that's passed on. And as it's a white chromosome, it is passed on from father to son down the direct male line. And of course, the advantage of this is it enables you to do surname research since, until the 21st century, surnames were also generally passed down along the male line. That may be changing, of course, which will be a challenge for future family history research. And why not? So I'm going to go back to the types of SNP testing. So the aim is to discover which of these markers appear on your white chromosome, or the white chromosome of the person with your testing. So the best way to do it is to do a sequencing test here, in which long chunks of the white chromosome are actually read, the whole thing. So whatever the sequence of bases on the chromosome is across that stretch, you read it. And you come up with a sequence, a very, very long sequence, and millions of sequences of the A, G, C, and T letters, which will be your own particularly unique sequence. Now, most of us who have white chromosomes, that could be about roughly half the room, will have pretty well identical sequences in their white chromosomes. But there are small differences here and there, which are these markers of mutations that occurred in historical or prehistoric times that are passed down through the ancestral line. And so these are what we're trying to find. And so by getting the full long sequence, you then spot where these small, tiny differences are between you and somebody else that indicate that you have different ancestral paths. And also, of course, because mutations can appear down the generations, you may test yourself and your father, for example, and find that each of you has one small difference somewhere on the white chromosome, which would be a unique mutation at you in that case. So the sequencing then aims to sequence, in theory, the whole chromosomal sequence. But in actual fact, today's tests can't manage that. They don't have the technology to do it, any great degree of reliability. And therefore, what you have is targeted regions in which bits of the chromosome are sequenced fairly fully. Others are kind of partially sequenced, others not at all. So we have a kind of mosaic of sequencing. It's still better than what we had before. It's not quite a full white sequence. Now, I give a talk in Birmingham in, who do you think you are this year, in April, in which I talk in more detail about the differences between the different sequencing tests. And if you're interested in following this up, then try and find the recording of that talk online, which Morris has placed online. And it says slightly higher level talk that goes into more detail about the technicality today. What I wanted to do is talk more about how we can use these tests. The two more forms of SNP tests that we're talking about. Single SNP tests have been around for a long, long time. In fact, they were the way in which SNPs were tested for most people doing this kind of research until a few years ago. And this is simply you try and find a very particular SNP. So you send off a DNA sample to see, do I have this SNP here? A very well-known SNP is called RL21. Many of you who are Irish men in this room will have this. I do. Many of you will. Probably not all of you. Some of you will not have that. But it's a very, very common SNP. You could do a test to see if you have it. And the testing company will only test the little piece of your Y chromosome where that SNP is found to see if it's there or not there. So it's a single SNP test. And again, they're very, very useful for very particular kinds of questions. But you wouldn't want to try and sequence your whole Y chromosome that way. You'd have to be a billionaire to do that. So they have a very slow and expensive process. So the great thing about next generation sequencing is it speeds up the whole process and makes it more affordable. Then most more recently, we have the panels of known SNPs. And these are very, very useful because imagine if you do a little more extensive next generation sequencing test, you might discover a whole chain of new SNPs that exist in you and nobody else, which will be your private SNPs, as we call them. But most of these actually will be shared with other people that you're related to who may not have tested yet. And of course, you can go to them and say, I did this NGS test, what did you do as well? Do you have $500? They might say, no, unsurprisingly. But if you say, do you have $99 or $119 to see if you share these SNPs that I have, that's a more affordable prospect. And it means that you could follow up your own test by more limited testing of other people who are believed to be connected to you and who may share these SNPs to see whether or not they do share them. And as we see later on, this enables you to start building descend trees. So these three types of SNP tests are complementary to each other. They all serve different purposes. You wouldn't yourself take a sequencing test and a panel test. There's no need because the sequencing test only includes SNPs found in the panel for tests. So be very, very careful. Make sure nobody you're involved with or working with. Try to take a panel test having taken an ex-generation sequencing test. But you may get your relatives to do that. And you yourself may want to come back later on to investigate further some of the SNPs you found in your own NGS test. And one way to do this is to take a single SNP test to just make sure that something you've found that maybe is a slight question mark over it actually is a justifiable discovery. So these are, yes, please. When you say NGS tests, do you mean something like Big Y? I'm trying to avoid being company-specific here. Yes, there. Well, I will mention that in a later slide. Maybe the next one, actually, which the two big NGS tests are. But yes, at the moment, there are two companies. If I think it's the next slide. Well, you hear what, because it's obvious. The two companies currently selling commercial direct-to-consumer NGS tests, which are the Big Y family tree DNA. And the Y leads to, I think it's got one now actually, all the four genomes cooperation. And they're both very good tests. They have different strengths and weaknesses. And there's a very sharp debate amongst project administrators about which is better to take. There's probably no doubt that this one is a better test overall. But it will hit your bank balance far harder than the Big Y test. The Big Y saves costs by targeting regions and doing what we call capture testing, trying to capture particular parts of the Y chromosome, where useful SNPs are thought to be. And again, the Y chromosome is very up and down. Some parts are very useful. Some are actually not very useful for this kind of testing. So the Big Y aims to get the parts as far as it can that are useful. It's a lot cheaper. It's a lot quicker. They turn these around in three or four weeks, whereas these can take several months. So this is for the purists. But again, it's not to be ignored. Because if you want to seriously research your own genetic ancestry, you may find that actually some of the SNPs that are crucial to your line line area is not tested by the Big Y. And therefore, you would currently need this test in order to define them. Now, I've taken this one. I think this one is good enough for what I was trying to do. And I sort of recommend people I'm working with also to take this one. Because if you catch the family tree DNA sales at the end of the year, you can also get two of these, the price of one of these, which means you can do what's very important in SNP testing research, test two people who come from the same cluster to then see which SNPs are shared and which ones are not shared by the two. Obviously, it'll cost you an awful lot more to do it with this. Eventually, I will do something like this. I'm not going to do this one. I'm going to wait for something better to come along. So I would like to see my whole white chromosome sequence. I'd like to have my whole DNA sequence, actually. That's what I'm waiting for. But at the moment, these are the two choices. And here are two cameras again, which do SNP packs and single SNP testing. Family Tree DNA do all three. They do all three types of tests. And they're very responsive at the moment to creating and improving their panel tests based on the discoveries from the Big Y tests. So if you're working at a deep level of trying to investigate a haplogroup, a much older grouping of related individuals with a much older direct ancestor, Family Tree DNA are now currently very willing to put together special panels for those groups. And then coming down here, Y-Seq, a very small, very mobile company based in Berlin, Germany, who also do single SNP testing at a very low cost. And I go to them for any single SNPs that I want to confirm the groups I'm working with. And they also do their own SNP panels, using a different technology to Family Tree DNA. So again, there's a rainbow of test types and technologies. But all of these have a role to play in the current ecology of SNP testing. And it's very easy to see what's going to change because I think these companies would expect would be the ones who would lead that change as well. So those are the companies which we can use. I'm going to talk about how to go about setting up your own project. So I've got a little flowchart that will show you the steps that you can go through. And the first question you need to ask is, do you, if you're interested in testing yourself, or a relative of yours, or a friend of yours, do you have a SNP project with a large set of FTR matches? So there are many very old SNP projects that are very large and developed and a lot of data from the STR tests. If the answer is yes, then can you identify the branches of this family lineage using the data you have already? If the answer to that again is yes, then find a test candidate within each branch. That's relatively simple because you want to find out what are the unique SNPs of each of the branches you've identified. So if you're in this lucky position, you can move fairly quickly to a single SNP. But to SNP testing, if you can do single SNP or panel testing, if you do that, check for the reads you get. Or, of course, you can go to NGS testing, in which case you should then move on to get the raw data. You can't do very much with the results as given you in most of the testing site. Well, it's certainly Family Treaty and A's results page. If you get the raw data files from Family Treaty and A, you can do a lot with those files. So that's just the beginning stage. Most of us are not in this happy position. So I certainly wasn't when I first started, and I think many of you, if you're interested in this kind of research, may also not be in the position of already being part of a large STR-based surname project. So what then do we do? Well, you can ask, do you have close STR matches? Now, you can't do the big Y test in particular if you haven't already done STR testing. So you'll know if you have already STR matches who share your surname. And of course, if that's the case, if there's a fair chance that they will be related to you, and you can use SNP testing to investigate whether they are and how closely they are. If you don't, then the only thing you do is SNP test yourself. And that's more or less the position that I was in when I tested myself, because almost no one I could call a close match in certainly not sharing my surname. If you do, then test yourself and one of the matchers. We get most of value from these tests by comparing results. So in a sense, if you only test yourself, you're hoping that something else may come up. Chances are, if you don't have any STR matches, you're not going to have any SNP matches either, so all be too distant, too far away from you. But it's still a start on the process of testing yourself, as I did, but it's better still if you have a match that you can test along with you, and then you have a comparison you can set up. You may have STR matches, close STR matches, with a mix of surnames. This course is very, very common for many of us. Often we can identify obvious NPEs, non-parental events, people who don't have a surname but are clearly connected to you through some kind of slip between sheet and bare in the past. Or alternatively, there may be a whole range of different surnames mixed in that may give a rather confused origin picture, and we'll look at a case like this a bit later on. If that's the case, once again, try and find SNP candidates. And if you've got several surnames with several members each all mixed in together, you'll want to investigate why those several surnames are coming together in your matches list. And therefore, ideally, you'd want a testing candidate from each surname group. If you can see branches within them, try and find a testing candidate from each branch within each surname. Of course, the cost amount up here, this can be an expensive process. You can do it gradually over time. Collect tests, process them, find out what they mean, then contact more testers and gradually build up more and more data. When I first came here two years ago, I showed the people who were here that day a very rudimentary tree chart from the group which I was working on. In the two years since then, we now have more than 50 big Y results and several SNP panel results in that grouping, and we have much more elaborate tree, which I'll show you later on. And it's taken two years to reach that, but we're now seeing the results of that long period of investigation. So once then you've done the test again, you have to get your data and process them, compare them, or of course get the positive reads from the single panel SNP tests. So we'll look at the data side of things later on. For now, I'm going to start by looking at these different cases. So I'm going to begin with my case number one, which is just about here. This is for the surname, the person who has no matches, that's me originally. So when I first did an STR test, I had no matches at all. I did one person who matched me at 2 and a 25, but he didn't match me at a higher level. I was 18 out of 111. So he wasn't really a close match, although he believed he was. I never really thought that was likely. The surname was Ron as well. So my surname is Cleary. His surname was Gorman. And the one thing that led to the thing that probably was some connection was about that both of us trace ourselves back to South Tipperary, so the area between care and the cork border. So clearly our two families were living in close proximity for quite a long time. They're definitely there in the 19th century. They've probably been there for centuries actually, many, many centuries. So the question was, how did I come to have a close match who was called Gorman? Why do I not have any matches of my own? Well, the second question is actually quite interesting and fairly easy to answer when you think about it. My family probably was not one that had very much immigration to the US. The vast majority of STR testers in the database today are American. And I think 70% also are some of the figures I've heard. And if you're part of an Irish family that's had very heavy migration to America, the chances are some of your descendants will have tested and begun to find relatives back in Ireland as well if they've tested. In our case, I suspect, we probably haven't had very much of that. And therefore, there are a few clearies related to me in the US who are testing, therefore a lower chance of finding people who are connected to me through this. Secondly, the Gorman, well, that probably is a family that went to America that the Gorman related to is based in the US. So their family clearly was affected by migration. But all of the Gormans in Australia, quite a few of them, come from America or Australia, and they're none so far from Ireland. So that's the first thing. So what I decided to do was set about finding somebody else like a test. I'd done my own testing, including Big Y. And I was still at Singleton, apparently with no close relatives for about 4,000 years. I was going to last in my tribe sensation. So what I did was I went and sought a potential test. And in Tipper area, there's a Cleary family. My father grew up in Clommel and often visited a cousin, also called Cleary, a cousin of his father. And when I asked him, my father, well, what kind of cousins are they, no one actually knows. First, second, not first, probably not second, third, maybe fourth, maybe just cousins, just cousins. So we don't actually know who the most recent common ancestor is. But this particular line can get back to about 1825. I can get back further, because I've gotten very lucky that our line happens to know of a gravestone in Rochestown in Tipper area that's guarded by a very fearsome bull. So no one gets near to take it down. And that gives us these two people that connect to him through parish records. And then we get two more generations from the gravestone. So I can get back to the 18th century, which is not the case for many Irish families. And my cousins can get back to this Thomas Cleary. So we know that there are some potential candidates, at least two who could be the ancestor who connect into my line, and who don't quite have the right dates to be the Thomas Cleary who my cousin descends from. But maybe, as we all know, dates in parish records and birth records, marriage records are not exactly fully reliable. People often seem to be a bit younger than they really were once you find their birth record. And we do have the marriage record of the Thomas Cleary here. And he tells us he was born in, as he was married in 1874. This is his second marriage. And at the time he was 50 years old, at least that's what he was telling people. And so we think that's the problem today. But he's probably not this man. This one's probably almost 70 to the young. But he could be this one. It's always possible. So what to do? Well, so I invited a descendant from this family to take a 37 market test. And I actually came last year to GGI and talked about this. And I gave the results of that 37 market test. It was actually very interesting. Because it's not immediately clear if we are related or not. And I'll ask you the same question I asked last year in a moment, see if you agree with last year's crowd. So this is the STR results. And you see here that we match on 33 of the 37 markets, which is just a lot on the margin of what family tree DNA would say is a match indicating relationship. But unfortunately, one of the markers that we differ on, we differ by a huge three steps. So technically speaking, we call the genetic distance between us is the sum total of all our differences, which is six, six out of 37. And that's actually quite a high difference. So this apparent cousin, Cold Cleary, from the right political area, still doesn't appear in my family tree DNA account as a match. So it's beyond the threshold of relationship. So can I ask you, what do you think then? Do you think that we are related? I am this of the clearance of the area. Given that it's six out of 37 difference on the so-called stepwise model, which counts all differences, or just four out of 37 difference on the so-called infinite alleles model, it only counts the numbers and markers you differ on. So can I ask you a quick show of counts? How many of you think that I and this other Cleary are related in reasonable time? Thank you very much. How many of you think we are not related? Thank you very much. I mean, you're very brave. You always see where I'm leading with this. Last year, it was 50-50. It's very interesting that the audience did split. Pretty well, 50-50, saying yes and no. Of course, since last year, I've gone and done a big Y test on the other Cleary. And I now know the results. What I see in the big Y test is a lot more resolution in this question. This block of red here is me and the other Cleary. We share all of these. So all these SNPs are shared by us, right down to just one that I have and he doesn't. So in the time since our lines diverged, there's been time for my line to pick at one more SNP, and there's been no more private SNPs in his line. This is based on analysis, by the way, from those who are familiar with the various resources for analyzing Alex Williamson's Big Tree and Wifle.com. I've used both. And my own reading of the band files to see which of these SNPs are shared and which not. So this actually is pointing to a very strong picture of a relationship. So the Cleary's, I think, definitely are related. In historical time, what I can't really do is put a data. You see, I've got a little timeline across over here. And I believe it's sufficiently vague that this bottom line is somewhere anywhere between 1,600, 1,700, or 1,800. I don't think that Thomas Cleary, 1813, is the man. I think probably our common ancestor is going to be a little bit further back and maybe just too far back to find him through parish records. But we can show at least that we are genuinely cousin lines. And whatever my father and his family believed when they were visiting them in Clonnell in the 1950s, they were definitely cousins. And that's good enough, isn't it? That's all we need, really. We don't need to know exactly what number of cousins that is. Now, over here, we have seen the Gorman line. Yeah, so that's the interpretation. So right over here, I think before each and hundreds, most likely. The Gorman tester has also very kindly big Y-tested this year. And we're still working on his results. They are now in the big tree, not yet in Y-full.com. And he believed a long time that Cleary was a Gorman NPE. And I sort of swallowed my pride and said, OK, fine. Probably it's OK, isn't it? And I think we found a Cleary tenant living on at the Gorman Landlord's Farm in the 1800s. Oh, yeah, it's going to go down here. But actually, no, no. The connection, I think, between us must go back a long way. Because these are the private SNPs we found in the Gorman tester. And there's a lot of them. And dating SNP branching is very, very difficult and very fraught with danger. I have to warn anybody who tries this. And we have to have a very wide margin of error around any kind of dating. But my estimate is that our land's divided in the dark ages. I have an estimate here of around 600 CE or possibly before. So we are talking about a very, very deep branching between us and them. And very likely we may have lived in the same part of Ireland for all the centuries ever since, which is interesting. Although maybe we just moved there from somewhere else around the same time as possible. So I've actually answered some of the questions then in my first case. And so I pursued these testing candidates, persuaded the test, and got the results of now answered questions that were thrown out by previous tests. And what I hope to do in the future is find more queries, in particular, to flex this out. I do know one family in Illinois, who I'm trying to contact. And I hope in time to find more testing candidates. So you may need to do this to make the best advantage of these testing types. You need to have a question to pursue. And you often need to make those contacts to find the people who can answer those questions by taking these tests. So to move on then, look at case two. Case two is what happens when you have a, I think I have the whole thing here. We'll just run this through quickly. Yeah, it's when you have, yes, case two is actually the yes that comes off here. It's when you have a large surname lineage based on STRs, but actually not having very much clarity in the way of branching. So I'll just quickly show a very interesting surname on work in the moment, or actually hearing about the person who's working on it, who's very active in this research, which is the Cochrane family. And they trace their origins to Derry, and possibly before that to Renfrew in Scotland. And so there is documentary evidence of a Cochrane who moved from Renfrewshire to Derry in the early 17th century. And we have a very large Cochrane lineage, which has got very, very similar STR values that suggests they also descend from an ancestor with a vintage of around about 1600 or so. And so some of them have got a certain degree of documentation that suggests they may descend from this Derry ancestor. And they've also found Cochrane's still living in Northern Ireland, who they've tested. So you see here the distribution of the surname. It's very much a Northern distribution. And of course in Scotland, there's two different spellings of Cochrane here, but they tend to be concentrated largely around this West Coast area. South of Glasgow, and here's Renfrew here, is Renfrewshire. So this and Cochrane family then have done very extensive STR testing. And I don't think they have an awful lot of show for it in the sense that they clearly have got a lot of similarity, so they can demonstrate the existence of the lineage that clearly are related to each other, whoever the ancestor was. But they have a lot of singleton markers on these here, which aren't so useful for making branches because they could occur at any time when the first sub-ancestor broke off the branch in the tester himself. There's a couple here like this, for example, which may allow you to construct a possible branch. There's not much here in the way of branching information. So I think this lineage really is ripe for some quite serious SNP testing, particularly using the big Y test. And some have actually big Y tested, and they're beginning to get more distinction and a clearer branching pattern. So already three people with the name Cochrane or their ancestor, and they've already found a division. So there's one branch here and a fairly recent common ancestor here, and this probably is around about 1600. So we expect that more testing is by more branches coming off here, maybe a sub-branch or two. And over here we have some below the man who was in the Cochrane project, I think the man testers here, and who again appears to have fairly similar STR patterns as the Cochrane's. But now the state testing has shown that he actually connects much further time back, probably around about 11 or 1200. So sometimes in the medieval period, before surnames were adopted, and probably before the original Cochrane ancestors had also taken that surname. So I think, although this is very much in the beginning, in its infancy, they have that huge STR lineage to draw upon. And I think they're now working on to identify more potential testers to try and bring more hierarchy and structure into the test here. And of course there are many big surname lineages that are going a long way in doing this, one of which is the Irvin surname project. James Irvin at the back here has gone further than most of us in using STR and SNP testing to build a structure, a family structure for his project. I think the Cochrane's potential to follow that on a slightly smaller scale, I think they will do so over the coming years. SNP testing also lets us see who our deeper ancestors are. And of course, you may not be interested in deep ancestry, maybe genealogy, maybe is what is driving you. But I think the Cochrane's have discovered some quite interesting things, because they discovered that they are also connected to other family lineages of testing within the same half of group, including a large family group called Cooley. The Cooleys do not know where they came from. I'm sure many of you know that Cooley is an Irish name, it's found over in Mayo, I believe. It's also an English name, I think with different origins. These Cooleys may come from Lancashire. They also connect with a tested named Hackett, who can trace his ancestry back to Derbyshire and the same evidence suggests that he may be of Norman origin. So this could be a Norman potentially, or Flemish origin group here. And the Cochrane's who are in Scotland are connected to this group, who seem to have possibly, again, very hypothetically more southern roots into the Isles. And going back further still, they've also found that they are connected to another large family lineage, which is the The Tucker family from Devon, possibly from Devon. Again, they can prove ancestry back to London and they believe before that they're a Devon origin. But I'm not sure what the evidence is on that so far. And the Tuckers are also closely connected to the Norwegian family. I hope you hear a representative of this tester here. And for all of these lines, I'm going to draw them back. We'll find there's a mix of Norwegian, mostly Swedish and Irish lines. And I'm looking forward to hearing a talk this afternoon, I think, from Peter Shorland here at the front, who will talk about Swedish and Irish DNA. And I'll look forward to that. So again, this is fascinating. So I think it tells us something, not only about deep origins, but also about potential migration. Do you have a question? As your study extended to your... We have two people in this particular group from Germany. We have none from France. If almost all of the people in this grouping, I'll show a map later on, are from Ireland or the UK or from Norway, Sweden, Finland. And two from Germany. We have some Welsh too, yeah, so we have some Welsh, yeah. And there are very... There are really a few testers from France and Germany. I mean, I suspect there will be some from France, but many testers. Yes. Yes. I can't say much about France at the moment. I know that we need more French testers, but I've been sort of looking northwards more recently. Yeah, thank you for that. So then this is going to be this case too, then large Sonia lineage, not much distinction in terms of branching within the STR tests. But here, SNIT testing is opening up a much more hierarchical tree and actually demonstrating connections are some of these coulis were popping up as matches in the Cochrane tests. Here clearly we can see that the SNIT testing sorts the STR results into clear branches with a much older lineage. And therefore we now know that the coulis do relate to the Cochrane's, but only about 1500 years ago. That's the time scale which we're looking at here. So I've put it at a distance, deep relationship. So that's the second test then, the second case. The third case is convergence, we call convergence between surname families. So this is the case when, here we go, you may have close STR matches showing a mix of surnames. And again, how can you approach this? I'm sure many of you found this, those of you who have lots of matches might find actually that there's some of your surname and lots of don't of your surname. Do we ignore the ones that have different surnames? I think we shouldn't. We need to ask questions about why they're in there, as with the Cochrane's and coulis. The relationship there was clearly a deeper ancestral relationship and the two STR results have converged since then. And if you have lots of surnames in your STR matches, it may again be a matter of convergence and SNIT testing can sort this out. So I have an example here. This again is from my own family research. Those of you who've seen my talks in the last two years we were on the board, hearing about the Cairns and the Cummings but here they are again. This is from the matches of a relative of mine. His name is Kemp and therefore lots of Kemp matches all the way down but he was also picking up matches called Jacobs here and here. And one here called Cummings and other people in the Kemp group had even more Cummings matches. So Kemp Cummings Jacobs seemed to be co-occurring and looked as if they were a close and related group in some way. So then if the STRs can't tell us that they're not. So STR testing isn't able to sort these family groups out but SNIT testing can do that. A little bit of information about this. This is the Kemp surname, surname distribution, very particular parts of Ireland, cabin mainly limerick and cork. And then in the UK, it has a very sort of interesting distribution really of the common in Scotland, common in the east southeast of England and then in Cornwall. Not really in between. So it almost looks as though it's got a coastal connection here. Certainly, there are probably surnames of different origins in the two parts of Scotland. Their name brought in by from other parts of Europe because Kemp is also found in the Netherlands and Denmark in particular. So there are variants of Kemp elsewhere in Europe. Here's the Cummings distribution. Again, two distribution from Scotland, a very west coast distribution and a very northeast distribution. And then Ireland again has a very particular pattern that will tell many of us that it's probably associated with Scottish-Irish settlements. So again, many of the Cummings that we find in our matches list do indeed have ancestry from the north but they don't, most of them have any connection that can prove to Scotland. So the Scottish connection is more hypothetical and it could well be that actually they've adopted a Scottish name but have been in Ireland for much longer and therefore not necessarily Scottish just because the name they carry is of Scottish origin. So I hope we've helped you approach this then. Well, we want to try and build trees from SNIT results. And again, unlike the, this is the Cummings STR list now and unlike the Cochran one I showed you, there's some very good clear patterns of branching here. There's a nice shared marker here over here with shared markers. So we're able to build some interesting branches for the Cummings which held us fine testers who then took a big white. And just to highlight these, get through this quickly. And what it meant is, because they also have quite good information about their ancestry here, we actually begin combining the SCR information, the SNP information and the genealogical knowledge that these people had to begin building trees from all three sources of information. So if you try and build a tree only from SNP data, this is what you, what you end up with is the Cummings tree based on nine big white tests and one panel test. And again, it's a nice hierarchy for this branch here. They have a clear sense of branching over the past 300 years or so. But these people here just stretch right back to the original ancestor. And the big white testing hasn't been able to differentiate the branches for them. So we need to turn to other forms of information to see where we can find connections there. Certainly, the big white test hasn't been able to find the key SNPs that may help us build a branching there. Maybe other forms of NHS test will. But whoops, it's gone again. This is a tree you can build from STRs. So this is using STR data. And actually you can see that this tree appears to be slightly more fine grained than the SNP tree. But of course, we can't rely on this because we don't necessarily know there hasn't been convergence in some of these STR branches. So we need the SNP tests to confirm and refine the branches which we're building with STRs. And once we have the SNP data, you then use that SNP data as the background of backbone to your tree. And then you begin to add in STR information within the ends of the lines. So the STR information can begin to refine what you have, and for example, there's very long branches here and here that don't seem to have much differentiation. And so doing this, I'll do this quickly because I'll do a longer version of this in Birmingham. You can begin to build a tree using SNPs. So these are all the SNPs in the branch appearing here and STRs. So here we have a very common STR that's shared by all people in this branch and seems to be quite stable for hundreds of years. And all the ones in purple then are STR markers, some of which correlates certain SNPs and others of which follow on from certain SNPs. And then you can begin to add in information from the genealogy. And there are still questions to ask. So here we were in quite sure of example where this person should fit in because he shares one of these patterns and all the other. So it could go on either line here. And in the end with these red dotted lines, this question was solved through further SNP testing and then genealogy where another person came along tested the why, had more genealogy and then we found that this person here was there for connected here. There we go. So this is possible for all large family lineages. If you have a degree, you have STR testing done already. If you have a degree of genealogical knowledge and if you're able to do several big whys across the cluster, we have nine here which is perhaps slightly more than we needed but it has given us some very fine data to work with. And we do the same for the chems, this is the chem family which I descend from on my maternal side. And as you see, these are all chems. There are no comings or Jacobs here. And then we worked on the Jacobs families. Well, the Jacobs family all descend from one known ancestor who was on migrants to Maryland in the 1660s. And so all of these people knew they would descend from him from the STR testing but didn't necessarily know how they descended from him. So we're beginning to get some refinement of these remains using STRs and SNPs here in green but we do need more of it. So we need to have more big whys. Ideally we'd like to see one from each of these branches which each descend from a son of the original founder. And then we begin to see to what extent these SNPs we've found here are shared or not shared and how they help to build up a descent pattern for this Jacobs family. That's their own internal research. What I'm also interested in though is working out to what extent these three family groups were related as the STR suggested. And in fact, they're not. At least it's rather like the Coolies and the Cochran's. Their relationship goes right back and SNP to not quite 250 BC to about 750 CE. So again, a dark age connection seems likely here. And what we also find is that the comings line here has an older common ancestor than the other two families. In fact, the shared SNP for these two families was appearing round about the time the coming family was beginning to break up so that the most recent common ancestors for Kent and each for Kent and Jacobs was round about the same time probably round about roughly 1600 or so. And therefore we now see a separate tree for each of these three families and how they relate to each other. So this tree then is sorting or the SNP's rather sorting the STR data we had before into a clear tree structure. And again, this is possible for your family history research as well. So there's a three case studies then. I would do the time or us. Okay, right. So once you've done your test and half. Okay, once you've done your test, you need to get the raw data and process it. And I've gone into this in more detail in the talk again in Birmingham in April. So if you're interested in that, go and look at the test. So if I say good time, I'll jump on here through this. I've mentioned three types of tests at the very beginning. So again, see this complementary. Ideally try and find two members in your family cluster who will do an MGS test if they can afford it like big Y or the four genomes test. And then sort your data into trees as I've done, showing the tree patterns of shared and single SNPs. But then you might want to consider retesting some of those key SNPs. Some of them may be a bit dubious or questionable, not quite clear. Maybe they appear in one person or two or three people in your group and not one person. You need to double check whether that negative is a real negative, for example. And you can do this through single SNPs for single SNP testing. And then you can expand your research by creating panels of SNPs, which is an offer to other people that you might use to suspect or relate to you to expand your tree. The main problem is, the main issue really is if you want to get more of this kind of research, you need to actively expand that tree yourself by finding testers who will help you expand the tree by testing more and more branches. So just to finish, because I was asked to talk about the future, I'm going to go through this, I'm actually not going to talk about this far. I'm going to talk about the future, go back to my last map here. This is a full flowchart anyway, which will be available, I can send this to anybody who wants to look at this and it'll be on the recording of the talk. And just to end then, I was asked to talk a little bit about the prospects for the future. And I think I'm sure the people in this room will know better than I do what may be coming. The family tree DNA are about to have a conference at which we may have announcements about the new shape of the big Y test. So there are many challenges in SNP testing. The big Y has inconsistent coverage in both the depth of the coverage and how much of the Y chromosome it covers. The other test has issues over expense and the time it takes to turn the test around. And of course the two may be mutually contradictory. So you can speed the test process up by testing less or you can test everything that make it slower and more expensive. And the questions over the reliability of the calls and the tools to analyze results which are not always as user friendly as they could be. And we also know from NGS testing that there are something like 500 STRs which can find in your big Y results and your FGC results. So can the testing companies begin to leverage more of these additional STRs? A moment as you know, the largest STR panel is 111 markers available for family tree DNA. Maybe if we have 500 marker panels we get much more fine branching right down through our family networks. So family tree DNA then are going to announce soon I believe some changes to the big Y. I've no idea what they are. These are only my idle speculations but it could include some of these or maybe we'd like it to include some of these. Generating the haplotry they have dynamically from results rather than us having to contact them and say please add this knit to your tree. Can they do as weifel.com do with their database and grow a tree dynamically and automatically. Just giving you some of these. Can they increase the amount of the Y chromosome being targeted? So there's a wider coverage for the test. We really like this because we do know that our useful SNPs on parts of the Y chromosome not being covered. Can they increase the length of each fragment that they read which means you can then capture more STRs and more accurate reading of the SNPs they're finding? And of course we'd like better tools for understanding the results to be presented. And of course we want all of this to be for the same cost or better still less. That's fine. We want all of this, we want it now. So that was fun to do. So we'll watch the announcements next month and there to DNA. I'm sure there were disappointments on everything. More long term, whole genome sequencing has started and people are taking WGS tests. So in other words, rather than target certain parts of one chromosome, we're the whole thing. And what we call 4X and 10X that is tests with a read each position an average of four times or read each position an average of 10 times are already available at lower costs than the big Y. But these resolutions are not enough for the kind of research we're doing because we can't be sure that any SNPs discovered will be reliable enough to demonstrate how to SNPs. We need a lot of retesting if we're relying on these resolutions. But when WGS tests with higher resolutions come along they will certainly I think knock the single chromosome tests out of the way. Now of course, huge challenges in building relational databases that can do them many to many comparisons in order to talk about three billion base pairs on each of our genomes. And we have to compare three billion with three billion every time a comparison's being done. And this is massive. I'm not sure if it can be done at a reasonable cost yet anyway. But of course in the future it may well be that the tests that people take will all be WGS. So you send your sample in and everything will get read. And then what we may do, we may find ourselves buying bits of that back from the testing company. So you want to know where your Y SNPs are then you buy the bits that will give you your Y SNPs. So you want to know your own DNA. Again, you can buy that from them as well. So that might be a way which we see this proceeding in an affordable way. The testing itself is not the expensive part. It's the maintenance of the infrastructure around that. That's the most expensive. And looking long term, not very long term, it's coming, it's ours. Fourth generation sequencing or different ways of sequencing the Y chromosome to allow the sequencing to be done as the sequencer reads the sequence. The moment MGS evolves is kind of breaking up of your chromosome into a mosaic or a jigsaw or bits which are then reassembled by very clever software. So future systems may be able to read as they go along or also have longer read lengths which allow capturing of all those STRs which I'm not quite sure about and more reliable capturing of SNPs and also of all the difficult areas of the Y chromosome that can't be sequenced very easily. So there's mainly some more accurate calling of some of the SNPs that we do have now that may not necessarily be 100% reliable. And this is actually already with us. So the full genome corporation are already piloting their whole genome long read pilot. And there's no information about this. Some projects have sent pilot tests to this. So it's under kind of beta testing stage. But it is actually on the FGC website and if anybody wants to go and buy it and has, wait for this, $2,750 to spare, then you can upload this test. Of course, it will get cheaper and quicker. But FGC haven't actually published the specifications of this test, but they're saying that the read length of the fragments is at least 1,000 base pairs possibly longer. So this isn't the same as what I'm talking about here but this will be a major step forward in sequencing testing, if only because those sequences which are being read will become much more reliable once this becomes affordable to be mass marketed. So that's a run through approaching why SNP testing and the possible future of SNP testing. Of course, SNPs don't just appear on the white chromosome. They're on every white chromosome. They're what you test when you do the family finder test. They're what you test when you do mitochondrial DNA test. And in the long run, I'd like to see an integrated test in which we send in a sample and everything is done, SNPs, STRs, white chromosome, autosomal and mitochondrial, all done in a single test. Of course, cost is a thing, but the future I think is it's on its way. So thank you very much then and is there time for questions? Yeah, sure, we'll have. Thank you, John. Thank you very much. We have time for one or two questions. We have one question here at the back. Is there another question then as well? Well, let's go to the one at the back first and then we'll reconvene. Thank you, John. Could you go back to your display of example one, the table panel that you had in example one, which showed an STO sequence panel and with a number of digits displayed in the table where you represent the GDs involved in your table? Yes. Or something similar in area one, yeah. Example one. Example one where it was my own results. It was your Gorman and Leary table. That's what I imagined of, yeah. Just in reading that series of letters or numbers, digits at the bottom, the double digit SNP 1923, how do you measure GDs in a double digit cell, where do you read the differences between the first set of digits or the second set of digits or both? In the case? You can't wear differences in the 1923 on the middle of the table. Yes, in the case of this one, you'd read both. So these double ones are multi-copy markers. The same marker exists in two places on the white chromosome and they are different markers, but they can't be sure which one is in which position. And so they are separate markers and you would read these as two separate markers. But there's one of them, it's not showing here, the so-called CDY marker that I believe Family Tree DNA would read any difference in it. Is it here? Yeah, here it is, yeah. So if there was a 35, 35, that would be raised one difference in this particular marker only. But most of the multi-copy markers, you'd read each one as being one difference, yeah. I have a question here from James. Not so much a question, John, because you might expect an observation, but thank you for such a comprehensive overview. Two points you didn't pick up that I expected you might do and might like to comment on. First of all, the vast improvement that STDNA made by four or five months ago in the Hapley Tree, which has meant that for big Y test results and for the panel tests, they now have a Hapley Tree that is up to date and means that the value for money you get for these tests is vastly enhanced. The old fun and games are going through band data, I suspect is now redundant. Not completely, but for many people, it's unnecessary. You might like to comment on that. The other one is specifically on the panel tests, which you didn't go into very much. Yes, absolutely. Of course, they don't identify new SNPs. They can only confirm SNPs that have already been identified by big Y. But one surprising result I find is they identify the existence, though not the identity of other SNPs, more SNPs from negative results. If you get new Y-mobbies, you know that even some of the audience don't. And that has turned out to be a vastly exciting area in a small particular interest. But the value of the panel test is much greater than I had anticipated. Again, you might like to comment on that. Yeah, I agree absolutely. If you do a panel test on someone who is close or related to you and have all your SNPs, then they will be quite closely connected to you and there are probably no more new SNPs to be found. If you do a panel test on someone who has your surname, it's probably a relative of yours and they share only say half of your SNPs. The chances are they're more there to be found. So that person, that could be worth putting through an MGS test to find those SNPs. So yes, it's like a flashlight or a beacon that shows where there are gaps in knowledge and where new SNPs can be findable. The first point about the haplotree I agree absolutely. Family Tree DNA have made huge improvements to their support for Big Y this year. The improved haplotree is one thing they're working very hard at getting all significant SNPs discovered by project administrators onto the haplotree. So people can see an accurate read of where they are. They can actually find their terminal SNP now by tracking down through that, through the haplotree. I think it does depend on how active the haplogroup project is. So I wouldn't be very, very active, a very excellent haplotree. I don't think it's quite so good for R1A, which I work within. I tend to recommend people to go to yfo.com who are doing something similar. yfo.com are rather efficient than the R1B side of things, but they are the place to go to for R1A reference. So it's also the courses as well, James, yeah. I differ on you. It gives them credit, they... Yeah. Gross-killing, yeah, that's great. Great, well, we have to stop with their unfortunately, because we've got a lot of time, but please show your appreciation for John doing exactly the same thing as he did. So we'll take a short break and then we'll come back for the O'Brien, the DNA of the O'Brien plan.