 Okay, well, ladies and gentlemen, we come to the next presentation of the afternoon and it gives me great pleasure to introduce my friend and colleague Debbie Kenos. Now Debbie is an honorary research associate in the Department of Genetics, Evolution and Environment at University College London, a member of ISOG, co-founder of the ISOG Wiki for which we owe her a huge debt of gratitude. She is the administrator of the Cruise DNA Project, the Devon DNA Project, the mitochondrial DNA Hapba Group U4 Project. She has written two books for the History Press, DNA and Social Networking and the Serenames Handbook and her popular blog, Cruise News, was originally set up to publish findings from her one-name study but is now focused on keeping up with all the latest developments from the world of genetic genealogy. Now, what is Debbie going to be talking to us about? Well, cousin matching autosomal DNA tests first became available in 2009 and are now the most popular of the three tests used by genealogists. And thanks to the power of the large company databases previously, insoluble family history mysteries now have the potential to be solved. It is truly an exciting time to be a genetic genealogist. However, the interpretation of autosomal DNA results can be challenging. And though new tools are being developed all the time to help with the process. So what can we expect in the years to come as we move into the whole genome sequencing era where it's not just 750,000 markers that you'll have tested, it's 3 billion DNA markers. Debbie can is going to tell us everything. Give her a warm welcome. Can everyone hear me at the back there? Okay, now I have made PDFs of this presentation available online. I've got lots of links which I've put up. So if you want to read more about any of the things I'm talking about, download the PDF and then you can explore all these topics in much greater detail. So everyone got that? Okay. I'm going to be focusing on autosomal DNA for genealogy. And for those purposes, what we hope to have are matches in the database and matches of the right type. Obviously, autosomal DNA can be used for medical purposes, but that's beyond the scope of what I'm talking about today. What I'm going to be looking at are five main areas, databases and pricing. Because that also affects what you get out of the test. Improvements in matching and interpretation of results that may be in the pipeline in the future. Reconstructing the genomes of our ancestors, which is a very exciting possibility in the future. The future of admixture testing, that's all those percentages you get from this and that's country and continent. And what are the implications when whole genome sequencing becomes the norm, which is going to happen at some point in the future? Before we go on to that, just I'm going to run through just very quickly what we mean by autosomal DNA, just so that everyone knows what we're talking about. Within our cells, within the nucleus of the cell, we have these structures called chromosomes. And we have 46 of these chromosomes and we get one pair from our, one set from our mother, one set from our father. And if you're a male, you have an X chromosome and a Y chromosome. If you're a female, you have two Xs. But the autosomes, the autosomal DNA are those 22 other chromosomes. And they have a special property in that they do something called recombination. So it is the chromosome browser from Family Tree DNA. This is my son and this is his comparison with me and my husband. He gets one entire chromosome from me, one entire chromosome from my, from his father. And we can also follow the process going back to one other generation. This is my son compared with his maternal grandparents. And when you do this comparison, you can actually see how the segments are broken up. And you can see that, say, on that top chromosome, he's got one little bit from his maternal grandfather, one long bit from his grandmother. And you also get some chromosomes at the bottom there, chromosome 18, where he's got the whole chromosome from one parent. So I think it's actually fascinating to be able to now watch this inheritance process in action by testing lots of your closest relatives. And that's one of the advantages of testing with Family Tree DNA. You actually have this chromosome browser where you can actually see all, you can actually do all these comparisons with your close relatives. And once you get out to the third cousin level, you only end up with small segments in common. And once you get to, say, fourth, fifth cousins, if you share any DNA, it's probably only going to be one segment if you actually have any DNA in common with them at all. Okay, so databases and prices. I'm just stepping back in time a little bit. I first started getting involved in DNA testing back in the year 2007, and I ordered a wide DNA test for my father. At that time, I didn't even know that it was possible to take an autosomal DNA test and find cousins in databases. I don't know that anyone had ever dreamed that this type of test would be possible. And just later on that year, 23 and me launched that it was actually a health test, which is an autosomal DNA test. It cost $999. They were the first company to launch this idea of cousin matching, and that came in in late 2009. Family Treaty and A launched their family binder test the following year in February. And I got both my parents tested, and at that time the launch price was $249. Since then, ancestry entered the market in America. The prices started to come down with the increased competition, and then we had a period where the American prices, it was $99 for everyone. But ancestry have only actually started selling their test in Britain and Ireland since January. 23 and me have had all sorts of problems with the FDA in America. They had to stop offering their health reports at one time. They've now reintroduced them, but they've now put their prices up. So in the US it's $199, but you can actually buy a separate ancestry test. Those of us in Europe, the prices are never the same as they are in America. 23 and me charge a huge amount for shipping. Ancestry also charge quite a lot for their shipping. So if you are actually looking to buy a test today, oops, sorry, I missed out. Yeah, sorry, let's start again, I'll show the prices later. So the prices are all very different depending on which company you buy from. So this is actually my attempt to look at the growth in the autosomal DNA market. Back in the year 2000, there were just two companies, Oxford Ancestors and Family Tree DNA, and kid sales were measured in the hundreds. It's impossible to get exact figures because they just aren't, that data isn't available. So it's more the shape of this graph that's of interest rather than the precision of the numbers that I've got there. The two key points are, you can see around about 2005, that's when the Geographic Project launched, and they had sales of about 100,000 kits in the first year of the launch. And then we've had sort of gradual steady growth, and then 2013 is what is called the inflection point. That's when all the prices dropped to $99, and suddenly the testing really started to increase. And it took 13 years to go from 0 to 1 million. It took one year to go from 1 million to 2 million. And you can see that curve now. My estimate at the moment is that about 5 million people have taken their DNA, have their DNA tested with the big three companies and also with some of the other smaller companies. And you can just imagine if that curve continues at that same rate, we will be at, say, 10 million in just a few years' time, then 20 million, 30 million, 100 million, we'll soon be at a point where more people have had their DNA tested than they have not. So that's actually quite exciting to think about that, because the more people are in these databases, the more we all get out of these tests. The pricing of the... So the market's actually changed over the years. There have been all sorts of companies coming and going, and it's now really in a consolidated position where we've got four main companies and organisations. So they really dominate the market now. Family Tree DNA, they have a partnership with a genographic project, and the genographic project is deep ancestry testing, so they're not really relevant for genealogy. 23andMe, their test is primarily a health test. So for genealogy purposes, it's generally a choice between family tree DNA and ancestry. And the prices I mentioned, they vary... The shipping is one of the big factors to bear in mind. So those are the current prices in euros if you also factor in the amount for shipping. 23andMe, 169 euros is just... You just can't use a test that costs that much. Ancestry, they charge £20 for shipping. So whatever that is in euros at the moment, that's the actual price if you pay for a test, plus also the shipping charge. Whereas Family Tree DNA, they've just recently reduced their price to $79, and that they charge a small amount for shipping, so that works out at about 88 euros. So each company has a different database, different composition of the database. It depends on your goal for testing, but it's best... I would say that you need at least one person from your family in both the family finder database and the Ancestry DNA database. And I tend to use the family finder for testing extra family members because they've got the chromosome browser, and you can actually do all these fancy comparisons with people. But I think those prices that we're seeing there, I think they're actually now at the limit. I don't think they can get much lower than sort of 88 euros. The companies are all fighting for market share. Once those databases reach a really good side, they've got no incentive to drop the prices anymore. So I think that it's going to be a bit like televisions where the price practice stays the same rather than going up with inflation. The other thing that we're seeing now is a growth in third-party databases. And some of these are provided by Free of Charge. There's something called DNALand, which is a very interesting new initiative from some scientists in New York at the New York Genome Center. And wherever you've tested, you can upload your results to this database and you can get some free reports. But also your DNA can be used in scientific research. So the more people that contribute to that, the more that the scientists can learn. There's a very good website called Jetmatch where whichever company you've tested with, you can upload your results there and you can do comparisons. And now we've got some of the websites that offer family tree software where you can share your family tree there now trying to join the party as well. So my heritage had launched a matching service, genie.com had launched a matching service. So I think what we're going to start seeing is the actual cost of the testing is not important. It's the features that the companies provide, the databases, and everyone is trying to fight for a share of that market at the moment. So it would be interesting to see how that develops. And we've also got a number of websites that offer where you can put your family tree but where you can't actually add your DNA. FamilySearch did announce that they were going to allow family tree DNA, YDNA, and MTDNA results to be uploaded. And that was last year. Nothing has come of that. I don't know why that hasn't happened. WikiTree is quite a nice site where you can upload your autostomal DNA results, YDNA, mitochondrial DNA results. And then we've got other, there's a whole myriad of websites like Genesery, United, Genie, and Net where you can, which you can use for your family tree but where DNA results aren't integrated. Now one thing that I think is that it seems to be insane to me that we've got lots of people working on their family trees. Everyone's uploading their family trees but it's on different websites and no one is working together. So my vision for the future is that we have one integrated family tree where everyone is working together. Think of Wikipedia. If you want an encyclopedia, where'd you go for a first look, you go to Wikipedia. If you want to look at someone's family tree, you've got to navigate a whole, all sorts of different websites. Some of them you have to have a subscription to access the trees. We want one worldwide family tree website where everyone can put their results and where it's completely free and where you could also upload your DNA. So we just need someone to come up with this solution. I don't know who it's going to be. None of these sites at the moment are up to the job but none of them actually can do what you can do with your own family tree software at home. But that will come in the future where we have our family tree and our DNA in the cloud online and everyone's working together on one big family tree. Now DNA testing is as much about family trees as it is about the actual DNA matches and our family trees are still a big limiting factor with autosomal DNA. This is my family tree and you can see I've got rather a lot of gaps. I've got a number of ancestors who were illegitimate and I don't know who the fathers were. That's always a problem. I may never ever know the answer to that but what is exciting now is that a lot of the companies we've got Ancestry and Find My Past digitizing all sorts of material making records available online, indexing it and one day in the very distant future we may find that all the world's records have finally been digitized and indexed and we're all able to complete our family trees as much as is actually possible and there are no other records hidden away in parish chests or under someone's bed that haven't yet been discovered. So that one day perhaps in 10,000 years that might be the reality. Now improvements in matching and interpretation of results. We are, whatever happens in the future we are limited by our DNA itself. These are some theoretical probabilities of the amount of DNA that you would be expected to share with a cousin. So up to the second cousin level everyone will share a match with a second cousin. If you test your first cousin or your second cousin and you don't have a match then you need to start asking some questions of your parents or of his or her parents. But then beyond that level as you go further, as the relationship becomes more distant the chances of actually sharing DNA with a cousin starts to diminish and once you get out to the sixth cousin level you've only got about 90% of the time you're not going to match with a sixth cousin. But of course when you take one of these tests we have so many of these, fifth, sixth and seventh and eighth and distant cousins that you end up with hundreds and thousands of matches but generally you can't find any connection with those very, very distant cousins. So if we can't, if it's so difficult to get matches with our recent cousins the other possibility of course is that we start digging up. Our ancestors, which we had an excellent example if you were here yesterday of the Earls of Barrymore project where they have done just that although they were actually trying to get YDNA in that case. But that is the first known case where we have a genealogist who's trying to do that. So this is really groundbreaking research but whether or not that's going to happen in the future in a lot of cases I can imagine that the churches and not going to be very happy with lots of crazy genealogists approaching them on a sort of almost daily basis pleading, please are my ancestors buried in your church yard? Can we exhume the body out of desperately need to have his DNA? I don't think that's going to happen. It's probably only going to be some very high profile cases and even then it's not even possible. The queen will not give permission for the princes in the tower to be exhumed from Westminster Abbey because she's worried about setting a precedent for people digging up their ancestors. So that's an interesting possibility but also the way the testing is done there is no lab at the moment that offers commercial ancient DNA testing for genealogists and I think there's a great marketing opportunity there for some enterprising person to set up an ancient DNA lab for all these genealogists who have managed to get permission to have their ancestors dug up and tested and there's also things like testing hair and teeth if you've got a and all that sort of thing. There's no easy way of getting that done at the moment. So that's something we could well see in the future but if you were able to dig up your ancestor what are the chances that you would actually have an autosamal DNA match with that person? And again, the chances are not actually that good once you go too far back in time. So from going back to at least well your sixth great grandparent you know, the chances are pretty good you're going to have some DNA perhaps with a seventh great grandparent but once you get too far back in time that you just get to a point where most of your ancestors you don't actually have any DNA from them it's a rather unusual situation you wouldn't be here if they haven't passed their DNA on to that next generation but it's all gradually been diluted over time. So if you are going to start thinking about digging up an ancestor make sure you dig up a more recent ancestor not your 14th great-grandparent at least you have some chances of success then. Now another limitation of the current tests is that some of the matches that were given by the companies are not real matches they're false positive matches and we also have them on the other hand we also have false negative matches people you match but you don't actually show up in your match list. So generally if you have a long segment in common that's likely to be a true segment but if it's a short segment those are much more likely to be false positive matches and that's particularly the case for segments under about 15 CMs and the reason for these false positive matches is because the data is not phased and I'll just explain now what phasing is when you get the raw data if you were to download your raw data you would see all these sort of random collection of all these A's, C's, T's and G's but it's not in a particular order so the first letters aren't all the ones from your mum and the second letters aren't all the ones from your dad it's in a completely random order so that has to be sorted so that you have all the letters from your mum on one side all the letters from your dad on the other side and the best way of doing that is if you have both your parents tested but none of the companies at the moment will allow us to phase our data with our parents but what you can do is you can do very complicated statistical analyses and this is from a review paper published a few years ago where they tried to do that so if you get a string of letters like that for every position there are actually four possible solutions but when you start comparing those letters with reference populations then the possibilities start to drop so I got a point in here yes, so okay, so this combination here that's never seen in any reference datasets and the same with that sequence there so gradually you get to a point where you can eliminate those combinations because they're just not seen in any population and you can eliminate those and then they do all sorts of complicated Bayesian analyses and then you end up with a situation like this where you've ruled out those two combinations that one is 7% likely and that one is 93% likely so this sort of technique is actually now getting very sophisticated and they're using it on big datasets for medical purposes so the UK Biobank, I'm part of that project 500,000 people and they're using this statistical phasing to do their medical analyses Ancestry are currently the only company to do phasing but I think this is something we're going to see in the future I know Family Treaty is supposed to be considering it and once you have phasing that will really improve the accuracy of the smaller segment matches now another reason, another problem we have currently with our results are sometimes you end up with results and you match loads of noted people on the same segment and we call these pilot regions what we would expect is to have all the matches distributed evenly across the genome like that but now that we've got these large databases what we're actually seeing is in some places you've got hundreds or sometimes thousands of people all matching in the same place and you can't possibly have a thousand people all matching because they share the same fourth great-great-grandparents it's just not possible so what this actually means is that these people are all matching it may be because they all share a segment that everyone shares because they're humans it may be they share a segment because they're Europeans or it may be there's a little group of people who all share the same segment because they're all from County Mayo so we have to try and distinguish between the segments we share with a recent ancestor and the segments we share in common because lots and lots of people share because of the population structure and the frequency of the HAPA-5 is actually one way that you can do that so if you only match one person on a segment that's likely to be much more recent than a segment that you match loads of noted people on now another problem with interpretation of results is if you match someone in it to say a second cousin, third cousin or fourth cousin that sort of level you can usually be pretty confident if you've both got good family trees that you can make the connection and the matches is a solid match but with the more distant matches it becomes very difficult if you match someone in it's only on one segment you may be a fourth cousin or you may be a 40th cousin and we've got no way of telling just from the DNA information on its own at the moment and this is actually a very interesting study done by some colleagues of mine at UCL where they used computer modeling simulations and they actually modelled the historical population patterns and looked at what would happen to segments going back over 50 generations which is something we can't do with our family trees because no one has got a family tree DNA a family tree that goes back 50 generations but this gives you an idea of the distribution curve and the green bits here, that's 10 generations so you can see if you have a 20cm segment around about 60% of the time that is going to fall within 10 generations but that means 40% of the time it's going to fall outside 10 generations and even way back beyond 20 generations and then once we get down to these very small segments only a very small percentage of those will fall within 10 generations so we can use these sort of statistics as a guideline but at the moment we've got no way of telling which are the recent segments and which are the ancient segments but there is DNA land which is the free website they have an interesting system where they are assigning segments to they're assigning some as recent and some as ancient and they're doing this based on the mark of frequency within each segment we don't really have much information at the moment as to how this is working out I don't even have any matches at DNA land but I think it's an interesting concept and I'd like to see all the companies doing more of this in future and actually helping us to determine the actual age of the segment reconstructing our ancestors this is a bit where I think it gets really exciting reconstructing the genome of our ancestors so the basis for this is chromosome mapping so here is my son and this is a comparison between his maternal grandparents and a third cousin once removed and on the top here the orange is his grandfather the green is his grandmother and you can see that this cousin matches on this segment here and because we've tested both his grandparents we know that he's got that segment from his grandfather so that's the basis of chromosome mapping being able to assign a segment to a specific ancestor I'm limited I only have three first cousins and none of them will test two of them won't even let anyone know where they're living these days which they're just paranoid about privacy but we have got some Tim Janssen in America has tested more people than anyone else in the world in his close family he must have tested several hundred family members now he's very kindly shared with me his chromosome map here this is his mother and he can't actually fit all of his relatives that he's tested on this chromosome map he's tested so many but you can just get an idea of what is possible because now he's actually mapped out so his ancestor Paul Youngman he knows loads of noted segments that he's actually inherited from that particular ancestor and the same with some of the other ones so he can almost reconstruct Paul Youngman's whole genome here just with all the testing that he's done and this is another example Kitty Cooper she did a very nice blog post where she just looked at one particular ancestral couple and she's tested lots of people who were descended from this couple and I'll just zoom in on the picture there and you can see that she's all these people you can see they all have little bits of DNA from this same ancestor so you can imagine if you test every single descendant from a known ancestral couple you will actually be able to reconstruct quite a sizable part of their genome what you really want is an ancestor who had lots and lots of children the more children the better and the better chance you have of reconstructing the genome and once you reconstruct once you've started to reconstruct the genome and then you can actually make inferences about your ancestors because those segments of DNA they contain genes they contain markers and genes are associated with things like family traits so things like eye colour, height, hair colour and all sorts of things like that so that opens up all sorts of exciting possibilities we can actually possibly work out what our ancestors looked like if we can test enough of their genome this sort of research is still in its infancy at the moment DNA land has just started introducing these trait reports they're trying to predict eye colour there's only a couple of studies that they've done and my eye colour came out as blue but I have green eyes but the studies they had only reported blue and brown anyway so you can only get blue and brown out of this particular report at the moment but they also give you a nice little chromosome browser there and they show you the segments of DNA where those markers are so if you've matched a chromosome app you can then go back and look and see if you've got those markers and which ancestors actually gave you those particular segments of DNA and they even give you the different markers there and tell you which ones are likely to... they all seem to have different effects a lot of genetics now seems to be much more complicated than anyone ever thought there's not one single marker that gives you your eye colour or your hair colour it's lots and lots of different markers and it's much more complicated than anyone ever realised but as we get more and more data these sort of things should become much clearer this is the GED match another the other free site they will also predict your eye colour they're not so far off they give me a sort of grey-greeny colour rather than pure green but they actually test more markers than the DNA land one 23 and me give you all sorts of trait reports so this just gives you an idea of the sort of thing that we may be able to get so hair curl we just have to look at me and I've got the very curly hair anyway eye colour they still think my eye colour is blue not green and things like lactose intolerance I have an interesting trait the norovirus which is the... if you go on a cruise ship it's a sickness bug there and I'm supposedly resistant to that and so it'd be nice to actually try and trace back and find out which ancestor that came from although cruise ships wouldn't have been around in their day but so these are some of the things that we may be able to deduce in the future so I can imagine a future where we'll actually be able to reconstruct the faces of our ancestors and this is something they're trying to do with forensics at the moment and there's a company in America called Parabon and they have this test called a snapshot test and if they have DNA from a suspect they can actually plug it into their database and try and predict what the suspect might look like and there's also a company called Human Longevity which is Craig Venter's company in America and they have done... they've got the largest database of whole genome sequences they've done 10,000 people so far they're hoping to do 100,000 by the end of the year and they have done some pretty accurate reconstructions they've done the testing on some of their employees and it actually looks pretty accurate so from what I can see are the other pictures that they've shown there so that is actually quite exciting to thought but one day those ancestors at the moment are just bits of paper we may be able to get in the future some idea of what they actually look like and all their different features and physical features and how tall they were and there is... I don't know how many of you've heard of the People of the British Isles Project that was able to find very fine structure population structure within it's a misnomer it wasn't the People of the British Isles it's the People of the United Kingdom excluding the Republic of Ireland but there is also an Irish DNA Atlas project so they've published the regional results showing that you can distinguish between say people from Devon and Cornwall and people from say the North Wales and South Wales but the other component of that project was looking at the faces of the People of Britain and a couple of years ago at the Royal Society Summer Science Show they had the measuring machines there that they're using with the people and they're taking all sorts of detailed measurements about face shapes measuring the distances between eyes and they're trying to see if there are regional patterns in facial features and also if there's any link between the sort of different markers and the manifestation of particular features Daniel Crouch presented a very... it gave a very interesting presentation at Genetic Genealogy Island a couple of years ago about the faces of Britain Project but I haven't yet seen any results and I don't know when the paper's going to come out but this is the source of research that we need so that we've got a baseline so that we can actually do this ancestral reconstruction Admixture testing I just wanted to... didn't want to go into this in too much detail but this is something that again has advanced enormously just in a few years so when I first tested with Family Treaty you know with the population finder test this is what I got 100% Europe and it was divided into Europe and Western Europe which was not really very helpful at all and then 2014 we got the My Origins test and now I am 57% British Isles and I'm the rest of it it's all sorts of different regions of Europe but again it's not really very helpful to me I already know I'm from the British Isles so I don't really need a DNA test to tell me that and it actually tells me less than I already know because it comes up with all these other regions that have nothing to do with my genealogy at all but what we are going to see is much more fine-scale resolution that we are expecting an update to Family Treaty DNA's My Origins test Ancestry DNA are supposed to be coming out with a feature that will give much more regional breakdown they're presenting in America at the American Society of Human Genetics meeting this weekend 23andMe I'm sure will have other things in the pipeline as well and we've also got all these third party tools the GEDmatch DNA land MyHeritage is supposed to be coming out with another some type of admixture test and there's also the Geno2 next generation test but one exciting new development is the Living DNA test which is using the data from the people of the British Isles data set and this is the first time giving regional breakdowns within Britain so you can actually get your percentages of your ancestry from different regions of the British Isles so from the percentages from Devon, South Wales, North Wales, Northumbria and so on no one at the moment has had any DNA results from this test it's still it's very very early days but they are actually exhibiting downstairs as well but this is where the admixtures at the admixtures test will be going in the future and we'll actually have much finer geographical resolution but this is only made possible by very careful sampling all of the publicly available data sets they get people and all they find out about their ancestry is that they're from England or they're from Ireland or they're from France whereas in the Poby project they've got people with four grandparents all born in the same region and that is what allows you to give this as well as all the advanced analytical tools that gives you this very fine resolution right the final thing I wanted to cover was whole genome sequencing and I'm going to turn my crystal ball as to what the implications of that might be in the future so at the moment what we're using are microarray chips and Rebecca Canada has done a very nice blog post where she's actually mapped the SNPs that are used on these microarray chips and looked at the distribution so you can see here what the whites are areas of the genome where there is no coverage at all and then the cold areas where there are just very very few SNPs and then you've got a few hot spots in red and yellow where there are lots and lots of SNPs so this is giving us a sort of battering of SNPs from across the genome but there's also areas that aren't covered at all and also with the when you're trying to match segments if you've got bits missing there aren't SNPs on the chip it may be that it's giving us false positive matches because this segment will look longer than it actually is and it may be there are other SNPs breaking up the sequence which we don't know about now if we move to whole genome sequencing there's a comparison between the SNPs and the whole genome sequencing at the moment they're currently available the chips that are used for genetic genealogy they target between 300,000 to 1 million known SNPs and those are SNPs that are already known to researchers and they have a frequency in the population of 5% or more now there are newer chips from Illumina available that will now test up to 5 million SNPs and they give you this spina allele frequency down to 1% you can add extra YDNA and MTDNA SNPs to those if you want but with whole genome sequencing that allows the whole genome to be sequenced at all 3 billion positions and that gives us all the rare variants and will also give us the potential of finding SNPs that are particular to our particular families so in future what we panic expect is to have our whole genome sequenced and that will just be one test and you'll get your Y chromosome sequence you'll have your Y SNPs you'll have your Y STRs you'll get your full mitochondrial sequence all in one test but the moment the cost is prohibitive it's not something that everyone can afford but that is where we are going in the future but what are the implications for relationship estimates? there was one study done a couple of years ago where they looked at this and in actual fact the results were rather disappointing they found that they could estimate 100% of second cousins and then 55% of relationships from 9th through to the 11th degree that's 4th to 5th cousins but we can pretty well do that already with the microwave chips so what they're saying is it gives us an improvement of between 5 and 15% in the accuracy of the test so it's perhaps not surprising that no one's trying to push whole genome sequencing for cousin matching at the moment but there are other things that you can do with the whole genome data and a very interesting paper published early this year they mined the data set of the 1000 Genomes Project and looked at very rare genetic variants and this is what they said that very rare snippets should become important genetic markers for familial relationships and also for population stratification and they found that in each person has 40 to 100 new mutations unique to him or her that aren't in the DNA of either of him or they aren't in the DNA of either of his parents and then also they then looked at sort of the population level and each individual has between if you're of European or Asian heritage you have between 14,000 and 40,000 these rare variants that are unique to you and Africans actually have a much, much higher level so what you can do is actually compare the number of rare variants you share with another person so it may be you have share 1% of your rare variants with one person and 90% with another person so you can actually use that as a measure of relatedness and that's what they tried to do in this paper and were able to detect relationships just using the rare variants and there was another very interesting study this was an ancient DNA study where they were looking at some ancient DNA samples from the Iron Age and also from the Anglo-Saxon Age but they developed a very interesting technique called Rare Coal where they were able to they utilized these rare variants and were trying to look at the differences between the ancient genomes and the living genomes based on the number of shared rare variants and they were able to come up with new insights about the percentage of DNA that the English population shared with the Anglo-Saxons so tools like that I think we're going to start seeing used for our autosomal genetic genealogy tests in the future and all these rare variants are apparently results of some massive explosion in our population accelerated explosive population growth just within the last 100 generations in the last 300 years and that's resulted in an excess of these rare genetic variants so that's some fascinating data that's just waiting to be exploited and there's a load of work there for all our citizen scientists to do this sort of work and something else that no one has so far has really properly exploited our autosomal STRs so if you've had your YDNA tested the markers that are tested there are STRs and these are repeating motifs so you get little sequences of DNA letters that are repeated like three or four or five or six times now there was an organization called the Sorenson Molecular Genealogy Foundation and way back in 2003 before all the autosomal SNP tests were known they did actually start to look at autosomal STR markers and they actually typed 120 STRs on a population of about 37,000 people and they were then able to use with the STRs with autosomal DNA because it's passed on in big chunks in big segments so if you have one STR it's likely to be passed on in tandem with other STRs so they actually picked up little motifs since sets of three or five and they were able to use those to make a family tree and this is the tree they constructed using these little motifs and you can see that the family tree was broken down into little groups so you've got one particular signature here and then you've got a little pink signature here and then a little green signature here and then a lot of them had this same blue signature here so what I can envisage in the future is we can actually use these STRs perhaps in combination with the data we have from the segments and we can use the rare variants as well so we can actually in the same way that we're doing with the YDNA we'll actually be able to work out our relationships in a much more precise way than we can do at the moment that's particularly the case for the more distant relationships the tests at the moment are very very good for the closer relations up to about the fourth cousin level but beyond that it becomes very difficult to know what to make of the results now this is very complicated territory now and there is something called the ancestral recombination graph now I don't profess to understand precisely what this means but I will use the words of the scientists themselves to explain it but they say it's an elegant simple yet superbly rich data structure that describes the complete evolutionary history of a collection of genetic sequences drawn from individuals in one or more populations it's like a family tree only richer because it defines the relationships among individuals but also traces the histories of specific segments of DNA sequences so if you were to replace your family tree with this ancestral recombination graph you could tell exactly which pieces who or genome came from your eccentric great-grandmother and which pieces you share with your charming intelligent and handsome third cousins and this is what it looks like and they published a very complicated paper full of all sorts of complicated mathematical formulae but essentially this is like using DNA to reconstruct an autosomal tree of the human population so you can imagine one day they will have it all cracked they will work out the recombination points and you will just be able to look at the DNA and say right well this is where you fit on the tree but I think that's a long way ahead it's computationally very very intensive but it's an interesting prospect so if you wanted to have your whole genome sequenced you can do but it's going to cost you a fair bit of money it's around about just under $1,000 very fast genetics is a great Ventus company they're offering it to participants in the personal genome project there's a company called Shure Genomics there's another company called Guardiom they specialize in privacy and they will send you your DNA on a secure hard drive if you want to if you're worried about people accessing your data and we have got quite a few genetic gene allergies who've been using full genomes for full sequenced and they offer a range of products and they start off with one that's $8, $9, $5 and you can actually have your genome sequenced at lower low coverage low resolution but it's not really worth doing that and in fact just yesterday I saw I noticed a posting there's also a company in Australia called Genomics who have just launched whole genome sequencing for $200 except you have to sign up and you'd say don't know when they're going to launch any results so I think it's I think the sequencing is actually part of a university in Australia so the cost I think was it probably won't be long before we get to the point where it's $100 for a whole genome test and then the thing is what people actually do with the with the results after that so this is what the the machines look like at the moment there's also advances in the technology for doing the sequences so at the moment Illumina dominate the market they're the company that produce all the chips that we use for our DNA tests and it's this new product that they launched called the Hysec X10 which is actually 10 machines that all work together and these machines are really about the size of fridges and that's we had a visit to the Sanger Institute a few years ago and that's what the the lab looks like and you can actually see the sequences they're all churning away in the background processing data working 24 hours but to produce thousands and thousands of sequences every year now the way that this technology works at the moment is it breaks they can't sequence the whole genome in one go so what it does it breaks it down into little segments of at the moment it's something like 150 to 250 base pairs and then it has to read it like 30 times 50 times and then if you read it enough times you end up getting all these little jigsaw pieces and you have to map it against the reference genome and reassemble it so it seems like an absolutely crazy complicated process but it somehow it seems to work but now we have new advances and there's another competing technology called Nanopore Sequencing this is a British company Oxford's Nanopore and they have this new sequencer called the Min Ion and you can see it's a tiny little thing there you can hold in your hand and this has the benefit of rather than chopping up the DNA into fragments its sequences each base in turn it gets very very long reads and it has reads of up to 100,000 or more bases so this will actually allow us to get into some of those deep dark regions of the genome that no one can currently sequence although we managed to sequence the whole genome there are still areas that can't be reached with current technology and that's particularly the case for the Y chromosome where half of that can't even be sequenced at the moment they also have a desktop version called the Promethe Ion and this is the one that I really like this is called this Smidge Ion and I just really I just want to have one of these they released them the first time at the American Society of Human Genetics Meets in America and it literally is this little bit here and you use your phone you can actually do the sequencing on this little device here so in a few years time when you come back to our past we will have these little machines and we'll be able to sequence people on the spot and give you your matches immediately as soon as you test I'm sure that's the way that it's going to go but the cost of these at the moment they're almost giving away the machines it's a bit like a printer where they give away the machine but it's actually all the consumables that cost a lot of money so it's sort of several thousand pounds just to get one several thousand dollars just to get one sequence out of one of these and also the accuracy is not that brilliant at the moment it's about 90 percent for the the mini ions whereas it's about 99 percent with the next generation sequencing of the other major challenge of course is even if we can do the sequencing we also have to interpret the data and this is where I think it's the challenges are still to come especially if we're looking at doing cousin matching with whole genome sequences so you imagine you've got a whole genome sequence and you've got to compare that with a database in the future of say 10 million 20 million people and that's 20 million individual comparisons so the processing power is just quite incredible and one whole genome sequence we've had people who've had their genome sequence you're talking about a file size of 50 to 100 gigabytes and my first when I had a first desktop computer the hard drive I thought was wonderful with three gigabytes it wouldn't even hold my whole genome sequence if I would have it done now so I think that's a major challenge in the years to come and I think there's plenty of opportunity for citizen scientists to actually do something and help to interpret this data with especially we've seen with the Y probe is somehow people have been really really good at interpreting the next generation sequencing data so where does this leave us in the future what is going what is this vision of the future so there will be a point in the future when all the world's records have been digitized and transcribed and the world's or several DNA trees have been completed we will have this ancestral recombination graph and we will have a single integrated genealogical and genetic world family tree so at that point babies will be sequenced at birth and there are details we recorded on this global family tree we would have to have make sure that people also then record the details of marriages and deaths and so on but all these developments are going to have a somewhat unforeseen consequence because there will no longer be any need for genealogics so do we want this brave new future of whole genome sequencing but I think it's going to be a long way in the future okay we have one from let's go right here at the back let me bring down the microphone you can just pass that one yeah earlier in your presentation you were talking about phasing and it's not really being heard of the commercial companies yeah I just need to be phasing at the moment I am on family tree and there's a breakdown of matches between paternal and maternal as I have bought my mother and father tested right is that a kind of accrued way of phasing or is it just it's not actually phasing in terms of phasing individual letters it's just assigning matches to based on testing your both your parents so it is a sort of proto proto phasing it's not true phasing and they also don't go down to the small very small segments they have a cut off I think it's nine cms for the family matching tool so the phasing is more of is the poor relevance to the very small segment matches so all the larger matches that you're getting from that are easier to deal with anyway so basically it's just then in common with it's a bit more complicated than that with parents it is effectively in common with but they also give you matches they sort based on matches with cousins and they actually in that case they match on segments as well thank you one last question sure all right okay I need to very beginning put it up there and there's the url great yeah fine thank you very much for that Debbie and please give Debbie a warm thank you for a wonderful presentation so which will be on World War One and identifying the soldiers in World War One treating an aching for Debbie cannot speak in a backdrop past very much that's great nope it's still in here