 speaker on stage. The talk will be about protecting the wild and I'll hand over to her. Please give her a warm round of applause. Thank you very much for the introduction. My name is Jutta Buschbaum. I'm an evolutionary biologist that is my background. I did do my PhD at the University of Chicago working on little fungi that live in symbiosis with algae and form colorful rocks on colorful crusts on rocks. I did a post-doc in bioinformatics and after that moved back into organismal biology working in force genetics and the ten years I worked in force genetics for the first time I encountered questions that were with regard to application and I found out that actually moving from research to application is not trivial. So what I'm going to present is a high tech way using genomic data to protect biodiversity in a way that you can actually reach application and use conservation genomic tools. So this summer the draft of the report of the Intergovernmental Science Policy Panel for Biodiversity and Ecosystem Services came out and had its results were quite warming. It stated that around a million animal and plant species are currently stated and of source half of those species are already dead species walking. So because due to the destruction of the habitats or habitat deterioration they are not able to reproduce in a sustainable way anymore. A third of the total extinction species extinction risk to date has arisen in the last 25 years and just to give you an idea about the relation we are talking about about currently the rate of the extinction risk is already at least tens to hundreds of times higher than it has averaged over the past 10 million years and within these 10 million years there were the ice ages for example. Most of the extinction risk is due to the factor of land and sea use change. The report also talks even talks about that we already seem to have transgressed a proposed precautionary planetary boundary which means within the boundary we have a stable biological system but having transgressed it we might already be in a transition to a new state that we have no way to find out how the state is going to look like. So all of these facts that the report is stating are actually pretty negative and I was quite happy to read that they also present that there are actually people who do better than most of us and they point out that many practices of indigenous people in local communities actually conserve and sustain wild and domesticated biodiversity quite well and today a high proportion of the remaining terrestrial biodiversity lies in areas managed and held by indigenous people and these ecosystems are more intact and less declining, less rapidly declining. So we have examples of lifestyles that actually do better than most of us and I know the solutions won't be simple and it won't be easy to get there but we can look to what these people do better than we do. All of this sounds it's a global report and it sounds kind of like far away like probably somewhere in the tropics but actually threats to biodiversity happen also directly in front of our own front doors the summer and a paper came out from two colleagues from the University of Greifswald who had analyzed a long-term data set about leaf beetles and they were asking if we already have a decline of leaf beetles in Central Europe so they compiled a long-term data set of leaf beetle observations for Central Europe starting from 1900 now to 2017 so spending 120 years and what they find is that systematic reports on leaf beetles and leaf beetle observations here are increasing during this time interval time span but despite the fact that couldn't we have like in the last two decades we had very high numbers of reports and observations for leaf beetles the number of species the orange line is declining it's slightly declining but the question is is this real or not and what was most worrisome to the authors is that in the date in the data set the number of species here in orange that were having more reports was declining while the number of species that showed less reports than before is expanding so these kind of long-term data sets are very hard to interpret and many factors can contribute to those patterns and it's not clear if this pattern is statistically significant but if you take a step back and consider your background knowledge your prior knowledge about the state of the world do you see like how does the current state look like like does it look rather good or rather yeah worrisome and then with that knowledge tell me that that these results are an artifact or a bias I am worried that once we have a statistical significant signal in this data set it will be already too late so right now I'm I've been talking about beetles leaf beetles and beetles are the largest group with group within insects with about 400,000 species leaf beetles are a large family of about 50,000 species and which are worldwide distributed and here in Germany we have over 470 leaf beetles species so how do we actually know how many species there are and who actually counted all these species and that is the task of the taxonomous taxonomy is the science of naming and defining including circumscribing and classifying groups of biological organisms on the basis of shared characters so one could have the picture of some woman with a funny hat running over a meadow catching like butterflies or some guy mushroom hunter crawling through the forest trying to find mushrooms and it's true as by a lot by diversity scientists we spend a lot of time outdoors and yeah on the other hand by our taxonomy is a high tech science today so taxonomists actually take up new technological tools and develop developments to help them identify and describe understand the species so taxonomists actually are often experts in for example my crossbook microscopy morphometrics biochemistry even proteomics and genomics so throughout the talk I'm going to compile this list of people and experts we are going to need to protect biodiversity if we want to do this on the basis of genomic data right now the list is quite empty the first entry is taxonomists but that will change quickly and taxonomists are a super group of evolutionary biologists mostly so I told you the taxonomists and by diversity scientists take up technology and so as soon as computers came about and the internet started people started to use that to compile information about species and today we have several global resources available for at the species level and above the species level so we have by diversity scientists were among the first few defined by diversity information standards we have a global catalog of lies live a list of all named species the global biodiversity information facility has a aim to bring together information from different sources and they are compiling producing these wonderful maps this is leaf beetles leaf all the records about leaf beetles in the that we have in the world and it looks like as if leaf beetles are highly associated with world world economics however that is clearly as an artifact and it just shows that we need many more taxonomists and biodiversity scientists all over the world to find and identify leaf beetles so we also need by diversity informaticians to help us compile globalists and distribute knowledge so far I've been talking about species which is a simplification the question is what is what are species actually and so we need to talk about genetic diversity within and between species and I'm going to do so using golds which most of us might know here in Europe we have two large golds of the genus Laros one is in the front the lighter grey one is our silver mover and in the back is our herrings mover the darker one and I'm going to use German names because the English names go crosswise and it's completely confusing so I will stick with the German names here in Europe these two species seem to be really fine species because they barely interbreed so they don't hybridize however if you take a step back and look at the genius genus in general you see that species of the genus are distributed kind of ring-wise around the Arctic and so the ideas that say during the ice age all of this area was glaciated and the golds retreated to a refrigerant here near the Caspian sea and then after the ice retreated the golds moved back north one branch moved into Europe for forming our herrings mover and another branch then moved counterclockwise around the Arctic producing different morphotypes different species crossed the bearing straight and then into North America there the dark blue one is the I'm simplifying the equivalent of our European silver mover the American silver mover then the ideas that some individual crossed back to Europe and formed our European silver mover and while all of these species here are interbreeding so they hybridize only when this ring was is closed those two species don't interpret anymore and the big question is are we actually dealing with one single species or are we dealing with different species that just happened to hybridize more or less the question is not trivial because it has consequences for protection if we are dealing with one single species all the golds in Eurasia could go extinct and it wouldn't matter because we still would have the golds in in North America however if we have different species in all these areas we would need to protect individuals or the species on a regional level and protect all of these different species so to investigate this question about do we have different species and what were the evolutionary processes and histories that brought about the species a group of scientists investigated that using DNA sequences and here on the left you have the model the theoretical model of the ring species and here on the right you have reality and the scientists found that it's much reality is always much more complex so for example they found two refugia or they proposed to refugia but what they found was that genetic diversity was correlated with those species or morphotypes and so what that also means that genetic diversity is correlated with a geographic origin and what we learn from these types of analysis is we learn about evolutionary processes and history about variability and differentiation about gene flow and migration about speciation processes which we all need to understand our species which will allow us to protect them so we need evolutionary biologists who do phylogenetics and population genetics so once we found out that one can use genetic diversity to infer geographic origin because genetic diversity is correlated with geography people immediately said okay we can use that for conservation applications and it's also we learned that we often it is unclear what is the species species boundaries are unclear and some species have huge distribution ranges which different clusters of variability within this huge range so we know that we need to protect within species genetic diversity which means that we need to understand within species population structure and we need to build useful and reliable models of population structure these models are actually required for all of our applications they're required for monitoring for example for conservation strategies for functional adaptation and adaptability questions of produceability of different provenances the impact of management regimes breeding strategies and also for enforcement applications from the studies I showed you before with the gulls we also know that we need to approach the question of a population structure on a distribution range wide scale so here's a map produced by euphor gene the european network for forced reproductive material for one of our native oak sessile oak and the dots are the sites for conservation genetic conservation units and so that is that is one strategy how to represent within species genetic diversity and how to sample it and you can see this is a hypothetical example but we likely will see a gradient from west to east or might see one um at this scale um then once we have these kind of global datasets we can go to the fine scale and maybe for example do a national mon genetic monitoring and we will find much finer scale gradients we also will find especially for forest trees outliers so forest stands that don't fit so usual pattern and that is because first reproductive material has been moved around a lot and so these lighter or darker dots are this material that was moved to germany from the outside and we only will identify these outliers if we have the whole reference dataset if we don't have the whole reference dataset we might not identify these outliers stands with a different history or in a worst case these outliers might actually bias our gradients and we are always talking about very slight gradients so it's easy to bias these gradients dilute them so we actually won't get the results we need to compile these kind of reference datasets that's huge collaborative efforts because people need to go out there into the field and collect the reference samples and that might be scientists that might be people from local communities, citizen scientists, managers, owners, government officials who provide background information, maps, distribution information and also in many parts of the world they might and might protect the people who are actually collecting the samples and it might be conservation activists and NGOs so once the samples have been collected they need to be stored somewhere and for the long term and the information needs to be databaseed and that is the work of scientific collections which are mostly at natural history museums and there the samples are processed they're organized in a way that you can find them again all the metadata is entered which curators do collection managers, preparators, technical staff at the scientific collections so once we have these kind of datasets large-scale datasets what are we actually doing with them so the foundation for all of our applications is population structure and they're specifically population assignment so the process is that we first we decide on a question and design our project accordingly that we can answer the question then we need to infer the population structure model and optimize it in the next step we need to check if our model actually is good enough for application because we might have found the best model but it might not still not be good enough for application so we need to test that and that is the step of population assignment or predictive assignment and then in the end we want to test our hypothesis as it stands different or does an individual come from stand A or from stand B and here we identify error rates and accuracy so this whole process is very statistical and so the analysis of these reference data they need to be accompanied by biostatisticians who can tell us how to analyze our data so what is the state of the art right now what kind of geographic resolution do we actually get of this non-model species currently and I'm going to present the example of an African temperature species which is a very valuable timber it's one example but basically all results for species who have large distribution ranges and are continuously distributed and are also long as a longer lift are very similar so these kind of results seem to be species independent so the species are milicia regia and excelsia african teak which cannot be grown in plantations for timber quality wood so it is harvested unsustainably from natural forests it's distributed in west central and east africa here's a black rectangle um and a group of a dozen scientists um got together and they actually sampled the reference data set for these two species um they it's about over 400 samples they analyzed four marker systems um resulting in a total of something like a hundred markers genetic markers and then they optimized their population model and um used different parameter settings and we are going to concentrate here on the best solution that they found and basically this rectangle here is the black one over here so the resolution is so they found population structure clear clusters so it's um the populations and the species from west africa can be distinguished from those populations in central africa and the ones on the east in east africa can be differentiated so that is really good so we have population structure we know their signal the problem is still that our resolution is much lower than we would need to have it because um we basically need resolution on a at least on a country level because most of the legal um most of the laws are national so might be um legal to harvest a tree in one country but not in another country um so we need to get our resolution down to a country level um or even to regional level if you want to distinguish was the tree harvested in a national park in a protected area or in outside in a managed forest and when as biodiversity scientists we don't know how to continue it's um one thing is to look for um what people do with model organisms and specifically um what people do in human population genomics because there are thousands of population geneticists are working and there's a completely different funding background due to the interest of the medical and the pharma industry so they are always advanced um what we can learn from there from human results and from human population genomics is that we need two features one is we already know that we need a distribution wide distribution wide sampling which provides a spatial context um the second feature is that we need genome wide sequencing preferably genome sequencing which provides us steps in time um because our genomes are um archives of our evolutionary history there are records of all the processes and events and this steps in time then translates also into resolution once we have these two features actually um these reference data sets open pandora's box um suddenly we can um we can ask all kinds of questions and objectives even those that we still don't know we can develop all kinds of applications um which is done um for humans um currently there at least four global data sets on human human diversity um these are very widely reused and these big data uh data sets um so they are big data with regard to the number of samples and also um the genomes or the genome representation representations and this is this results in very information rich data which initiates analytical development so um people continuously evolve um developing new statistical methods and right now a new wave is coming in um of these methods so once you have these global data sets people start in human population genomics this start to do these intense um regional um samplings and this is the example of the united kingdom bio bank um it's a project with 500 volunteers they are all uk citizens from all over the islands and um each individual was genotyped in a in a wet lab for 820 markers that's completely i mean that's a different number than the hundred or thousand or by diversity science it's most i don't we have we normally analyze the maximum of a couple of 10 000 markers so um that's a completely different number um but then um statistical geneticists come they do some weird and wonderful voodoo and they derive 96 million markers per genome that is per individual from these 820 markers that were produced in the lab uh so that's a hundred fold increase and once you have this kind of data set for for a genome you suddenly or you finally you become country level and within country level resolution so these panels um are example so the first panel shows individuals who were born in um edinburgh and the question was where were people born who had a similar ancestral background genetic background and what they found was that that was all over scotland and northern island um northern yorkshire um was even more local so people from yorkshire don't seem to get around a lot um for london the situation is completely different that is what we would expect because london is a people magnet people move there all the time they meet there they become children and the kids born in london um their genetic ancestry has nothing to do with london it's from all over the place from the british islands and the world so that's why the colors are very strongly real um dissolved so this study came out also this summer and it's the first time that i have seen that we actually really can archive um regional resolution um and i find this possibility for biodiversity science um very exciting so it was made possible um by very sophisticated statistical approaches which are able to analyze genetic data from highly complex evolutionary and ecological systems um and at the same same time these analysis um are able to handle big data we are talking about gigabytes and terabytes of data and results um so um statistical geneticists they're developing new methods of data representation to handle this amount of data and then we are able to sufficiently extract the signal for a very specific question from data which is a very low signal to noise ratio so to get there we need many experts and specialists so we need statistical geneticists big data experts who also might contribute machine learning expertise uh we need molecular biologists who know how to sequence um complex genomes um we need bioinformaticians with an expertise in genomics um for assembly annotation and alignment of genomic sequences um the result is actually this um this is the also list for the 1000 genomes project reference data set and i don't expect you to be able to read it but you the bold type of interest because it shows all the different tasks that are necessary to produce a standardized and highly cleaned reference data set um so even so the whole also list is something like one and a half pages long and even um considering that some authors will have contributed to several tasks um the publications for reference data sets um mostly have also lists that are far over 50 people so they are huge collaborative efforts now we take the step into biodiversity science here this are eight gastro tricks so you're a little warm like organisms who live in the sediments of freshwater lakes and marine um sediments um they are in general a couple of hundreds micrometers uh large and i don't have any numbers but my guess would be that maybe worldwide a hundred to a thousand people actually uh work on these um species there are 800 species of gastritis so let's say there's one to maybe three experts for species for these organisms so how are these three people going to manage all these tasks to produce a reference data set um you might say well it's it's gastritis i mean i've never heard about them maybe they are not so important maybe you don't need the reference data sets but actually some of those species are bio indicators for water quality um so what we observe right now is a gap for biodiversity conservation um we in model organisms we have pandora pandora's box open we have all the statistical analysis at our hands um to analyze our data sets however in non-model organisms we are still stuck with summary statistics that don't provide us the resolutions that we need and we know that to close this gap even for a single species it's a huge effort but at the same time we have over 35 000 species listed by cytos which need already now effective protector protection so we need to find a way to close this gap and actually move in this direction and the good thing is so what we um so all of this in biodiversity sciences in academia and we need to make the transition over the conservation genomic gap into the big loop of real world conservation tasks and the good thing is we already know what we what we have to do so we need to have um reference data sets distribution range wide we need to have statistics um and it's going to be big data so we need collection management data management and an analysis environment so looking at the different um ingredients or different steps it's the first we need a general data infrastructure for global diversity reference data sets that actually can be used across species for preferably as many species as possible and provide a working environment for biodiversity scientists and um experts um should be user friendly so it can be used by scientists but also that people from local communities and um citizen scientists can add their observation data and the data into this um data infrastructure um um I have listed quite a lot of features that these kind of infrastructures should have and I'm going to argue that these features are not some nice to have but actually some must have um because our goal is always application um so we need developers managers and curators for data infrastructures okay um since our goal is application um the main um the main results we need so our main features are quality control and error reduction um these are the bases um so that we are that our conservation tools can be real robustly and reliably applied under real world operating conditions and the way to achieve quality and error reduction is through chains of custody so it means that from project of sign from the questions through all the steps that are necessary to produce a reference data set and then um to um so from um sample collection genomics statistical analysis down to application um these steps need to be documented um standardized they need to be each one of them needs to be validated and reproducible um they should be modular so um they can be user friendly and the whole chain of custody needs to be scalable um so if we if our chains of custody have these characteristics we actually will have tools that will work uh in everyday life so um we need professional developers and programmers who are able to um produce these very collaborative softwares um we need free and open source experts so we always can ensure that our code and that our infrastructures are still integer and we can check them and I'm a biologist I don't have any background in hardware but I've heard a couple of talks here on the conference about green IT and um have the feeling we should have people who know hardware and software and know how to develop to develop um these high tech tools in a way sustainable so that by developing these tools we don't use more resources than we are trying to protect um so I've shown all these features and characteristics that the software should have and um and I'm arguing arguing that these features are necessary because of the reality we find us in um it's one of rising over exploitation and destruction of nature um so the extent of environmental crimes is up in the billions it's all environmental crime together the green bubbles are only second to to drug associated crimes um and they are up there with counterfeiting human trafficking um so these are multi-billion enterprises um they are often transnational and yeah industries with huge profits so um if there's a uh some crime some mafia boss and criminal manager who just bribed a government official somewhere in the neck in the woods um it just would make sense that that person would not wait or not take the risk to be discovered just because some customs officer um pulls out a container somewhere in the harbor for example opens it and says this looks kind of weird um let's take a sample send it to a lab and then a population geneticist comes back and says oh yes um this sample is not from area A as documented but actually it's from area B and it was illegally locked so it um if we have reference data sets information rich reference data sets they become highly valuable and they need protections themselves um against manipulation and destruction um so we will need to think about IT security from the beginning also these data sets um often very politically sensitive because if it is shown that in a certain country there is illegal logging repeatedly um that country might not be too excited about this information so we need to think about IT security experts um um so my hope is that these kind of very high tech digital um conservation tools can actually contribute to the UN sustainable um development goals by empowering indigenous people local communities and also us um to protect and force and sustainably use our lands and our biodiversity by providing the management and law enforcement tools so we need people from around the world users from around the world who use these tools and help to develop them further and to maintain them and finally um these high tech tools um will be just another technological fix if we don't manage to get our um back down our way of life down to sustainable levels so what we need is to today this year the um Earth Overshoot Day was at the end of July so at the end of July we had um used all the resources that we had available for the whole year and we need to get this back towards the end of the year so that our resources actually um sustain us for the whole year and yeah I the graphic here for Germany suggest that we are on a good way we are reducing our resource consumption and maybe even our bio-capacity moves up a little bit so um actually um it seems that our personal lifestyles and choices make a difference and we just need to close this gap here much quicker so protecting biodiversity needs all of us to achieve that and with that thank you very much so thank you Jutta for this very interesting talk and the very valuable work you're doing we have three mics here please line up at the microphones if you have any questions or suggestions or want to participate and work together with Jutta we have one question from the internet so please our signal angel start why do wild plant species within a genus are further apart than wild animal species within a genus could you repeat it please yeah why do wild plant species within a genus are further apart than wild animal species within a genus I'm not sure I understand the background for the questions because animals move and plants don't move oh okay so if that is the idea behind the question plants actually move too they don't move as individuals but they move their the genetic the genetic material through pollen or fragments so actually diversity in plants and in animals can be quite similar so the ideas that plants are just stuck and should have a completely different population structure does not hold because plants move around their genetic material through seeds through pollen through vegetative protocols so thank you microphone one for helping out please ask your question so my question is about a bit about the success factor if we think of this whatever database being set up there and I think it's a it's going to be a huge database I downloaded my own genome on the internet it was about 150 megabytes and if we multiply that I think the genetic variation from one person to another is about one percent only so we have we can compress that for to four megabytes per person if we if we sequence all the humans in the world that would be 32 petabytes that would cost approximately 15 billion dollars and that's only for the storage now comes the entire management of course we don't want to digitize all the human genome but rather plants and animal species genome so it's a huge data program and what would be for you the success factors for this thing to really fly and did you talk to organizations like wiki data or others or where would it origin ideally be hosted at a university or an international nonprofit or who would be running the thing yeah I I mean it is really big data and and I mean I think the first goal is not to think about having all predicted five to ten million species be sequenced on a population level think we need to think about the next step and there it would make sense to start with species that are actually highly exploited like many timber species and also many marine fishes and I think that's where we should start and to host this kind of data I think it should be in political independent hands so it should be in an with an NGO or with the UN some organization that is independent and are you the first to think about this or are there existing initiatives there are actually existing initiatives I have been in contact with the Forest Stewardship Council and they are actually starting to sample their concessions and initiated to build up the samples they work together with Q botanical gardens and the US Forest Service and so and right now they're analyzing the samples using isotopes with another method which is very powerful and can also produce geographic information and so yeah so people are moving in this way so yeah I think the ideas out there just we have to start and we have to really do it and provide one infrastructure so that we can combine for example morphological data isotope data entranomic data into one data set which will increase our resolution and our reliability okay microphone number two please thank you for your available talk my question would be you started your talk with the possible decrease of leaf beetles and in the dataset you showed on slide number six there was an increase in leaf beetle population until the 70s something about that is there a possible explanation for that I believe and I am yeah I believe it is because people started to much more systematically observe leaf beetles so it's a sample effort and also at that time the people so it's a multi people collaboration who actually has assembled this data set so the people who are part of this collaboration they edit their own private data sets and that's why you have an increase I think while the people from the 1900s 1910 you only can use the data that is available in publications and samples in museums or in scientific collections I think that is the reason why you have the sharp increase thank you so we have another question at microphone number two thank you for your fine talk and excuse me maybe my question is a bit off topic do you think the methods and the roles that you identified in your talk could be transferred to the assessment of raw materials I'm thinking about metals maybe the the data infrastructure might like if you wanted to collect raw metals or materials from all over the world and come a sample of scientific collection and have kind of a reference data set that might work actually but the the genomics obviously won't so that part wouldn't and you would need to use different method from physics obviously but actually the infrastructure certain parts will be quite similar I think so yes so we have one more question from the internet who does contract a freelance evolutionary biologist can you give an example of this kind of work you proposed so I I see this gap between science and applications that we need these applications and there's a huge potential for these applications we know that illegal logging that is my background but it doesn't seem to be much different for example in marine fisheries we know that there is this huge amount of illegal logging and timber trade going on and we need to have the methods actually that are have the power to detect illegally traded timber so I think there's a huge need for these kind of methods and organizations who are interested in this kind of methods are are government their companies NGOs yeah customs interpol so yeah do we have any other questions so thank you again Yuta for your talk and the valuable work you're doing please give a warm round of applause to Yuta