 So now let's get into the meat of it. I will now start with the overview of the BG database. And so first, as we mentioned already, for the note that we mentioned, we have a team of people. So we have different expertise. So if you have any questions, don't hesitate to contact us. And the most efficient way to contact us is with this email address because it goes to a ticketing system. And then if it's a question about ontologies, one person will answer. And if it's a question about statistics, another person, and if there's a bug in the web form, it might be a different person. And if someone's on holiday with someone else can answer, we will. So we really try to be as reactive as possible. Don't hesitate to contact us there. And we're also on Twitter. So you can also follow us for updates and also ask us questions on Twitter, although the ticketing system is more efficient. And these are, so this group, what's imposed that it's a mix of computer scientists and programmers, statistician, and by curators. So we have these different expertise which come together for BG as you'll see over the morning. And what is the aim of BG? The aim of BG is to understand gene expression and help biologists use it, so you. So our aim is really that gene expression is very complicated. There are many types of data, many parameters, many conditions. And we want to, as much as possible, do the work behind the scenes upstream from you so that you have something which is simple to use. So in a similar way that when you take genome annotation from Marsamble NCBI, you don't need to know how genome annotations are done. We want you to be able to trust our results. We also want to be transparent. So if you want to check them, of course, trust comes with verification. But we want to, as much as possible, make these things useful. And that also means making tools which make the gene expression useful to you. So not just dumping all the gene expression on you, but making it relevant to your work. And so for this, I'm going to do a small demo of the gene page of BG. It's the easiest way to show this rather than with slides. Here is BG. So this is the home page of BG. If you type BG.org, this is where you get. And you see here three big buttons, expression comparison, expression enrichment analysis, and gene search. So if we click on gene search, we can look for a gene. And here you can type anything. There's an autocomplete. So if I type T, it will find the gene T. And if you type hox, it will give you hox genes. And let's just look at insulin because it's one of our default genes. And here what you get is a list. You show the first 10, but there can be more. With the genes which have the name insulin somewhere in them. And we prioritize. So here there's a whole match in the description. And if I go towards the end of the list, I would get things which are insulin related, but not insulin directly as the name, right? So we order this so that the first hits are more relevant. And if I click on the first gene here, what I'm going to find is a page which reports the expression in this gene. So I have first some general information about the gene. So I see which species it is in the human genome. It has the ensemble identifier, its name and so on. And here I see where it is expressed. And what we present in priority is where it's expressed in terms of anatomy. So that includes both organs and cell types. So here, for example, you see that insulin, that's top expression is in type P pancreatic cell, which is, if you know any biology, kind of expected and reassuring that we get this result. You also see that it's found in 93 entries, meaning here anatomical structures, which can be anything as specific as type B pancreatic cell or much more broad. If I go down here, I might find some very broad terms like, I don't know, lung, placenta and so on. And what I see here is also that I can show others. So here it's ranked by a score, which is how highly it's expressed in this space. But I also have an FDR, which is how confident are we that it is expressed here. And these are not necessarily exactly correlated because if I have more evidence, the FDR goes down, even if it's slowly expressed. So here, for type B pancreatic cell, I have not that many data sets. So the FDR is not that low, but I am very confident. And the source of the data, this comes from microwave matrix, whereas the others here come from RNA-seq. We can also have ESTs in C2 and full length single cell, as you will see later in this morning. And you can see the data only from one source or all of them. And here we can see the anatomy with also the developmental stage, the sex or the strain, which in humans would be populations. If I'm going to see, for example, here with developmental stage and sex, it's going to update the table. And what I'm going to see here in a second is expression instead of each row being an anatomical structure would be a combination of an anatomical entity, a developmental stage, and a sex. So here, for example, type B pancreatic cell in post-juvenile, which means adult, basically, and male or female, we don't have the information. There's also idolatry of langerans in the third decade. So people in the 30s and a female was detected. And also in the 50s and a female and in the 80s in a male. So you see all this detail. So you see that basically from this, you see that insulin is expressed in type B pancreatic cell idolatry of langerans over all the other life in both males and females and not really a difference. So you see this immediately whereas if there was a big difference in males and females, it would also be very apparent or if it was only an embryo or only an adult or only an aging and so on. So you see this information you can select every time. And we also have the information of absence. So you see that, for example, insulin is not expressed in testis, corpus callusum and so on and maybe you don't know these things. So we use, we show you an ontology. And if you click here, you see the details in an external site of the EBI of what this is. So what is corpus callusum? It's a white matter structure in the brain, basically. And so this way, if you don't know what the term is, you can know what it is. You also see on this page on top here, autologues. So we can compare the expression between the species. We can see here that, for example, among the primates, we have 11 species with genes. We can see these species like this. Here we have all the lists. So we can click on one of the genes and see what the expression looks like. So here it's in Makaka. In Makaka, we have much less information, much less data. So we cannot detect expression in the punk rest because no one sequenced it, but we have the liver. And you can see here the scores are much lower because we don't have the punk rest where it should be expressed. And we can compare, if I click compare expression, then I'm going to be able to compare the expression between these species. And we will show you which homologous anatomous structures have expression of the autologous genes. Just let this load rapidly. And this is the same place there. So here you see that we get all the same on top and the current punk rest was very highly expressed. But actually we only have the information in human, presence of expression, because no one has done detailed data analysis on the other primates for the underground punk rests. Then I get, for example, that the hepatobiliary system is found in seven of the primates I have and the underground system in eight of the primates. So we get some information on the conservation. Of course, this is dependent on the data which is available. And this page you could have gotten there directly by expression comparison here. So I think that's it for my short demonstration. Oh, just you can also filter here if I go back. If you look at the expression here, I can filter if I'm looking for something specific. So just a second is loading. Every time we have to do a query because we've changed the parameters here. And I say, okay, I'd like to know what expression there is in the brain. So I'm going to type brain. And I find there is some weak expression in the hind brain. So now I'm going to suggest to you, sorry, I shouldn't stop sharing. I'm going to do a now, I forgot, sorry. Sorry, sorry, sorry, my bad. I want to show you the species we have. So here we have the species which are in BG and you see here that we only have animals first. That we have many animals. Second, here you see on top the model organism. So we have the species from the classic model organism where we create our own data, but also we integrate the data from the model organism database. So if you're used to working with fly base or MGI or Zedfin, we have the information from these databases integrated. And then we have many other species. So if you work on veterinary species, we have dog, cat, horse, and so on. If you're interested in the evolution of primates, we have these primates. If you're interested in the evolution of fishes, we have these fishes and so on. And we are working on adding new species. According to the availability of the data, we need expression data across the diversity of organs and tissues and cell types to be able to make sense of it. So now I will start the WooClap. So if I ask you to follow the link which is in the document to what's WooClap, you should find Paul asking you, which species have data in BG? Yeah, you find the link. So from the master document, you have a link to this section, the overview of BG. And on the top of this document, you have the link to WooClap. I will maybe show you this again. I can maybe share to show it so that you can know the WooClap. Good idea. Thank you. So yeah, from the master document here, you have this link for overview of BG that leads to this page. And here you have this link. So you should follow this link. And then Mark is going to launch the WooClap. I don't know if I show the answer if I follow the link. Yeah. So here you have... Oh, you stopped my voting by following the link with your co-owner. Oh yeah, my bad. Stop the mine. So if you can leave the WooClap and I can relaunch it, sorry. It's the first time we do this with two people activating it, sorry. I don't think so. So I restarted. There, now it's started. OK. So please... So the question is, which species have dead and BG? Platypus, human cancer. Josephine aminogaster, esterovigae, healthy humans, or arbidopsis. You can answer several. But it's been... It's still stopped, sorry. I don't know why. Have you left it? Yes, I have left. But I have a cannot vote anymore, sorry, except... Sorry about this. I don't know how to... Ah, now it's open again. OK, now you can vote. Sorry for this mess. I found the button to make it open again. OK, now people are voting, cool. So since we want to be dynamic, every one of these polls, I put the limit of one minute. So don't spend too much time thinking, vote, tak, tak. And we will get an idea of what we understood. So I will share the results. OK, voting is over. So you all agreed that there's human healthy data. I agree. There is human healthy data. It's the example I showed you. Most of you also agree that there is fly and platypus. Indeed, I told you there were different species and there was a photo of the platypus on the front page. Congratulations on noticing it. There is not human cancer. Why? We will come to that later. But we are very careful in BG to create only healthy data so that you can... When you look at the gene expression in BG, it is a solid baseline, a reliable baseline for gene expression in healthy individuals, whatever, and also wild type for experimentally modified species. So we don't have gene knockouts or knockdowns in zebrafish, fly, and so on. And we don't have tumors or other diseases in humans. And it also means that the gene expression is comparable between species because comparing a human cancer to a knockout mouse doesn't make a lot of sense biologically in the general case, whereas comparing healthy human to healthy mouse shows the conservation and evolution. And we don't have yeast and we don't have arbidopsis because we only have animals. So thank you for your votes. I'll go back to my slides. Sorry, no, I had another woke up in my bed. So I will now launch a second woke up. If you can please stay on the woke up. A different, another question. What do the BG... What do we highlight when you go to a gene in BG? Do we highlight gene ontology, expression atomy, expression disease, or expression in development? People have started to vote. Again, I limited the time. So you now have 36 seconds. It's the same link. The woke up just updated. So it seems my explanations were clear. And so indeed you all noticed that what we highlight is expression and anatomy. We do have some inferential gene ontology, but we do not highlight it. We do not have expression disease at all. You can see the expression development, but it's not what we highlight first. So I think this shows that we were relatively clear so far. I continue now my slides. Okay, so I said that we are careful to create healthy data. And when I choose the team, I said we have a bi-curator and I would like to get a bit more into this and expand a bit more because it is not something that everyone knows what's behind the scene of many databases. And it's very important for the quality and the reliability of the data we have in BG. So in general in biology database you have two large categories, uncurated and curated. In uncurated database, the typical example is gene bank or as it's called nowadays NCBI nucleotides. The main added value is that it can be very up to date because there's nothing manual. It's entirely automatic. So you can run the program automatic. You can update the data automatically. So that's very good. On the other hand, there is a lot of redundancy. You will find the same gene sequence in the same species by different groups. Many times in NCBI nucleotide and there is a very low organization of knowledge. If someone submitted incomplete information, that's what you get. If someone submitted wrong information, that's what you get. If the information in the wrong place or with weird terms, that's what you get. It's not very organized. On the other hand, you have curated databases where the data is verified, redundancy is minimized. So the typical example is the SwissProt part of Uniprot. Here, every protein from a given species, only one entry. And if many people studied it, they are all going to be adding to that one entry. So minimal redundancy, the annotations are standardized so that you will always find the gene name on the gene name and always find the sub-cellar localization on the sub-cellar localization. And so this is not so complete, but it has this added value of organized and reliable knowledge. Just to tell you that at SwissProt, they have a team of curators who verify everything and read on average a paper a day per curator. And you only considered a true SwissProt curator after two years of work in the SwissProt group. So this is the kind of thing we aim for. And this is a citation from the Society for Bicuration. Bicuration involves the translation and integration of information relevant to biology into a database or resource that enables integration of the scientific literature as well as large data sets. So all these words are important. We take data which exists, we translate it because sometimes the terms are not clear or not the right ones or just there is a synonyms and we choose only one to be consistent. Integration, so the information is together in one place and you can use it. And it allows to take information both from the scientific literature, reading papers and large data sets such as Lazar and Isaac data sets, genomes and so on. And this is, sorry, what we do in BG. And so with curation, you can do annotation. Annotation is associating a biological object to a feature based on evidence. That's a bit dry as a sentence, but it's more clear if you look at the gene ontology which most of you know, everyone knows I think. So if I associate the gene ontology term to a gene, so I've associated the biological object gene to the feature gene ontology term and I do it based on some evidence. If it's automatic that all the, I take a new genome and every gene, I blast it against the closest well-authentic species, say against mouse and I transfer the gene ontology term then it's based on the first blast hit, then it's uncurated, it's not verified. Whereas if I read the papers and found, aha, people studied these genes and so it had this function and the corresponding gene ontology term is this one and now I'll put it into a Swiss port, for example, that is a curated annotation. So in BG we are curated database, all expression data that we integrate is verified. It's verified that it's the right species, the right organs and that it is wild type and healthy and that it is the right protocols which are followed, everything is verified manually. Any data which does not fulfill our criteria is excluded. So there could be some part of a large dataset you don't find in BG because it did not fulfill our quality criteria. And every expression data says annotated by manual curation and our by curator is not only the entry in SRA or our express or other databases, but also if things are unclear or ambiguous, the paper, the supplementary data of the paper, the webpage of the project and so on. And we follow standards which themselves are curated by us and other by curators so that we have very clear standards. So again, if you think of the gene ontology, this is standard of terms and of the way terms are associated to genes and we work with the community on doing this carefully. So now I would ask you to go to the Google doc and if you look on the Google doc of the course, if you look at overview, I will share the screen, my Firefox, okay. Here, if you go to overview, you see here you have a link. If you follow this link, you see here that there is this table and if you could please provide your name and two examples of curated and uncurated databases that you know of. So we also get an idea of which databases you know and you're used to. Okay, so most of you have noted that Swissport was a curated database. A few of you have noted that I mentioned maybe that BG is a curated database. So for the others, I hope that by the end of this day, it will be clear. Uncurated data is obviously I mentioned gene bank and CBI is not a database, it's a institute which has many, many resources and activities. Ensemble is indeed an uncurated database. So most of the big place where you'll find all the primary information uncurated if you think of the PDB for structures or things like this, this is where all the primary information is, right? I'll go back to my slides. So I told you already one of our criteria for curation is wild type healthy. So why wild type healthy? It's informative on what is called in the literature the causal function of genes. So causal function is if you think what the gene has been selected to do. So if you think of, for example, not the gene, but the heart because it's easy to think about, the heart has a function of pumping blood in the body and that's its causal function. That's what it's there to do. It also has a function that it makes a buddum buddum buddum that you can listen to and know if someone is alive or not. But that's not its function, right? So the function of a gene is not that when it's mutated it causes cancer. The function of the gene is to regulate some other genes, make an enzyme and so on. So the healthy wild type expression is informative on this. It allows us to compare between species because healthy wild type is what was selected and what should be conserved in evolution. So if a pattern of expression of healthy wild type is conserved between species, then it's probably important to the function of this gene. And it also provides a reference for biomedical studies. So obviously in biomedical studies at some point you need the expression in the disease of interest to you or the treatment, but you also need to know what happens in the absence of this disease and treatment. And we've been collaborating with several other, with several large projects where they need a reference, what happens in the absence, for example, of cancer. And this we can provide reliably because we checked it. And to give you one example, what it means to do this curation, we did a full recreation of the version six of GTech, which is the largest transcriptomics effort, I think that I know of in humans. So it's a large project to build the comprehensive public resource of tissues for gene expression regulation humans from 54 tissue sites across nearly 1,000 individuals. And if you look at the description, they say non-diseased tissue sites. But if you also look at the description, they say, we have not excluded specific donors from specific tissues based on their cause of death, or medical history. And so we re-analyzed this, and this means that our bio-curators read every single pathology report. And we found that there were many non-healthy cases. So for example, we had, we rejected 179 whole subjects, so every data from this person, because they died of an overdose of drug, because they had malignant cancer, because they were morbidly obese, and so on. And we can say this is not healthy, this is not healthy gene expression. And then we also, for people we accepted, discarded some specific samples. For example, if someone had dementia when they passed away, we accepted their samples of muscle, liver, pancreas, and so on, but not brain. If they had their liver, if they had acytus or liver disease, we did not accept the liver tissues, but then we did accept the brain. And we also verified, sometimes the pathologist said, I was supposed to dissect this part, but it was a bit difficult, and also took a piece of the neighboring tissue, and so in that case, also we discarded it. And all this is in the GTEC, if you take it from the original, but it is not in BG. So we curate this and verify it. And so we overall reviewed the annotations of 11,900 libraries and kept only about half. And because GTEC is from human samples, it's a bit delicate. So we have internally all the exact information, but we cannot make public the exact age of the people or their ethnicity. So when you see that the ethnicity is not filled, especially, or it's called strain in the web interface for humans, it is usually that we are not allowed to share this information publicly, which is completely understandable, of course. And so in the end, we got 539 conditions where we are confident that this is indeed healthy gene expression data. So I have another Google doc. If you go back to the Google doc, I have another question here, which is, so I'm going to ask you to do a little exercise. I gave you the description of a real experiment and we're not going to keep all of this. We're going to keep part of it. And so which part would you call healthy wild type? Maybe you can share your screen. Yeah, I'm going to, but I cannot read and share at the same time. So I wanted to read first back. So you see here, I can zoom a bit, so it's clear for everyone. We use microanalysis to identify difference in gene expression levels in liver and in quadriceps skeletal muscle between 18 hours fasted wild type control and crouper like factor 15 null mice. So crouper like 15 is a gene and I'm letting you, which part of this, which data would we keep from this? So we don't necessarily keep all the data from one experiment. What is wild type healthy here? You don't need to read the full. I see some people select the link. You don't need to read the full annotation. I put the important part here because there's a lot of details in an annotation. And basically you can say what you would keep from this experiment reading the description. So I see that most of those who have answered have answered fasted wild type control. So I think you've identified that we would not keep the crouper like factor 15 null mice, which are knockout mice, right? So they're not wild type. An interesting question. And so you might be hesitating. We have 18 hour overnight fasted wild type. So this is the kind of question we have to all the time address when we are curating data. If you fast mice for 18 hours, is this normal expression? Is this healthy wild type or is this unhealthy? So if I would imagine if I would fast the mice for three days and they're almost dying of hunger and take the gene expression, this would certainly not be healthy. If it's 20 minutes since day eight, obviously it's completely normal situation. So now what do we do? So we have a whole table in our GitHub of all the choices we had to made and how to make guidelines every time for future situations, where for example, we consider that fasting, which is less than 24 hours, is going to be reasonably healthy because something which can happen totally normally to wild animals out there in the wild, right? They didn't immediately find food and it's still not completely unhealthy. And we have to make this kind of cause all the time if they sleep a bit, more sleep a bit less, if they are together, maize and females are separated, all these little manipulations you can do in the lab, do we consider them wild type or not? No one noted, but we have here two tissues, two organs, liver and quadrature skeleton muscle, and in this case we would keep both. So someone wrote only skeleton muscle. So it's an interesting question. When they're fasted, we keep the skeleton muscle but not the liver because maybe the fast has more impact on the liver. We kept both, but at some point we have to make a call, right? So this is the kind of, this is to show you the kind of questions our curators are addressing every day. Yeah, because in this case, we consider that the time of fasting is something that you can expect in the wild. So we would keep liver, but that's a good point. We could have selectively discarded sample as for GTEC actually, yeah. But in that case, we consider it still normal. Okay, going back to my slides, wrong slide. Too many things open there. Okay, continuing on curation, I told you that we annotate following standards which are themselves curated. And so for this we use ontologies. So ontologies, you all heard of the gene ontology, but ontology is a more general concept in information science and not to be bioinformatics. And ontology is a list of terms. So we agree on what words we use. So different people don't use different words for the same thing. And this can happen because just language is ambiguous. And it can also happen because there are different traditions and different fields. So the medical doctor and the zologist might use different words for the same thing, for example. If you just have a list of terms, you have a control vocabulary, which is already very useful, but it's not ontology. Then we have definitions of each term. So as I showed you, and I showed you an example, I clicked on a part of the brain and you saw a definition which tells you that this is a part of the brain, which part of the white matter, da, da, da, da. And so this allows you to know what it is really. And if you just had the list of terms and definitions, you would have a dictionary. You've all used dictionaries, I think. But what makes it really an ontology is that you have not only a list of terms and definitions of the terms, but also relations between the terms. Relations between the terms mean that you know that the white matter is part of the brain. And so if I'm asking for everything within the brain, I'm not going to take only things that have written brain, but all the things that have written white matter, which is part of the brain. And you can have different types of relations. I just said part of right now, but I will show you different types of relations. And this allows what's called automated reasoning. So the simplest type of automated reasoning is to say, if I want all genes expressed in the brain, I have to fetch all genes expressed in any structure, any term which is defined as being part of the brain. So not just those which are notated brain, which those are notated to something which itself is known to be part of the brain. And this seems rather simple, but you can have much more complex cases and we can write code to do this so that it's done automatically so that you don't have to worry about, we worry about it upstream. And it scales when you have more and more data, more and more species, because we have these ontologies, we can scale up and continue doing this reasoning more and more. And so the most well-known ontology is the gene ontology. Here, for example, you have the notation in a Uniprot for Hox gene. And you see that here you have immediately the gene ontology terms. I actually, by the way, I'm showing you the old Uniprot. They changed their web interface like a week or two ago. And on the new one, if I do a screenshot, I don't see the gene name and the gene ontology in the same screen. So I kept the old one, but the principle is the same, they haven't lost their information. So you see here that you have these specific terms, sedmigration, hindbrain, and you see where the annotation comes from, because again, you have to know how we associate the term to a gene. And if you take this sedmigration and hindbrain, you see that it's part of this graph of relations, the sedmigration and hindbrain is a, that's the black arrow, sedmigration, and is, which is a sedmotility. So if I want all the genes which are associated to sedmotility, I will also take all the genes associated to sedmigration, which means I will also take all the genes associated to sedmigration and hindbrain, which means I will take Hawks B1A. And we also have here the blue arrows part of, so sedmigration and hindbrain is part of hindbrain development. When hindbrain develops, the cells have to migrate to the right places. And so we can see here that we have different relations and depending on the question I ask, we might use both or one or the other relation. And we see all these other types of relations we can have in the gene ontology, regulates, positively regulates, occurs in and so on. And we don't use so much the gene ontology, we use an anatomy ontology, which is called uberon, which is an ontology which describes the anatomy of any animal and all animals and which is developed together with several teams in the United States and ourselves. And it includes the species specific ontology. So if you work with zebrafish or mouse or C. elegans or Josephia melanogaster and you're used to your favorite ontology, the information isn't there, but we also have much more general information for all species. We have the information. So here for example, you have a zoom on the liver. You have the information that the liver, as it's defined here, is only found in vertebrates. So you will know this information if you need to restrict. That is part of the abdomen, part of the exocrine system, that it contributes to the morphology of the hepatobiliary system. There's also part of the hepatobiliary system. And you have these types of relations here. So we also have always is a part of, as in the gene ontology and as in all the ontologies we have, but we also have more anatomy specific things such as develops from or contributes to morphology of and so on. And so thanks to this, we can reason to know where to generalize gene expression and to find if gene expression, we have gene expression in the liver and I want gene expression in the abdomen because the liver is part of the abdomen. I can find this expression, right? And we, so what makes an ontology useful? Why are some ontology successes and others less? Well, one important part about an ontology is that it's a standard, a common standard. So if each of us develops a separate ontology, then it's a mess and we don't progress much beyond what we'd have without ontologies. Actually, when we started BG, we were developing our own ontology. Then we heard that some people in the United States were developing their own ontology and we merged our projects and that's modern uberon. So uberon is common to all animal species and is used in all the large projects which do gene expression across many organs, tissues or cell types. Here I said G-tech and phantom, but it's also used, for example, in the human cell atlas. And a good ontology useful ontology covers a large domain of knowledge. So each ontology would be specific of a domain of knowledge. No ontology covers all human knowledge. Here it is animal anatomy, which is a large domain of knowledge and there should be tools leveraging it to make it useful to people who don't want to actually read these graphs but want just to get the result, which is useful. And you've all probably done gene ontology enrichments and that's very useful first step. And today you'll see some tools we have which leverage uberon and the annotation we do to uberon to make them useful to you. So in BG, we annotate to anatomy and cell type using uberon and the cell ontology. We also, as I showed you in the demo on the webpage, developed to life stage, which is both the fine grained embryonic development and also all the post embryonic life stages. So the age, the sexual maturity and so on. And for this, we develop internally a developmental ontology for each species. And these are all aligned and done in structured and standard way so that we can compare them between species. And our development of these developmental ontologies is recognized as a standard so that all those big projects like GTEC and so on which work on, for example, human, they will use our human developmental ontology. And because it's following one standard, it means that we can then compare, say what it means, mature, sexually mature human to sexually mature mouse. We also have sex, which in most species male and female, but sometimes a bit more complicated. And we don't, for this, have an ontology but just a small control vocabulary where we have male, female, homophroditic or undefined. And we have strains. This, we don't have the information for species. For lab species, we typically have detailed annotations of strains like the mice I show here. For some domestic species, we also have, it can be called strains of varieties. And for humans, we have populations. Sometimes we don't always have the information when we have, we can't always share it. But as much as all the information we can capture and can share, we will share. So for example, if I look at here an annotation I took from a specific gene. So this comes from a library which is described in the short read archive as primordial germ cells, 19 week of gestation male. And we will annotate this to the specific ontology term. So primordial germ says, is actually primordial germ cell which is a cell ontology term. But we see from reading the supplementary information in the period that it comes from the gonad. So it's in the anatomal structure gonad. And we have a specific term. So 19 week gestation is not a standard term but the fifth month stage of human embryo is. So we have this term and male is male. And so we're going to annotate these one by one. This is the end of this part. So I think, I hope what you found understood from this is that in BG, to make the gene expression useful to you, we do all the work upstream of curation of the data sets of integration of the data. And as we'll show you in more detail later homology comparing how do we compare between species?