 All right, and here we are. And this is, I went through and added little labels, which you can do by clicking on any dataset and adding a little tag, just so that I would know which one was, you know, male, wild type, et cetera, all the metadata, which is quite important. So we have all of our data, all of the data is H5AD, otherwise we would come over here and, you know, change the data type, but that all looks fine. So we're gonna start with concatenating object. And this can go very wrong if you don't click it correctly. So make sure you start with one there. And I, yeah, we're still using the down sampled stuff to make it a bit easier. And then you want two through seven here. Make sure you don't accidentally click one again, or you end up with essentially eight libraries where there's actually only seven. And yeah, we want intersection of variables, so they don't just keep adding the same, you know, metadata field twice and just add a dash one or whatever. And we want batch separators that, onward we go. This is gonna stitch all of our data sets together in a meaningful way. And now we can use one of my all-time favorite tools. Ooh, we're gonna learn all sorts of stuff. I mean, to be fair, what's cool now is that you could look in this little window and get all sorts of information about your object, which is awesome and didn't used to exist. So gold star to the developers of that, which I believe one of them is Mehmet. So gold star to you Mehmet. When we can get even more information by running this, get me the more information tool, which will prove very useful to you when you're trying to manipulate your metadata, which we're going to be doing so soon. And metadata is like, where did the sample come from? Was this knockout or was this wild type in this case? And now we can look, this will tell us, you know, ourselves by genes. This will tell us all the, lots of the different metadata we have. We can look at our columns of metadata and say, oh, this is cool. This gives me some sort of maths from the empty drops. This is telling me my batch information. So these are all from the first one and then later on they're the long and then variable. So this is information about each gene. So it's symbol, whether it's mitochondrial or not here. And if we look at, yes, our experimental design, right? So when we add these from zero, one, two, three, four, five, six, seven. So this would be considered batch zero because it's the first one we added. And then we added two. So this would become batch one and so on and so forth. So you end up with batches zero through six. All right, let's add in some metadata, shall we? So we're gonna use this information to change it because that calling n701 batch zero is very confusing. So let's stop that. All right, so we're gonna go with replace text, our observations. So this is our cell data. And remember that these numbers are, this is found from that experimental design object. This is how you can figure out which batch is supposed to be what. When I had one female, we're gonna rename that whole column sex. And we only want that, those columns, rather than, you know, if we look at this, what we're interested in is creating a lovely column that's useful here. We don't wanna repeat all of this extra information. So let's cut it. Again, don't want advanced cut. All we want is C9, all right? And then this should give us, yes, our column of sex metadata. And now we're gonna do everything again, but we're gonna label them by genotype instead. So let's do this again. We're gonna switch it, and it's gonna be zero, three, and you're gonna be wild type, one, six, knock out, and then we're gonna be calling this genotype. And then now we can call that our genotype data. Metadata, okay. And now we're going to paste two files side by side, genotype data and sex data, the logby tab, yes. I mean, you can probably skip that step and just manipulate and data. Like we're gonna do it in a second all at the same time, but that's how I do it the first time. So that's how I'll forever do it. In case anything goes wrong, you've at least saved yourself a little bit earlier on an easier to function step. And yeah, that looks right, okay. And the next step, we're gonna be manipulating and data by adding that information in, so add new observations and it's gonna be that. While that's working, we also know that there's some labeling is poor. So we're gonna try and rename these categories of annotation. So rather than where we have it being batch zero, one, two, three, four, five, six, we're gonna rename them their actual indices from the experiment. And this, I've seen people fail here because if you get to this step and it doesn't work, it's often because you didn't actually concatenate all of the datasets in that very first concatenate step. So if this one fails for you, check that and make sure you didn't accidentally have like one dataset twice or missed off a dataset. And we're so, so close. Got all of our lovely C. Now we have genotype and sex are in our observations or our cells information. And now the final thing is we did all that work flagging the mitochondrial reads. We wanna not have a column that says two or false. We wanna call them that says, you know, how well, what percentage of mitochondrial genes are in this cell. So we're gonna use that information now. So you've got our yes or our output from and data, sure, copy null, insert field change. Gene symbols, we don't, we want it to look within the column that is talking about our mitochondria. So we're gonna trick it into looking within the mitochondria one because it's slightly more accurate to count the mitochondrial the way that we have done using the GTF file rather than just the names necessarily. And we are there, my friends. So if we'll rewrite the name of this, okay. And then for my own, say I see now that it has all of these objects in it and they're all labeled and it's all lovely. I'm gonna just remove them. Yeah, and I was using these tags in some of these to distinguish between whether I let alavin throw its own thresholds or I dependent on empty drops only, all right. And so the only tag I'll leave this one with then is I'm gonna leave it with them because it's 400 K reads. And that's important to realize this is not the full object. This is only 400 K reads per FASQ. So it's a down sampled object, right. So we've done all that, it's just because it goes a bit faster in a tutorial. We've done all that, awesome, fantastic. There are other ways you can pull data if you aren't as interested in taking it from raw if you wanna believe other people's pre-processing steps we can download the exact same data just with the EBI's pre-processing which for better or for worse, it's amazing because the way it works is they'll apply the same general pre-processing standards to everything but there is a lot if you're looking within a specific data set or specific data cell type there is also the other side of you want to curate your analysis for a specific cell type or group. So there's definitely swings and roundabouts for having a sort of standard pipeline or for having a targeted pipeline. These are the parameters that work for these cells based off of what we're finding in these samples. So it depends on how you wanna access the data so you have it either way. And then it's not in, obviously it's four files right now so it's not in the format that you want. So we have to read it in and that's fine so we pick our matrix and our gene table and our barcodes and keep in mind these will have already had some extensive filtration on them and the experimental design goes there. Yeah, there shouldn't be, I don't think there's anything for that, yeah. So the data will look a little bit different because it's already had that pre-processing. And we're done, congrats on making it to the end. I hope you learned a lot, I hope you had fun. Let's be honest, the next tutorial is far more fun because that's when you get to make your plots. So I'll see you on the other side.