 Okay, so there are many techniques which at the end lead to having a single cell on a stack. You can use the, these are all things that we had discussed on the very first day, so you should already know a little bit about these techniques, but maybe you do not know how to handle them in CellRanger and this is what we will be discussing right now. So you can have a multiplexed single cell on a stack. You can have side stack, or sometimes it's also called total stack. You can have actually RNA, single cell RNA stack, but combined with the VDG information, so the TCR information, and you can have the same option, but with multiplexed version, which is a little bit more complicated, then spatial transcriptomics is not yet at the level of a single cell, but they were advertising already last year that it should be, but it's not yet. So this will be coming at some point maybe. Max and I know we already had some questions in the Slack, so there are people probably interested in that, so how to combine, for instance, single cell RNA stack and single cell ATAC stack. And then the last point is single nuclei RNA stack, which is also a technique that will lead to transcriptomics on a single cell. So these are the type of data that I will try to quickly remind you what they are and then show you a little bit of how CellRanger deals with it, and then you will know everything about it. So for multiplexed single cell RNA stack, so this is what the TANIX would advertise why you should be using it, and it's nowadays mainly used because it reduces the cost of sequencing. This is something which is quite clear. For the analysis, the only difference compared to standard single cell RNA stack will be in the CellRanger part. So at the end, you will get output folders and you would get the matrixes of feature barcodes, et cetera. So everything is the same. The only difference is in the CellRanger part. They say that the key advantage of it should be that it increases sample throughput, it increases the number of cells, as I said, in a single experiment, and it increases the number of possible replicates in a single experiment. This has all to do with cost as well. And the last point, which might be the most interesting one, it enables you to detect multiplets in a more easy way and it will remove them even prior to the analysis. And the idea is like that. So this is also something that Hertha explained, probably in a more comprehensive way than what I would do. But you have these seven multiplexing oligos. So sometimes they're also called hashtag oligos or CMO for short, that are added to the cells and therefore you will have one CMO per sample. So in a pool, this means that the same CMO can be actually used in several different pools and this is quite commonly used. And this technique is similar to measure cell surface protein that we will see just afterwards for total sec or side sec metal. And this is how the picture that I took from Hertha's presentation, where you can see how it works. So you have for each of those samples, you will have a CMO that attacks the sample and then it's mixed in a pool, but it has this tag that we didn't know which sample it comes from. And therefore afterwards with the encapsulation, you should have one cell per encapsulation. Sometimes you have doublets and that's where they say that it performs well to remove multiplex. It's because what you can see here in these pictures is that you would have two CMOs in one encapsulation and so therefore you would definitely know that this is a multiplex and chances are high that you would get multiplex from different samples. And so you would be able to remove them already prior to the analysis. So this is what then is performed in CellRanger. And in CellRanger, how you write it, it's also quite simple. You have to write CellRanger multi. So this will be the function you use. You need to specify the idea of the sample you want to look at and you need to have a path to a CSV file. The CSV file itself should look like this. So it should look like this, sorry. It should have several entries that are with brackets like that. So you should have an entry called gene expression, an entry called library and an entry called samples. Let's start with the last one. So in the entry called samples, you should have two columns or separated with a comma where you should have the sample ID and where you should have the CMO ID. So each sample, as I said, was tagged with the CMO. So this is something that the biologist would know which sample he tagged with which CMO. So this is something that the biologist should provide to you. And then you can fill this slot. And then at the end, the sequencing facility should give you access to FASTQ files that are for the gene expression and FASTQ files that are for the multiplexing capture. So the information about these CMOs. And so at the end, you have to provide, therefore, the path to the FASTQ files that are for gene expression and the path to the FASTQ files that are for the multiplexing capture, where each of them will have an identification first. And last point, which is the easiest point to get, you for sure need to be able to give a reference onto which you want to align your reads. And so you need to write the reference and the path to this transcriptome. So this is how the CSV should look like. You feel free to interrupt me if you have questions. So total sec or site sec. This is a reminder of the slide off here. So it's working in a very similar way. And at the end, for the cell range report, you will also handle it in a very similar way. So what you have for each of the samples is that you will have the information about the RNA and also the information about a certain list of proteins that you will have measured. So here is an example of a data set that I had in my hands where they sequenced this stuff. It was like 20 protein proteins, cell surface proteins, that they wanted to add to their single cell RNA sec parts such that they will be able to very well characterize the cell types. So as you can see here is an example of the CD4 gene. Here is the RNA part. And here would be the total sec part. As you can see for the CD4, it was not clear here in the RNA sec part, but it was super clear that all these cells here must be CD4 positive, if at least if you look at the cell surface marker. And for the CD19 as well, it's quite sparse. And this is what could happen for single cell RNA sec data. And if you look at the protein, it is super well expressed. So I had a question that people asked me, but then what should we trust more and how can you actually annotate the cells at the end? So the annotation or at least in this data set, how we did it and what they wanted is that I first cluster the cells according to the RNA but that I use the protein to guide the annotation. And so here you can see a clear cluster. And this is the cluster which has very high expression of CD19, which is a marker of B cells. But actually even if I would only have used the RNA sec part, so the RNA part, even if CD19 was not high, there were other genes that were specific for B cells, such as CD79A, which were actually quite high in that cluster. So I would have been able to identify this as a B cell cluster, even without the protein. However, for instance, for the CD4, it was more tricky because as you can see, what we're grouping together and what actually was recognized as a cluster in terms of serrat was this whole part down here. And not all of these cells, if you look at the protein, were CD4 positive. So it was not 100% corresponding in terms of RNA part and total sec part, so protein part. And I would probably not have been 100% sure to identify this subgroup as being where the CD4 positive cells lie if I would have only looked at the RNA. So we gained information by having both the protein and the RNA. So this is just about biology, why it's useful and at the end, how you do it in CellRanger. It's also in a similar way. So you would also need to specify a CSV file. And the CSV file should actually in the same way as before have information about where you can find the FASTQ files that are linked to gene expression and the FASTQ files that are linked to antibody capture. And you need to also specify a CSV file that enables you to understand the reference used for the tag of the antibody for the proteins. And this is the pattern that will be used. And this is something that you have to look up for the different technologies that are used of what this pattern actually is. And a reference to the transcriptome, as always. For VDG single cell RNA sec, the CellRanger enables you to have a function that is just called CellRanger VDG. Again, you need to specify the ID and the FASTQ file. So the path to the FASTQ file. You also have a CSV file that you would use here. You would also specify the transcriptome. Here the transcriptome would be with the VDG. And then you would just specify where the FASTQ files are. So then there is VDG 5 prime multiplexed single cell RNA sec. And I had one in my hand, so I will show you the results just afterwards, where VDG was actually useful. I have to specify something important here is that there was a big change in terms of VDG 5 prime multiplexed single cell RNA sec at the latest version of CellRanger. So everything I say now is for CellRanger version, from CellRanger version number 7 on. And before it was quite more complicated. Here you have a way to analyze this data and you have to follow this protocol in order to be able to get to your reads at the end, so what it does and CellRanger again says that it does not really support 5 prime multiplexed data and that it doesn't recommend to use that. So that's why they have made a warning in previous versions of CellRanger that they did not want that, but they have removed that warning from CellRanger version 7 on. So you can just use CellRanger multi and specify the path that are linked to VDG and then use it like that. So the first part of the dataset is that you will use demultiplexing using CellRanger multi, the single cell RNA sec part, but without touching the multiplexed VDG fast queue files and this will generate BAM files for each sample. So this will be the first step where you will try to demultiplex the sample as if you would only have a multiplexed single cell RNA sec dataset. So exactly as I showed before for the CellRanger multi. This gives you BAM files that you will then have to return back to fast queue files by making sure that you only create one fast queue files per sample and this is quite important. And then you will have to, yeah, this is how the multiplexing part looks like. This is exactly the same as I just mentioned before. And then you will have then per sample a single cell RNA sec file, a fast queue file per single cell RNA sec data. And then you will use that with the VDG to map to the genome again. And this is then giving you then you're able to obtain VDG and RNA sec results per sample as desired at the end. This is the idea. So then you have a second CSV file that you need to provide and the second CSV file will look like that. You will have a gene expression part, a VDG part. The VDG part is just a reference to the VDG assemble genome or transcriptome. And then you will have the part of the single cell RNA sec. So the reference to hear its mouse path to the reference of the mouse transcriptome. Then you have here the fast queue files. So you will have used the BAM to fast queue method to go from BAM to fast queue files. So probably they will have that in the name. And then you have the path to it. And then you say this is where you have the gene expression. And then you give the path to the VDG. So these are the two fast queue file paths that you need to specify here. So in this data set, what I got is at the same time the information for the TCR sequencing and at the same time the RNA sec expression. So at the end, you are able for each TCR clone to understand from which cell it came from. So what is this a TCR from a memory cell? What is this a TCR from a naive cell? Or was this a TCR from an active cell? And in the data set I had in my hand, they were asking the question how different immunotherapies would react in terms of what kind of different clones you have in a data set comparing to the different immunotherapies that you would give to the mice. And so here is one therapy, which is the PD-L1 and then you have another immunotherapy which is the PD-1 IL2V and then the combination of both. And what you can see here at the end, this is a percentage of the expanded clones. So the clones that are probably doing something and you can understand that in the case of the PD-L1 you have more naive cells that are not found in the other ones. You have no active cells or early active cells and you have a higher proportion of memory cells. In the case of the PD-1 IL2V, you can see that you have much more early active cells. So this might be the cells that are actively fighting the tumor. And you can see that the combination does not lose those active cells. And in terms of TCR, we actually were also able to show that the one therapy, so this one, would create much more diverse cells, diverse TCRs. This therapy was creating much more expanded TCRs and the combination of both would benefit from both. So having a more diverse setting and having a more expanded setting. So this is then why the double therapy worked better. And this is, I think, where it was nice to have this at the same time that TCR and at the same time the understanding of the cell types gained by the RNA support. So spatial transcriptomics is another subject. So it's quite difficult to analyze as I understood. I didn't have it in my hand. But so at the end, what they gained up to now, I think it's 10 micrometers thick tissue slides that you then have spots of 50-meter barcodes and then you have the information on those 55 micrometers. What is the expression in that spot? So at the same time, you will have the spot information. So where it is on your slide and you will have the expression of the cells inside. However, what you do not really know is what number of cells you have inside. So you don't know if it's a unique cells or if it's a 30 cell or 50 cells. And this is something which is quite challenging in this type of analysis. Instead of cell ranger, you use something very similar which is called space ranger. And the output are also quite similar to cell ranger. So you not only have actually information about the RNA sec, but you also have some information about the XY coordinates of the section. So this would be just another layer of information on top of your single cell data. So they said they would reach to a resolution which is at the level of a single cell, but they did not yet. And so they want to go down to 10 micrometers such that they would be able to go to the resolution of a single cell. You can use SRAT for the analysis, but there are also some other very specific analysis, data analysis tools that have been developed. You can use, for instance, a spatial experiment object, which is a special type of object that is similar to a summarized experiment object as well as the single cell experiment object to deal with that data type. And at the end, the function that you have to write in space ranger is just space ranger count and then put down the slide. So it's quite easy to use. So multiomics, there is the possibility to combine single cell RNA sec with single cell ATAC sec on same cells and this is quite interesting. And single cell RNA sec then would help the ATAC sec single cell to be annotated by doing a label transfer. So it's with the function label transfer that we discussed yesterday that you are able to then link together the RNA sec part and the ATAC sec part. And at the end for the single nuclei RNA sec, so this is a reminder, also have the slide of head. It's an alternative to single cell RNA sec and it's quite useful when you have tissue that is difficult to dissociate. And there are no ribosomes, so no translation of transcription factor during processing. You have lower representation of immune cells and surface protein. So this is something which was shown quite clearly in that article where they, if they went for a single cell RNA sec or single nuclei RNA sec on the same tissue, they ended up in one case with roughly 15% of immune cells and almost 0.7% or something like that of immune cells in the case of single nuclei RNA sec. So my understanding of the biology is not so clear why you would lose so much immune cells but the truth is there. And in terms of analysis, there is actually no difference in a cell arranger part. You treat them exactly the same way. There's just a difference in the QC because you don't have to remove high mitochondrial content cells. And otherwise it's treated exactly the same way in terms of the analysis. So I went a little bit fast. So if you want me to go back to one of the methods because it's the one that you will be going for in your data set, let me know. And that's I think it for that presentation.