 I'm not 415 as my computer's telling me. So we're gonna get started. My name is Erica Fike. I'm the conference organizer and I'll be the chair for this package demo. Just for a few things, this is the first package demo of this conference. If you're here in the room, if you have a question, please raise your hand. Niles has asked that he be interrupted with questions, not wait till the end. So feel free to raise your hand and we will bring a mic over to you to ask your question. Online, same thing. You can raise your hand in Webex or you can put your question in the chat and we will try and get those questions over to the speaker as soon as possible. And then we will have a Q and A at the end as well. Yep, those are all the housekeeping things I needed to say. So let me introduce you to our speaker. This is Niles Elling. He's a postdoc at the University of Zurich and he's presenting a framework for multiplexed image processing and spatial analysis. Perfect, yeah. Thanks for the introduction and thank you all for joining. I also want to thank the organizers and everyone giving technical support here. It's really great, makes it really easy. Yeah, so I'll be looking at the chat once in a while. There will be some time where I take a little break and ask if anyone has a question so we can discuss in between. So here you see the landing page or like the compiled HTML of the workshop that I'll be giving today. Most of the demonstration will be an R but you can follow along. There should also be a link in the program to this HTML. So what I'll be presenting is a relatively newly developed framework for multiplexed image processing and spatial analysis. It is made up by three packages. One is called Steinbock. It's a Python-based, dockerized framework for image handling and segmentation. And most of the workshop will package demo will be on IMCR tools, which is a package to read in, especially annotated single cell data and to perform analysis on this. I will also quickly demo Cytomap again. This is a bit of an older package to visualize composite images and segmentation masks. So yeah, the code that I'll be presenting today is on top. So when you click on this link, you will directly come to the GitHub repository where you see this R Markdown file here. So you can test it out. I also provide the packages that you need to install and instruction how to get this script. So I want to start with a general overview on the tools that I'll be discussing. So we start from multiplexed imaging raw data. Ideally, these are just multi-channel TIF files. Here in Zurich, we're working with imaging mass cytometry, which is a commercialized system. And they provide raw data in the so-called MCD format. So there's a small pre-processing step that Steinbock can handle to generate these multi-channel TIF files. Steinbock is a framework to read and TIF files to perform segmentation by using a pre-trained neural network. It can then also quantify single cell features of these segmented cells and export spatially annotated single cell measurements, as well as segmentation masks and multi-channel images. MCR tools can directly read in the single cell data to then perform a bit of spatial analysis and spatial visualization. And CytoMapper can read in multi-channel images and segmentation masks for visualization. So I'm not gonna demo Steinbock here. For this, you would need to install Docker. I provide instructions here where you can download raw data if you want to test it. The folder, the raw data will be downloaded in the data folder. In Steinbock, there's a raw data folder that contains four zip files, which contain the raw data, and a panel file. This panel file is only there for annotating the individual channels that were acquired. So in our technologies, we can acquire up to 40 proteins at once. We use metal text to label the antibodies. So this panel file just indicates which antibody, for example here, this is SMA, was labeled with Indium 115. There's also one column indicating if this channel can be used for single cell segmentation. For example, nuclear for nuclear segmentation and then for cytoplasmic segmentation. The Steinbock framework is, well, initially, it's a Python package to make it user-friendly. It's dockerized. So what you need to do is you can set an alias. So the comments that I'm showing here, these are just bash commands. You can set an alias to Steinbock, which is just a docker-run command. Here you need to adjust the folder where the raw data is stored. And at the end, you just need to specify which Steinbock version you're using. And then within these six commands, you can read in multi-channel images. You can filter them. In this line, you segment them using deep cell. This is a pre-train neural network image segmentation. And then you can measure different features of the segmented single cells. The first one here is the mean pixel intensity per channel and per cell. These ones are morphological features as well as the location of individual cells. And the last one is neighbor detection. So here spatial object graphs are constructed indicating if cells are in close physical proximity. And all of this will be written out in a folder structure. I can also show you this here. There's this image folder that contains the multi-channel TIFF files. So here, while looking at the raw data, we have four patients. But patient one actually has three images acquired. Patient two, we acquired four images. Same for the masks. We get each multi-channel image, one segmentation mask. After measurement, we also get the intensity measures per image as well as the morphological features and then the neighborhood graphs. I can also quickly show you how these multi-channel images look like by just dropping them into Fiji and then just adjusting contrast and brightness. So here we have 40 channels, so 40 different proteins measured. This one, for example, here is Ekaterin, a tumor marker. And the last ones here are nuclear markers. And in the mask folder, you have the matching segmentation masks. This is segmentation mask for the first image. So you can see that roughly the structure fits. Here, every round spot is a single cell with an integer ID. But I will come back to this later. So this is what the Steinberg framework writes out. Due to this very standardized form, it's pretty easy to read in this data into R. To make it easier for you, I also provide this data on Zenodo with instructions how to download it. And the real package demo will basically start now. So we'll switch to R. And for the first sections, I will demonstrate how to read in specially annotated data using the IMCR tools package. So here IMCR tools provides the read Steinberg function, which reads in the generated data from Steinberg into a spatial experiment object. So here we have a spatial experiment object now with 46,000 cells. The counts as say stores the mean pixel intensity per cell and per marker. So here cells are in the columns, markers or measured proteins are in rows. The call data of the spatial experiment object stores cell specific metadata. Every cell has an integer ID. We also measured the area, the major axis length, the minor axis length, the eccentricity of the cell. And we also read in some metadata from the images from where these cells were derived. In the case of a spatial experiment object, the cell locations are stored in the spatial quartz slot. So here we have the X and the Y position of the cells. The spatial object graphs. So these are, this is an edge list indicating if two cells are in close physical proximity are automatically read into the call pair slot under the name neighborhood. So here we can see the first cell of the spatial experiment interacts with the 27th cell and so on. So this is important for visualization later. The row data of the spatial experiment object stores information contained in this panel file that I showed you earlier. So this is just information based on the antibodies that were used. And again, here the marker SMA was labeled with the metal in your 115. The rest is relatively irrelevant here. I also provide you a fully processed spatial experiment on Zenodo. So I'm only reading this in here now because this spatial experiment object already contains a cell phenotype. So in the last entry here, for each cell we have identified if it's a tumor cell, a T cell, a B cell. Yeah, of note, this is a cancer data set that I'm presenting here, containing different cancer types. The full processing was done in this IMC data analysis book. I've added the link here. So this gives a more detailed overview on how this analysis can be done, right? So now we have read in the single cell, especially annotated single cell information extracted by Steinbock. In the next section, we want to read in the images. These are once multi-channel images as well as segmentation masks for visualization. For this, we can use the cytomepper package. I wrote this package to handle multiple multi-channel images. It's really heavily based on the package EB image, which is a really great basis for image handling in R and treat it a bit to just handle multiple images. So here, the load images function can read in multi-channel images, as well as segmentation masks. When reading in 16-bit images, and this is how we save segmentation masks, you have to set this as is parameter to true to really read in integers. And you can also look at these individual entries. So here, this is an EB image object which stores integer IDs. And here, these are pixels that all come from the cell with the ID 57. We can now set the channel names of these images. So currently, images have no channel names. In the case of the Steinbock framework, the multi-channel images are stored in the same order as the single cell data. So we can directly transfer the row names of the spatial experiment to the multi-channel images. And now the CytoImageList object, it's a class exported by CytoMapper, contains 14 images, and each image contains 40 channels, similar to what I've shown earlier in Fiji. One important thing is also to add the element metadata of the CytoImageList object. This is important to match images between the multi-channel images and the segmentation masks. We want to make sure that the names of both images and segmentation masks match, so they're in the same order. And here we can set the element metadata of the list to contain the sample ID. So sample ID is an image identifier. This can now be found in the element metadata of images, of masks, as well as the spatial experiment object. So there's also sample ID here. So now you can link images, the segmentation mask, and the single cell information contained in the spatial experiment object. Right, so another option would all be to generate single cell data directly from the multi-channel images and the segmentation masks by using the measure object function from CytoMapper. I'm directly gonna start this because it's gonna take a couple of seconds. So here for every cell contained in a segmentation mask, it calculates the mean pixel intensity for every channel and then also records morphological features and the locations of individual cells and stores this, in this case, in a single cell experiment object, but since a couple of days, CytoMapper also supports spatial experiment objects. And at this point, I can ask if there are already questions now since this will still run for a couple of seconds. We're not seeing any hands raised in the room. Anybody? I mean, that's good. Yeah, you're being very clear. So what's happening here is now, well, the operations are quite heavy. So it needs to iterate through all images. I mean, for 14 images, it's really manageable doing this for, I think we've tested it on 700 images and that was one hour. So using Steinbord for segmentation is definitely recommended. The back end of the single cell experiment, Aya. So I mean, this is all done in memory at the moment. The single cell experiment is usually actually quite small. So we never had issues with this keeping it in memory. CytoMapper also supports images kept on disk. So they are stored as H5 files. You can either generate them in external software or you can use CytoMapper to read an images and then write them out as H5 files. It does, however, make everything slightly slower. So the memory usage is minimal, but the computations take a bit longer since individual images need to be read into memory first and then processed. So here the created single cell experiment object contains again, 46,000 cells. It doesn't, well, it has row names based on the channel names that the images had. And yeah, relatively simple call data annotation based on the morphological features. So again, the area, the radius and then the X and the Y location. Since this is a single cell experiment object, it stores in the call data and not the spatial quotes. So will Steinbord also work on single channel tips and would that have an issue with CytoMapper? Not really. So we haven't really checked Steinbord on single channel tips. In theory, it should work. There might be an issue with the segmentation because you would need to have at least one channel for the nuclear stain and one channel for the cytoplasmic stain. So ideally you want to have a two or three channel fluorescent image for segmentation. CytoMapper can handle single channel tips for measurement and also for visualization. So for visualization, we're gonna select these three images. What if you segmented outside Steinbord? Yeah, I guess, I mean, like cell posts and status, they support only nuclear segmentation. So I guess they are a single channel will be fine. We tend to do a whole cell segmentation. But yeah, if you're using cell posts, then there shouldn't be any issue and it will create a similar segmentation masks to what Steinbord is doing. Okay, so we have selected three images and we can now visualize these images. Here I'm coloring them based on six markers. And this BCG parameter stands for background, contrast, and gamma. So I'm just enhancing the contrast for each marker. We can also zoom in into these images. So here in red, this is Ekaterin. It's a tumor marker. In green, you have CD3, a T cell marker. In blue, this is CD20, a B cell marker. Turquoise, you can't really see since it overlaps. It's cytotoxic T cells, it overlaps with green. CD38 in magenta is a plasma cell marker. And K67 is a proliferation marker. So cytomapper only supports up to six colors due to spectral over, at some point you can visualize more colors. It will just be white. Let me finish with the visualization section and then I will come back to the question in the chat. So here for pixel visualization, we use the plot pixel function. Cytomapper also exports the plot cells function. And this now takes segmentation mask objects and can color each cell based on metadata or based on the expression. So in this case, we can visualize the matched segmentation mask and color them by cell type. So here up here, you can see some tumor cells. And here in pink or light purple, you see cells that we call B next to T cell due to our low resolution. We can't really differentiate between B cells and T cells if they sit really close to each other. So we just call them B cells sitting next to T cells. You can also provide custom color vectors. And in this case, the color vector is a named vector. The names here are the cell types and the colors are, or indicate how these cell types should be colored. And then you get the same segmentation mask out just a lot differently. Also one thing you can do is you can subset the spatial experiment object and you can specifically color a certain cell type of interest. So here by setting missing color to white, you color all cells that are not part of the spatial experiment object as white and all other remaining cells. In this case, CDAT cells are colored in red. Okay, so there was one question if you could write your own custom feature extraction function for a measure objects. So the measure object function supports a lot of features already, so you can calculate the mean, median, standard deviation in intensities per cell. You can calculate any sort of morphological features. It also supports all funky, hydraulic features, it's just something that EB image provides. Custom feature extraction is currently not supported but I mean, it would be possible to add this via an issue or pull request. And then, so the EB image propagate a cell body. Do you have experience with EB images propagate cell body segmentation? I've used it before, I don't remember that it does a Voronoi tessellation, but I've never used EB image really for segmentation other than also thresholding and object detection. So I can't say too much about this. Good, so in the last part of this demo, I wanna switch to spatial analysis. So working with this spatial experiment object, we have spatially annotated single cell data and we can do some relatively simple things using IMCR tools. First, I've mentioned before that Steinbock generates these spatial object graphs by expanding, so it does it by expanding the mask of every cell and then cells are considered in close spatial proximity if the expanded masks overlap. The IMCR tools package also provides the build spatial graph function. So here you can use or you can calculate these graphs are talk and using different settings. For example, you can construct a KNN graph between cell centroids and expansion graph. So here you only detect interacting cells in a certain distance or you can also construct this graph via Delaunay triangulation. We can run all of these commands. So here, yeah, we detect a 20 nearest neighbor graph and the next graph uses an expansion of 20 micrometer. And when you're using the Delaunay triangulation, it's worth setting a max distance otherwise you get a funny border effects. All of these graphs are stored in the call pair slot under different names. So here the first entry is the graph exported from Steinbock and this is the KNN interaction graph expansion and Delaunay. And now using the plot spatial function from IMCR tools you can also visualize these graphs. So here we are just selecting one image and we also want to draw edges and we color the nodes by cell type. So when we do this, so I hope you can see this. Here we can make it big. So here every dot is the centroid of a single cell. It's colored now by cell type and then you have these gray edges between interacting cells. So Steinbock generates an undirected graph. So you have these bi-directional edges between cells. The KNN interaction graph, it now contains a lot more edges because we detect the 20 nearest neighbors and it also takes a bit longer to plot since there are a lot of edges that need to be drawn. So here you can see these cells are highly connected and I wouldn't really consider them direct names anymore, but a 20 nearest neighbor graph now accounts for larger interactions and we can also see a bit further in this demo why this might be needed. And then the last one is the Deloni triangulation graph. Actually I skipped the expansion. So here Deloni triangulation, it also detects these a bit more long range interactions between cells. One feature of this plot spatial function you can also here give the image ID and it will plot all images side by side. I mean for our data, so our images are relatively small. They're only 600 micrometer in width and in height. And this doesn't render too nicely now. Well, yeah. But here you can already see different structure. For example, these four images here contain the so-called b-next-to-t cells indicating tertiary lymphoid structures. All right, so for the per cell protein expression level to use mean or total intensity of the pixels belonging to the cell. So we usually use the mean intensity which is just the total pixel intensity divided by the area of the cell. There are also other ideas of normalizing this better but for us the mean always worked. Some people also tend to use the median but if you have some markers that are only expressed on like certain spots of the cell or just on the membrane the median will be zero and this might be an issue. Good, so the next part of this demo is how to calculate cellular neighborhoods. So this is a term coined by Gary Nolan's lab and they published two papers based on this. The approach is relatively simple. So for every cell you basically calculate the fraction of cell types in its direct neighborhood and you then cluster based on these fractions. So the IMCR tools package provides the aggregate neighbors function and then in the, this function generates a new entry to the spatial experiment object that is a data frame. So every row here again is a single cell and then you have the fraction of cells in its neighboring or in its neighborhood. So here I'm using the 20 nearest neighbor graph. So I'm aggregating across the 20 nearest neighbors in 2D. And for example here for cell four, five percent of neighbors are CD4 cells 10 percent are t-rex and then the rest are tumor cells. And you can then use this information to cluster the cells. Here we are just doing simple K-means. And then we can visualize these cellular neighborhoods. So here these tertiary lymphoid structures they're part of this pink neighborhood. The cyan cells are tumor cells and these purple cells are the tumor stroma border cells. We can also visualize the composition of each cellular neighborhood as a heat map. I'm not gonna go into details here since there are only five minutes left. And for brevity, I'm also gonna skip the patch detection function. This one allows you to detect patches of predefined cells and directly move on to the interaction analysis. So this was proposed by Shapiro. It is based on a permutation approach to identify cell type pairs that interact more or less frequently compared to what you would expect by chance. Gonna start this since it also runs a couple of seconds. So what it does now is per image and per cell type pair it computes the average interaction count. And then it shuffles the labels in this time in this case 200 times and generates an empirical null distribution of the average cell type interaction count. The actual count will then be compared against this empirical null distribution and these empirical P values can be calculated. So you get a sort of a statistical readout if certain pairs of cells interact so attract each other or avoid each other. Yeah, the default iteration parameter here is 1000. Usually it also works if you permute to 300 times or so. Good. So the result is a data frame for every image and every cell type pair. It lists the average interaction count and then provides empirical P values. And the sick well in entry indicates if two cell types are interacting just one. If there's no statistical significance which is zero or minus one meaning they avoid each other. These values can also be summed up across all images and then you get these heat maps where the color indicate if certain cell types are interacting. For example, here B cells interact with B next to T cells. Two more cells interact with each other but they avoid all other cells. So they're quite compartmentalized and CD4 T cells interact with T-Rex which is not super surprising. And with this, yeah, I'm already at the end. There are also more or there are links to all the other resources that I mentioned. And I'm not sure if we have time for questions now. I have one question in the room and I think that will be our last one before we move on. Hi, thank you for the talk. So, okay, maybe it's a bit early for me to ask this question. Okay, I'm actually going to present about it tomorrow in the package demo. So I wrote a package called spatial feature experiment which extends spatial experiment. But unlike your package, it stores the cell segmentation as vector polygons with the vertex coordinates and instead of mask, so it's more memory efficient. So if, okay, so maybe like tomorrow if you get a chance to take a look at a spatial feature experiment, like what you consider like integrating spatial feature experiment with into your package. So can reading your data as SFE in addition to SP and SCE? Oh, sure. So there, I mean, I know the spatial feature experiment. It would be nice. Yeah, I'm not sure if you already have a reader for converting segmentation mask into polygons but I can also check tomorrow. We're using segmentation mask to overlay outlines of cells on composite images. So you still need to store images somewhere. Yeah, so I also roll like applying functions for the polygons. And actually you can convert the masks into polygons such as using the Terra R package which is designed for raster geospatial data. Okay, cool. Yeah, that's nice. I will check this out. Any other questions from our in-person audience? Okay, one quick question. So in the interaction plot you are showing, what does T-Rack interacting with self mean? Like on the diagonal, T-Rack interacting with T-Rack, for example. So yeah, it's a common thing for our data. We usually observe certain cell types to interact with each other. This could be biological, so it could mean that T-Racks are sitting next to T-Racks. It could also be a segmentation or phenotyping issue where, for example, a single T-Rack is split into two. It's sometimes a bit hard to tell. For example, we know that B cells, they tend to cluster. So they form these large bulk of B cells, these tertiary lymphoid structures. So there we expect them to interact more often compared to a random distribution of B cells. But for example, here in neutral fields, they don't really tend to cluster except for a necrotic areas, which we didn't acquire. All right, well, in order to give everybody time to get to the next session, we're gonna close this one. Thank you very much. A round of applause for our speaker. Thank you so much. Thank you. There is a session here in just a few minutes, so you can either stay or head back to the cure building. Thank you. Bye.