 We had a new bias academy, if you haven't heard about us from before today. So we started in April this year. We had actually now already 21 webinars. We had a huge interest. We had over 10,000 registrations and recorded videos on YouTube that you can find on the new bias channel where we collected over 27,000 views. And for today, I actually already passed on to Marion, who organized the webinar today to introduce the speakers and moderators of today. So today we have two speakers, Florian Lever, who is working at Intem in Bordeaux and Thibault Lagache, who is working at Institut Pasteur in Paris. They are both specialists of the analysis of colloquialization and advanced methods. So Florian will speak for the first part of the webinar, approximately 40-45 minutes, and then Thibault will continue. So we have also a range of panelists here to help answer the questions you have in the Q&A, to moderate the session and to ask also some questions to the speakers. So don't hesitate to ask questions in the Q&A. So Florian, the floor is yours. I'll let you share your screen. Okay, this is good. So thank you, Marion. So I'm Florian Lever. I'm part of the team of Jean-Baptiste Bareta at the Interdisciplinary Institute for Neuroscience. And today I will focus on this first part of the webinar on colloquialization method for single molecule localization microscopy. I'm not in presenter mode. Okay, now. So you already know the speaker and moderator's team. And basically, my presentation will be a short intro on colloquialization and fluorescence, and then on single molecule localization microscopy. I will present a few techniques that use unorganized neighborhood methods. The technique that we developed that is stationation-based. And we'll do a fast demo of the software developed that is colloquialized. So you can find the software at this link at my GitHub. And you have also access to the slide I will present today on my GitHub. So for the ones that attended the webinar of last week done by Fabrice Cordelier, you already know about colloquialization in fluorescence microscopy, but I have one slide in order to get it back to that. And usually a lot of people that are doing colloquialization in fluorescence microscopy are using two very well-known coefficients that are the Persian correlation coefficient and the Mander's coefficient. So the Persian is an indicator. That means that you have a relative value for the colloquialization. It's good to do some comparison, for instance, whereas the Mander's coefficient is more a quantifier where you will have a real value of the colloquialization. So if we have these two images where you have some spot and you want to know if they are colloquialized or not, it's pretty easy to do this kind of colloquialization analysis, Pearson or Mander's, because you have the pixel-to-pixel correspondence. So for instance, in this image you have this pixel, it corresponds to this pixel in this image. So it's quite easy to do some computation of the colloquialization pixel-to-pixel with respect to the two staining that you have. In the case of the Pearson correlation coefficient, what you are going to do is that you will have a scatterplot where you have the intensity in the first channel and the intensity in the second channel. And for instance, in this case, you have a high value for the intensity in the channel one. So you are very near the end of the scatterplot while it's quite moderate in the channel two. So you will have a point that is here. And you can do that for every pixel in the images and you will end up with this kind of scatterplot. And then you can compute the Pearson coefficient directly on the coordinate of the point in this scatterplot. And you know if you have some exclusion, so no correlation, which means close to minus one, if the two channels are not correlated, so this is this kind of fit and then you are around zero, or if it's correlated, so you have this kind of fit and then it will be close to one. So this gives you some correlation and usually you will use that with different conditions and you will know that one is more colloquialized than the other if the value that you get with the Pearson correlation coefficient is higher. And you can do also Mander's coefficient, so more of a quantifier. And indeed now what you are going to do is that you are use the threshold in order to do some segmentation. For instance, here you can see that some threshold where directly put, that would be the threshold that give you this object here. And then you will just compute some quantifier of the overlap between the two channels. So for instance for the Mander's coefficient of the channel one, it will be equivalent to the area in yellow divided by the area in magenta. And for the Mander's coefficient of the channel two, it will be equivalent to the yellow area divided by the green area, divided by the yellow and green and yellow and magenta here, sorry. So that means that in the case of the channel one, if you add just these two images, you will have a Mander's coefficient of 0.3 here and a coefficient of 0.77 here. But that's what you are doing when you have images, when you have a pixel to pixel correspondence. For my talk, I'm talking about single molecule location microscopy. And now there's something that is really different. Contrary on corollary to fluorescence microscopy where you are going to have all the fluorescence in the image on an image, everything at the same time. And so if you are on white field, for instance, you will have the problem of the diffraction limit of the light and then you won't be able to see structures that are below around 250 nanometer. In the case of the single molecule location microscopy, we are able to only illuminate, only switch on a few, a spare subset of the fluorophore, like you can see here. And because they are very sparse, we are really able to localize them very precisely. For instance, using some Gaussian fitting. And then we have the real position with some uncertainty of the localization. And then you can reconstruct your data with, basically, a point cloud in 2D or 3D, the coordinates of the molecules. But then now, how can we use this localization in order to do the computation of the co-localization? Because here we really have points. At first, people just get back to what they knew and they did image reconstruction. So basically you have the pixel size of your camera, which is, for instance, in this case, 116 nanometer. So here you can see a defected limited images. Basically, we don't see anything in this region here. If you have the localization, you have the sub-pixel localization that you can see here. So you can say, OK, I will have a pixel size. I will define a pixel size, a new image with a pixel size of 20 nanometer. You will have these images then. And you just project every localization in the pixels. And then you will have this kind of images. And you can apply a Gaussian that is dependent of the uncertainty in the localization precision of your localization. And then you will have an image that is smoother. And because you have image, you can go back to something like the Pearson coefficient or the Mander's coefficient in order to compute the co-localization. But in a way, it's very pitiful because we spend so much time to try to find the correct position of the localization. We are in a continuous space because we have your point clouded into the entry D and it is pitiful to go back to an image to discredit the information and to lose some information that we have with the localization. So when we have two staining, we have two point clouds and we want to be able to quantify the localization, the level of co-localization between these two point clouds. So there is different solution. You can do some object-based co-localization. So you will compute some statistics on some segmented object on your localization. But that's something that I won't talk about because it will develop in Part 2 by Itibo. And you can do some... So there were a few techniques that were specifically designed in order to work with single molecule-localization microscopy that are using unorganized neighborhood techniques. There are ideas that you are going to compare the spatial distribution of the two channels in a defined vicinity for each localization. So for instance, in 2014, Rossial just released a technique that is based on the Gaetis and Franklin method. So here you can see that you have... It's a basic simulation with some circular clusters that are separated, but with some overlap. And what I've put here is that we have exactly the same density between the two staining, localization density. So here you can see the magnification of one of the clusters. What you are going to do is that you are going to define a radius and you are going to count the number of localization inside this radius in your color and in the other color. And with this number, you can compute the L value and you can see that here, this is what I say, this value here is a number of localization in the color in the radius. You divide by the total number of localization of your data. You multiply by the area of the total area of the data and you divide by P. Basically what you are doing is that you are doing some kind of normalization with respect to the radius of research and to the area of your image, of your data. And that's very important to do some normalization because when you are doing single molecule localization microscopy, you can have staining that are very different localization density because it will depend on the flow of form, the labeling, on the acquisition time. And so you can really have different density with your two colors. When you have done that, you have L value for each of the localization. You can put them in a scatterplot and then you can just define a threshold on the scatterplot in order to find the localization that are co-localized. What we saw when we tested that is that even if there is some normalization you can see that if we change the ratio of one of the colors, now we have five times more localization in the first channel. The frame of the scatterplots change. Here we were between 150 and 150 and now we are between 150 in this color and 90 in the other color. That means that if you put this threshold in order to find the localization in this ratio one to one dataset, you put exactly the same threshold here, you won't have exactly the same localization analysis and that's something that can be problematic if you have a lot of different data to analyze, a lot of different cells. If you need to change your threshold, that's something that is problematic. The normalization is really good but not sufficient for what we need to do. Another technique that was published in 2012 is the coordinate-based co-localization technique that was developed by Sebastian Malcouche in the team of Mike Ellemann. Again, the idea now is to define a radius where you have a number of bins and what does the number of bins is that it will divide the space of your radius in different bins where you are going to compute the number of localization. So basically here you say you want five bins and the maximum is 15 nanometers that means that each bin will be 10 nanometers and you compute the number of localization in the two colors in the bins. Here we have one red and two green so we do that. Let's go back to the second ring here and we can compute that in this ring here we have four localization red four red localization and three green localization. And then we do that for each of the bins this one, this one, this one and we put that in a scatter plot and then you can compute the Spearman-Rang correlation that gives you the same kind of ideas as the Pearson coefficient but why don't we use Pearson coefficient here? That's again because of the difference in the localization density that you can have between different staining in single molecule localization microscopy. Because the Pearson coefficient will expect that you have a linear relation between your two value, your two the density in channel one and the density in channel two. And that's something that is very difficult in single molecule localization microscopy because that means that you need to have the same density for the two channel. And when you do Spearman-Rang correlation you don't need this linear this linear relationship between the two density. And that's very interesting for single molecule localization microscopy. So here I just explain you a very basic way of how it works because it's a little more complicated than that. First it's not the number of localization that you are doing a little more complicated where you are again normalizing with respect to the area of your research. And when you do Spearman-Rang correlation you are not doing the fit on the direct values that you can put here but on the rank so that means that here you can see the rank here I did it on the number but it would be done on this computation here. And we can see that here the first when you rank basically you have an array and you do a sorting that means that this one is the first this one is second here third, four and five and you do that exactly in the same with the second color and then you do your correlation on this rank not directly on the on the values that you can put here but I wanted to have something pretty much clear to understand and in the end still you are doing your rank correlation and you know that if you are close to minus zero you have exclusion if you are close to zero you don't have any correlation and if you are close to one you have a very big correlation with your sustaining. So here you can see that I have two different condition here some overlap of the cluster and here a full collocalization and I have defined here a radius of research of 15 nanometers and a number of bin of five and we really manage to we can see on the distribution of the spearman rank correlation of each of the localization that we are a lot more collocalized in this case still the problem is that you have two value and that means that every time you have two value you can change any of these two and you can have some results that will be different so it can be quite difficult to find exactly what are the correct parameter when you have two parameter in order to find the proper computation of your collocalization and another problem that they add in the paper is that they computed the number of the percentage of collocalization as a number of localization that were higher than the threshold divided by the total number of localization and in this case that means that here you can see that we have exactly the same cluster but in this case we just have more green point in the background well in this case that means that here we have more collocalization in this case because now the background is higher and that's something that can again happen in single molecule localization where one of your staining may be less specific so you will have more localization in the background and that's something that is problematic when you want to compute the collocalization so in 2016 Pajon et al. tried to improve this technique and they released a closed dog technique which is combination of db scan and the coordinate base collocalization db scan is the density base spatial clustering analysis with noise is a segmentation technique so I would just for the one that already attend the webinar on the single molecule localization analysis you already know that because I presented it but I will do a few slides in order to explain to the other what is db scan and really db scan is a segmentation technique so what you want to do is segment some part of the localization and say that they are object of interest for you you organize the localization with respect to three classes there are the core point, the density variable point and the Wittler point and again you have two parameters the radius of research in order to compute the density and the mean number of points that will define if your point is a core point or not and basically if you have just this small cluster for instance and you define a mean point of four that means that you have this value of radius you will compute the number of localization here so it's five five is bigger than four so you have a core point and then if you go here you have two points so it's not a core point one of the neighbor is a core point so it's a density reachable point and in this case here you have only one point so it's below four and there is no core point in the vicinity so it's a it's a Wittler point you do that for every one of the localization you end up with this kind of classification and you will say that your object is the core point and the density reachable points and then you have your object and the background and the idea which cluster was to limit the core localization analysis to cluster to not have this problem with the background so you do db scan you segment your cluster in the two color here you can see the first cluster in magenta and the second cluster in green and then you compute the coordinate based the core localization only inside the point of the clusters but what we the problem here is that you have five parameter so you have two for db scan two for coordinate base core localization one for the threshold so maybe you can use the same radius for db scan and the coordinate and cbc but it's not even sure because the technique are not exactly in the you don't compute exactly the same value for the density so maybe you cannot choose exactly the same radius and what we see is that if you have reduce a ratio of one to one and one to five for instance in this case with a lot more of the green localization you have a real problem with db scan because db scan use half threshold so you cannot directly have a normalization and you really need to have a threshold that is dependent of your density so if you fix a threshold and put it on every one of your cells you end up with this kind of segmentation for one of the and then your to be rubbish because now you have done the cbc and all these localization here and then you can see here that now you have a bigger localization that in this case just because you have some point in the background and if we go back to the fluorescent microscopy as I presented in my first slide so what we call gold standard is the Pearson and Mendelsk coefficient but it's basically because a lot of people are using it since they are very easy to use and understand and you can use that because you have the same image dimension for long there was no equivalent possibility in single molecule location microscopy because there is no easy way to link to localization since they are not at the same position the problem with the techniques that are existing for analysis of multicolor single molecule location microscopy data that they can be they are very nice, they are very good normalization and this kind of stuff but they can be difficult to generalize in good shapes and density and some of the parameters that you need to use are really dependent of the shape or the density and so at some point we were interested in trying to adapt the Pearson and Mendelsk coefficient to multicolor single molecule location microscopy and we really wanted to have something that was robust to the variability in shape and density and for a few years now we are working with Voronoi diagram which is a space subdividing technique and that is anisotropic pair by nature and we really like this structure because for one localization you create one polygon so you can really see that here you have the localization and you can see the Voronoi diagram and very basically any point of the space that is inside the polygon is closer to this point than to any other localization so that's really the way to find these edges it's the bisectories between the two points and there are very nice features of Voronoi diagram and for instance you have the connectivity so you know the direct neighbors so the polygons that share one at each so for this one we know that these direct neighbors are this one, two, three, four, five, six localization and it's very scalable that means that the denser the localization are the smaller the polygon will be and then with this kind of information you can really have very good you can add some very interesting features to the localization and what we did is that we wanted to do some generalization of that and we can use the polygon to compute a set of different statistics so for instance if we are at this localization we know it's polygon obviously so if we add the rank zero the number is one, the localization here, the area is the area in yellow, the area of the polygon and the density is one divided by this area if we go to the rank one we know the direct neighbors so now the number is five the area of the old polygons is the yellow plus green and then the density is five divided by the total area you can do for the that for the rank two and the rank K but what we did is using the rank one because we find it very interesting since it has a smoothing effect on the computation of the density and here you can see that the color of the different polygons are directly related to the density so this one is the denser on all the locations that we have in this data here and what we saw is that with the Voronoi we can do some kind of Voronoi normalization so it's not a real normalization the mathematical point of view but we call it a normalization in a very easy way so if you have this kind of clusters here that you can see and here we have a ratio one to five so we have five more localization here but you have exactly the same threshold if we do the computation of the local density as I explained it you will have these two histograms what we can see here is the localized mid-distribution of the density is that the two histograms are shifted which is expected because here it's denser but if we apply some threshold obviously here we have chose the threshold between the background and the clusters so we have a nice selection of the localization in the cluster but if we apply exactly the same threshold to this denser dataset now we have a lot of localization in the background that are selected so that's quite bad and what we saw is that a very simple Voronoi normalization is that you can divide the density that you compute for each of the localization divided by the average density of your dataset and you end up with this result where you can see that now the two histograms are on the same frame and that if we apply a threshold at zero so because it's a logarithmic distribution here it means one time the average density you will really manage to select the localization that are in the cluster so it's very interesting because now we can use this kind of normalization in order to apply the same technique the same automatic technique to hold the cell in a dataset and what we did first in 2015 was we wanted to use that for segmentation so we compare the local density to the average density of the dataset and we use that in order to do some segmentation of synaptic protein so we managed to find the number of ampala receptor in nanodomain and what we were interested in is to see if it's really robust because we did that for synaptic protein but then we use that in order to do some clustering in automatic segmentation of 1960 weight plates so a lot of different data we just use exactly the same threshold and we managed to do some really nice segmentation where we see that the clustering was different depending on the render of concentration and since the release of our software a lot of different group have used that for instance in plant biology or for viruses and we have seen that really this segmentation, this selection of some localization that are part of object of interest is really robust to the localization density but then now we were at the moment where we have these two staining and we want to be able to from these two staining so from I think the possibility to have one Voronoi diagram for each of the staining to be able to arrive to this kind of result where to be able to classify the localization with respect to either in the background it's some identity in one color or collocalized and in the end the technique is very easy and is really just taking benefit of the features of the Voronoi diagram so if you have this very simple simulation with two clusters that have some overlap you have your two Voronoi diagram now you can do like for the Gettys and Franklin framework or for the Pearson coefficient you can have a scatterplot when you will plot each point with the density in channel 1 and the density in channel 2 obviously if you have a right point here it's very easy to find its density in its color because we know it's polygon so for instance here we have a density in the first color but in the end it's as easy to find the density in the second color because this point is inside a polygon and we know this polygon is dependent on this localization so we can find the density of the other of the other staining for this point that is this one so now we know that this point is identity in the color 1 and low density in the color 2 so we put it on this scatterplot at this position if we go to this point here we see that it's a low density in both channels so now we have point that is here if we go to this point here we have identity in the two color so we have a point that is here we can do that for all points in the data set obviously we can separate the two scatterplots to have one scatterplot per color and then we can apply the threshold that I show before very robustly in order to have some things that we will call the Voronoi-Mander coefficient and with this threshold we are also able to color to have our different classification of our localization so now we know that the localization in the background are black in yellow and cyan we have localization that have identity in one color and in blue and red we have localization that have identity in both color and we can compute the Voronoi-Mander coefficient as the total of the density of the localization in red divided by the total of the density of the localization in yellow and red and we can do that also for the second color with blue and cyan and blue we have also a way to compute a very generic spearman rank correlation on all the data set by just using the spearman rank correlation on this old scatter plot here and this one obviously so we have this data set we end up with this kind of result where we can really see directly where are the colloquialized region and where are the value of identity in the two color so we wanted to see if really this technique was robust so we use some experimental controls for instance we use microtubules where we had a DNA paint acquisition and we separated our acquisition in two so it's one data one cell that we divided in two in order to have two perfectly colloquialized data and what we did is also we we can see here different how we we have a lot of microtubules very few and between the two we also varied the localization density in one of the color so you can see that in magenta we always have exactly the same number of localization but on the second color we start from something that is one to one ratio to something that is one to ten ratio so one we removed 90% of the localization and when we plotted everything we see that we have a very high value even in this case where we have very few microtubules and in all cases even when we have a ratio of one to ten it's very very nice and we don't have 100% because we are still on single molecule localization macroscopy data so we always have some differences we don't have pixels so we have differences on the border of the localization of the structure of interest we then our technique is working in 2D in 3D so we did a very basic testing in order to see that in 3D indeed it's working really well so here you can see some microtubule and the nucleus and the nucleus is below the microtubule and so when we did our analysis in 3D we had a Mender's coefficient of close to 0% which was very nice and in this case for all this data they are required by Rémy and analyzed by Rémy and Coré of the team then we asked some collaborators so Philippe Hauss and Jean-Arys at EMBL to give us some data of nuclear power complex because they are very challenging here you can see two proteins one of the nucleus and one on the periphery of the nuclear power complex basically I think the nucleus is something around 16 nanometers so every time you do some localization analysis it's dependent on the resolution having this kind of here what we want to have is something that is not colloquialized because we have the nucleus and something that is at the periphery but with this kind of resolution it's very difficult to find a very good estimation of this no colloquialization and here you can see a result in 3D of one of the nuclear power complex so here I'm showing in 2D because while in 3D the Voronoi diagram is quite difficult to understand but you can see that we have dense region here and dense relation here and here and in the end we managed to really see that we have a very low colloquialization analysis for the nuclear power complex so we were really happy where you can see that we really managed to say that we are not colloquialized even in these very challenging data sets in the end we also use that in order to do some quantification of data that were already quantified in 2014 by some of our collaborators but by using reconstructed super resolution image so you have the PSD that is a post-synaptic protein and then you add some proteins that are either part of the nucleation or the allegation of the dendrite and so when you are at the nucleation you are very close to the PSD which is the case of Habi so you have a very high colloquialization for the vaspe for instance you are on the allegation so you are far from the PSD and you wouldn't expect to have some high colloquialization and we are very happy because we had very similar result to what they did but at the time they used reconstructed super resolution images so they had workflows that was pretty complex and not easy to generalize and in our case we just applied a threshold of one time average density on all the cells so basically I click on a button and the result so I was pretty happy with that so to end the software I spoke of are available so SRT cellar for the segmentation is available at my Github and you have a one click window installer and the code source is available for colloquial cellar that I just presented here you can also you have a one click window installer on my Github the code source is still pending because it's quite a mess so I need to clean it and I decided that I will merge the two software on one so it's taking a little more time than expected but in the end everything will be on the same platform so it will be easier to have everything in 2D and 3D directly on the same platform so I want to thank obviously all the people of my team and particularly Jean-Baptiste, my boss Rémi and Corez that help with the acquisition and the localization analysis of the data all collaborators at the laboratory Olivier Rossier, Grégory, Giannone, Eric Cosy, Daniel Chocquet all collaborators at EMBL Jean-Arys and Philippe Hauss all the funding agency again the information about SRT cellar and colloquial cellar so because it's on Github you have the possibility to if you have issues to raise and here there is also we are part of the image.cs forum so you have the tag cellar if you have some questions here and then I can answer and now we do a very fast demo of colloquial cellar so when you want to download it as I said you have my Github my Github account where you will find colloquial cellar and the no bias academy for the slide colloquial cellar and SRT cellar so if you go to colloquial cellar you have a release here so for now it's empty because there is no code source but you have the release so you can go here and download it and download the manual then you can install it and when you have installed it you have that on the data set you have a zip so you can unzip and have a few of the a few data that you can test it when you want to open data so you can either open directly the whole directory or one localization file so I will open the directory so I go to data set and I will take for instance the simulation in 3D so it opens everything and here because it's colloquial cellar you cannot do some segmentation and this kind of stuff you just have the colloquial cellar which is an analysis so you click here you have the possibility to just create one colloquialization data set or to run on all the open data in this case it will use the name of the data set in order to do the corresponding between the two color and if you do that it will create the Voronoi diagram and do the computation so here you can see that the Voronoi diagram is very beautiful and it's because it's in 3D and as of right now we for colloquial cellar we don't have a 3D viewer only 2D so that because of that that it seems really bizarre but if you go to the colloquialization here you will have the scatter plot that I show you have the possibility to have some display so the original localization the classification or the Voronoi diagram and you can change the threshold when you have LRE you can compute the value on the LRE and then you can compute the coefficient and when you click you can see that everything is updated with your new value and here you can close everything if you want on one click and then go back to for instance the 2D data set open everything very basic simulation do again here you can see that it's a 2D data set so now the Voronoi diagram seems to do something that we can understand and again you have everything that is updated depending on the data and that's all I will finish with that so if you have any question there is few questions but not so much in the Q&A so it seems that you are very clear did you listen to me Florian? yeah yeah which technique did you use for the 3D so which modality by plane or astigmatism? in our case for our data it was astigmatism but it works for any modality yeah yeah well basically collector seller just taking a localization file so coordinates you already have to do the localization before but if you are using bitplane in the end you will still have a coordinate in X, Y and Z so obviously you can open it in collector seller also the software about the software so the software is freely available on Gitter is 2 software available for Mac? no for Mac I don't have Mac so I never try to do the compilation so for now for SRT seller the code source is on so anyone that has a Mac can try to compile it I think I put CMake list so it should be easier would you expect yeah there is a CMake list so CMake can help to do the compilation I would expect to not have put Windows specific function maybe a couple of them still but I think I really try to not have anything specific to Windows in the code but no I don't have Mac so I never did the compilation and for collector seller the source code is not available still so when I will release the complete platform there will be all the collection inside again it will be with a CMake list so to help compilation but I'm not able to do a Mac compilation and there is the last question about okay can voroneu analysis be applied to segmentation of spots like we have a local maximum segmentation these kind of things for spots is it for that or for localization process well I would say why not if you have only one spot no I would say no because still voroneu is performing better when you have some data for various parts data it's not the best solution for sure if your spot is several localization so yes it's basically going back to cluster analysis so I would say yes so it depends if it's one localization no I would say no okay there is a last question how large can a data set could be what is the maximum of localization that you can manage so for this version I would say maybe a couple of million even in 3D it will be long but I think it should work for the next release when I will put everything in it will be several millions into the 3D okay the question appears there is plans for an image or Fiji plugin at some point I wanted to try to do that and I don't have time so I try to have some funding for that but never managed to so it's complicated because I'm still using C++ it's not developed in Java it's not the same language because I wanted to be very efficient and very be able to handle millions of localization even in 3D and porting it Java is not so easy it's not just done in one day at some point I managed to have some funding to do that I would be very interested in doing that but me I don't have the time to do that it's too much work it's specifically difficult for the visualization visualization I think could be there is OpenGL in Java so it should be okay it's more like the quantification and Tesla is several tens of thousands of line of code very lot of work to put it in another language another question practical question so the participant asks so he has two lists of coordinates so is it possible to upload the two channels separately in the software what does CollectorCeller now is that it will open one localization file per staining and then you click on the button in order to link the two and create the collocalization object so there is no possibility to have the two staining in the same file he has to merge in some way but my understanding is that he has one file per staining so it's perfect okay that's Mario I think that's it there was one last question about the format that you read for the software as input and maybe more generally can we find some documentation there is a manual but maybe in the manual of CollectorCeller I don't speak about the format that's possible basically the format for CollectorCeller is the one that is done by is the same as Thunderstorm CSV with the same name for the column okay thank you that's it we don't have other questions thank you Florian for your talk and then we move on to Thibault Thibault Thibault the floor is yours I will share your slides with the participants okay so hi hi everybody something to share my screen okay I think this is the first time I'm sharing my screen what is it where is the button to share the screen the green button at the bottom in the middle oh yeah thank you okay I hope you can see it is everything okay you can see it okay welcome everybody and thanks to the organizer of Nubias Academy it's a really nice initiative and I hope it will continue for a long time so in this second part I will give some details about the statistics to analyse the spatial distribution of objects and objects are quite general and also for localisation in SMLM but this is like more a general framework so my name is Thibault Agache I'm a researcher in the bio-image analysis unit at Stupaster the head of the unit is Jean-Christophe Olivier Vaut-Marin and everything that I will show you today about SODA the plugin we developed is available in IC in the bio-image analysis platform that is free open source and can be downloaded through this address so we just give again a general outline to explain where we are going from a localisation program to a point process and a spatial analysis problem then there will be a second part specifically dedicated to this spatial analysis of the art point processes and at the end I will give you an example of some analysis we did with Olivier Dengleau who is also here today to analyse the spatial organisation in three colours a structural illumination microscopy just to show you can map the organisation of synapses at the entire neuronal level so I think I understood that to do some localisation actually it's not easy it's like a pipeline so typically a first step when you do such type of pipeline is to at least decrease the noise and sometimes detect the object or the localisation of your objects and then based on this there are quite a few standard approach like the correlation signal, the overlap of the mass that you can detect or a distance based approach so for those of you that have watched the webinar of Fabrice Cordelia last week he talked about the first order statistics where you look at whether or not the centre of some objects are inside the mass of the others and more generally there is a lot of distance based approach. In this approach you are going to look at the distance between neighbours quite generally and then let's say the most difficult part is to interpret to give a statistical significance of your results because you can more or less always observe an overlap or a correlation between signal but does it mean something and to answer this question I think this is the most difficult question to answer and most of the time what you can do is to shuffle your signal or let's say to randomise your localisation etc to extract some p-value, some statistics but that's not really easy and so this was just a general overview of a co-localisation pipeline and that's good many techniques have been developed and as you have observed with the previous part of Florian all these approaches are being questioned with the introduction of a super-resolution microscopy because now we have access to the localisation directly of objects or even more generally we have before the yellow it appears because we have an increased resolution for the red and the green objects and there is not so much yellow right now in your images and so this is really a sensitivity of many methods to the microscope PSF I think it's really a problem and this explains some way in a way the development of object-based techniques that you're going to try really to localise your objects or your localisation directly in SMLM and then you're going to directly work on these objects because otherwise if you look at the overlap or the correlation of signal you can obtain completely different results depending on the microscopy you're using and the second issue when developing your localisation analysis and this issue is being more and more important right now again with also the increased resolution is the contextual interpretation of what you observe because typically two objects or more can be closed and this sometimes has no significance at all because it is really let's say that it is really significant if the space is empty and this is something that Florian showed before that really depending on the context and the background of your field of view the proximity can have a lot of sense or not and this is the exact same thing that having if someone is really close to you in an empty space it's much more frightening than being in the subway typically and so all this small introduction was to justify the use of very general mathematical framework the mark point processes to analyse the spatial relations between objects with some robustness and just before we have introduced the term coupling to marginally look at the distance between objects because you can have objects that overlap but with increased resolution clearly many objects are not completely overlapping or not at all but still there is a spatial relations between them and this spatial relation can be a direct interaction or the co-presence of molecules in a larger complex or in a larger organelle or for example the synaptic opposition and the command between all this situation is that you will have a significant spatial relations at some distance between different objects and not specifically like overlap or something so this is something that is more general and that has a lot of sense for a super resolution microscope and then all you can go from pixels to objects in many cases when you image molecules these molecules are sub-diffractive meaning that their size is let's say below the resolution of the microscope so typically what you observe is the PSF of the microscope or a small object plus a big halo of the PSF so typically you will have a lot of objects that you can segment with different techniques like a pure stress order or something that is more robust like a wavelet analysis or convolution with a log ocean and at the end you will have segmented objects with a given shape, a given color and a given position and from there you can represent your objects in your field of view as a collection of objects with a position a point and a mark for example typically each image can be seen as a marked point process representation and a marked point process which is just a stochastic process meaning that what you observe is the realization of some stochastic processes and what you want to know if for example if the marked point process that is green and the marked point process that is red is the image distance and I forget to say that typically the localization in a super resolution microscopy in SMLM are per se some marked point processes because typically you have a color and a position and so this is an example of some synaptic molecules that have been imaged with three systems by Eurydia. So you French some the secretion in Paris so now this is the second part so we have a good representation of the objects let's say all the position in our field of view but what can you extract from this and this is like the second part you analyze partially this marked point process to extract reliable information and there will be some equations for people that are interested in if you are not interested in in the matter in the statistic you can just skip it so just before that I would get a very short review and this is quite similar to what Florian just presented but just to give to really set the ID so typically when you have object localization the first idea is let's say to come back to something that we know we can look at the correlation of the overlap between molecule clusters or at least between areas with high density of molecules and so as Florian presented this dense area can be segmented with db scan or some replay dependent functions so I forgot the name or here this is also a replay related function in the article where you have some dense region and then you are going to look at the correlation or the overlap or the correspondence in a Voronoi diagram so there is a lot more to speak about all these techniques and I will refer to part one and there is a second strategy so my classification is a little bit different from the classification of Florian but in this second strategy this is a coordinate base localization that he really described but more generally this type of strategy I call this like the single object and the object can be localization or even a pixel because we can have example of such techniques like more than 20 years ago in confocal microscopy the idea is to look at each pixel or each localization and look around if there is any co clustering co identity of the two probes or the two colors and these two techniques are I mean they are interesting and let's say that they look at local properties of signals and these local properties let's say this is more or less co clustering or correlation of identity between signals and the cell strategies that we have developed and especially it was especially useful at least for us because we looked at synaptic position and for the synaptic position like the co clustering or the co relation of signal in most cases going to fail especially in supersolution microscopy because sometimes you don't have any spot that overlap or that correlate so we developed something like a bit more general from the distance point of view and we decided to really analyze statistically the distance directly between objects and so if the distance is zero this is like say the overlap but more generally or you can track information from the distance and for this you have to first characterize the distance so this is the first equation the right one is three four slides where I'm going just to enter into the detail or you can extract the sticky information from the position of the different object and the distance between objects so first you have to quantify the relations at different distance between objects and a very common and widely used process is the replace function or yes this is something that is related to the replace function so the idea is to take for example the first object the green one and to count a different concentric ready the number of molecules that are going to fall into the different annulus or in 3D this is in the different sphere around the points and so you will have a number so for example you will have that oh I have a mean number of three molecules around at a distance comprised between four and five pixels okay so what can you do with this and this is the main part is okay we characterize this distance related to some null hypothesis so here's a new hypothesis is that okay everybody is random especially the second population and so under this hypothesis how much do I expect to have molecules in each annulus and this is what is important and interesting is in which annulus I am observing a significant accumulation of objects meaning that there is a special relation at this specific distance and this is where is coming the statistical analysis of this replace function so quite rapidly we have found that the expected number of objects under the null hypothesis of randomness inside this annulus is just a Gaussian and so we just need to compute the mean and variance so it's not completely the mean and variance is going to depend on the molecule density the surface of the field of view and some boundary correction because for objects that are close to the boundary we expect let neighbors and the interactions between objects that are going to share some random neighbors and when you take into account all this stuff you can so this is the second side with some formula you can derive the expectation the mean number of molecules that you expect in each annulus which is going to be proportional to the x of the annulus and the variance so the variance is going to depend on the means the boundary corrections and a big term in blue which is going to account for this the annulus intersection between the different green points so this is an interaction term so it's quite complicated but at the end of the day you have zero mean you need variance variable so in each annulus you expect if there is no coupling that each number each reduced number of molecules in each annulus is going to just to follow standard Gaussian law and then you can have and this is what is interesting here is that you have a multi-distance analysis because you can then have a Gaussian vector in the different annulus around your green objects and once you have let's say a Gaussian vector with zero covariance like this it's really easy to extract a lot of information so for example you can compute the p-value of the observation what is the probability that what you observe is due to chance and the other question is what is the number of really coupled objects and single objects in my image so if you look at the maximum value of this vector you can very easily compute the p-value so there is a formula with the maximum value and this is the number of annulus so you have a close formula for the p-value of the probability of observing something by chance and then from this you can detect the annulus that are far above the statistical threshold so the threshold is a function of the number of annulus and this is like the unipartial threshold that is commonly used in image analysis and then you can really detect the rings whereas there is a statistical accumulation and when you count the number of expected molecules in these rings compared to the expected number of molecules or spots or objects that would be there just by chance you you compute and you have a robust estimation of the number of coupled objects at each distance and so let's say that at a given distance I have an object, I know that at this distance I have a given number of coupled objects I have a given number of objects that should be there by chance then I can compute the probability that this specific object is here by chance or not and so typically in an annulus you expect let's say one object and you have 100 it means that you have a probability of 99% of chance that each object is here through the coupling and then you can do a lot of from this you have a lot of information because you can the mean coupling distance, the mean number of coupled objects at different distance so you have a lot of information and this is a very versatile tool because you can use it with localization or segmented spots and at different distances so I'm just going to show you what we did with this with this tool and so we analyze the coupling between three color three color spots so in blue or male and PS95 in red so these are two post-synaptic molecules and we look at the opposition with some synapses in green which is a marker of presynaptic vesicles and so here I just give you an example of what we can observe in white field and stricter illumination just to remark the you can observe that the big spots that can be observed in white field are clearly refined stricter illumination but more than this we can see small spots with stricter illumination that could not be seen in white field so this is also something that is interesting here and so this is more or less the same information so this is an example of the PSD and synapses in the different modalities and then because we had like 10 images of one neuron in each image we have thousands of spots and so we wanted to run statistical analysis and especially do this automatically and robustly and this is where we went from the microscope and the images to the analysis through the IC software and the IC is really well suited for this type of batch analysis because you have a lot of tools to automatize your analysis and quickly open the files run the analysis and the image get the results and do it again as long as needed and at the end of the night you have for example an XL5 with all the results for 100 images and then then you need to work so this is actually what we did we designed a protocol to segment the neurons in different colors analyze the coupling between color one and two, two and three, one and three then integrate all this information and get a lot of statistics on how many objects are coupled at which distance, what are the size of the spots that are coupled etc so I'm just going to run a small videos that I recorded on my computer which is okay or you can do this with IC you have to download the Soda 3-color online protocol then this protocol is already online so you just open it and it charges with a lot of links so a protocol is a series of blocks that are going to compute a lot of image analysis blocks and the links is for example when you detect spots to link the spots to the localization analysis etc it's really like doing some plumbing plumbing and so first of all your spot is really like the general parameter what are the stressors you are applying to detect your spots or your mask then there is three Soda blocks to detect the coupling between color one two, three, one, three and the remaining is really to extract which spots are single, which spots are coupled only with two and not with three and export all the results in Excel files and images and so at the end so after like for example for a war neuron with 10,000 spots I think it takes something like a few minutes maybe three minutes so it was for 10 images and so at the end you have a lot of Excel files so here's an Excel file with all the coupling data at the different distances so each line of the Excel file is the results in one analysis and then you can combine the information for the whole analysis and you have also images that you can just drag and drop in the IT software and observe the mask, the region of interest that are for single molecules so single molecules is for example two without one, without three or here I am going to underline one with two with three so these are the homers that are with PSD and with synapses so these are the three three-parted synapses also the same with synapses that are in trio and so I am going just to show you where are the synapses with three molecules and above that we have for example synapses with only two spots, PSD and Homer some with only Homer and synapses at the end we have a lot of information and then you can do a lot of graphs and having a lot of information so this is just a summary we observed that typically the spots that were alone were smaller and then a lot of them were just a copper so they were a little bit bigger but smaller than the spots that were in trio we add the histogram of the distance because you remember that we have the information of the coupling at each annulus so we have the information of the coupling distance and how it is distributed and so for example here we observe that the distance between PSD and Homer so this is in red and blue is much lower than the distance between PSD or Homer from PSD95 or from synapses because synapses is on the other side of the synapse and what is really important is that we can see the arrangement of the synapse with the synapses that is in face of the PSD95 so the PSD95 is really at the PSD and Homer is slightly behind and slightly like on the side of the synapse and this is something that have been observed with very few examples in an electron microscope and I don't remember but yes and the next slide is that actually what I show with structural illumination and spot etc is completely also applicable to single localizations so you have to segment a region of interest so this is something that we do so we segment the press and optic button by looking at the cluster and actually we use a dv scan and inside the press and optic clusters we look at the coupling between single molecule localizations at different distances and we extract molecules that were single or coupled so here between synapses so these are two markers of press and optic molecules and what we observed is that they were indeed coupled with a mean distance of something like 16 nanometers so the size more or less that 14 nanometers so the size of a synaptic molecule and a stoichiometry that was in line of what have been observed with a mass spectrometry so I'm already at the conclusion so as I show you there is a lot of standard tools but these tools are most of standard tools are not really efficient for increased resolution and also for localization based microscopy and more than that it's really hard to give an interpretation of your results what is the probability that what I am serving is due to chance and can I map which are my localization or my objects that are single or coupled and so this is why we have developed this statistical analysis at different distances and implemented it in IC to really get statistical information at different distance between different objects and this works in wide field confocal sim or localization by microscopy and 2D and 3D and so the experiment so I acknowledge a stupaster and all the other XNRS INRs that are financing the lab and this research in particular Centre de psychiatry and neuroscience this is the centre of Lidia that is really helping and especially through the imaging facility they have a great imaging facility everything is available in IC IC has a strong support from friends by your imaging and I really think no bias for invitation and for what they do because really they really build a community around a very major analysis and that's really nice so from here I would like to thank you and I'm going to stop talking and listen to questions if there are some questions thank you we have some questions some of them have been answered online but the first question was can you have a specific file format and can you use whatever CSV file to load your localization now there is no format so typically you're uploading an Excel file and so there is two type of protocols and I see there is a protocol for imaging where you detect spots etc and there is a protocol that is really dedicated to localization microscopy and in this protocol you have a first series of blocks so everything is explained online with the documentation that you have a series of blocks where you're going to choose the number of the column that contains the localization in X the other one in Y and the other one in Z if you are in 3D and you have even a column where you specify the number of photons and you can specify that you just want to use the localization with more than given number of photons so this is a series of blocks that is really going to specify the column of your Excel file for the localization Tivok can you just give the name of the protocol you're talking about this is the is it a storm 3D Soda and there is a storm 2D Soda and there is so in the protocol storm you have to specify the column and I talk rapidly but you also specify how you are going to build your clusters and in IC what we have is a DB scan so you have to specify some parameters of the DB scan to segment the clusters because Soda is going to look inside the clusters and look at the single localization inside the segmented clusters so the approach is completely different from the other approach in the localization by microscopy where more or less you are going to look at all the clusters interact or overlap or correct you segment the cluster and directly look inside at the molecular level at the single molecular level there is one more question and that says for the formation of the annual life for non-circular objects does it match the contours of the objects or does it assume always assume circular shells so actually with what I presented here today is down the center of my so everything was circular but we are now generalizing the approach and consider the shape of the object and the contour and then the annulus is no more an annulus but is more an isoline that is following the contour of the object and that is expanding from the contour of the objects and this can be found in the current publication so first author is and this has been published in signal processing letters in 2020 and the plugin will be available very soon with all the documentation another question is that can localization error be incorporated in the two color analysis it should this is also in progress so yeah it should because right now to model the co-localization or the coupling we model this with some point processes and a point process is going to be coupled at some distance but definitely we are not thinking about how to incorporate the localization error in the computation and also the resolution because typically from the resolution you cannot have two points that are closer than the resolution limit even in SMLM so typically this also complicates the analysis so we are working on this theoretical aspect another question is that how do you load your SML data in IC so I think you partly answered the previous question but the basic file is an Excel file with the localization and then most of these tools are available in the protocols in IC and so there is a series of blocks we are going to specify where your file is and which columns of your Excel files are containing the X, Y and Z of your localizations yes so you do not need to reformat your data you just have to load it and create the column yes there is no formatting but at the end you need localization in CSV or Excel file okay I think there is no more question