 My name is Ignace Organda Carreras, and right now I'm a postdoc at the Modeling and Digital Imaging Lab at the Institut de Jean-Pierre Bruggen and in La Versailles, in France. But today I will be talking mostly about my previous work in my previous postdoc at the Massachusetts Institute of Technology, where I was learning on how to look at the brain from the inside, and in particular giving my background in computer science and electrical engineering. I will focus on the methods and algorithms that I develop, especially to study brain connectomics, and they're all microscopy and image processing methods. So I will make a short introduction on connectomics and what connectomics is, and especially on the approach that we use at MIT with our collaborators at Harvard University using an electron microscope, and in particular I will describe the image acquisition methods and the image processing methods that we use. And then because, not only because it is my field of expertise, but also because I think it is important for you whenever you use any kind of image processing tool or you have any kind of medical-biological image problem, I will describe some methods on image registration and image segmentation. And this would take about 55 minutes to an hour, and then if hopefully you read your emails, I told you to install a small software so we can play around with some of the tools that I've been talking about, and then we don't need to feel sleepy after one hour. Okay? So let's start. So first of all, what is a connectome? Because maybe you heard the term, maybe you didn't. If you didn't, it was my case when I started my post-doc, I ran into the Wikipedia, and then I tried to find a comprehensible definition of it. And there, you actually find one, which is a commonly accepted definition that says that a connectome is a comprehensive map of neural connections in the brain. It's quite generic, right? But if you go into the detail, you see that it may range in a scale from either a very detailed map of full connections from either the entire brain or just part of the brain of a nervous system to a more macro-scale definition of the function or the structure of the connectivity between areas of the brain. In general, just keep in mind that connectomics or connectome refers to all these scientific efforts to capture, to map, how the nervous system organizes, how the neural interactions in the brain organize. But don't be surprised if, reading some papers, you get some confusion between terms. Actually, it would be interesting to define two types of connectomics based on the scale. One that was first described by all of the Sponsor and collaborators that says that connectomics studies connections between brain regions through the white matter. And they estimate this using diffusion tensor imaging that provide these beautiful colorful images that unveil the structure of big highways of nerves in the brain, right? And then they associate this structure with function using functional MRI, as Maria said. We can see which areas of the brain get activated at which time and then associated with these structures. Of course, this is a macro-scale view of it. You will hear more about it tomorrow. And we could maybe call these macro-connectomics, given the size and the scale, or maybe projectomics. But today I'm going to refer to these other type of connectomics studies and the one that was defined by Jeff Lickman at the Harvard University and my former PI at MIT, Sebastian Sen, that said that connectomics study actual synaptic connections between individual neurons. And for that, we use high resolution stacks of electron microscopy images. For high resolution, I mean nanoscopic resolution. So we actually see organelles, we see synapses between individual regions. For the reason, we can maybe call this micro or even nanoconectomics or even synaptomics, because as I said, we're going to be able to visualize the synapses between all the neurons. Yes? Is there any agreed meaning of this suffix, comics? I mean, you also have echo-nomics. I think this came out with a fancy name to be able to sell it better, same as with the genome. And they said, okay, let's make the connectome. But it's actually some papers about making fun of it, yeah? Here, when you mentioned these two different kinds of connectomics, are they nowadays already high throughput, or is this micro-connectomics still very low throughput compared to this other? Let me show you what we get, and then maybe that answers your question. There's just a few labs working on it, but we built already quite a very stubborn pipeline. I don't know if I could say high throughput, but quite yes. So, of course, you could ask me, why are we interested in synaptic connections in the first place? Well, we're interested in synaptic connections because we know that the function of a neuron is largely determined by the inputs of that neuron. And the function of a network of neurons is mostly determined by the inputs of that network of neurons, but also the intrinsic connections between the neurons in the network. So, we expect that knowing the rules of connectivity in the real neural network, in the physical neural network, we're going to be able to constrain our model, our functional model, our model in the computer, in the circuit. This is actually how many people started working on this type of connectomics because they were trying for years to model some specific neural networks, but they always get stuck at a point where they didn't know the real connections, so they couldn't model it properly. Okay, so you could ask me as well, okay, if synaptic connectivity is so important, why hasn't it been studied before or in more detail? Well, first, because methodologically it's very difficult. Light microscopy was until very recently two course for dense reconstructions of the neuropiel. By dense reconstruction, I mean you have a whole block of tissue full of neurons and you want to have labels for each of them, not just single label neurons, but you want to have everything, let's say, with a different color. Now we're getting to that possibility with some techniques. That's the brainbow, for example, but we're still far from the resolution that we can get using, for example, electron microscopy. And then, yes? Yeah, when you said that light microscopy was two course for dense reconstruction of neuropiel, would you say that, for example, nowadays a lab that already does individual patching could actually use their lab, the microscope, the light microscope to do this kind of... Yeah, there's some, yeah, very new, especially two photon microscopes that can actually work with a brainbow. They can give you a very high resolution of multicolor images. And even in... I've seen that in vivo with some fancy confocal microscopes. And, okay, so we could also do some staining and electro recordings, but they could only sample very small sets of neurons. So we will never get to all the connections in that block of tissue that I'm talking about, right? And we had cellular section electron microscopy with enough resolution to do this, but the problem is that it produces an incredibly large amount of data. And what is an incredibly large amount of data? Well, let me show you some numbers. We look at the brain volumes of the typical model of species that we study in neuroscience. And we mind that we have an electron microscope that gives us a resolution of 10 nanometer, 10 nanometer isotropic resolution. That means 10 nanometer in X and Y per pixel, but also in the C direction. Well, that would mean that for a story in just starting one cubic micron of that volume in our hardware, we would need one megabyte, okay? So let's see the numbers. If, for example, we look at the C leg and this millimetric worm, that was actually the first one whose connectome was completely reconstructed. It's 302 neurons. Well, just starting the whole worm at that resolution could mean one terabyte of hard drive, which right now doesn't sound that bad. We can go to the store by some external hard drives, one, two terabytes, you can find it on personal computers and laptops. But imagine that this is just to store the data. I'm not talking about any processing yet. Just your image in the data at this resolution is the serial sections, for example, well, I'll show you later the type of microscopy, but for example, serial sections of your T soon. And then you image it. So you get 10 nanometer resolution. Yeah, so I mean, every boxel represents 10 nanometers of the of the tissue. I'll show you some images later. So you see actually right away. So in mind that it will work with rosophila, for example, with the brain of around 0.5 cubic millimeters and 80,000 an estimation of 80,000 neurons is storing all this volume as images of that resolution, we need 125 terabytes of hard drive. Same for the larva, the silver fish. And if we go for larger brains, such as the mouse, of course, we're interested in the mouse with 7.6 cubic millimeters of volume and estimation of 75,000 millions of neurons. Well, this became completely prohibitive. We have 450,000 terabytes of hard drive. Or even if we just want to store the cortical column, then it would be 1000 terabytes. And of course, we can get dreamy. Look at the numbers of the human brain. And then with its 1.3 liters and an estimation of 100,000 millions of neurons, while that would become the crazy number of 1,300 millions of terabytes, 1.3 set of bytes. So of course, when most people see these numbers, most neuroscientists say, okay, you should just give up. You cannot even store your data in a proper and affordable hard drive, why you shouldn't even care on processing it, on extracting information from it. And somehow they're right, because these numbers look very prohibitive. But somehow they're wrong, because Wikipedia told us this is not only about analyzing the entire brain, we can also focus on parts of the brain, parts of the specific neural networks. And also because we know that technology is on our side. You look at, for example, the cost of data storage and the cost actually per gigabyte, dollars per gigabyte over the past 30 years. So we see how very constantly the price of being reduced by a factor of 10, roughly every four years. So if we look at the, if we stand this line into the future, we can see how just in five, six years, we get money back. I hope so. We would get hopefully a very feasible, affordable hard drive to make the whole mouse brain feasible. In about 20 years, that would be the case for the whole human brain. I'm talking about the hard drive around 50K. So we're reaching this point where data storage is becoming not a problem. And this is just to show you how technology goes and on our direction. But for you to get a better idea of the resolution that I'm talking about, I don't know, we can change the contrast of the screen or something. This guy's not as dark as it looks like. Well, see if it changes. Let me show you this video. So this is Bobby Casturi. This is a postdoc at Jeff Liegman Lab on our collaborators lab at Harvard University. And he's posing on front of our electron microscope. And on his hand he's holding a wafer with you don't see much but about 90 to 100 serial sections of mouse cortex, which is our main data set here. So if we zoom in, hopefully, we'll start to see what I'm talking about. We have some stripes here with all the sections. So be able to see some squares now. Those are the sections. And start to see that the tissue. Let's see. You see some circular structure here. This is the hippocampus. Those are some layers of the cortex. Okay, right now. See some blood vessels start to differentiate cell bodies and some dendrites running from left to right. Some axons, myelinated axons you see with these thick membranes. And then when we get to the highest resolution, we started to see all these organelles that we studied in our science class. We see mitochondria. Oh, you probably have to believe me on this. But you see, these are mitochondria. You see membranes, double membranes, you see vesicles. And actually, if you believe me, and you look at this center area here, we will see a synapse. This is the end of an action. This is full of vesicles. This is the synaptic space. And this is the other part of the other synapse, the dendrite that receives the... Maybe stupid question, but maybe not. So whenever I look, I mean, I'm not a biologist. Whenever I look at these images, I think, well, this looks like stone age drawings. So now from a scientific point of view, you tell us this is a synapse, but can you actually validate that it's a synapse? But how do you actually know? This is a very good question. You said asking your professor. So that doesn't count. So is there a scientific method to find out that this is really a synapse? So, base, yeah, actually, I asked my professor, how do I know that this is a synapse? And then, actually, even before we get this kind of a stain in, I'll show you later there's some stain that only shows membranes and the inside of the cells. So there's no organelles. And it's not as clear as here because you don't see all the vesicles with the neurotransmitter on one side and the dendrite on the other side. But they started to kind of infer the synapses based on the shape of the neurons from some morphological features. That wasn't very good. You asked me my opinion. But then here we have even more features. And then the fact that we get the shape plus all of these vesicles on one side and this area, you don't see very well here, but it gets darker. And it's also you can sometimes you can even put a label on the synapses. So you can make sure that this is an actual synapse. There are some people that actually only label synapses on this kind of data set. Isn't it also this separate line of work where they have like gold particles attached to particular antibodies, which attached to certain percent. So you have like a separate way of a separate way of doing anatomy, which overall can't sort of add to this problem. You mean to label this kind of thing? Yeah. Yeah, this is actually this is a line of research where they're trying to play with the different labels to, for example, just get synapses and membranes. That would be awesome for my application. It's always important to remember that this is the morphological correlate of what we say is a synapse. And it has a very defined signature. And there's a set of criteria by which you would identify that in terms, you know, synapse itself and synapse transmission is a functional concept. So whether these two things are exactly co located, again, is an open question in all cases, because I suspect there's all kinds of transmission that is captured in a functional signal going from one place to the next that may not occur exactly at that location. But as always in biology, there's a weight of evidence that accumulates over the years that this thing is, this is a place where Yeah, so I understand this one way. Yeah, you have a set of criteria, you go into your image and then you see it. But now the question is, can you go back? Now, can I take now this very, very image and somehow test my hypothesis that this thing, maybe in cases where it's not as clear as this one, so can you do the validation, the testing of your hypothesis that this might be a synapse? So it fulfills all your criteria, but maybe there's criteria on n plus one, which you haven't discovered yet. Okay, so the question was, when you have a set of criteria and you go into this image and you tick them all off and say, yes, this is a synapse, but maybe there is something new that you haven't seen before. Is there a way to go back and actually cross check? Yeah, so there are some people working on this kind of things. They, for example, they image first the same data set with some light microscopy and then, for example, just label in the synapses or some specific neurons and then they try to match afterwards after they image at this resolution with EM, the LM with the EM and see if they actually see what they think they're seeing. I just want to extend his question about neurotransmitters. So can we take any image or any quantization of the vesicles, the release of vesicle patterns or anything? We can get any data about this. How to quantify the vesicles or how to take pictures, the release of vesicle patterns? I'm not sure I understood. So you're asking if we can label any of the organelles? Yeah, in terms of scaling, suppose number of neurotransmitters and number of vesicles releasing profile. Okay, so at this resolution, yeah, I didn't mention that. So this is our highest resolution. So this is per pixel. We have six nanometers in X and Y. So it depends on the size of what you're trying to see, of course. And then in the Z direction, because this is an actual cat of the tissue, we have 30 nanometers. So this is not isotropic data. So you could see maybe the vesicles and X and Y, but you could maybe lose some of them in the Z direction using this technique, but also use some others. So as I said, we're not all interested in the 2D part, but we want to look at the whole block of tissue. And for that, we need to reconstruct it in 3D. And as I was mentioning, this is actually an isotropic, so we have to interpolate, but you see how, hopefully you'll see how the whole structures get recovered. And then we can go back to the 2D and start to identify single new rates. And then we can, for example, color this red new rate here and iteratively unveil the 3D structure of the new rate inside the block of tissue. And for example, reconstruct any other new rate that contacts it. In this case, we have this green new rate that touches the red one in two places. And the good thing of this technique is that then we can go back to the 2D and then check, as I did, if the actual touch was just a random one, or if, as it is the case here, we see the same feature that tells us, OK, this is a synapse. We have the axon that then drives the synapse and the synaptic space. And as I said, we want to do dense reconstructions. We are not just interested in labeling beautiful single neurons, but we need to fill all the gaps in the block of tissue. We need to basically color everything that is passing through, everything that connects to anything. And then once we have that, we go to all the touches, we check their synapses, and then we build our connectomics information, our connectivity matrix, if you want. And you see some gaps, it's because we didn't label the glial cells because they don't connect to anything. OK, so how do we do this? Let me show you the whole pipeline that we developed. When you reconstruct one of these cells, when you do one of these reconstructions of the whole block, do you actually do several blocks and then try to also match those? Yes. Yes, because, well, there's some limitations in the software that we were using at the time. And then we have to then match all of those blocks that were together, side by side. So, yeah, let me show you the pipeline that we did in our project at MIT and Hallbard. We had first the sample preparation. As I said, we were with mouse. We struck the brain and then we first cut the brain at thick sections with a vibratome, around 100 to 200 micron sections. And then we stained them with heavy metals, which gives them this darkest color, even darker in this screen. And then they finally get dehydrated and embedded into an epoxy resin. So this whole process takes about one week of work. And after that, we get this. We get this thick section embedded into this yellowish resin that looks like the Jurassic Park insects, right? So here is when you have different options about the imaging. We get, in general, three big options. They all have their advantages and disadvantages. It mostly depends on the feel of view that you need for your experiment, the resolution and how deep you want to go into the TISU. For example, you can use a serial block face EM that was first developed in the Max Planck Institute in Heidelberg. And then in this case, the TISU gets stuck here and it gets cut and imaged at the same time, which gives you a very well-aligned data because it's just cut an image. But it's not isotropic because the resolution in the C direction depends on how thin you can cut, of course. And there's some limitations also in the feel of view that you can use here. Then you can use focus ion beam SEM, which literally bonds the TISU with the laser, what you image, but gives you the highest resolution that I've seen. You can get this isotropic resolution and you can get us down as one nanometer per voxel in X, Y and C. So this is where you can see vesicles even in the Z direction. But there's some limitations, of course the data is aligned and it's a very high resolution, but you have strong limitation in the in the feel of view and also on how deep you can go into the TISU. And finally there's the third option, it's called ATOM. It stands by automatic tape collecting ultramicrotom. This was developed in Jeff Lichman lab and it's an ultramicrotom coupled with a tape collecting machine. So here what we do is we cut the TISU, we set it on this tape and then we can image it as a large feel of view as we want. But there's also the limitation of the thickness or how thin we can cut and that the TISU gets misaligned when you cut it and put it on the tape. But because we were interested on large feel of views we use this technique. Let me show you how. But first thing you do is you trim your resin, so you fit it onto this metallic piece that goes into the ultramicrotom, that is by the way developed by Leica. And then you just place the tape collecting machine in front of it. So you're gonna, there's a diamond knife here and then it's gonna smoothly cut it and it goes into the tape. Let me show you, this is our technician just setting up the tape in the tape collecting machine. It takes a few minutes and then you can leave it running overnight. So you see how the block of TISU is here, it gets iteratively cut and then set on this tape. So at the end of the day you have a very large tape, like one of the old films with each frame, it's one section of your TISU block. So after this, of course we just had to cut the pieces, put it on a wafer, the same kind of wafer that Bobby was holding on his hand in the the first video I show you, and then image it on the electron microscope that you have at hand. In our case we have a specific software that we made at home so you can image at each section, it recognizes automatically all the sections and image roughly the same areas. But as I told you at the beginning the alignment is lost in the process. So you see that even at this low resolution it's almost impossible not just not to get dizzy but also to follow any of these processes in 3D. So the first thing you have to do from a computer size sense of view is that image alignment and stitching. So the stitching just means that you took different snapshots of your section, then you have to recompose them or psych to have a single image per section and then align in between the sections. And this is not an easy task, believe me, because well this is not super common but you can get tissue burnt, torn, you can get some noise dirt, some problems with the some artifacts introduced by the cutting knife, you can get images with overexposure, etc. So you need to develop methods afterwards, yes, and your computer can deal with this kind of images. Just a quick question, do you use techniques used in the geographical area like for remote sensing imaging? They probably do the same type of registration. Is there any connection at all? Do people use the same methods for image registration or is it totally different in your field? I'm not related to that field but I think the image registration in general is applied to many different type of images because I actually myself develop some algorithms, I get people sending me emails to register all kind of images even frames from cinema movies and stuff. The image can be different but the techniques remain the same. You just need to identify common points on consecutive sections and then use them to do the registration. In this case well just for you to know what we used we had some hierarchical approach from low resolution to high resolution we find by cross correlation some correspondences and then find the relative position of all the tiles and then the absolute position of all the sections in the whole sequence. And after that once we have everything aligned we need to do the labeling. We need to, as I told you before, color each of the new rights in the block of tissue with a different color giving it a different idea. So the first thing you can think of is just doing it manually. So this is what we did as a proof of concept. This is another postdoc in our lab, Daniel Berger, who spent quite a bit of time just doing what we were doing when we were kids, painting in between the lines. So he loaded a small portion of the block of tissue into the software called ITK SNAP. And then he goes section by section selecting a different color for each of the new rights and then painting everything that is on the whole block of tissue. And as you can imagine this is a very tedious work and it can give you serious back problems and especially you don't have the right chair or the right touch screen and it's not really really worth it is you look at the whole data set. So he actually manually painted these two small data sets. They have their images of 1024 by 1024 pixels. That was the limitation of the software that we used at a time. And one has 100 slices and the other one 256 slices. So it took him months to finish the work and be completely sure that he didn't have any error. Actually another person did it as well so we could compare and check if they were right. And he found about 400 objects, 400 new rights or pieces of new rights and one and 600 on the other one. But this really tells you the need that we have to scale this up. So we need to go for automatic solutions. We need to go for software that can handle this data this large and at the time it wasn't that easy. So we need to design ourselves large-scale segmentation tools, automatic tools and have a way of validating our results. And for that we went for the so-called citizen science solution that I'll show you later. And just for you to know we use a two-step approach on the segmentation that we trained some machine learning algorithms called convolutional neural network that gives us the probabilities of objects being together or not in our block of tissue. And then we use some some other method called a watershed to have a hierarchical labeling of those objects based on the threshold that we can apply on those probabilities. So now the thing is how to validate that because we need to choose the best threshold that gives us the minimum number of errors. And for that we get again we made our own software it's called Avni where the users could just load the dataset where he or she could see overlaying the new rate that he or she selects. And then a three reconstruction of the same. So you could go check in every new rate and see if there is a split or a merger in the in the three reconstruction and then change the threshold manually and then set it for the specific new rate. So we get a few students and a few technicians doing this especially over the summer and then we realize that we want to scale this up for this huge dataset that we had. Then we needed to explore other possibilities and put this somewhere else. And because we knew that this task is actually one that doesn't require specific expertise or much knowledge about the the tissue we decided to go into the citizen science solution the crowdsourced solution. So we put this we actually this is the real name we gamify this problem to put it in the web so we make a game out of it so people can correct our the mistakes of our computer methods making some points and playing against other users so that they at the same time they have some fun they they can actually correct our mistakes. So we have time in the demo time I'll show you how this works. So I said this is the whole pipeline that we developed over there and as I told you we're at this point where well image acquisition is not the problem we're producing much more data that we can handle. The data storage is becoming not a problem the problem is now on my side on the computer part so I always like to quote here is Agassimov that said that we're reaching the stage where the problems we must solve are going to be unsolvable I would say they are already unsolvable without computers we shouldn't lack we shouldn't fear the computer we should fear the lack of them and in particular and all the computer methods that I've been developing over the past years I developed them in an open source platform that may maybe some of you know that it's called Fiji or maybe you know imagej. Imagej is a very popular open source toolkit for biological and medical image processing and Fiji is just a distribution which comes with a lot of solutions for this type of pipelines in connectomics not only connectomics but also biomedical image processing so I'm going to spend some time talking about it because I think it will be useful for you you actually have here a set of plugins so called small parts of the of the program are called plugins that allows you to import all kind of all kind of biological data and dimensional data you have plugins to do the image registration images teaching of 2D 3D images you have ways of visualizing that data in 2D 3D 4D and also some manual semi-automatic and automatic tools to do the annotation of those data sets and more importantly you can based on those annotations then use some tools to quantify strike the the numbers the statistics of your image data so first of all because it's my area of expertise but also because I think that in general neuroscience whenever you want to work with images you're going to have to deal with this type of problems I will describe in a little bit more in detail what is out there for image registration and image segmentation hopefully you will find it useful well first of all just a bit introduction about image registration because well we all know that it has to do with aligning 2D or 3D images either a pair or a sequence of images it formally it has to do with the type of transformation that we expect between those images so for example if you are trying to align images that you took from microscope and you know the stage just shifts a bit well you know that you're you have to do with it with the translation if it rotates then it's a rotation you have translation and rotation is a rigid body transformation this this kind of things those changes there are let's call them simple there they have a linear solution out of them then those are more or less easy to solve and then there is some not that simple problems when they teach you for example get get torn or folded or a stretch and then you need to use some more complicated mathematical tricks to solve the the deformation basically treating the image as if you were elastic or use some local corrections etc in any case this search of a common coordinate system is going to be very very important whenever you have to integrate or compare image data that you obtain from either different samples different image modalities or you obtain it with the different measurement you always have to bring everything to the same coordinate system then over there start measuring so in Fiji there's a bunch of options for that I will go only about some of them that are kind of a state of the art right now because there's some all ones as well for example there is one called register virtual stacks slices that I developed with some collaborators I will you'll see that in my slices I put a link here in my slides so you can go actually to the to the data I mean to them to the website to see the the software unit in this case this program what it does is that it registers virtually any large sequences of to the image data and for that it uses something called sift that stands for scale invariant feature transform and it it's those are some specific points in images that are invariant to changes in the scale in the perspective in the zoom et cetera this is exactly what you most cameras do for example in your phone when you take a panorama picture what they do they take the overlapping area between the different snaps of your scene and then they try to find those points because based on those points that are in in both images you can find a model and then reconstruct your panorama well you can actually do the same thing when you have consecutive sections especially in this type of studies because the sections they really look like each other if you you don't use very thick sections right so those points are going to still be there you can use it for a stitching or you can use it for registration in between sequences of images and then you can choose transformation model and apply it to those points then there is some more complicated solutions for example something I develop during my PhD thesis that treats the images as if they were elastic because I was working with histological sections of mammary mouse mammary gland and of course the tissue has some elasticity properties and sometimes I need to recover from that well we came out with a solution to based on V-splines treat the images as if they were elastic and then recover and actually calculate the transformation and the elastic transformation between the so-called warp image and the unwarp image and then well I developed it further the people who started to use it in neuroscience this is actually how I get into neuroscience because it became popular on EM sections so I had to actually improve it further use make it usable with multithreads to handle able to handle large images compatible with the sieve correspondences to do proper initialization etc how do you recognize the warped image when if you're are you doing this manually so at the beginning well this is a weird example right yes you can select which one is your reference so you have a sequence yeah people for example in a sequence selects the the one that looks better or that is in the middle yeah but this is a good question because if you you select the bad one then you're going to warp everything into something that looks not very realistic and that's why we went further and then now the state of the art is another approach that in in physics called elastic alignment or elastic montage it has kind of the same idea behind but instead of using B.S.Planes B.S.Planes it uses a system of not triangles but springs that looks like triangles here but the the good property that they they tend all these springs make made the the the image or the deformation to be as read as possible to avoid spreading a lot of errors and then even if you have for example a deformation in the middle of the image then you could only get strong deformations in in the middle and then the borders would remain more or less in place and this was the this is developed by a Stefan Salfel that was a PhD student at Max Planck Institute in in the resident and is right now a junior PI at Janelia Farms and this is what we use right now for our sequences of EM images because it's also able to deal with the all type of images of any size and it doesn't block your RAM memory and you can leave it running overnight and you have or even for weeks you have terabytes of data and you have very nice results how as I will show you later and of course after you have the the alignment then you want to have your image labels actually image segmentation is something that some people call an ill-defined problem because some people understand different things from it like almost everybody understands this is the process of partitioning digital image into multiple segments that's why we call it segmentation but it's true that some people understand there is the process of extracting the objects of your image or at least the boundaries as it is the case here and I like this other definition it says that more precisely the image segmentation process is the one of assigning to each pixel a label such that all the pixels with the same label they serve some characteristics for example in this image we can say okay to all my cell pixels I'm going to set blue label and then green to my cell bodies yellow to my my membranes and black to my background for example or you could say and this is another correct way of segmenting I want to do what you told me before Ignacio and I want to do the dense reconstruction and I want to have a different label per object so each cell would have a different color these are all correct but these are different ways of interpreting image segmentation the easiest are the simplest ways to start by extracting just the borders and then decide later what to do and in in this platform we had many solutions from the simplest ones for example segmentation via threshold but you get your image you just plot the histogram and set the threshold value and say okay from my threshold to the left everything is going to be background from my threshold to the right everything is going to be foreground there's multiple ways of doing this automatically and many of them are already implemented there you can do segmentation via clustering for example playing k-means and the on the color components you can do also directly some edge detection by filtering for example yep sorry sorry please the using the threshold did you classify based on the intensity of the image the first one is it intensity or just yeah this is the intensities okay just like a MRI just like MRI segmentation what it depends on the type of segmentation but this is the simplest way you just plot the histogram of the intensities and then just say okay if these are for example 8-bit images so I have values from 0 to 255 I say okay my threshold is going to be 128 so everything that is not up to 128 is going to be background the rest foreground but this is the simplest way there are also ways of automatically selecting the threshold that look at the at the regions of the image and then decide based on that not as arbitrary as I just did and then yeah there's also for example filtering you can do for example a smooth-during image and then apply a gradient operator such as so well and then you strike borders or you could go for not such deterministic methods but go for something a little bit more complicated for example growing methods region growing methods in this case what you do is like for example you manually set a few seats on your image and then you're going to grow regions until they they serve some minimum area and some similarity values and then when you reach the border of of the image then you already get those regions this is called level sets it's implemented also in 3D and in Fiji or there is a classic approach as well that is called waterset well kind of morphological approach where you treat your image as if you were a topological surface and then you simulate water levels from the the minimum height by different steps and then when those levels reach each other then you create a a new buzzing with a different color and then you directly get the yes how do you define here the values for the topographic surfaces so what is the height would it be the colors would it be yeah so usually yeah you get the intensities but what it's most commonly used is you first apply a gradient over the image so you get on the borders you get peaks and then on the flat areas you get just almost black color right so you can get you start having minimums here minimums in the black areas and then those are gonna be your your starting water levels and then you go up yeah you should get the better a slide for this anyway there's also some solutions there I created the interactive interface you can play around with it and and Fiji as well and finally what I spent most of my research time when I was in Boston I was developing machine learning uh methods to do the segmentation hopefully of my em sections and by machine learning what we mean is that we want the computer to learn how to do the task for us based on a few samples that we provide and how how do we do this for example with with our images okay look at this we have an original image in this case is a TEM section do is gonna create some features which are nothing but filter versions of the same image that enhance some specific characteristics of the of the image for example we can use HD detectors again or some texture filters etc anything that we think that you're gonna help us to discriminate between what we want to segment here and let's say we want to for example segment between membrane pixels and non-membrane pixels so I go there's an interface for this I'll show you later in the in the demo time and then I just paint a few pixels for example you don't see much here in green over a membrane and then I paint a few red pixels in over areas that have no membrane and then what I'm gonna do this is the trick you represent each of those pixels not only by the pixel value by a vector that has the pixel value in the same positions but in all the the filter versions of it so I have a future vector and then at the end those future vectors are already classified as red or green because I did it manually right but this is the perfect environment for all the machine learning methods to learn how to classify other vectors that haven't been manually labeled as I did so I can just represent the rest of pixels well I can train first my classifier and then I represent the other pixels as vectors as well and the classifier is going to provide me with the predicted classification of each of the pixels I hope you understand better when we went into the demo time and then finally which I consider the the most important part of the platform especially you work in connectomics is this plugin called Tracking M2 that was first developed by Albert Cardona that used to work at the Institute of Non-informatics in Zurich he's now a PI at Janelia Farms I work with him over the years and on this software and it has all the tools that I've been talking about to produce this pipeline to do the connectomics studies it has it integrates the stitching registration editing and annotation tools has morphological data mining and three-dimensional modeling tools but most importantly it has a very robust and a straightforward workflow to deal with large datasets so this means that even if you have terabytes of data you can use your laptop or your regular computer and then open the datasets work with it and you're never going to run out of memory why? because it uses a system of MIP maps so what we do is that we only load into memory a version of the image that adapts to the zoom that you're looking at and the size of the patch that you're looking at so you never load everything into memory and and collapse let me show you what you can do with it for example you can do the type of image alignment that I was mentioning before with the correspondences you can find all the correspondences between tiles and then make the mosaic in this case again of a drosophila larva brain then you can do this iteratively for all the sections that you have on your AM sequence but also you can find those correspondences between sections and then use it to have a proper 2D linear alignment of your data set which in this case was enough but it's not always the case let me show you here this is again a whole section of drosophila drosophila larva brain for different zooms and here we just used rigid alignment rigid means only translation and rotation but this data set in particular was very noisy so you see how when I go through then the sections they get very very agitory and it's very hard to follow any of these processes especially at the highest resolution for more than two three sections even manually we get lost if not dizzy but luckily for us we also implemented the elastic solution here with the system of springs and then you see the same data set how everything gets much more stabilized even in the presence of noise and then you can follow very easily at least manually all of these processes through the data set so here when we get this quality of data is when we we kind of start painting yes I have a question now so if you do this elastic alignment it means basically that you have to deform any geometrical shape that you have in each of the frames right yeah based on this triangle so yeah so would you be able to draw something like error boundaries around membranes for example that capture the possible error that you made due to this deformation yeah so it's a good question it depends on how strong the deformation is at the end but the the good thing of this approach is that it tends to be as read as possible they actually apply just local read the form and it's not I mean unless you have a completely tissue torn or something like this it actually rents the the data much more usable you see yeah of course it depends like you can select the size of the triangle and then you set it to a small then maybe it's smaller than some of the object that you have here and then you can completely destroy it or even fold it to play with those parameters it's not straightforward you just not just click the button and it works unfortunately and yeah of course then you can just as I said yep on on that alignment method does it always work on the same scalar could you actually do first for example local with smaller triangles and then go to broader with yep absolutely yeah this is something we as usually do you you observe there is one big small and big the formation over the whole image and then you want to refine on specific areas that have something particular of just that the idea of the brain and not the whole thing just pass once with some parameters and then go back to those so you start with a bigger mesh and then it keeps getting thin yeah tuning I had to say that tuning that that algorithm is it's not trivial it's a it's a bit of a trial and error but it gives you very nice results and an em and of course afterwards well there's plenty of tools to do what Daniel was doing in the video you can manually paint everything that is on your images and then select some of these objects and render them in 3d you could render everything that you're not gonna see much or some other people instead of doing dense reconstructions they also go for the manual solution but they prefer to do a skeleton reconstruction so instead of clicking and painting every single pixel inside the a new right inside some membranes they just want to click once and go very fast over the data set by iteratively constructing the skeleton of the neurons that you have in the block of tissue and then you have to be careful of course so you say okay here there is a branch here there is a synapse but you go mark in everything even you can select the direction of the synapses etc and then in this way at the same time you you produce the connectivity matrix you can export it afterwards CVS or you can even reconstruct the skeletons in 3d and then get fancy reconstructions over the the whole data set and then you see what connects to what okay just to finish the the theory part just some take home messages for you I hope I convinced you that connectomics is at an early but very promising stage and I hope that now you're sure that it needs a multi-disciplinary approach and a state of the art image processing methods but also hardware technology and the nice part of it is that it brings a lot of challenges for the biologists but also for the engineers is pushing forward a lot of fields and now there's also a lot of money involved on it from the US administration from the European union etc and I hope you see that these techniques are very generic and flexible as we mentioned before they can be reused for other biological problems or image modalities so you had to work with images at some point during your research please go ahead and install Fiji and you probably can find some nice solutions for you okay so now let's go into the the demo part I don't know if some of you install the the software as I mentioned on the on the email otherwise you can follow what I'm gonna do I'm gonna show you how to do it on my own machine yes so you can download for example these two files or one of them and then it's gonna be this image here and then we can play around with it at the same time how much time do I have okay and I don't want to abuse the coffee time 1049 could go it depends on them but 1520 means it's going to so okay well I gonna go out of this you don't see anything there what if I do this no okay this is the image it's space for you yep the the image it's uh yeah I said you have it no let me see what I do have this it's you can get this link yeah hasn't that loaded yet someone has it yes is it okay well then I will start okay so yeah we just go and let's see what is my go on plugins okay plugins once we have the image open we can go to plugins segmentation you see that there's a bunch of segmentation plugins here there's a lot of registration you need to do skeletonization a lot of tools that you you may find useful and in this case we're gonna go to trainable WECA segmentation okay just click on it takes a few seconds to open okay it's open in the wrong window trainable WECA segmentation I need to open this okay so this is yep I'll put it back there any of those it's just again a TEM image the Sofila Larva brain okay it's very similar to the ones that I was using in the presentation these are from Albert Cardona in fact this is good so this is the interface that I was telling you about before just load one image and then you see well maybe from here in this image we have uh again I'm telling you what you are seeing so these are membranes in this case they're single membranes yeah this is these are mitochondria vesicles some vessels here this is the synapse you have to believe me on that and the idea is that we need to tell the computer which pixels are which at the beginning I just put two classes so two types of pixels you can actually rename them afterwards so let's say okay I'm gonna paint automatically gets this line selection so you can say okay some of these pixels are gonna be my class one for example just click on up to class one and then just make sure that you don't want to get the computer wrong so you don't go through a membrane when you want the non-membrane class for example okay and then we need to provide at least two type of samples of one class and the other one otherwise it wouldn't make sense so we just paint again for example here very carefully on the membrane and then add it to the to the other class and it turns green okay yeah I mean this you can actually choose any of these guys here and then for example if I select a rectangle you could take all of this if I add it to for example this class but those are way too many samples for this small experiment and if now I wanted to move one of those yes you want to remove just double click here one click select it double click we get rid of it okay then well just with two samples we can start to play a bit we can then train the classifier as you see for a sim for the simplicity of it I didn't tell you what the classifier is or how many filters we're using we just paint a little bit and train come again yes you can select it as well but then it's going to use all the pixels inside the area so maybe it's too much the good thing of this is that you have some sparse labeling you can very fast just select some some pixels okay and then we we click on train classifier okay and hopefully it takes some time it tells you okay you selected 216 pixels as belonging to class one 108 to class two and then you train a classifier in this case I use a random forest which is one of famous machine learning methods but it's cool because it provides an estimation of the test error it says okay you're probably going to have around 2.5 percent of pixels mistaken but this is based on the on the samples that you made so if you don't select representative pixels then there will probably be mistaken so we come back here and then you see it made a a first estimation of the all your probabilities I mean of your the probabilities of each class actually what is shows here is applying a 50 percent probability to to each class and then you say from one the threshold is 15 so you say okay one side is red one side is green are we there did it run you can tell me if it fail it's probably my mistake and then you can see the the result you see create result should create an image of course it's in the wrong window and then it gives you something like this just create result you want to see the vinylization of it the idea behind this this kind of message is that is what we call interactive learning so now we have trained but we are not satisfied with it because we see there's some errors you can for example zoom in some areas let's say here you can zoom either with with the plus key or using the the compass here magnifying glass and you see how for example here you say okay this this membrane hasn't been added to the to the membrane class so I just put it myself not using this but again this line just paint a little bit add it to the green class and then retrain okay and now the training is to get faster because the features were already the filters were already calculated and then it gives you a new result this case we have a lot of green this is an interactive process where you can go back and forth between your traces and and the result until you are satisfied in particular a good practice is to go to the settings okay I see it will show you this big window and then you see all the filters that you can use now we're using just five of them but with different radius sizes different patch sizes that means that we're gonna go for example you see here Gaussian blur it's gonna do a Gaussian of radius 1 but also the minimum is 1 to 16 so it's gonna do 1 2 4 8 16 trying to get the information from many neighbors here you can actually choose the classifier there's a list of it you can Weka is actually famous Java library for machine learning so you get many other methods over there and the most important thing for me is to click here in advanced options you go homogeneous classes but it does it is to balance the number of samples that you have per per class because for example here is very obvious you get a lot of non-membrane pixels and just a few that are membranes so you are tempted just to paint a little bit on the membranes and then scratch a lot of non-membrane but maybe you you turn up with double the number of samples for for soma for cell bodies than for membranes so many methods can be affected by that and they would tell you okay then just go and classify everything as cell body because for example imagine that you have one pixel is a membrane you selected as membrane and nine as cell bodies well you just select everything the classifier says everything is a cell body the error is 10 percent so it's not that bad for for the classifier so you the way to prevent this is to click here so we're gonna before before applying the training the classifier we're gonna resample the proportion of each sample yes but it's always a good practice because most methods get affected by this okay and and we only have two classes or is it possible to add it's possible to add let's say well let's first train this to see if there's any difference on the result but in this case not much because the number of samples were already too close but we can say okay let's create a new class again this is in the wrong desktop we'll show you this you need just to put it in a name let's call it mitochondria for example okay and then it will get automatically added here and now you can leave it blank yeah so again just a quick overview question so there are programs like IDL that have been around for 40 years for image processing I mean are you adding something new are you creating something that's much easier to use in for neuro neuroscience images you know how does your software differ from these others good question so on one side I try to make something attractive not only for neuroscientists and not only for biologists because this platform uses is used for by tons of especially biomedical image processing people but also for the machine learning people so by putting in contact Fiji and WECA now I'm providing also new problems for the machine learners so now they can test their algorithms against something that is actually useful for us you can go here and say okay for the whole list of classifiers that I show you some of them work really well some of them are crap so these people now need to improve their methods in order to to get and good results and it's also interesting for for people who never they don't have much knowledge about machine learning they just have to come here and interactively have a solution that is better than a simple stress holding or a deterministic method back to the mitochondria no I would actually add one one pointer your answer to that question so what I think there are two big differences to tools like MATLAB or IDL the first one is this is free IDL and MATLAB they cost a fortune in particular if you're not at a university or a grant giving organization second is MATLAB and IDL they are general purpose tools so you have to before you get to that level you have to do a lot of work so you might as well ask the question what's the difference between this and Java or this and C++ the way I was thinking about it is IDL is basically a programming language this is much closer to being an application with an ability to extend it and when you get to programming languages you're just talking about the level at which you're programming but this is much more like an application so someone who basically knows nothing about programming could think of using this tool to do something useful if you put someone who doesn't know how to program in IDL in an IDL environment you've got a month before they do anything that's right actually well you could presumably write this in IDL if you wanted to my idea was more to work as a bridge between the machine learning world and the biomedical imaging world and also in the meantime produce some nice application for users without experience just to but okay just to finish with this demo if I go let's say okay I selected I added the new class for example my tucondres okay I want this to be a mitochondria just add it here and then it okay it made a lot of mistakes but it adds a new class and then you have now green, red, and blue and then of course you need to you need to tell it okay these things are not mitochondria you should correct from that etc requires a a few training back and forth until you get the desired result some of this but the good thing is that once you get to the to the nice result that you expect you can save it and then you can apply this to all your images even in a in a cluster and then you can maybe run it and say okay do the classification for me or get me just the probabilities and then I would work from that and create another pipeline okay so just to finish before coffee let me show you the game that I was mentioning before okay doesn't fit here okay more or less so I told you in in our pipeline we need to actually make sure prove read the results that we get and for that we brought our our tool our software tool to the to the crowd to the website we created a website called iWire and then once you log in you can just play with different pieces of a block of retina as I mentioned before see here you don't see organelles this is unfortunate but this was one of the first data sets that they produce in fact this is from Heidelberg from Windfront tank and then the idea is that okay you don't see if you go into the website in your computer you probably see it better so you can go through the sections and then in dark blue you see the portion that then our machine learning artificial intelligence method I decided it goes together so at some point it's lost and you say okay I think this has gone together so you just have to click in fact instead of solving splits and mergers what we do oh there's a chat as well so whenever you do something someone else is also talking to you so you just have to click because it's easier to solve mergers I mean splits than mergers because imagine that in 3D you want to cut a volume into two you have to select word and how it's much more complicated than just knowing that you have a lot of pieces that you have to put back together and then once you click then you see it gets reconstructed in 3D and then you see if it makes sense or not and then you can go on follow the the whole process and see like there's some pieces that are missing just click again click again it goes automatically reconstructed you see okay that's a little crazy can go farther farther farther okay maybe this should go together too and then it goes out of the out of the screen at this point I just say okay well I'm finished are you sure you're done with this cube yes I'm sure and then it gives you some points based on how similar the people who segmented this or corrected this segmentation corrected the round segmentations in base of not the clicks but the volumes comparisons and the idea is that we tested many people with the same volumes and then we get to to an agreement between the crowds and this is what we're going to use as the real thing the solution and then every day has a different ranking there's some people that have no much other things to do and they spend hours here or there's some people that just want to help and you see that you don't need much knowledge you just have to have an idea what the a neuron should look like but it's also some training you can select some tours and then it tells you okay this is a this type of a neuron so it should look like this don't go on nursing that student et cetera and then once you get done it shows you goes well you've seen just the big cell that we're reconstructing now and then it goes to another part of it and then inside this there are games there are for example I think it was last summer we made a competition between the people who started playing through Facebook with the people who started from Twitter and from Reddit et cetera there were different teams and then the winners at some some prizes or my boss made a funny video for them ways of attracting regular people into science which is always nice but also helping us solve our problems and I think with this I'm done thanks for your attention