 So welcome to this new NöBS Academy webinar. Today we'll continue our series about nine. So last week our speakers Jan and Stefan introduced you to the concept behind Nimes and how to use it and they gave you some links to some exercise. So today that will be the topic of this webinar and again for today we have Jan from fmi and Stefan from nine and we're pleased to welcome also Franke from fmi as a one moderator for today's session. With that I leave the floor to our guests. Yes thank you very much for the introduction and welcome back everybody for the second session. Let me just share my screen now all right. So now today's part two of this introduction to KNIME for image processing. Just a brief recap and some follow-ups from the last session. Last time we walked through an image processing workflow for image segmentation and basically we saw that KNIME image processing allows building of more or less complex image analysis pipelines and makes also debugging very easily because you can really see the intermediate result of each of the nodes in the workflow. And of course we can also easily adapt workflows because it's a very modular system and batch processing is built in into the workflows and the possibility to add annotations to nodes and also workflow annotations is really helping with the embedded documentation of whatever you do in your data science and image analysis workflows. So it's a very useful tool for reproducible science and with that we'll dive in in a few more aspects of that for today. But before going into more details I would like to point you to the possibility of learning on some online resources. You have already heard about the KNIME hub that we were mentioning last time. So on hub.nime.com you can search for existing workflows for examples. You can also look for the components and nodes that are available. And maybe just to mention that in the KNIME analytics platform you have direct access to the KNIME hub if you have an account there. But you can also go to the example server here in the left part of your KNIME explorer. If you double click here it will open the list of example workflows that you can browse and see. In particular the image processing workflows are under the community part and then image processing you find a number of workflows that are tutorials for a few nodes and some applications, specific workflows for the integrations that are available. So the image j cell profiler integrations and the nipython extensions are just noteworthy and a few highlighted applications such as trackmate for example and the other one here in the list. So I would encourage you to just use these examples as a learning resource as well and of course use this the search and also ask questions both on the image sc forum that is known to most bio image analysts and also on the KNIME forum when it's more for technical questions about the core functionality of KNIME. But both forums so image sc and KNIME forum are fine for any questions with image processing KNIME workflows. All right sorry for this I would now jump to the assignment that we gave at the end of the last session. So here are the links again of the assignment workflow online. You can download it as an archive or just go to the hub and download it from there. And given the webinar format we of course cannot make it fully interactive that you can hand on click click it with us together but I will walk through a solution or a proposal of a solution now with you directly interactively in KNIME. So let's switch to the assignment workflow. Do you see that? Yes. For some reason there's my zoom is blocking here the screen. Okay I can't reach the zoom window here. Okay all right so in the in this workflow assignment the task was to load some images and then do some segmentation similar to the workflow we have been walking through last week already. So I will be rather quick in the in the beginning and then focus on the task of measuring the image intensities in our segmented objects. So first of all I will now start with the note repository and the workflow coach here and use a path to string node to convert the list listed paths in here that we see are pointing to six files out of our data file folder and we're just converting them to a string variable in order to be able to load them with the image reader node. So so far that was the same that we had done last week already so if I double click the image reader table there's not much I need to change on the configuration as it is it will just read in the six images if I execute and open views it will open an image reader an image viewer sorry and showing me the six images that are being read. These are two channel images if you have followed the assignment you will know that and we want to go so if I click on one of those here we can see these are this is channel one and the other one is channel two and the goal is to segment the single nuclei and then measure their intensity in the second channel. So first of all I would like to split the image in the two channels I'm using this splitter from the image processing extension which when executed will just give me a new column for every channel so I have two different columns now for each image there's channel one and channel two now for the actual segmentation as highlighted in the annotation here we will go for a simple thresholding workflow first which is a small smoothing then global thresholding we would like to fill holes in this thresholded images and then do a connected component analysis to generate the actual labeling image. So if you look at this at the workflow coach in case you have activated it in your installation as well we can actually quickly find these things in the top hits of this workflow coach so what we need now we want to do a smoothing so you need the Gaussian convolution here then after this one I need to activate it again then it suggests the global thresholder which is exactly the one that we want to have here and then I want to click here the next thing would be either a connected component analysis or the second place here is fill holes which is exactly what we want to do on the binary image before doing the connected component analysis so I can yet take this one and after filling holes I will directly jump to the connected component analysis so now let's see what these nodes actually do step by step so the Gaussian convolution I will double click to configure this one I will leave it with the default settings with only like a sigma and two and both x and y dimensions and for the column selection tab this is where I have to choose which channel we want to process so in this case I will only select our first channel here and click okay so if you look at this if we want to see the intermediate results this is how it should look like so in general the nuclei are still distinguishable but a bit smooth smoothed out and so we can more easily apply a threshold in the next step for the global thresholder I will switch to a yen threshold here and just press okay and this will give us a binary image for the whole set of six images we will have now a black and white mask image of these cells now we said because some of those might still contain holes we use this fill holes node which basically just gives the option to the option to select in what dimensions we want to fill the holes so if you keep it to x and y for for 2d images of course that's all we can do for 3d image you could also choose to select different dimensionalities here so this will give us the binary image without holes in the single components that looks good so now we will just do a connected component analysis we can run this right away and it will give us this a new a new image type called segmentation or labeling indicated by this seg column header here which contains all the labels if you look at these you will see random colors for each separated object if we mouse over this these colors you will see up here we have a value equals and then this indicator of the label which is just different numbers in our case all right so of course if we look here you will see that also some of these because we did a plain connected component analysis the cells that are touching they are not segmented optimally for now we will leave it like that I'll just talk in a minute about other approaches that we can use here of course so now that's basically the segmentation there is actually a question from the audience about this whether there's a way to use watershed segmentation I think it addresses in a segment right I will address it yes thank you I will address that in a second we can actually use for example image j macros to do that and image j has a nice watershed implementation that we can use I will show that in a minute and of course we can also replace this segmentation by more complex segmentation approaches which we will see also later in this seminar so for now if I look at the image viewer I basically have that that single column now with my segmentation because the name is still a bit cryptic of this image and then the channel identifiers here I will just rename my column to something more readable so I use a column rename node by double clicking it will be always added and to the workflow and I will choose this column and then just name it segmentation for example all right so that's the first part of the segmentation now the more tricky part is to get to the actual measurements of intensities so we first need to join back this segmentation with our split channels from the splitter here and then at the second step we will need a node that measures intensities for now I will just take a joiner to merge back all the columns that I want to have so I will take the original images from from here from the splitter let me just make that a bit smaller a bit larger and then take the joiner when we have to configure the joiner we will always have to choose a matching criterion and in this case it's just row ID and row ID from the both from both of the tables we join so since they didn't change row IDs in our whole workflow the first row will be still the first row in this one so the joiner will just merge the channel the the columns together if we see that here we now have the two separated channels in from from the first table together with us with the respective segmentation of the first channel for each of the table rows now there is a new node that I need to introduce for the for the measurement of the features it's called image segment features here and this one allows us to choose an input image with intensities like the first channel and a segmentation image which defines these segments or objects so if I configure this one I can choose a segmentation here and I can choose an image column from our two channels so I will just leave it with the I will choose the second channel here because we are interested in the intensity measurements from that second channel so keep in mind the segmentation was derived from the first channel and now we measure on a different channel for the segment settings there's an option where we can append segment information that I will leave checked and we will also see that later we can choose informations about overlapping labels that is a feature that these segmentation images have as well for the actual features that we want to measure we have several categories here in the first order statistics if you select this one we have choices of different measurements of intensities min minimum maximum mean and others statistics so we take the mean intensity for each of the segments that's what we want to measure on our channel too and in addition let's measure the size of the object which is in segment geometry and in here because it's it can be area for 2d images or volume for for 3d images it's the feature is just called numpix so the number of pixels or voxels for if you want that we measure so that's the raw raw size of each segment of each object if we execute this node you will see in the output table when it's once it's done you will see a table like this one that contains actually a black and white image of each object well these ones were the the objects at the border that's why they're incomplete but if you just look at some of these so usually these bit masks are just the bounding boxes for each of the cell and containing a mask of of the shape of the cell and we have a column called source labeling which refers to the labeling where it came where this segment came from and then we have our two features that we selected here in the column numpix we have the the number of pixels so in the hundreds and thousands and for the mean we have the mean intensities of these objects if you right click on the column header so if you right click here you can also choose different renderers for example if you want to see it in a full precision then it's just like a different number format but you can also choose for example bars which allow for an easy quick visualization of the features here so if you have different classes in the table you could easily see the changes of of these objects already maybe so in the mean we can also do that next to bars you can also for example and make it gray scale which gives you a kind of a heat map of the features here and you can browse through so you see here already that towards the bottom of the table it gets a little bit darker in whatever that means of course you can mouse over these features and then you see the actual numbers so that's just a very quick way of visualizing and Stefan will later on take take you on with a bit more ways of how to visualize this graphically so that's so far for the actual assignment that we gave you we measured we extracted the area and measured the intensities of our images with the segmentation from the first channel and the measurement on the second channel now let me switch back to the to the slides shortly we've also put that solution in here on the NIME hub and downloadable as a NIME archive file so in case you want to review that feel free to to download it anytime as a follow-up on this one I would now quickly highlight some alternative segmentation methods as I mentioned on the segmentation we already saw that we have a few clustered cells that were not segmented by our very and simple threshold based segmentation so we can improve that by various ways of course when using extensions in NIME we have an image j sorry an image j extension that provides macro language access so you can use any image j macros in that image j macro node and that's one way that I will just show in a minute in order to get these extensions you have to go to your file preferences and then check check in the available software sites that you select this experimental community extensions update site so in NIME this is file preferences and in here you will have these available update sites in the list where you can also add more or just check and uncheck sites from there so if I quickly wrap this up and want to go to my so if in the connected component analysis we saw that we have these clustered cells from time to time because they were touching so let's just do a simple watershed here because our watershed image j just works on binary images I'm sorry let me just go out for macro I will use this image j macro node drag it into my workflow and because we are working on binary images I will have to choose the the last node that works on a binary file here so this fill hole holes node if I if you remember that one still has the black and white masks so we drag this to the image j macro here and I if I double click here I can add either my own code with a pure code option or I can go to choose some pre-configured options and watershed is one of them so if I go here and add this to my configuration dialog you see already there's a pre-filled text here in the dialog which is just the three line macro code and I can as well again have a column selection and some other options but for now we can just click okay here and see what it does you might have noticed that this macro node has two outputs if you mouse over this macro node you will see there's processed images and there's results table so the principle for the macro node is that it takes whatever you define as an input column it makes it the current image in image j then runs your macro on it and the output image will be just the last image that is open after your macro has run and it will be fed into a new new column and the current results table in image j will be the input to the will be fed at the output of this at the second output of this node if I do this and run on here and we will have a look at the result with the image viewer and we let's look let's have a look at these clustered cells down here again or this one as well so you see the watershed just did a good job on the 2d image to separate some of these these cells so we can now take advantage of the modularity just reconnect this output to our image connected component analysis it asks me to replace the connection and to reset the nodes and right away our workflow is still functional we can do the connected component analysis and we will and we will have separated cells in our downstream workflow here and here you see that so that's one way to enhance the the segmentations I don't know if there's any pressing questions for that one now actually there the audience commented on the possibility of applying such segmentations to stacks of images in a three-dimensional framework yeah that's a very good point so image j macro this node also I think it allows was there an option to I just don't remember it here so if you feed a 3d image I think so it basically feeds a stack to image j and you can just process the stack in your macro you can also do slicewise processing or volume processing so the options are are manifold but of course in the macro you will be limited to what the what the image j macros does so if your plugin works slice by slice then that's what it what it will do I don't know Stefan do you want to comment I think yes please I think the question was raised a little earlier even before the image j macro so actually I think that's that's an interesting point changing from 2d to 3d and if you open up the configuration dialogue with the Gaussian convolution for a second maybe that's worth pointing out here if you go to the options exactly you see the dimension the dimension selection here at the bottom of the screen which if you open a stack of three-dimensional images basically a three-dimensional image basically and you have the dimension selection like this in x and y it will basically do a Gaussian convolution slicewise however if you click the the set dimension here as well then it will automatically switch over to processing that stack instead of slicewise will really do a three-dimensional smoothing in that case and I guess the question so that really is you can build your pipelines on 2d and easily switch over to 3d without changing your entire workflow that's the one point and very likely the question asker was interested in 300 sections in that case very likely also referring to the the data size here honest answer that will be if you have a machine with enough memory that will not be a problem but since it's really we do a three-dimensional processing will naively load all the slices into memory and so that you might actually run into trouble there and might have to do some advanced techniques like loading one image after another we might come back to that later yes thanks so there's there's options to to get either loops over the rows or even tile loops that automatically cut your image into tiles and then set it put put it back afterwards so these options are are more advanced use cases I would say then yes all right um yeah so so maybe I will just now switch to one other use case for introducing so we we've seen the image day one macro and of course we would also like to highlight more advanced segmentation techniques involving deep learning recently you've probably heard of approaches like stardust and cell pose which are very promising and generalized model deep learning models for for cell or nucleus segmentation and thanks to the nice integrations in nine you can also use these frameworks relatively flawlessly I would like to highlight one example for cell pose so that is a deep learning base segmentation method recently published in nature method and basically it's called from from python just like yeah well almost any other deep learning model approach with either pi torch or tensor flow in this case so you can choose from from two different pre-trained model types which are supposed to be very generalist well working well on on almost any type of cells but you can also use your your own trained models of course I've put together an example that calls cell pose via the python integration in in nime so in order to use that you would need to install the python integration you can go go to the nime hub with the link included in these slides and also then well you will have to install and restart nime of course and in the preferences you will have then the opportunity to select to configure the python environments this requires conda installation but on top of that conda installation actually there's not so much needed so you don't even need to set up your own conda environment yourself but nime provides the possibility for workflow editors to ship the environment definition with the workflow or with the component they ship on and they will share with you so in our case this node that is doing this work is called conda environment propagation and it basically looks at a local environment of of you when you're developing the workflow and makes a list of the packages and keeps that environment definition in the workflow configuration so we can make use of it when running the workflow on any other computer so in order to test this I would like to just replace the essential part so this this part is basically out now my my first draft of segmentation which I will just create a new component now I will reset this for and rename it simple segmentation so now by control clicking on windows or command clicking on on mac you can always go inside this component and see the contents which are just the six nodes that I wrapped into that component and they are basically flagged with a component input and output here so back to the parent workflow I have this simple segmentation now I would like to replace or compare this simple segmentation with the segmentation I've put as a shared solution on nime hub now let me quickly go to that node I will just move in this window so on the on the nime hub I've uploaded a component node called cell post segmentation which is basically referring to to that paper and you can just drag this icon onto your workflow so we can take this one and drop it here and then connect it with our output of the splitter I in this component node if you develop component nodes there's options to to define some user input which I've done for this component node where we can select the input column we can select the name for the output column and we can also define the parameters or the settings that are required for for running the cell post model in this case these are defined also on the web page of cell post they're explained so you basically have to provide an expected diameter for your objects in pixels and a minimum size so 50 and 10 for them are okay so I click okay here and I will let it run to do the segmentation and in this case if you run that on the first time on a computer the this conda environment propagation node that is included in the node it will take care to set up a new conda environment for you if it doesn't find an environment with this with the name that is defined there that contains the correct packages and if you run it any subsequent time and it finds that environment in your conda installation then of course it will just go ahead and use that conda and an installation since it's just it's not using GPU this defined conda environment but it just made it a simple CPU inference of the deep network so I can run it also on notebooks like and macbooks that don't have an nvidia gpu but of course it takes a little bit longer than it would do on a powerful workstation with gpu. Jan can I interrupt you for a second? Sure there's a question from the audience if the cell post model type shouldn't be nucleus here good point yes I missed that one actually so when you have uh yeah we can we can try both I left it on on the cytoplasm uh this if you just have a doppy staining or other well-defined staining both of them work actually good and the recommendation from the cell post uh authors is actually that if the nucleus model doesn't work well you can choose to use the cyto model by default it's more useful when you have two different staining that define nuclei and and cytoplasm and you want to basically use this the nuclei as seeds for the cytoplasm then you can choose between these two models but in our in our case where we only use one channel both of them work equally well and Jan can you quickly explain whether you how to obtain a training or if it's possible to train your own model yeah so in this case of course we are by because we are using the python integration you are as a as a workflow author you can do whatever you can do in python and just integrate it here but of course for many cases it's usually best if you train a model on the data that you want to apply it on in case of cell post or starters they also buy these pre-trains that are supposed to be generalist well performing models on on most on many data but of course we can also select our own train models let me just click into that one with command double click i can open the the cell post node and we can just have a look at how it works so these green nodes actually define the user inputs that you've seen in the dialogue and this one is this condo environment propagation and the python script node is actually all that the one that is doing all the work i don't know if i can open it without unlinking it yes so that's a script that i basically put together from from the example notebook on the cell post page it's um running it's creating the model and then running the the the model on each of the inputs in our input table so of course that's already a bit advanced for setting it up but if you but if you know a little bit of python scripting you can do that and basically you can transfer the the code from a python notebook into that script node and then once you have this one you can easily share it with others who don't need to go into the python code and then they can just apply that node on your data and your node is now called cpu because it's designed for cpu usage there's a user question which asks about gpu training and inference yes so i this one is i just named it cpu because of the condo environment because if in this environment that i'm using i only installed the cpu version of py torch so that's the only difference and of course also in the calling the the cell post script you have an option to set gpu to false or true so i explicitly set that to false for now for the sake of being compatible across different systems if i had if i targeted my workflow to a specific workstation where i know there's a gpu i could also define the condo environment in a way that we directly install a py torch gpu version for example so it takes a little bit longer than expected here now i wonder why that is it took like a few minutes for me when i when i try it but maybe because of the connection now it's a bit slower on my small laptop so we'll take a little bit of this and ask another question uh the condo environment is it in the uh nine workspace or can it be elsewhere on the computer it's actually in your main condo installation so if you go to preference this python where is it nine python integration you see that i will i pointed it on my macbook here i pointed it to and to my own mini condo installation and the environments are in the default path of the mini condo so if i now went to the to the terminal and executed condo i would see that new environment created by nine in the main list of environments okay maybe for the sake of time i would also switch or hand over to stefan that he can introduce the other approach that we wanted to introduce here and we will just get back to the to the results in a minute maybe if that's fine with you stefan that is fine with me while i start sharing my screen one common to um romance question actually about the condo environment um so the the idea behind that is that you can have dedicated condo environment for every workflow basically so that is really a setting on a workflow basis you can determine yes my workflow requires python and also saying okay this and this package needs to be available in order to run this component for instance so that's just as a as a small remark here and we were talking about alternative ways to segmentation already i'll jump ahead a couple of slides here and jan i think you already had mentioned um stardust um that's the part that i took a look at i will not spare a lot of words on this here a because the authors have done an amazing job at documenting things um you can really go to their um github page the link is on the slide get information links to um publications preprints um and so on if you are interested in that um so it really is in short a segmentation methods method that basically um assumes or puts a prior on the segmentation namely that objects that are beat to segmented they don't have to be strictly convex but um star convex which helps a lot with um for instance examples that we have over here and also segmenting overlapping and touching cells here also there is there was i don't know the exact date but if you google or search for new bias stardust there was a new bias academy um presentation april um last year by martin even colleagues who've done an amazing job at and describing what how stardust actually works so i'll focus a bit on the implementation and how you can use it in nine so one point that is also interesting here is there's the main github repository here that there also is a stardust image j and pitchy plugin in a um in a dedicated repository here so one thing that when i saw that i actually got quite excited and to be able to use stardust in nine as well why is that i mean it only says image j and figi um but the beautiful thing actually is with the um image j integration it's not only an integration for image j one macros but um we actually have a layer we have functionalities that we can use um image j plugins as nodes that really show up automatically as nodes in a nine installation and you briefly saw that already in um in jan's node repository here um if you really looked at the details you'll you saw a category that was called image j2 here and that actually if you take a look here it says edit image plugins process and so on that should sound a bit familiar to you if you've used image j um and figi before because that actually is uh um categorical representation of the menu structure of image j and figi and i should be specific here of image j2 um inside of figi and for instance we do have a um gaussian blur entry here that for example shows up here as a node in the filters category that can really take it drag and drop it as a node into the into a nine workflow and it really uses exactly the same underlying implementation as it would use in an image j2 it automatically derives the configuration dialogue for those plugins or commands as they're also called from the parameter definitions of the plugins we can also set the the sigma the radius for gaussian blur for instance here in the configuration dialogue so we'll also do a mapping of usage concepts um onto um onto nimes or interactively selecting parameters will be translated into a configuration dialogue on a node and why that got me so excited actually is that it's not only the regular image j2 plugins that are shipped but you can also integrate custom image j2 plugins that then show up as um nodes in your in your nine installations really if you go in and have implemented um some image j2 plugins like this one i think was really implemented by by jan at the at the fmi you define your parameters and you will do the implementation and you implement once you can use it in image jfg but you can also use it inside of nine um how do you do that practically there is jan already showed that there's a um in the preferences there is a if you open up the preferences nine image processing plugin and there's an image two image j2 plugin installation you can click add here and manually add custom jar files with image j plugins for use in your nine installation beware that you have to restart your nine um ones before they show up in your node repository with that i actually thought well that's amazing we do have a star disk image here plugin i can just want to want to use it inside of nine unfortunately that didn't work out 100 percent yet however the the interesting part is star disk is a deep learning based method and we actually can also use and have native and integrations of deep learning in nine so there's couple of those integrations with um popular frameworks like keras tensorflow um onyx which is also a cross platform um standard basically um also deep learning um for j so the nice thing about that is or the two things that i want to point out that are really particularly nice about that is that you can actually build the topology of a deep learning um of network with nine nodes with those keras layer nodes for instance however you don't actually have to use that you can see that as an example down here you can use python and the keras python api to define um to define your network in such a node here and then even use this topology and manually add nodes in here take a keras network converse converted to a tensorflow network and then use the native tensorflow execution to actually um take the network and use it for prediction wise that's so interesting and the tensorflow integration is a native integration so it tends to be a bit faster than the keras integration that uses tensorflow in the back end so the results should be the same um but the the processing speed is a little faster with the native tensorflow and integration and that's actually where i thought well i have all the pieces um all the pieces together um why don't you try to put this together and build a um stardust um component and i actually um did that i just have to find my browser here and move that move that over um from the nime hub you can actually search for stardust and i uploaded a component here and also an example workflow um let's take the component here and drop it into the workflow that um jan has also built beforehand so he used to call that a simple segmentation for me it's just called a segmentation and the idea behind the component is again very similar we hook it up to um to our input data and in this case i'm only using um a two-dimensional um stardust model here to open up the configuration dialogue um have to select the model that we want to use for segmentation so you can easily change this out so this is a trained model that performs quite well in fluorescence images and nuclear segmentation but you can easily switch that out and point it to a new zip file here which is then used for segmentation and obviously i'll have to select the image column on which to actually do the um segmentation since this takes a couple of seconds to run i will actually insert a row sampler here to not work on the um entire dataset but only process it on one image so now we only see down here in the node monitor there's only one image in our input table and now we can give the stardust a try and we'll see what's actually going on uh-huh the zip file could not be found so that is obvious i'll have to point it at the correct network that should hopefully work out let's give it another try yeah and that looks better we can take a look at the inside of the component that looks very similar to what Jan showed before a column selection node a node where you can select the model that is applied we do have the tensorflow network reader in here interesting bit and pieces is the tensor network executor that really applies the model here and what is a bit particular about stardust is that we don't actually get a pixel by pixel segmentation out of the neural network but we do get um the probability map and radial distance maps out here so we'll actually have to do a bit of post-processing of the output of the neural net and luckily that's um where the imagej2 integration comes in the authors of the imagej2 of sort of the stardust imagej integration actually have provided a plugin that only does the post-processing the non-maxima suppression basically takes what comes out of the neural net and creates a segmentation from it or a sorry I actually have to take a look here and actually creates a labeling image and automatically returns that so with that that's actually really quite easy to just drop in the stardust segmentation and would now be able to also use this in our pipeline and exchange it with a cell post and so on so that modularity actually really is if you're really still exploring exploring don't know yet what is the best segmentation method um now it's actually really powerful there all right um that was it about alternative segmentation methods let me move my browser over here and quickly jump back to my slides um because we actually promise or I specifically promise also on twitter that we'll talk a bit about data analysis not only specifically image processing um so let's get to that and take first take a step back Jan already mentioned and showed the input data um from the from the assignment with two channel images we use one channel to do the segmentation and then are interested in extracting the intensities from the from the second channel however if you've done the assignment and looked a bit at the data sets and looked at the file names uh them over here you'll actually notice that there are um two different classes that we actually have two groups um of data in here and that already hints at the explanation that Jan pointed out before that the means change as you scroll through the um through the results from the image segment features node because we really do have different classes different types of data in here and this is what we're what we're now interested in as in we might have something like untreated cells versus treated cells in class A and class B and we're now interested in comparing um comparing the features the cell features that we extract for each group to see really is there a difference between the between the two groups um how do we do that or before we do that um briefly jump back a little and talk a bit about um data data aggregation and again pulling out an example that's not image related here um but the idea behind data aggregation is knowing that you have some some category or groups in your data that you want to apply an aggregation method to um individual groups so sticking to to the example here we have an input table with a couple of um with a product ID we have a category in here and the number of ordered items and an example of data aggregation would be now how many items have been ordered in total for each category what do you do you look at the entries for the clothing category we have two ordered items clothing one clothing five so you want to sum up two plus one plus five and you end up with a sum of eight in this case here look at your group again home three we only have that one so that's directly the output and we do again look at the group of electronics here and sum um those up so that's the general idea behind data aggregation we do have groups for which we actually um want to take a look at the the numbers inside of the groups and we want to aggregate them somehow and the the sum really is just an example here it could also be in computing an average value from the individual values in each group computing standard deviations and things like that how's that done in nine there is a dedicated note for that it's called the group by note and that actually has a configuration dialog and let me jump over here and show you how that how that looks live um so the idea here is I actually have to add an image segment features note here so that we actually have a couple of features that we can use take the segmentation and the second channel and I take the features and we said we were interested in the average and the number of pixels say okay execute that and take a look at the output we now see that we actually have our individual bit masks here the sizes and also the averages and here our we went in with um six files so one possibility for a group for instance would be to um look at all the cells within one file for example that we are actually interested in in the second step at actually looking and aggregating the all the cells that belong to class a and all these cells that belong to class b so let's um first take a look at the file base aggregation how could we do that we do add a group by node actually make that a bit bigger here group by node take a look at the configuration so first of all we have to define the column and that actually contains the group by which we want to group in our case Jan already mentioned that there's something like a source labeling which actually contains the labeling image from which a cell has been extracted and we can use that as a group for instance and then we can switch over into the manual aggregation tab over here and say okay for each individual image we are let's see for the number of pixel for the cell sizes here we can add that and now I have to manually select the aggregation method what do we want to do with the individual um with individual numbers in here and that actually is a drop down here we can you see there's a lot of things that you can do we can compute the the average immediate maximum and so on and let's say we're interested in the average cell size per image that's basically what we're computing now to say okay execute this and take a look at the group table and now we'll actually see that in this source labeling representation that we actually have class a a bit bigger class a image one class a image two class a image three and the individual averages over the number of pixel which basically is the average cell size per image in this case so that's however what I mentioned before that's not necessarily what we're interested in we're really interested in comparing the two classes not comparing individual images and there's a note that I have to that I have to introduce here there's a a helper note basically it's called the labeling a labeling properties note that I can use to extract the file name from from a labeling and in the configuration dialog I actually select append and use the source labeling and here again we have the features and I am interested in the the name in this particular case but I could also get information like the dimensionality of an image number of pixels of the entire image um calibration and so on and just so that you've also seen that we're only interested in the name for now you can execute this and take a look at that results table again so now we actually have the the file name extracted for but only one file name and we actually have done an application beforehand so now we know the average cell size for this file let's just continue and fix our little mistake later because we don't want this first aggregation we'll actually use a second note it's called a string manipulation that is a generic nine note that you can use to manipulate and transform the string columns that is text columns basically in our case we'll use that to extract the class a and class b from the file name this is what the configuration dialog looks like there are a couple of functions here very similar to the image shape macro functions for instance and we're looking for a function that is called the substring and it actually because it actually as the name suggests and extracts from a string that we are providing um from the start position um the length number of characters so that we do that we actually double click on this and it inserts it here into the expression field the first um the first parameter to the function is um the entry in the column that we want to change the name column double click that here again um how where do i want to extract i want to start from the very beginning of that string i want to extract the first um six characters here and i will add a new column that is called the the class so we basically do that execute it and take a look at the output table it becomes a bit easier to understand what we're doing so for each row in here for each cell we're taking we're extracting the first six characters to figure out if an image belongs to class a or an image belongs to class b dimension before that we actually are not interested in that on a file aggregation level but we are actually interested in doing this for each individual cell so let's do that we're taking the group by here out of the loop we're connecting the image segment features directly to the labeling properties node and the string manipulation can execute that and then take a look at the output table we will see because now for each individual cell in here you know that this cell with this average intensity actually came from class a or we can use that to get some descriptive statistics we can actually use our aggregation node here and take a look at the configuration and now we're actually not interested in grouping by each files we move this here but we're interested in grouping by a class so we move that over here into the and use that as a group column to change the the manual aggregation and not only compute the average cell size but let's also compute the average intensity per cell basically say okay and execute this node and take a look at the output here to realize okay now we have two classes class a and class b and we see that the average cell size already is slightly different between those two classes and here we already see that the sorry the average cell size here but the average intensity is there's a huge difference between those here this is now only one value that we've extracted for for each group why don't we do it a bit more exploratively and take a look at the at the box plot you also see the distribution between those two between those two groups so we just exactly use the same data and feed it to a conditional box plot and the condition conditional in this case refers to that we want one box plot per group in that case we can hook it up to our data take a look at the configuration dialogue very similar to other configuration dialogues as well we first need a category column that is our class and we'll focus on the average intensity and in this case we can leave the other default values as is can say okay and now actually if we execute this note and right click on the context menu again right here at the center of the context menu you'll actually see that there is an entry for an interactive view of the conditional box plot you can click on that and very similar to the to the image viewer that Jan has already shown before as well we now have an interactive view of the of the input data and here we already see that the average intensity is different between class A and class B and then last but not least I mean that that is nice already but the question then should ask yourself is this really significantly different or is it different enough to extract or do a statement about what we can use here then is we can again use the very similar data and not take a look at the interactive visualization but actually do a hypothesis test to compare if the averages of the two groups are different so there is a actually maybe you move on is there is there a way to change the visualization so maybe to do a violent plot instead of a box plot or something yes there is I let me check if I do have that installed yeah so I do have that's a dedicated extension the plotly extension and that adds a couple more visualizations so for instance we could add a violent plot here and thank you Franca for for bringing that up do have that on the slide we take a look at the views here and the the javascript views and you will actually see that we have quite a view visualizations that you can use it's not only conditional box plot with line charts and scatter plots and and so on so there's a lot more visualization wise that you can actually do if you're interested to find a lot of examples um on the hub for specific questions also on the forum because I mean we ran an entire course on visualizations with mine so there's a lot more to explore here we can try to connect that and maybe just one short question so could you also select defined subpopulations for plotting um so that's the violent plot just for completeness sake yes there's two ways there is the well I would say that the naive way in quotes is that you filter filter out the data for instance or split your data before feeding it into a visualization um into group a and group b for example there is there's a note that's called a row splitter as an as an example where you can also define the the rules how to split your input data feed the top port to visualization the bottom port to another visualization that would be one idea how you could do that but you can also and that goes into deep you can also basically create composite visualizations automatically if you put them into into components and you'll see that this actually should have a view that now contains the conditional box plot as well as the violin plot down here and those visualizations actually can talk to each other as in it is possible to select some data in this plot up here and then only show the selected data for instance in the in the plot down here so yes that is possible but it's a bit more advanced concept wise but I can put a link to the to the documentation around that in the in the q and a section good more questions or are we good for now maybe just to add as a comment so if um for interactive visualization what Stefan said is great to have in nine um if you already have some some script for example in r for doing your own scientific visualization you can also use the r integration and then basically just feed the table into your into that r script and use ggplot2 for example to generate a static uh plot or the same with the python view when you have plotly scripts for example yeah exactly yes thank you very much then that's that's indeed a very good point all right i don't have the r extension installed so i can show that so you only saw the python um where is it now okay um where was i is the the difference in the mean actually statistically significant um we would use a student's t-test or an independent group's t-test that's what we call it um in that case there's a note for that um as there are a couple of other um hypothesis tests that you can use in NIME i mean it is a generic data analysis tool after all we can open up the configuration dialogue we first have to define our grouping column again that's the class we actually have to define can only compare two groups with this note so i select class a and class b and now we can actually leave the average sizes and the average intensities both in here say okay execute this note and take a look at the interactive view the statistics view here and we'll actually see we'll get some descriptive statistics again of the um the different properties in the different groups how many cells do we have per group what's their what's the averages and so on but um down here we'll actually um really see the the test statistics so for the number of pixels with um assuming equal variance for the um two different groups for the distributions of the two different groups um we'll actually um get a p-value here and for the average intensity we will also um get a p-value here and we actually see that and this is much lower but both are actually way below the um 0.05 limit for statistical significance so there actually is statistically significant difference in the averages when looking at the cell size but also when looking at the average intensity and that actually is the end of just giving you a short idea of what you can also do with nine um analysis wise so the the great thing really here is and i should really stress it and point that out let me um expand this component here again um so actually right after the labeling properties note we have only used generic nine functionality here so it doesn't really matter where my where my data comes from that i want to aggregate and generate descriptive statistics about or do a hypothesis about or do the plotting this really is generic functionality um that i can use for images that i can use for i don't know molecule structures that are supported um in nine as well so that really opens up the really huge box of functionality um in nine that you can also use your image features on for example um i don't know training a model that predicts whether a cell belongs to class a or b depending on how big a cell is for instance stuff like that so we do have notes in this machine a lot of notes in the machine learning um area as well that you can really just use i'm out of the out of the box now and i think that actually famous last words as in i would now be happy to hand over to jan to open briefly open up another can of worms um namely working with multiple images sharing my screen maybe quickly before we go to the next topic there there's two open questions from the audience if you if you like one question is whether you can extend the nuclear mask to measure something in the cytoplasm uh yes you can um easiest way would be there is a morphological labeling note for instance that can use to dilate or erode the masks basically to extend it or it's called morphological operations as far as remember yeah so it's called working on binary images or on label links and you can basically both have opening closing dilating and eroding operations cool thank you and maybe just out of interest another question was um whether it's possible to convert the little bit masks or the bit um that you get from the segment features note as a table into some other kind of output that makes it easier to export a table like this maybe i can just start with that so of course the bit masks that you could that you use from the segment features they are not necessarily always connected components so they can be disconnected label links whatever so it's they are not necessarily polygons for the connected component analysis however they usually they of course are connected otherwise wouldn't make sense so for the for them i don't know if there's a dedicated note to make polygons i had a use case once where i actually used an image j to plug in to to create from the outline the list of coordinates of the outline i'm happy to share that i will put a link in the q and a section on the forum then after the seminar is there anything else and they and they are regular images after all so there is an image writer note that also uses syphia o in that case to write the images and you can actually just dump them into files for further processing i don't know png files for instance if or whatever you like basically just because i had that use case recently all right so just for the last five to ten minutes i would like to just tackle another topic working with multiple label links as i think stefan mentioned already before the the segmenting or labeling type of images based on the mglip 2 library it has the unique property of allowing to store multiple labels for each pixel and we can actually make take advantage of that of that feature let me switch back to to nime and actually the the segmentation workflow that we touched last week if you remember so this was basically just three images with a very simple cell segmentation as well where if you remembered our task that i introduced was not only to segment the nuclei but also look for the foci of this second channel here and count the foci of of this marker for each nucleus so in for this case we basically can do two things we can segment the nuclei and segment the foci and get a segmentation for for both of these images and then uh there are nodes that allow merging these two labels together i need to introduce well a few things beforehand so this is basically just the link to the to the nucleus segmentation workflow that we had last week and uh we will add two more nodes now to to wrap up the session one is the label transformer because we can so and when we have a label image you have numberic labels from one to the number of components that you have you can also rename these and make it string based labeling so basically instead of just zero you can give them any name and in our case when we merge label links we need to make sure that we what we have different labels for the nuclei and different labels for the spot so we will need this one and um the the node that actually generates overlapping labels is this labeling arithmetic node so i'll quickly demonstrate that to this workflow of last week i just added another simple segmentation on our first channel on the on the marker channel sorry on the second channel um if you look closely i should maybe go to the image viewer here um you see that there are in the there are small spots and segmented here and just small foci from within the the nuclei now of course we don't see whether they are within the nucleus and the same nucleus or not so i will just join these together let's take this one here this one there and just draw join row by row so now we have the two label links next to each other in this column the nucleus label links and here the the spot label links now we need this label transformer node and i will just put it into that connection for the nucleus because we only want to transform the nuclei now um in here i will configure this node to contain always the name nucleus underscore and then i have a list of variables here that i can use i use current label which means it will just append the current number for each nucleus if we look at the result from this we see in the output we have if you if you look at the value column here you see there are nucleus one nucleus four nucleus five and so that's basically allowing us to now merge these labels with with the spot labels by using this label labeling other arithmetic labeling arithmetic so here in the options i will choose by a method the merge method it's explained in the description panel as well if you want to read what it what that actually means and we choose our first labeling and the second labeling just these two different images that we have created and the output of this one is a joint labeling with a unique feature that we have for for every single pixel we can have multiple labels multiple sections so if we go here and look at for example this node in in here in the value column you see value one and nucleus one so this is the first spot in the image but also it belongs to the first nucleus and down here we have value two and nucleus three so this is basically in the in the nucleus without spots we just have the label nucleus three this allows us to to now with the segment features node again to extract the the all the several separate objects by filtering out with only those segments that overlap segment features so i will just use this merged segmentation and channel two image and here in the settings we can say append labels of overlapping segments and overlapping segments do not need to completely overlap and the bottom sections here we allow us to fill in on the labels of the segments that we want to list or on the filter on the labels of the overlapping segments and that's what we want to use here i gave all the nuclei the label nucleus something and then uh now i put a put an asterisk here which is a wild card for any anything that follows now i'll check contains wild card here and just to show you the the power of of this approach we can now in the in this node we'll see a list of all these spot segments with the labels and also in the label dependencies columns now we have the name of the nucleus which with which this spot was overlapping so basically for this nucleus five you see that three spots in the original image were in that nucleus five below here there's again another nucleus five however it stems from the different file with so the two dot lsm file so if we now group by source labeling and by the label dependencies so by the odd all the nuclei we can use this powerful group by node i use that node a lot in all kinds of analysis within nine and we'll group by source labeling and label dependencies and now we will just count the number of spots so i will take this label column in my aggregation so i will click manual aggregation add my label here and as an aggregation method i will choose the count so i will just count the occurrences of labels in each group by doing this we'll get a table where that allows us for each nucleus to count the number of spots so for example here this nucleus number nine in the last image it has had six spots so we can look at the arithmetic here to verify that in our third image there was one nucleus with six spots it's this one and you see it had the nucleus nine value label and here these are the six spots so this was the way to count our spots per per nucleus i have to cut it down here because of time limitations let me just quickly switch back to what i left open previously so the cell cell post segmentation node it also ran successfully giving us a new column of a segmentation which worked well on these on these nuclei also segmenting the touching cells fairly well and i think with this one we'll have to close for now and still be able to answer a few questions so the workflow i just demonstrated is also available for download or from the nine hub directly and with this let me just wrap up with the conclusions again i think i hope we were able to show you the nice modular nature of of NIME and how it helps to to make reproducible science and batch processing easy and also well documented with with annotations and grouping component workflows so are there any questions open still i can buy us some time and i have a final remark because we had a question also cell posts and stardust showing up now and i don't know with the image shape integration that that is really something that is in that we try to keep and is in nines dna being open to integrations and not trying to reinvent the wheel every time so i don't know for example preparing the slides today i also used image shape for some interactive visualizations or an exploration of parameters and so on and then put the parameters in a in a nine workflow just because there i mean different tools are good at different things so choose the tool that you like we try to integrate as basically as many as we can as a really usable so you're able to really put them into into one workflow and really end to end workflow images in and basically data statistics visualizations out there's a question from the audience now jan which is i tried to connect omero using nine omero reader note version 5.2 but the login failed is this note compatible with the latest version of the omero server which is 5.6.3 it's a clear no it is a no i honestly have to i would have to look up so there is there is a released and trusted version of the omero integration that is compatible with 5.2 we've done some work on that on a on a nightly update side with on an experimental update side that you have to activate but honestly the last omero version that i have tested it on was 5.5 so i'm not entirely sure if it works for um 5.6 but in order to do that you have to add an additional update site um and give it a try but we can put the documentation how to add that update site into the into the follow up answers and i also take a look at it myself thank you and maybe as a general remark again please feel free to contact us on the image sc forum or on the nine forum in case of more technical questions for nine and for stefan then possibly but we're always happy to answer any any questions regarding my analysis great thank you franca for the moderation thank you jan and stefan for the amazing two sessions you gave us i think it's already the second or third time i follow you presenting nine first time online but each time i learn new stuff and i see the software evolving and bringing new features and new tools it's really really great so thank you again for this three hours of presentation now with that i think it's time to say goodbye and thank you all thanks for having us thanks everyone for attending have a have a great day and stay safe please