 So, I'm going to talk about image processing and about two tools, Scikit-Image and Dash, which I hope can help you to build very modular pipelines and applications for your own image processing needs. So, I'm a member of the Scikit-Image core dev team and the work on Dash which I'm going to present has been sponsored by Plotly, a Canadian company which I'm going to join in the fall. So, as you may know in the by-data community, images are a very common form of data and you have needs for image processing in various domains of science and industry. For example, in biology where you have microscopy images or for satellite imaging, quality inspection, but also for autonomous cars, for example, where you want automatic segmentation and so on. So, as you see, the needs are very different. They span a very large number of fields and in all these fields you have various tools which exist, some tools are libraries, some tools are user interfaces and all these tools are quite different. Some of them are written in Python and the tools I'm going to present today are not meant to cater for one very specific need, but rather to help you build your own image processing application when you have a specific need. So, let me start with Psych2Image. So, just out of curiosity, how many people here in the audience are using Psych2Image? Can you please raise your hand? I am. Okay, so some people, so please come and talk to me after the talk to tell me about all your bugs with Psych2Image. I will love that. So, Psych2Image is a generic image processing library. When I say generic, it's not for one application in particular, but its mission is more to process scientific images, rather than, for example, Instagram filters. It's really more for scientific needs. It's open source. All the content I'm going to cover today will be open source, BSD or MIT licensed. And it's for Python, obviously here at EuroPython, using NumPy Database as images. Compared to other image processing tools, one specificity is that Psych2Image works well with 2D, but also with 3D images. Sometimes with ND images, like in science, you have MOI, CT, a lot of modularities where you have 3D images. And last but not least, Psych2Image tries to have consistent and simple API and also good documentation, gentle learning curve, so that when you're getting started with image processing, you can get started quite smoothly and learn by yourself. So I'm going to cover this a little bit. In this slide, here's a short overview about what you can do with Psych2Image. So it's image processing for science, basically manipulations of images in order to transform them for other purposes, like when you want to filter them. Here you have a denoising example or when you want to extract information, like feature extraction for further classification. When you want to extract objects, this is called segmentations. Or after some processing, when you want to measure the size of objects, the shape that is to transform your images into numbers out of which you can do science. So this is what Psych2Image does, and this is what Psych2Image is not. It's not a deep learning library, I'm afraid. You have really great deep learning libraries with image processing capabilities, like Keras, for example, has some nice image processing examples. So the reason why there is no deep learning in Psych2Image is mostly because of architecture and maintenance choices. We choose to be a very maintainable library, very well integrated into the NumPy SciPy ecosystem, that is all the code is in Python or Sison. So there is no GPU specific code, for example. However, Psych2Image interacts well with machine learning and deep learning, both for the pre-processing and for the post-processing parts, where you can do normalization, data augmentation, or after deep learning, you can improve your segmentation, do some cleaning of instances, and so on and so forth. Also, one thing that we do not want to do in Psych2Image is to have a lot of very bleeding edge algorithms, like the one that you just published during your PhD six months ago. It might be a really cool algorithm, but if we do this, we'll end up with like 100 denoising filters, and then how will our users find their way through a library. We want to have a short API so that it's easier to find the functions, and therefore we let time do the Darwinian selection and choose the algorithms which we include. Psych2Image is a full-fledged component of the scientific Python ecosystem, and as such it works with NumPyArWaze, which are the images we process, and so it interacts also really well with... So this pointer does not work. Yeah, it's very weak. It interacts also really well with Psych2Learn, because you can pass NumPyArWaze from Psych2Image to Psych2Learn and vice versa, and also it interacts also really well with the visualisation libraries of this ecosystem, because once again it's this NumPyArWaze object, which is kind of the ling-waf ranka of the scipy ecosystem, which we exchange and pass between all these modules. So here is a very short glimpse into the kind of code that you would write with Psych2Image. I'm not going to make a big demo. You can find a lot of tutorials on YouTube, for example, but what you can see is that you first import sub-modules, so the functions are typically inside sub-modules. For example, the IO for input-output reading, an image from a file. This image will be a NumPyArWaze. You see here I'm asking for its shape. And then the syntax, the API, is that you have functions like this thresholding function, which take as input NumPyArWaze, and they return either numbers or filtered images, which are once again NumPyArWaze like this function, for example, which measures the connected component from here binary image to connected component. The input is an NumPyArWaze, and here the output is an NumPyArWaze, as you can see here. So the NumPyArWaze has actually all that we need for image processing, because pixels are just array elements. And so our API is really only functions working on images and returning images. The first argument is always NumPyArWaze, and then we have optional parameters, which are keyword arguments. If you want to tune the behavior of your function, we try to have sensible default values. Also here I have an example with a 2D image, which does this block of code, but it would work exactly with the same syntax if you were to have a 3D array, because we have exactly the same syntax. So pixels are array elements, and it allows us to use all the machinery of NumPyArWaze. So here it's just pixel indexing, changing the values of pixels, accessing to a channel like RGB image. It's a three-dimension image with three channels, but you can do also masking, fancy indexing, and so on. So the API is simple. We have just sub-modules, and these sub-modules have functions taking NumPyArWaze as input, 2D, 3D, sometimes ND, and the output is a number or an array. For the API, through time we have converged to a quite consistent API. If you started using scikit image like five years ago, maybe it was a bit more chaotic, but now, for example, all the denoising filters start with denoise, so that you can try to discover new functions, new filters, just by browsing the API and looking at the dog strings of these functions. I will show also the gallery, which is another way of exploring scikit image. We try to be consistent also for the variable names and also inside the code, for example, how we name indices, something as stupid as, are you using XYZ? Are you using plain row colon? We have heated discussions on the GitHub to try to find some consistency for this. Here is a short example to show you that scikit image and scikit-learn interact really well. It's an image that I acquired for my research with my team, so it's a grain of gypsum that makes the plasterboards and part of it has been dehydrated, the part which is textured and part of it is still intact and we wanted to do automatic segmentation of this. For this, we extracted features using the feature sub-module of scikit image in these two regions, and after we fed these features to a random forest classifier of scikit-learn, it gave us a first segmentation, but it was not really good, so it had a lot of mistakes. We cleaned this segmentation using traditional image processing like Gaussian filtering, thresholding and mathematical morphology. It's to show you really quickly this interplay between machine learning and image processing. A few facts about scikit image. It's released 0.15. We have more than 200 contributors, but between 5 and 10 maintainers, so we really try to welcome new contributors and I would be happy to talk with you if you could be interested in contributing to scikit image or reviewing for requests and we always need a lot of enthusiastic people. Our community is quite large. We have 20,000 unique visitors on the scikit image website, scikitimage.org per month. That's how we estimate the number of active users. If you go to this scikitimage.org website, you will find one of our most beloved features, which is the gallery of examples, which allows you to browse through thumbnails showcasing image processing applications and you can select one and open an example. I would like to give a brief shout out to the underlying package of this gallery. It's called Sphinx Gallery. If you're building your documentation with Sphinx, you can just import it as a Sphinx extension and get such a gallery just from Python scripts. Here is one rendered example with the code, the image generated by the code, some explanations. The gallery of example is really the part of the scikitimage website, which is visited the most because our users will come to the gallery and say, I want to measure the size of images, the size of objects in an image and they will do like control F on the gallery or something like that and open an example. We also have a CC also sometimes between examples. Sphinx Gallery also gives you nice features like at the end of the doc string in the API documentation, it will create mini galleries like this one with all the examples using a specific function. This comes for free when you just import Sphinx Gallery and also in the examples you have links to the API documentation. It's a lot of redundancy, cross-linking between the different parts of the documentation and it helps your users not to be lost in some dead end somewhere in the documentation. I really recommend giving a try to this Sphinx Gallery package. Let's say for example that you want to denoise an image, this is an image that I acquired during one of my experiments and it was very noisy. How can I denoise it? When I go to the gallery, there is an example showing how to denoise with several different filters. One shortcoming for gallery is that at the moment it shows mostly pictures like cat pictures, cars pictures, pictures of people and we miss examples with real data sets but we are working on this and if you have a good open data to contribute, we might be interested. There were explanations about all these different filters and here you can see, that was on my image this time that just with one line of code I can try one filter tuning the parameters with keyword arguments and you can see that from this noisy image for example when you use quite specific filter which is this one, the total variation filter, the histogram gets really peaked. So you can start having good results with very generic filters like the median filter here in green but it gets much better when you try the more advanced one. Of course sometimes at the cost of longer execution time. Something which we want to improve in the future is the speed of execution and the parallelization because some other packages use GPUs for example but we use only NumPy code once again for maintainability so in the future we want to experiment with NumBar and Python for example but at the moment what we do is chunking it into blocks. So I would like now to go to the interaction with the images part because why do we want speed execution? It's because sometimes when you do image processing you don't really know what the workflow will be, what the pipeline will be and then you need to tinker a bit with your images you need to try different parameters and for this for example you can use the widgets so here for example I have used the IPI widgets package and it's an interact decorator and if I want to choose the best Gaussian filter width I can just use the slider and select my best parameter you gain a lot of time by having this kind of interactivity and this you get with widgets but sometimes you need another kind of interaction with your images that is you don't want to change the parameter generating your image but you want directly to draw on images for example to have markers for the segmentation to identify an object to be removed from the background to delineate roads on a satellite image or to have bounding boxes for a training set for further classification and for this we have developed this dash canvas package I will give you one example which is this tool which is integrated into web applications thanks to the dash web application framework so you have the different components of the web application here I can increase the size of the brush I can change the color and so on and then I can perform a segmentation based on my annotations so on this annotation tool I have other features like rectangle, lines and so on and so forth so what is this tool? first of all the web application framework here is called dash it's developed by Plotli and the tagline of dash is no JavaScript so it's a web application framework in which you write only in Python and so all the components I showed you before are Python code I will give a few examples and it can be quite heavily customized so that you can really tune the layout so dash uses a flask server to run the applications and also all the components are based on the React JavaScript framework so there is JavaScript behind the scenes but the principle is that you write only Python and I have a few examples of dash code so what is that? no it's here so here I'm using the JupyterLab extension for dash so you see that I write some Python code here and when I execute it I have my reactive graph I have this radio items buttons here and each of these elements is defined in the layout here and when I want to add some interaction between these elements I can do this using the callback decorator of the app of the dash app and when I do this when I change for example the value inside this textbox then this text paragraph is also changed so this is defined here in this callback mechanism and if I go back to here to my app for example then there is this in the dev tools of dash you can see the graph of callbacks which is a bit more complicated because I have more elements but it's exactly the same principle as my little examples so which components can you use in the dash apps? you have the normal HTML elements which are provided by the dash HTML components the reactive components are found in the dash core components for example the sliders, the drop-downs, the radio items I just quickly showed you have also reactive charts plotly but not only plotly charts so if I go back here for example I have one graph and just clicking will populate the hover data I can select also which will change this part and it's quite classical to have figures which can be changed for user interaction but here it's a user interaction with the figures which modifies other components which is a bit more tricky to do you also have interactive data tables which you can include specialized libraries for specific components like for engineering or biology and basically every time you have a React JavaScript library you can wrap it with dash and this is what I did for this dash canvas package there was a very neat JavaScript package called React Sketch and I just wrote a wrapper around it adding these little buttons and this is how it was quite easy to create a dash canvas package so dash provides you two things one of them is the dash canvas object which is a modular tool for annotations and selections so you see here some sample of such annotations and also you have functions to transform these annotations that is to make for example numpy arrays, masks out of these annotations which will be then processed by scikit image dash canvas depends on scikit image so for example this is how I could use these annotations to do the segmentation of these objects in the demo if you're interested there is a gallery of examples on dash canvas.plotly.host I can show a few examples of this so here is the gallery you have one example with just bounding boxes and then populating a table, a numerical table like when you want to build a training set for machine learning there is one example in which you want to remove the background from just one person and then what you do since it's not perfect you can just improve it that's really the benefit of interactivity and so on and so forth so I think I'm running out of time so I will wrap up so this was a quick introduction to dash canvas which is quite a new project it started at the beginning of this year the roadmap which we have is to improve the interaction with images having for example annotations which can be loaded from a given geometry from a file and not only from the user drawing the annotations also annotations triggering directly callbacks without having to press a button and I would be very interested also in handling 3D images and time series for example for segmentation of objects in 3D like what you have in the medical sciences adding more examples for the gallery as well and since this interactive component is based on JavaScript it could also be useful for other packages like some libraries using widgets and so on so we can talk about it if you're interested so thank you very much feedback is very welcome on these two tools scikit image dash and dash canvas and please be in touch, thank you Any questions? Feel free to go to the microphone please Hi, thanks, it's very interesting when you added a little interaction from this input field and showing the text when you change the input field does it actually go through the server and back to JavaScript or does it all happen on the client? It goes through the server I think let me... yes, yeah you don't have computations done like on your local machine so I didn't speak about deployment like the app which I was running with the segmentation of the cells actually the server was a local server on my machine this you can do, you can also add the apps to an existing Flask application and you can also deploy using G Unicorn for example on Heroku and so this is the only commercial part Plotli also commercializes deployment solutions for that application, that's a business model around dash which is otherwise completely open source Hey, thanks can you tell us more about the medical images what are the challenges that Scikit images has regarding these type of images? The question is about medical images Yeah, exactly So for medical images we have identified several challenges one of them is to add more examples using real life science data sets because sometimes I go to conferences and like a biology person will tell me oh I never thought that Scikit image is for myself because I never saw a cell image on the gallery for example so this is one thing which we want you to do also a lot of 3D images are quite large data sets like acquired automatically with I don't know Lightsheet, Microscopy, CT and so on and for this improving the speed of execution through automatic parallelization is really something which we want to improve and also the dash canvas part it's not Scikit image there are people in common between the two teams but it's something which I see on top of Scikit image really adding some user interaction to play with images to annotate them so I also see this as something which can be useful to the life science community because you have a lot of people using like imageJ to do measurements or to do just manual segmentation and this you could do with Scikit image and dash canvas as well okay thank you time is over so now 5 minutes and we start again at 35