 Hola, hola, everyone. Nice to see you here so early in the morning, so let's start the party, welcome people in the room, the ones who watch us on the internet, I hope next time I will see your smiling faces, it would be nice. As the guy already said, I'm Wojciech Małota-Wójcik, I work in rich company where I'm a proud member of great development team working on distributed cloud services. After hours I work on my PhD project analyzing cancer images and trying to find some patterns in them which would lead us to inventing better treatment methods. Today during my presentation I will try to connect those two disciplines and will take you on a fascinating journey inside the human body to meet the greatest enemy of our time's cancer. If you are old enough, you may remember once upon a time an animated movie, when I was a child I watched all the episodes all the time and you see what I do today, so be careful what your children watch on TV. And this is the exact place where we are going to today to meet the world of ourselves. So please open your minds, fasten your seatbelts and we are going to be 800 times smaller now to visit the fascinating world of our body. So this is the slide prepared by taking a slice of resected prostate and you might see all those darker violet regions there, which are the fragments attacked by cancer. This is the very advanced stage, this patient probably don't live anymore, but if you are scared that you are sick don't worry, this feeling will go away tomorrow. And this is what we are going to analyze with AI models. So let's go further. Now we are 400 times smaller and here on the left you see a fragment of cancer, which is developing. It started from a single cell reproducing like crazy and on the right you see much more developed fragment of the image where everything went wild. You see that cancer is everywhere there. Before going further with analyzing those images, let's talk a bit about our goals. Usually when we work with cancer images trying to find some treatment methods, we have two groups of patients being treated some way and they present different outcomes. Some of them are healthy after the therapy, some are not. And we would like to predict this therapy outcome, this treatment outcome by looking at those slides taken in advance before treatment is applied by the histopathologists. So this is our goal, this is very difficult. So yeah, the histopathologist is responsible for choosing the right therapy for the patient. The patient takes time, is harmful to our bodies and costs money. So it is very essential decision to apply the right treatment as soon as possible. If it fails, then patient probably will die because there will be no more time or even there will be no other therapy to apply at that stage. So histopathologists are like Neo trying to pick the right pill for the patient. And we researchers are like Morpheus trying to get some insight so better decision may be taken. Which pill do you choose? Red or blue? If blue you may go home, if red we continue. So, anyone? Nobody. Okay, so red. So let's follow the white rabbit then. Now we landed and we are 800 times smaller now. This is what histopathologist sees in the microscope. Basically this is not how our tissue look like in reality. This image is a result of applying two dyes, hematoxylene and eosin to our tissue. Hematoxylene, which is visible as darker regions here, here, right here, here, here. Attaches to phosphate groups building our DNA inside cell nuclei. And the eosin visible here as pinky regions between those nuclei attached to other objects in the cytoplasm of the cell. So by applying the staining protocol we are able to visualize the tissue. Let's start applying some mathematical magic. Don't worry, no calculus involved, only linear algebra. Here in the center is the original image and on the left you see, on the left and on the right you see the transformation of this image. Because each pixel represented by RGB values in the center may be transformed by solving some equations to other two images. One representing the amount of eosin attached to the image at a particular point. And the second one on the right representing the hematoxylene. So, as I said, hematoxylene attaches to the DNA inside our nuclei. So you see clearly that the nuclei are visible here, but not here. Here these are just dark points because there is no hematoxylene. In the project I want to present today we will just focus on nuclei. So basically we may ignore the eosin image and we will focus only on the hematoxylene one. So, another piece of information we need before we may process those images using AI. And actually this is AI. The thing we need is to find the exact places where the nuclei exist in the image. This is done by using an open source neural network called Stardust, which was trained to find nuclei in the tissue images. So you see that it is very accurate, nice neural network open source, anyone may use it. Having those two information we may transform hematoxylene image and just ignore the pixels outside of the nuclei. Because basically we will analyze the texture inside the nuclei. So everything else may be ignored. You may see that this image still must be pre-processed before we may apply some AI model on top of it. You see that some nuclei are brighter, some are darker. If we process further with this image, our neural model may be trained to recognize if the nuclei is darker or brighter. It may focus on this feature. And this is not what we want to. So before feeding it into neural network, we want to equalize the contrast of each nuclei individually. So you see that they are much more similar now to each other. Okay, so let's start building our AI model using this image. The approach I'm proposing here is to classify each image inside the nuclei by taking 11 by 11 neighborhood around this pixel. On the right you see some examples. You see that we selected, for example, I don't know this pixel. We have this field around of it. And based on it we will classify this pixel in the center of the image. And we will do it pixel by pixel everywhere. We will do it by using autoencoder. This is very simple but still very powerful neural network architecture, which is an unsupervised method because this is what we want to do here. We want to find some hidden knowledge, not available to the histopathologists at the moment. So we look for unknown patterns. We will train autoencoder. This is very simple structure. You just feed image to its input and you train it to get the exactly same output. The magic of this model is made by introducing this bottleneck here. You see that the number of neurons here is much smaller than the number of inputs and outputs. By doing this we first this neural network to generalize the information and find some patterns which might be useful. In our case, as I said, we feed here 11 by 11 images. So it gives 121 assigned 16-bit integers on the input. And here we encode it to 10 64-bit float numbers. Let's focus a little more on one more thing. You'll see that this neural network may be divided into two parts. One is encoder. Second one is decoder. Let's start by discussing encoder in more details. So as I said, we feed 121 pixel image on the input and we get 1064 float numbers which gives us in total 242 bytes on the input and 80 bytes on the output. Sorry. You immediately see that this image, the information provided by the image must be compressed in some way because neural network must return 67% less information. By counting the bytes. So these are the advantages of the autoencoder neural network. It compresses the data, it generalizes the knowledge and it reduces the noise. Of course, it also introduces error. In our case, as I measured it after training, the error is 4.7% per pixel in average which might be understood as 12-gray levels in 8-bit space. Let's go next. Here are some examples of the images. In each pair on the left you see the original image and on the right there is an image processed by both encoder and autoencoder. So you see that some details are lost. Also images are smoother. So we will see how it behaves during the classification of the images. And now let's focus a bit on the decoder. The decoder is not the very useful part of the autoencoder architecture. It is used only during the training and basically later we just remove it. But it provides one particular interesting feature. Normally the decoder receives its input directly from the autoencoder and returns the original image. But we may provide some random numbers to its inputs and then in response we will get a synthetic image taken out of thin air. From time to time in some projects it might be useful. Here not really. But I prepared two videos for you presenting the updates returned from the decoder when I applied constant values on the input and was progressively changing a single of them from 0 to 1. So this is how the image changes. It may be a sign that neural network found some patterns. But you also see that the images we get here are very different from the ones I presented earlier. So who knows this is neural network, right? We don't know how it thinks. So after taking all the pixels from all the nuclei in the image and passing them through the encoder I get points represented by ten float numbers. What you see here is the result of the UMAP algorithm. Its goal is to transform high dimensional data to a smaller space, a 2D in this case. But the important feature of this transformation is that it preserves clusters. So similar points are close to each other. So you see that we have some group here, here, here, here, right? So eventually it means that some groupings were found by the neural network, which is quite interesting. Now, let's classify it. I took the simplest approach. I just applied a simple K-Means algorithm to divide those points into eight classes. And this is the result. So you basically see that K-Means eventually found clusters presented by the UMAP. Oczywiście, to są pewne differences, ale to jest prawdziwe świat. To jest to, jak to działa. I teraz zapłacimy tą klasyfikacją do najwyższego filmu. Widzisz, że każdego pikselu z nuclei jest teraz skończone z korrespondeniem kolorów. Powinniście widzieć, że życianiem jest w porządku, zwłaszcza jeśli jesteś udziału AI-dowolnego, ponieważ możecie zobaczyć wszystkie te klasy z blu, z orange, z czerwone, z czerwone, z czerwone, z czerwone. I oczywiście, z czerwone. Są te same wszystkie, prawda? Więc właściwie, nasz autoenkoder nauczył, jak zrozumieć bordy na śląskie, które muszą więcej pracować w tym procedurze. Ale nie jesteśmy całkowicie zrozumieć, ponieważ mamy dwa bardzo interesujące kolorów tutaj, pinkie i górne, które wyglądały tak, że klasyfikowali internałów z nami. W eventualności mogliśmy zauważyć te dwa grupy do pacjentów, z różnych utkodów z terapii. I comparować te histogramy z pinkie i górnej regiony ułożyły jakieś statystyczne testy, które jest zauważywane, więc nie będę pokazał to do was tutaj. I w eventualności, my znajdziemy, że jeśli jesteśmy bardzo szczęśliwi, że te kategorii może przedstawiać jakieś differences między dwóch utkodów. I może, może zauważyliśmy jakieś życie tutaj. Nie próbowałem jeszcze, ale zauważyłem. I widać, że więcej pracy ma być zrobione tutaj, żeby wyeliminować te problemy z górnej regiony tutaj. Więc przedstawiłem jakieś techniczne techniki i idei za zauważywanie kancorów. To jest bardzo prosty approaches. To są dużo bardziej kompleksowe rzeczy, co się dzieje, ale oczywiście nie mamy czasu do descreślenia tutaj. Mam nadzieję, że zauważyłem się. Musimy takich ludzi którzy lubią robić takie rzeczy. I teraz zauważymy tą approach jak do Kubernetii. Bo to jest dlaczego jesteśmy tutaj dzisiaj. Więc to jest jak archiw z hospitalu zawsze wygląda dużo drogów, czerwonych w basie. Nikt nie chce tam if you want to find something there it's better to ask prosecutor because basically he's the most powerful man or woman to get data out of there if something goes wrong in hospital. But this is what we researchers must deal with. If we need data sorry the cancer samples are stored there in analog way. These are just fragments of tissue on a piece of glass. So if we need those data for training our models we must go to the histopathologist asking for those samples. This histopathologist must go to archivist find those samples, scan them and only then we may use those data for training our models. So to make progress situation must change because histopathologist are extremely busy people it takes months until we are able to get our data for research and it is never ending story. And also it is important that in hospital R&D department is not the first player because hospitals are focused on treating patients so we just manage the situation somehow. Data must be accessible at any time by anyone without bothering other people. It means that we must digitalize them. Some medical images like x-ray, resonance tomography are already stored in digital form because this is how the modern hardware deliver them nowadays. As I said the histopathological slides are still stored in analog form because our organs still exist in real life and not in a metaverse. So before we may analyze them using computers they must be scanned and digitalized. So some hospitals already started investing in storing those slides and scanning them on a massive scale. Sooner than later all the hospitals will follow this path. So the current situation sorry most of those hospitals will migrate to the cloud and this is the big opportunity for cloud providers, for IT specialists, DevOps whoever so both doctors and researchers will be able to access them at any time but it will require petabytes of storage and low latency fast networking between hospitals and data centers because because we need something better than providers offering data centers scattered across the country being far away from the hospital. World is evolving and providers offering geographically distributed cloud services enter the market. It means that data centers are now being located much closer to the hospitals offering shorter access to the end user. Health related data is the most restricted type of information by GDPR. So also legal stuff must be taken into account. Cooperating with a cloud service provider living in the same country makes it easier to comply with applicable law. Some hospitals taking R&D seriously will go even further and build their own private in house data centers. So R&D department, storage and computing units are stored close to each other offering faster access times faster model buildings and faster development. This is how the hospital of the future together with its R&D department look like. There is a dedicated infrastructure for storage for storage. So both histopathologists and software developers may access the data at any time. This is a perfect use case for Kubernetes because this is how clusters may be created dynamically adding all the required resources on demand. If you need reliable partner delivering geographically distributed cloud services with a list of proven successful deployment please visit reach.co contact us and ask and ask we will help. So here are some reference links to the open source libraries open source models even cancer images shared by the government of USA. You may experiment with it you may play with it whatever you like and thank you very much for enjoying my presentation if you have any questions please ask anyone? If there are no questions or are there any? Anything in the slack? We had somebody asking but not actually asking questions. Thank you very much for attending my presentation I hope it was interesting. This is the future actually because there will be no more histopathologists. Training a histopathologist takes 20 years something like that and we need more and more smart people and they won't be delivered. So actually the best investment we can do at the moment is to use very knowledge and experience to train computers to do their job because it will be cheaper and more effective and available for more people. So thank you very much enjoy the cube come today. Thank you.