 Good morning everybody. So my name is Hamid Tizoush from Kimia Law at the University of Otulu. Today I would like to talk to you about KimiaNAT, how to train our histopathology-deep network from scratch. So one of the main goals of the computer vision and AI community in collaboration with the pathology community is at the moment to design and develop techniques to learn to represent tissue, which is probably the most difficult things, the most difficult task in computational pathology in a modern sense. So we know that based on the textbooks we have different type of tissue types, epithelial, connective, nervous, muscle. But we also know that we have many different diseases, carcinomas, inflammation, infections and so on. The challenges, the diversity and variety of this, that many of them are polymorphic, we have mitosis, and all that, the manifestation in body parts, different sites will be appearing differently when we look through the microscope or a digital scanner. So that's what we call in computer science, NP-Hartness. Which is a fancy word for saying that's basically impossible. We cannot do this. We cannot represent all type of tissues in a computer, because this is just unbelievably large number of possible combinations and variety of tissue. Well, that goes back to the discussion between shallow and deep networks, because if you have a artificial neuron, which is here in the circles that you see on the right, and every artificial neuron is basically a simple processing unit with some in and outputs, and shallow networks usually have a few layers of those neurons that the green rectangles that you see is one layer. So the first layer has three, the second layer has four, and the last layer has two processing unit or artificial neurons. And deep networks have a lot of those layers and neurons, and shallow networks have only a few, generally three, four layers. And up to 2006, 2007, we didn't think that's possible to really go deep, and since 2011-12, we know that we can go deep, which is we can train circuits, networks, or artificial neural networks that are consisting of many, many, many layers. And the problem for that fancy word from computer science, NP-hardness, the impossibility of learning tissue variations, is the magic for that solution is in the depth of this type of network. So we have to go deep to be able to learn something sophisticated, something complicated, that combinatorially, theoretically, may not even be possible. So to do that, you need a lot of data, because we don't have a model. We don't have a set of equations. There is no mathematics for that, so that's fundamentally empirical. So the software that we design, and we call it deep networks, is basically learning from a set of data, from the data of the past. The data of the past is the experiences, the knowledge embodied in images and reports and notes and labels and annotations and delineations and so on. So one of the largest, probably the largest public dataset is the TCGA dataset available through the GDC portal. So it contains more than 30,000 whole slide images, both frozen sections and FFPV specimens, and about 11,000 of them are available for training and testing. So there are HND sections that are diagnostic, so have good value, good to quality, so it can be used for training purposes. So that's from 25 different primary sites and 22 cancer soft types. That's a decent dataset, in spite of all shortcomings, all problems that the literature has to talk about the TCGA. At the moment, we don't have anything better as a community. We don't have anything better in terms of publicly available. There are institutions that have larger, cleaner, better, but they are not publicly available. So how do you do that? So then we need other ones, other datasets that are easy and manageable. So TCGA is not easily manageable. So for example, there is the colorectal cancer dataset. It was introduced four years ago. 5,000 images, eight different classes, like tumor epithelium, stroma, complex stroma, background pixels, debris, adipose tissue, and so on. So 625 small patches or images in each class, the images are 150 by 150 pixels, which is a trend in literature, in public dataset. People keep the size of the images small to make it manageable, but it has the downside that then it may not be representative for the practice of histopathology. Other dataset that was used in the studies, the endometrium dataset, it was used to introduce almost a year ago to compare deep learning techniques against the experienced pathologists. So more than 3,000 patches of size, 600 by 400 pixels. So okay, that's a okay size at 20 times or 10 times magnification, all slight images, so what the patches have been extracted. And there are four classes of endometrial tissue, which is normal, hyperplasia, adenocarcinoma, and so on. So that's something that was also used for testing. But the training validation and testing of the actual network, the chemionet, was done based on TCGA, because TCGA is again the largest, has the whole slight images, and you can play with it. You can, some of them the quality on the right here, you see that on the top one we have high quality images, and the bottom one we have rather low quality. So that's the downside of the TCGA. But I would say that's also the reality of the practice that when you go in a hospital, not all images may be of high quality. So the question is, do we want to use all of them or get rid of them? So we used after, for example, eliminating slides that had no complete pyramid in terms of magnification, we were left with more than 8,000 osclight images, and we used 7,000 for training, 7,000 osclight images for training, 741 for validating, and 744 for testing. So for example, we used the ones, the cases that the patient had only one or two osclight images, we used them intentionally for testing, because we wanted to use as much as data for training. So we extracted patches at 28, so size 500 by 500 microns. So that gave us roughly 1,200,000 patches for training, 121,000 patches for validation, and 116,000 something patches for testing. And all that being at 20x, images being size 1,000 by 1,000 pixels roughly, that's it. That's the decent size of things to be done compared to the overwhelming part of literature, which works with really tiny images. So the problem was, and is that if you look at the publicly available data set, like TCG8, the data is unlabeled. So when I look at the lobular carcinoma or the adenocarcinoma, I have the site, I have the primary diagnosis, but I don't know where exactly should I look as a computer. I don't know where is the carcinoma. So the data is not delineated. So in that sense, the data cannot be used by any type of supervised learning. So what we did, we used the Utixel search, which takes the WSI, patch them at 20x magnification, use the staining information, the color information to calculate color histograms, and then cluster them with unsupervised techniques, simple or good old-fashioned techniques like K-means, then you know what patches are similar to what patches, and then you group them in clusters and find a mosaic to extract features. So when we do that, so there's a lot of details to that, which we don't have time to grow. The paper, Utixel and Image Search Engine for large archives of histopathology, host-like images is publicly available. You can download it, it's open access, and the details are in there. So the main thing is when you get the WSI, the Utixel search engine finds a representative number of patches and then assembles the mosaic on the right, which is a selection of the patches at high magnification that represent the whole slide image, because we cannot process the entire whole slide image. One could say, why not take everything? Why? Because we are thinking about the pathology office, and then there could be a workstation there, there could be no GPU power there, and so on. We have to make it feasible, practical, so we have to make it smart, such that it's not necessary to process the entire WSI. Then we had to extend Utixel, because that was not good enough to use the unlabeled, not delineated TCGA dataset. So we went back to the cellularity, which is a popular thing to do for different purposes. And if you look at these three cases, and we segment the cellular clay, it doesn't need to be exact here. We are not really looking at any morphological features or exact counting. We just need an idea how cellular is the patch that I'm looking at, and the examples that we see are 13%, 41% for glioblastoma, GBM in the middle, 61%, and so on. So we get a quantification. Again, this quantification is not for the expert. It doesn't need to be accurate, super accurate. It's just for the computer to figure out that I'm dealing with a hypercellular or hypocellular region. So that's a very important information to have. Now we can extend the Utixel and say, okay, no, I want a cellular mosaic. I want a selection representation of the whole slide image. And I get that to technologies like Utixel. And that's the mosaic that we take as the representation of the specimen. And we take that and further compress that or reduce the dimensionality by looking at the cellularity. Of course, this will have ramifications down the road. Does it mean that you're specialized on cellularity? How many things are important when I look at the cellularity? I may be missing some conditions and some abnormalities. We can talk about that. But the question is, with this assumption, can I train a network that understand tissue? So when we had to change the structure of that Utixel search engine, so we, again, we were working indexing at magnification 20x. And we clustered, we applied unsupervised techniques at five weeks magnification to look at just the information, the staining to group the tissue just based on the staining. And then the patches were 1000 or 1000 pixels. And the algorithm assumed that there are nine tissue types in each image. Those tissue types may or may not have pathological meanings, but we don't see them. They are not meant for the pathologists. And many clusters may be redundant. So you may be, if you look at them, you may want to merge some of them. It doesn't matter. As long as we do not miss something, having redundant information is not end of the world. And then we grab 15% of the tissue. So that's a major thing to do. So we grab 15% of the entire specimen as a sample. And again, that's a major assumption that with 15%, I'm grabbing whatever is needed to understand the tissue. When I look at the singularity, I look at the top 20% hypercellular tissue samples. When I calculate the singularity and sort them, I look at the top 20% to be selected among the 15% of the specimen. So all the numbers that one can play with, but it seems as we will show that it may not directly and immensely impact performance. Then you will go ahead and patch the whole slide image. You apply your own supervised technique like K-means clustering. You basically group the patches together. You categorize them and you get the mosaic. And then you look at the hypercellularity. You reduce the size of that mosaic. Just look at the ones with the top 20% singularity, which may be very different for different W size. And then you push that into a network to get features. And that's the final representation. So you have deep features, which are just some numbers that a artificial neural network generates for the patches that you have selected as the representative of the whole slide image. So why dense net? So we use dense net, which is established topology in the literature. It has been quite reliable for representing histopathology images and is highly popular. There are many people that use dense net, dense net topology. And we have some experience with it. So you start with what you understand and what you know, how it behaves. And compared to the top 10 networks in literature that use the so-called image net, which is a non-medical, natural set of images. Destin is very compact. So compared, for example, to the efficient net, efficient net has 66 million parameters and fixed resonant has more than 800 million parameters. Destin has barely seven million, which makes it a very small network, which again, having in mind that we want to deploy solutions in the office of pathologists, we want to keep things small. And not just from the efficiency perspective, also from the perspective of being able to generalize, we want to keep things small. So we trained and validated the solution. It took, of course, several weeks, almost two months to be accurate, all the experiments that we won to train and validate. There was a huge gap between training and validation, which theoretically is a sign of overfitting, but the test showed that is not the case. And that's one of the challenges that we have at the moment, because we don't have really large data sets of histopathology images. So we showed the different settings that we trained the dense net topology at different levels fully or just part of it. And that's the result. So the best was, of course, if it was trained from scratch, everything was touched and everything was relearned. Then we had the best result. So then we applied some search to say, okay, if I have a query image, and I look at the primary diagnosis, and I look at the top three cases, can you find, can you show me top five, top six cases? Can you show that the search can be successful if I'm using that representation? And another example for ovarian. So again, I look at the top six cases, and I see that the majority of them have the correct primary diagnosis as the query WSI. Then we went and did horizontal search, which is, okay, I assume I don't know what is the primary site of the WSI. And then you search for it. And the question is, in how many cases you can find the right primary site has no practical implication, but is just a sanity check for the network is that if the network has learned the tissue, it should be able to distinguish brain from pulmonary prostate from liver. So which may not be easily possible at high magnification. But but if you get really too close, and you have a small field of view, but that that's a sanity check for us. And you see that dense net is generally very low compared to chemonet, chemonet 1234, which are different levels of training and adjustment. And the difference is that you see that on average, 44% of the increase in accuracy when we use chemonet, which is no wonder, of course, the literature will tell us, of course, if you fine tune, if you retrain, you get better and better and better, of course. Then we did vertical search. And now we are after primary diagnosis. And I know that's brain, but tell me is it low grade glioma or is it glioblastoma and so on. And then we also again saw that chemonet was much better than dense net. Again, no wonder. But when you do it, you see it and you have the empirical evidence. Interesting was cases like melanocystic and pulmonary case here that previously we had zero because we had a very small number of test cases, four and five respectively. And then when we use chemonet, even for the few, we can get decent results. So that's the power of generalization that when you learn to represent tissue, you are capable of being able to even look at and recognize what it is, even if you don't have enough samples. When we visualize dense net, we see a mishmash of the something primary diagnosis that be classified, just put it out there to see how accurately you can separate them. And of course, when we go from dense net to chemonet, and suddenly, of course, things get really well separated and you can distinguish them. And of course, that's one of the ways that to visually verify that you have been able to generalize the training has had some impact on the data. So we also tested that on the endometrium data set, which is just patches, no WSI, and we took the representation and we use classifier like SVM to classify that. And again, we look at the literature. We also compared to a fine tune VGG, which has also been used with more than 800,000 human annotated histopathology patches of 74 different legional and non-legional tissue types almost a year ago. We also compared that, so that's not just general networks, also another network that has been also trained with histopathology images also publicly available. And chemonet was again, doesn't matter what configuration was much better than dense net in any other solution. We also look at the colorectal cancer data set. And we saw here, for example, that the benchmark is 97%, the chemonet representation combined with a classifier could bring to 96.8%, which is again, so shows that if you customize, you can get better and better. But the claim of the chemonet is to be a good representation for histopathology images, not a classifier. So why chemonet as a representation solution? We wanted to exploit a diverse multi-organ public repositories like TCGA, and we want to work at high magnification, 20x, if possible, higher. And we extract the large patches in contrast to the majority of literature, which uses really tiny images. So we work at 1000 by 1000 pixels and we trained and tested with 1000 by 1000 pixels at 20x. And so we trained a densely connected topology with the weak labels, because we didn't know really what part of the images, what the pathologist has written the primary diagnosis for. So chemonet is supposed to be a feature extractor, is not supposed to be a classifier. And the high cellular team was like that we created, crucially fascinated the training of such networks, but the work of future have to show for what type of representation chemonet features would be suitable. So a paper will come out soon that describes the details of the chemonet and the chemonet network will be available publicly on the websites of the chemonet of the chemolab for the research and educational purposes to be used. Thank you very much. I appreciate your attention.