 Yeah, thanks a lot for the kind words. So as already mentioned, I'm working in medical computer vision But won't talk about medicine at all tonight, but more about the underlying concepts. I'm using in my research, which is unsupervised machine learning, which is one very important and fascinating sub-discipline of AI so When you recall how children are actually learning tasks, you might say well most of the stuff they actually know about the world is from from the parents but actually when you think a bit more most of the knowledge children gain about the world is actually completely acquired by observation and by playing around with the environment and many things that children actually Learned during during their childhood such as that objects are persistent. So when I occlude an object children before a certain age Don't really know that this object is still there But after a certain age they kind of have this knowledge and know about this property of object persistence Same is true for intuitive physics. Nobody really teaches children physics in like Years three or four But what really happens just by observing the world even little children can predict how objects fly and Without any knowledge about the mathematics behind it So what is quite crucial about these observations is that when we as humans want to understand intelligence and want to build intelligence systems methods to actually learn just by observations are really really important and When you think about the world in which machine learning algorithms actually operate in which is like Normal environments for robots or the internet For other sorts of algorithms most of the data we actually find there is not labeled in any way So it's just there and the machine learning agent who tries to actually make sense of the data has to Make the sense completely on its own. So he doesn't really have a teacher and This is what unsupervised learning at its very core is about When we look at the broad topic of artificial intelligence We have one sub topic which is machine learning, which is how do we enable? Computer programs to actually learn from data and there we have three sub disciplines Which is supervised unsupervised and reinforcement learning and I mainly will talk about unsupervised learning today So in supervised learning we're concerned with with tasks tasks such as object classification or detection of objects and images and supervised learning It's really prominent these days after recent successes over the last years and is widely used For example in the development of self-driving cars and really works well currently Also in reinforcement learning we have amazing progresses and in reinforcement learning. It's basically about teaching robots or Asians To act properly just based on the reward signal So if we give them a thumbs up for the actions or not in unsupervised learning It's a bit different. We don't actually teach algorithms We just provide algorithms with the data and the task is to find meaningful representations of this data So how do we do this? In supervised learning we have some sort of teaching signal, right? So we have a data set where we have labeled data for example images of Dogs and muffins and we exactly know which of the images are dogs and which are muffins and by this teaching signal We can train the algorithm in unsupervised learning. We only provide the images without any further information So the algorithm has to study for himself and to find meaningful representations within this data set Okay, so it boils down to these two different kinds of approaches in supervised learning We try to infer something from the data by using the teaching signal that is provided to us and by trial and error We try to we try to tune the internal parameters of our model in In unsupervised learning We're just giving the data and we're trying to infer the underlying representation Such as counting the black dots and the images which is already quite a good proxy for extra actually separating the classes without ever knowing That these two classes exist As I said before supervised learning is really successful these days And in tasks such as objects classification on certain data sets is it's even beyond human level so in supervised learning we are given an input image and then a certain amount of Output labels that are possible and we provide a teaching signal to the algorithm so that the algorithm can tune its internal representations and internal parameters To arrive at the right prediction in the next round in unsupervised learning and These are the notions. I will keep doing my talk We quite often have some sort of encoder system and a decoder system or generator The role of the encoder system is to find meaningful representations of the of the data such as Compressing an image into some other sort of representations some Collection of numbers which are more meaningful. So instead of storing one megapixel of images We only store 100 components and try to preserve the most important information Of the image and then we have decoders and generators who try to by looking at features representations and providing Providing points in the feature representations to reconstruct or generate images videos or other types of data This is really useful in many applications for example in image processing And you can see here a noisy version of an image which is actually quite a quite common task when you acquire images in in no light you might have observed that there are Many small little random pixels on your camera images and when you learn such an unsupervised learning algorithm to actually denoise images This works exactly with the with the two building blocks. I introduced before so this image Is compressed into a feature representation which is so small that the noise just doesn't fit in But the really crucial parts which is the content of the image is preserved and can be reconstructed by the decoder network So in image processing, this is really nice But what we really care about and what I mentioned in the beginning is the prediction of future events So when I when I throw a ball to predict where the ball is likely to be in the next second Or in this example when you type into into Google or also use your keyboard on your phone What is the next word that is likely to appear? This is also an unsupervised learning task because we have much unlabeled data to actually train models On on such kind of tasks such as the Wikipedia. So how does this work? Again, we construct an algorithm that contains encoders and decoders and we have an internal feature representation And in this example, we will provide this algorithm three context words and then the next word that will appear after these words in the sequence and the task of the algorithm is to find a meaningful feature space that enables The algorithm to actually predict the next word and we train this model on many sequences or Even the whole Wikipedia and afterwards it can really dream up random sentences So you can provide the model with beginnings of a sentence and let them predict the next one You can let these models predict Plays written in the style of Shakespeare or even feed them with Mascripts from university and let them dream up random proofs, which are probably not true But anyway, it's possible What is really exciting about these models, which are also called word embeddings is that after training the model on this toy task We can use the use the encoder to actually transform the representations of words into a new Feature space where we can do arithmetic with words. So in this feature space, which is depicted here as these three dimensions For example the relationship between a king and the queen is the same as between a man and a woman And what we can do for instance is we compute man Minus king plus queen and then we arrive at the feature representation of a woman So in order to comprehend these vast amount of text in Wikipedia the algorithm without any Provided knowledge somehow comes up with a nice explanation of how the relationship of different words Actually works So ideally we want to have the same thing with images Because we kind of have these sort of predictive systems also in biological brains when you look at this orangutan Who is presented a magical trick? What you what you observe here really nicely is that the orangutan has something like an internal model that predicts the next state of of the world around him and With this magical trick this internal model is just broken it it doesn't align with which what he expects Which is why he rolls over the floor laughing? Alright, so we also want to have something like this in artificial systems. So building predictive models Beyond predicting the next word in a sequence. So really doing this on images and other sorts of data So the problem we have with images is actually when we have these two examples of a pen And we imagine what might happen in the next couple of seconds There are multiple possible outcomes and it's really hard to let a model that I Short before predict something like this because as you can imagine There are probably just these cases in your data set at least not all possible cases how this pen can fall down And when you try to train the model on this and you use These these videos at the as the training signal and the model predicts something like this It will be classified as completely wrong and this is really difficult To to train a model accordingly on on raw data frames So actually a really cool framework to solve this problem was developed in 2014 which is generative adversarial networks and the idea is here to extend The usual notions we we have an unsupervised learning So on top of decoders and encoders We also have something we call a discriminator and this is how it works So we have our true data set and we have a generator network So for example a painter who wants to fake the images in our data set and we have an art critic Who closely? Eximines the true data images and the images from the generator and he doesn't know where the images actually come from And he judges the images and tries to predict whether they are real or fake and Initially, he will be quite good at this Because he knows this this true data set, but eventually because he gives feedback to this generator The generator will become better and better at faking the images in the data set and actually convince the discriminator in the end That the images he generates are true So this is how it looks as a machine learning algorithm. We have our data set here We have our generator network, which is really nothing else than a decoder as I showed before and we have this discriminator network Which is actually in supervised learning algorithm, which we train With this clever trick of inducing a fake data set and true data set and this is somehow how we how we get the labels for this problem Okay, and this is how it works initially Everything is random and the generator just tries to dream up a video which at the beginning is just Gray frames so the discriminator is now trained on discriminating these kinds of samples and these kinds of samples And this is really easy, right? He just has to look at the color distributions and say, okay, these gray images are fake images And this is actually the real video But this training signal so this discrimination boundary is actually passed on to the generator and in the next round He may be predicts something like this a pen that's just not falling down and it's it's already Quite a nice image, but still very easy to discriminate because it's not moving at all But eventually at some point the generator comes up with something like this So a video that is not actually present in the data set, but is plausible somehow and because the discriminator Has actually learned the number of plausible outcomes He will have a really hard time in discriminating what is real and what is fake in in these two instances and this is basically the the task where this network has converged so the generator generates basically Samples that might have also come from the true data set and the discriminator is completely unsure what samples are real and what are fake So these were two tiny glimpses of what was common and what is common in unsupervised learning research in the next years and I personally Think that over the next year's unsupervised learning will be come even more important Because of the reasons I stated before that most of the data we actually have in our real world is Unlabeled and we need algorithms that can teach themselves and afterwards use just a little training signal from from a human Such we have in school to actually make this last step and make sense of the data So with these algorithms we can do pretty amazing things nowadays We can generate Images so these images are actually not real. They are all generated by these generative adversarial networks and Currently there's much work on actually generating really high-resolution images, which are not quite perfect as you can see But it's from from the resolution and it's already pretty amazing what these networks can do Back to the example I showed you with with words before so doing arithmetic is now also possible in the image space So here you have an example where you take the feature representations of basically transferring the attribute of a smile from a woman to a man and this really works with the feature representations generative adversarial networks Generate in the training process Another examples Example you all have on your phones at least when you're using Android phones since last month This is not with adversarial training, but this is Google DeepMind's implementation of WaveNet, which is a model Where some sort of label information is combined with the ability to generate speech? And this is basically a model and that is able to to generate the text-to-speech interface on your phones So this is also mostly learned in a completely unsupervised way All right, so I come to a conclusion now what is really important and what you should remember about unsupervised learning is that it is mainly concerned with finding good representations of the data and converting data into good representations and basically generating data samples from these representations and Unsupervised learning is also currently used in in many applications as I just stated and I expect that It will become even more prominent while the Algorithms evolve because things like generative adversarial networks were only found Three years three years ago. So there's much exciting work to do actually So this is the end of my talk and I want to Say a great thanks to the organizers for lots of lots of iterations in the rehearsals to improve on this talk and Hopefully to make it understandable to you all and if not, I am happy to answer your questions now. Thanks Thanks a lot Stefan. So your questions, please Yes Explain again the concept of Discriminators Yeah, and because I'm generator. Sorry. Yeah, so What the role of discriminator here is so the the discriminator is basically provided with an image and Gives a feedback if this image looks real or not So that's bad. How would you realize it in like a true life example? Yeah So they the discriminator is basically given an image and he doesn't know where it comes from and He states if it's real or fake and he's trained On models on on samples from the true data set and from the generator So you basically provide him the information of what is true and fake And then in the training phase afterwards He judges the samples from the generator so both networks basically compete against each other Exactly that's that's some of the trick behind this you kind of Attacking an unsupervised learning problem by introducing this discriminator Which casts it into a supervised learning problem, but you don't need any labels from this because it's just You just need to know the data from your data set and That some data is actually generated. So that's basically the trick how you get the labels Yeah, thanks. Okay any other questions here's one So when Okay, so the the question was what kind of feedback signal the generator actually gets from the discriminator Yeah So in in real-world example both are implemented with neural networks And what you do is you? Compute the gradient You basically compute how the generator has to change its output in order to make the Discriminator fail at the task of classifying it correctly Does that help? so So you You're basically looking at the the features the discriminator uses to actually discriminate between true and fake images And you have access to to this Decision boundary in the implementation later on Yes, here's one one. So in that case then do the networks have to mirror each other closely enough for one to pass the gradients to the other? How does that what do you mean by by mirroring? So does the structure of the networks have to match in order for the gradients to be passable? As in you're designing the generator and discriminator to be mirrors of one another so to speak and then passing the gradients between them No, that's not necessary. So they can I mean in practice both are implemented with convolutional networks So there's I mean the the kind of structure is is really similar But apart from actually being differentiable. There are no real restrictions you have Yeah, so this is most of the day a usual neural network used for supervised learning actually and The last question is actually from me much seven about your research You told us that you work in the domain of where a little training data is available and You work with Semantic segmented segmentation. Yeah, right of microscope images. Yeah, so what exactly is it and how? How does it help? So how does how does unsupervised exactly? When you when you have medical images you usually have a great variance between images acquired from different laboratories So if you have your data set, which is painfully annotated by the medical guys from your department and you train your Your models supervised on them and then you get data from another lab that uses a different acquiring method for for the images or another microscope There's some thought of difference in the images and your which is called a main shift and Your algorithms are likely to fail. So we are basically using this This sort of methodology to render the images that come from a different lab as if they were acquired by our lab So that it's basically possible to reuse your algorithms. Okay. Thank you very much. Thank you very much, Stefan