 Okay Good morning, I guess folks so I'm welcome again. Thanks. I guess I didn't scare everyone away yesterday I'm going to try to give a little bit so one comment I got yesterday was I could somebody said I could talk at a quarter of the speed so I'm going to slow down even more and I'm gonna but I'm gonna first to try to make up for those of you that maybe I dropped yesterday as to just give you a Quick review of what I tried to tell you in an hour and a half yesterday with lots of good discussion in more like 10 or 15 minutes Right now at the start so I'm going to show you that at the beginning again as a bit of review And then we'll get on to today's section, which is Remember this is all about vision about object vision And yesterday I tried to show you where in the brain the key computations were taking place and how we think about those in population spaces and But I left open the question I want to talk about today is how do the how do we go from an image to those Population patterns of activity that I kept referring to yesterday in the place called infertile cortex So this is a slide that we talked a lot about yesterday so I'm putting it up again just to again make sure everybody's on the same page is that the big picture I tried to give you guys yesterday was that you can think of any image that we present to the animal remember central 10 degrees Let's say a hundred millisecond duration Produces a measured pattern of activity across a sampled set of neurons These are here I'm list showing n neurons where n is typically a number of a hundred to a thousand out of remember about 10 million Neurons within it cortex, so you have a sample there and we count we can kind of observe the spike times which are shown as tick marks here and we then do various ways of Collapsing these data, which we refer to as possible codes and the simplest one that I presented to you yesterday Was just count spikes over a time window of a hundred milliseconds I'm going to relax that I'm not probably not today. Maybe tomorrow and Count and then do that for just a random set of neurons randomly taken just like the experiment early experiment or randomly samples from it that is My lab or I or whoever did the experiment randomly samples from it We think of downstream readers of it as maybe equivalently drawing a set of samples Wires if you will off of it and having to learn on those samples to then perform a task like recognition So the the this is an example of what what actually might happen in the brain in response in it in response to that image This is it's collapsed or it's sort of a condensed code version where we have one mean response To each here one mean response for each neuron. So there's n neurons There's n values here and again. I told you what this is just Trying to show one real trial in practice again What we do is we might repeat the stimulus 50 times and we give an average value over 50 repetitions so sort of the average response to this image from that neuron and And so here we'll have we had again. This looks like about 20 Stimulated neurons if we had you know a hundred neurons. We'd have a hundred values here Again, each is just a real value Which is the the observed average spike rate over this time window of that site to this image So we have many such vectors like this that I showed you as big green Vectors yesterday and I'll have that in a second and then I talked talked to you about thinking of those vectors as just points Within this n-dimensional space of neurons the mean and some uncertainty around the mean and I showed you this drawing here We said you can think of the the points as being a projected not this is not this is now in the it neural space As just being a dot it's just another way of looking at this data It's just the re-representation of the data and the reason we like this is because now you can get the idea that you Have to deal with not just one image But many images together and you have to find rules that you can carve this space so that you can Build those rules from some of the images and then generalize to new images So we do that with linear classifiers which are here shown in this case as a line in a two two dimensional space But a plane in three space or a hyper plane in a higher dimensional space Which again just trying to cut the space with this Into two parts one in this case to say the images belong to the category face and these images belong to not face and I tried to tell you the idea that is if you you're going to train on Linear classifiers if you do this in a space of it It turns out that you generalize quite well Whereas if you do this and like just the raw pixels or in lower-level visual representations You don't do as well and not only do you do well in terms of performance But you do well in terms of predicting the animals Patterns of behavior both its strengths and its weaknesses across many different types of recognition tasks This is sort of face versus not face But again you think of one classifier for each possible Subtask car versus not car and so when you build a lot of those then you get those kind of colorful behavioral patterns We spent a lot of time looking at yesterday And I'll show it again in the slide But this is the kind of one of the ways I want you to just think about how we Hypothesize how a place like it can support the general space of core recognition across many sub tasks within core recognition So that is this that let me just start back up and let me take questions right here So that that's a bit of a review from yesterday. Is everybody okay where we are? Okay, so this is again It's a hypothesis you may have other views as came up in question as to other ways of thinking of the spike codes and so forth I'm just saying this does pretty well now hear those spike that yellow vector is now A blue a green vector here and there's many of the many images stacked up here That's what shown here 2,000 images now and what I tried to tell you yesterday is like not just for face versus not face But many tasks which are sort of drawn here as subtasks behavioral difficulties This is behavioral data these are neural data I'm saying if you build linear decoders on this and you show and you predict the performance on held-out images that you get The performance that the animal shows us on held-out images, which is what I'm showing you here In fact the data I'm showing you here aren't actually on held-out images for the aficionados But we can because it's hard to get all images being held out But we can show that we the animals performance on held-out images is very well approximated by by these data here So so this is the big picture that you can take linear decoders out of these it population spaces randomly selected groups of neurons Train with some reasonable amount of training samples and predict the animals performance That is you can build what appears to be a behaving monkey That's from linear decoders out of a sample of it neurons. That's the upshot there So the summary I wanted you to have from day one is that you the feature set if you will out of it Which again we think of as about somewhere around a hundred to a thousand dimensional underlying encoding dimensions that are carried across Ten million neurons so a lot of redundancy in there But that feature set we think of as the penultimate product of this recognition Processing pipeline that then can support behavior with just these linear classifiers I skipped this slide yesterday So I'm gonna stick it in here just because I get it reinforces the message I'm trying to give you is when you think of us learning many new objects and somehow being able to handle those all together and Still say I can still recognize a car or a dog or plane and when I learn about cars I don't apparently influence my recognition of planes the way we think about that is that Our working hypothesis again is that there's a essentially a largely stable set of features Encodings from the image here in it that's sufficient for all future objects that you may want to learn I mean when you're born or you're in your monkey monkey doesn't know what a car is But we teach the monkey what a car is and this is relevant because a lot of this car this course is about learning I think that's the title of the course I'm really talking about representation with the learning part that I've told you so far is About on living on top of it these classifiers that live on top of it So we are basically up modeling this we think of this as being a kind of almost a stable set of features that might undergo development postnatal development But is a reasonably already quite powerful so that you just need a thin level of learning on it and machine learning languages It's called you know back-end training or just Just training up a softmax layer on top of a feature set So again This is the powerful part that there's a feature set in it that we think of as about a hundred dimensional And you can learn to certain faces or dogs or apples or trees or bananas or cars if I try to teach you those things Or we in do teach the monkey those things So one slide I didn't show you yesterday that alluded to it I think is a kind of important to set context with regard to that idea and learning is that all that behavioral data I showed you from the monkey. This is monkey performance relative to humans So here's a bunch of a bunch of different objects the animals learned here's one So this is like human level performance. You see the monkey is as I said yesterday about it human level After it learned so the data I'm showing you comes from when the animals already learned in fact extending well beyond here as we keep Collecting data from these animals, but here I'm highlighting for you this period where we first have to the animal has to say Oh, I don't know what a dog is. I'm gonna he's showing me pictures of dogs. I'm getting juice reward I'm somehow getting up to human observed level human performance across new images of dogs But there's this learning window here So I'm just trying to highlight that I was showing you the kind of caught the steady state performance When I show you these pictures of both humans and monkeys and humans are actually close to this performance when they come into the Test we think because we're showing objects that were driven by things that all of us mostly know But monkeys have to be trained to do this But they train up in about a few thousand trials, which is about one to four days The way we run this so it takes a monkey about one to four days to get up to human level performance And I'm showing you all this because again when we measure IT itself It's reasonably stable if you look at animals recording IT before and after these things or if you look at data You can get good decodes out of IT for doing these tests even in untrained animals So I'm saying this all to just to emphasize that we don't think that the power of this system is Resulting from the actual training of the monkey We think of the training of the monkey as being building up what we're approximating with those downstream Classifiers that I showed you and those are sort of standing in for what the monkey does and you could ask is a lot of good Questions that we're working on now about you know What type of exact linear class fire are you using is that correct or not and we don't know all the answers to that I'm just saying this by way to to leave you by saying that IT already in the adult animal even an untrained animal is already set up to be ready to learn with just one linear classifier Layer it's not because we had to train these animals to do objects It's somehow a powerful feature set already in place and an adult animal. Okay, does that make sense so far? Yes, go ahead Yeah, I knew someone was gonna ask you that I don't have that ready I mean it does it should put some limits It depends on the you know the objects and the layout of the space and that's a great question and I I don't have a prepared answer for that other than to say So your question would be like well, let's train a monkey to a thousand objects Or what's the limit if we haven't pushed the animals up to how far when do they sort of start breaking down in that regard? So we've only get up to like a hundred or so so they can handle up to that pretty well And again humans have been estimated to know about several thousand That's an estimate of what humans may know for objects so your question is is that space sufficient to do that and we don't really know but We believe it would be but we don't know yet And I think there are good theoretical questions there about the layout of the space and what's the capacity of that to learn Classifications and we can do it has been done for arbitrary Groupings of things, but we don't think humans are only learning arbitrary things. So that's the more interesting case For arbitrary kind of cuts of a space you can do the the theory around that but on the sort of more naturalistic cases It's harder to know what the answer to this is So the way we did it here was we take the really up the nuts and bolts of it are as you doubt You basically build up three-dimensional models like you have models of cars and dogs and trees And then we get a bunch of those 3d models And then we actually place them on mechanical Turk and ask a bunch of humans out there to say give us a free free label of this And if 90% of humans return the same label we call it a basic level object So if they talk they at least 90% say elephant and you know don't say animal then we'll say okay 90% of humans will tend to call that the same thing and that's the kind of Psychologist definition of a basic level category and so then we only use those objects in our test So we start with a model set that's generated by you know The world of gamers and video designers and whoever created such things and then we push them through a bunch of human testing To sort of make sure they meet that bar and then they make it into our test set And we have a battery of several hundred of those that we still keep trying to expand And I'm showing you a subset and we think these are these are just arbitrary samples of that set We didn't try to choose them to be oh we need an elephants or dogs We think of them as a sampling from that very large space of possible objects that meet those those generative criteria I'm loosely describing for you. Does that make sense? Yeah So we haven't We've done a little bit of cat. So when you say category you mean cross Category exemplar variability like there's many chairs for instance like so so I can have this specific 3d model of a chair Versus I have eight 3d models of a chair yet. They're still all called chairs That's what you mean by the chair category as opposed to you know chair in this room type one chair, right? I'm Mixing those up together. So the question is and I'm being a little fast and loose with those and it'll come up a little bit later But let me phrase your question into more when you when you want maybe we you want to know as precisely for these data I'll tell you these were specific models of like a fixed chair And the pay data I showed you yesterday, which was the dot the group the blue dots on a scatter plot Those were actually category data where there were many chairs within the group So again, I'm I'm mixing together studies to give you the general idea So we've kind of done both and your question may be well You know can you see slight differences in the ability of the decoder to predict under those two cases? And we don't see that yet, but I wouldn't say that that means it's not present It's one of these things that could be looked at More deeply. I mean what the first pass is you do it either way and linear decodes out of it Roughly predict what you're gonna get. That's the first pass answer. The details are still to be pushed further Okay Yeah, yeah as a first approximation consider them and the adult is pretty stable. Yeah So so there's there's yeah David is referring to work which we did that I still really like and I don't yet It'll connect to what I'll try to show you next which is that you can play statistical tricks where you show this repeated You know strong statistical changes in the environment and you can get particular shifts and the IT responses over a period of hours You don't know how long those last over days or weeks It's it's still not clear when they if how would the recovery periods? But the way I think about those results which I believe and these other results which I believe which are decoders are You know You also may know results of if you look at synaptic physiology on to photon imaging that you see synaptic turnover in the brain Even in the visual system all the time, right? So the brain is both changing at one level but stable at another so the way I think about those results is that most of us are living in a Reasonably stable stationary statistical environment, and if I don't change that I get something that looks Stable, but if I push on it in particular ways, I can get it to bend So the way I think about the system is it's bendable But when we observe it it mostly not bending because it's the stationary statistics the environment are kind of holding it In a reasonably stable place. That's how in my head. I bring those together But you see that may not be that's not a satisfying answer because you're like well Jim you're saying it's plastic It's just not plastic under that the traditional like these learning like this learning protocol I would say is not changing it much Right, but but a particular statistical manipulation that that pushes in a certain way couldn't change it right that's That's how I think about it too right it's in the simple mind in ways the ventral stream is subjective to certain types of unsupervised learning But but this learning which is reinforced learning can be best approximated as downstream But of course both that's an oversimplification Probably some reinforced learning like this drifts into it and there have been studies that show that those effects I just want to emphasize this group are small relative to the stimulus driven effects So they're sort of second order, but they're there so in terms of conveying knowledge It's nice to sort of sweep them under the rug as I've done here, but they certainly exist in both directions Okay, okay, so Okay, so I said this so let's we're gonna get on to today So so here's the you know, this is what we just got through saying It's like hey you can approximate that one to two day learning of the animal with linear classifiers that we Think mechanistically live somewhere past it in downstream circuits And but I want to come back to this what I said at the end yesterday is that well I kind of gave you I said I did the reverse engineering loop that I built a an okay model from IT to the behavior Which are basically these linear classifiers on the population spaces, but it's not satisfying as a domain of core recognition Which I set up for you as 10 degree images 200 milliseconds Somehow give rise to behavior because I'm not giving you yet a model of how you go from these images to these neural responses I'm just going I'm giving you models from the neural responses to the behavior that I'm sort of approximating with these classifiers So now we want to work on what happens between here and here and that is how are these features computed from the image or similarly What are what's happening along the way, which are related questions? And as I mentioned there was a there was a bit of a breakthrough for our field in about 2013 That's you know in improved since then and I'm gonna kind of tell you that that's what I'm gonna spend most of today Talking about that story and where where it's evolved to over the last five years or so So that's really goes under this model of how can how are this sort of topic of how can we model that as at least Approximate the computations and silicon that exist between the images and the IT Responses themselves Okay, so everybody with me questions. Okay, we see where I'm going Yes, should I go somebody said quarter speed? I think I'll maybe only about half speed right now. Is this okay? Whoever yeah, okay, so usually if someone wants me to go faster, that's okay, too So it's so I'm trying to hit the median so okay, so this is um, this is to get you started Davide showed some of this in his nice introduction yesterday where he showed you know to knock us work This is our version of a to knock us work. It's like you're physiologist you go record a neuron Let's say an IT you record it you measure its response to a whole bunch of images I showed you we measure, you know thousands of images. I'm showing you 1,600 images here This is not time these are this is the response that mean response averaged over 50 repetitions To a bunch of image images and I've grouped them by the category that use was used to generate the images So it was asked the minute ago of like do you use one chair? There's actually eight chairs in this category you can kind of see that here This chair is actually not the same chair, but they're all within the chair category So there's actually like there's eight chairs. There's eight faces There's eight there's like eight of each in this particular set And then there's many images of each with by rendering the different parameters and placing them on random backgrounds So you end up with these big large image chats Which I showed you some with the I showed you them yesterday with the face and the plane and so forth in the car and these Funny scenes, but here you get some four more examples of them And you can always see some it's might be hard for you to even notice there's a chair there So that variability of difficulty is something we take advantage of at the behavior level as you saw already But here I want you to focus on the neural response to say look this neuron kind of this is its response And again, this is not time This is just a funny way of plotting the data these could be bar plots, you know or dots They're just this is makes it a little easier for you to see because there's a lot of numbers here And you can squint at this and this is what people like to knock on others would do for you know People would take a stare at this and say I think this neuron is doing something and then they'll try to Change the it stimulus to say I think it's sort of a chair neuron or face neuron And then they'll reduce it in some way to try to see if they can kind of confirm that hypothesis So you could look at this and say first glance This is kind of a chair neuron if you stare at it right because look I could fit it with a function That's sort of flat step flat down flat and that would be you know That's my version of when you say it's a chair and I'm that's what its response pattern could be because that's your model As you presenting it to me as it likes chairs and therefore it doesn't like anything else and it likes all chairs equally But you can see that's on that model is actually clearly falsified by these data because it doesn't like that image of a chair And it really likes these ones And well it sometimes likes planes better than some chairs and boats better than chairs So you can see it's not fair to the poor IT neuron to call it a chair neuron It might make humans happy to call it a chair neuron But it's clearly not a chair neuron. Okay. I hope you guys can see that. Okay, so Here's a more famous IT neuron. This is a so-called face neuron Again, you could squint at this and put a flat line and a step and across and down and you'd call it It would kind of if you had to put it in one of these categories Then yes face neuron is your best model of this neuron and chairs your best model of this But you can see again, it's not fair to the poor IT neuron that well it doesn't like that face It really likes that face. Sometimes it likes some animals better than faces and or some planes better than faces So so you see there's structure in here that isn't captured by those simple models So again, you've probably heard of face neurons, but there are no face neurons in IT. Those are human crutches Because we don't have other words to describe them And so what I'm going to try to give you today as we we're not going to use words But we're going to use algorithms that can produce better approximations of what these responses look like Here's a v4 neuron just to also set the stage If you record from v4 and you try to put it in categories I showed you it's even harder and that's why you know It's people talk about IT face and chair and hand neurons But they don't talk about v4 hand and face and chair neurons because it's like now It's really clear that this is in some Crazy feature space that it doesn't map to the human semantic categories as well Which is probably a clue that this is getting close to needing to support chairs and face and so forth But it's just not quite all the way there and this one's even more hidden in the middle of the network And if you look at it plotted against the same categories chairs and faces and so forth you see that wow It's really really complicated. I can't even fit one of those sort of step functions to it So you might start looking for edges or curves and those are the things that people have tried to do a Number of studies have tried to sort of figure out other words like curves and edges that might satisfy this neuron But again, it's a complicated Response so this is kind of what the field Looked like for a long time for actually many decades or people staring at these things and trying to come up with these Language-driven models of things and some of them a little more than language. I should say I should to be to be fair But so what happened more recently is that people have come up with algorithms from neural networks that can actually explain these things quite Well, so the story, you know again if you step back is there's measurements But those alone are hard to interpret as I just showed you So you need to start modeling this in some way to try to explain those measurements better And and so the models that I'm gonna show you they're not models of a full behaving organism like Chris showed or that Peter was just thinking about in his work We're actually gonna have to make an action these are dust models are trying to take the image and process it in some way now Technically, there's some actions But the models for instance aren't pressing buttons or they're not doing those choices there We kind of add that machinery on top of the models for them The models that we're playing with just take an image and try to process it to a better form of representation to support Pressing images strictly to support putting linear classifiers on the top end to say it's a bird and not an elephant And so the models kind of take as their input It's like this is their whole world is this part of the visual Scene here and they try to process it to a space that can support linear classification So those models that I'm gonna show you they're not just sort of pulled out of the ether In fact, this is one of the most interesting intersections between current machine learning and AI and And brain and cognitive sciences. It's that over many decades Neuroscience and to some extent cognitive science influence the shaping of current deep networks by making a lot of measurements beginning with the work of Hubel and Weasel on you know edge cells within v1 But I kind of also highlighted here a number of things that are all relevant to current deep networks It essentially came out of neuroscience first and so these are when people say things are brain inspired in these networks These are some of the most important things so in the interest of I usually kind of go through these just quickly But I'm gonna maybe take a little time today so we can to go through them So first of all we think of any step of the visual process here Which known if you go in neuroscience textbooks for decades and again Hubel and Weasel That you don't you don't go like you don't take a part of the LGN and have a v1 neuron Integrate over all of the LGN it has some local processing within the LGN that's called its measured operation It's a receptive field and the anatomy supports that as well as I've sort of schematized here and similarly go to v2 It doesn't integrate over all of you on it sort of has some local region of v1 that it draws input for so again There's a spatially local processing that can be approximated with a linear filter with a simple non-linearity like a threshold non-linear After it those are some of the standard so-called LN models of the visual system and I'll show you some drawings of those in a minute The other next concept where people get hung up on is the notion of convolution So you've heard of convolutional neural networks, and I'm going to show you them next and neuroscientists often say the brain doesn't do convolution so therefore convolutional neural networks are wrong, but They're forgetting that you just open up any neuroscience textbook And there's an implicit assumption that whatever I'm doing here in v1 like say I've got an oriented edge here I assume there's some other v1 neuron processing this other part of the visual field that also has a similar Oriented edge, so I'm going to do it. I'm going to do edge extraction repeated over the image Now the brain does not do that by taking a filter and convolving it over the LGN output But that's how you can better implement it in a system And so that's why people run these as convolutional on machines But the neuroscientist version is you just do a parallel set of filters Implemented in parallel rather than actually running a convolution, but algorithmically they're trying to do exactly the same thing There's important details there of like is this ed this edge Exactly the same as this edge in a convolutional model The answer is yes in the real brain the answer is probably not exactly so in the convolutional model It's this is called weight sharing because you're because you're going to do convolution It's as if you copy the filter weights here all the way over there magically somehow Of course the brain doesn't do that But again if you think about the brain learning or evolving that whatever it's doing here If you assume it's doing the same thing here and the statistics of the world are reasonably Stationary over space then you will get something that's approximating weight sharing So these are these are examples of things that are going to be in these networks that actually are also to some form Exist in neuroscience textbooks are just usually not implemented or called convolutions in a neuroscience textbook Threshold nonlinear is already mentioned Nonlinear pooling where you take the outputs and you normalize within a local region Or a slightly larger region are key ideas and in and current neural networks that were also key ideas within Brain and cognitive sciences, especially visual processing the idea of normalization And then there's no soon of this so-called deep neural networks the original deep neural network I like to say is the ventral visual stream I'm already showing it to you as a deep network all the textbooks say this for decades You have this deep network. I showed you the history of the anatomy that gets us to why we stack those things in a particular way And it also tells us there's about four cortical layers as I've shown you here Maybe six depending how you break it But they roughly four cortical layer, so it's not a hundred. It's more like something like four or less than ten And I have also been talking about this largely feed-forward Processing I showed you the latency data that it gradually increases from 60 to 70 to 80 to 90 to more like 100 milliseconds which but and so you within a hundred milliseconds So if you have this gradual progression, which implies it's mostly feed-forward and that you is only a hundred milliseconds Which is quite a very fast amount of time Which means there isn't a lot of time for other brain areas and recurrent circuits to even be engaging in the first place So those are the kinds of pieces of evidence that sort of say look a feed-forward model is a reasonably starting point Approximation for what's going on in the brain's deep network for vision Another key idea is population rate codes in neurosciences has been somewhat mostly ignored in the field of vision because of the way Hubel and Weasel started it in the other sensory fields, especially somatic sensory the idea population codes Exists more naturally than what I've been showing you already that you don't think of neurons as living in isolation Doing a job on their own but as part of a group of neurons that together can support a behavior And so that idea again you saw me present that in the first part of the talk that idea has been around for a while That's also important in current deep neural networks And I already mentioned solution computed in about a hundred milliseconds. These are just some of the things that are there in current Networks that are we're in textbooks of neuroscience for several decades to one form or another So that's a bit of background But then we had networks they didn't just start five years ago So people would been building networks to try to approximate how the visual system is handling this to do tasks like recognition Really for many decades as well And I think the most famous network that really starts the thread that meets today's current thread is the network by Fukushima and without taking you through that all the details of this the key ideas again They implement these ideas of local processing and they have this sort of gradual buildup of simple cells complex cells Simple cells complex cells. That's what these s's and c's and s's and c's Mean here and so what they're doing intuitively is trying to build up selectivity Which is the simple cell part like edge features and David a showed you this trade-off between selectivity and variance So you build up selectivity of say to different orientations And then you build up some invariance to position and scale like a complex cell with a little bit of pooling operations And that's the complex But you don't have invariance over the whole visual field You just get a little bit of local invariance a little bit of selectivity and then you go to the next step and you Do it again repeat repeat repeat the difficulty is exactly how you choose to build those selectivities and those those tolerances It's not specified by the brain data or by these models Well, so the sort of core ideas were there, but how to do it was not yet clear Tommy Pojo's group extended that class in a class of models called the H max class This is basically the Fukushima models except much more detail more mapping to the brain And one of the it's called the H max class because that complex cell Step was a pooling and a maximum operator and that's that's where the max is in the name So when you build a complex cell, it's like a maximum response of the simple cell responses within a region of visual space and Maybe I feel like maybe you guys um, I don't know maybe it's worth spending a little time on this Just because I don't have slides of this But if I have a bunch of simple cells that have a similar orientation and then I have a complex cell It's just going to take the all the outputs of those cells and decide I want to take whatever the maximum responses within that pool of local cells I'm going to report that as my output So this is a complex layer and this is a simple cell layer and you would imagine that you would have another complex cell This is for this orientation another type of complex cell for another type of orientation as it isn't as a similar idea So that is how you build up kind of in variances in these kind of simple complex Networks from Fukushima and then the H max class. This is hopefully by way of background to just give you a feel for how these models had been Evolving for over the last couple decades. I don't I don't really want to dwell on that for too long because Just a bit of history perspective then and when when my lab started working in this area Dave Cox Nicholas Pinto These were graduate students in lab at the time. We decided to take attack of let's let the machine start doing our work for us So one of the challenges of those networks is there's a lot of free parameters as you might imagine It's like how big is this, you know exactly? How do I do the operator? What what features should I use here? There's a ton of free parameters that that are consistent with these Neuroscience constraints when you build these networks and we said let's start just searching that space of parameters for Architectural strategies that turn out to be better or not And so that's what we did here with basically GPU searching large classes of architectures to find ones that were tended to be high performing I won't dwell on the output of that work just to tell you that the main ideas you start to let again machines Do the searching for you rather than one graduate student tuning up a model over three or four years Which is how it was working before that and then I want to tell you spend time really the next bulk of my time Rest of today is talking about a model And a follow-up set of models that is really sort of a deep neural network that was this was the version that we built in our lab called HMO you probably haven't heard of HMO it but it doesn't matter it stands for hierarchical modular or Organization which was a way of putting together architectures to do Performance on these tasks and I'll show you that in a minute But just to say it follows in this long thread of the Fukushima class of models the H max class of models And then this class of models So I'm going to tell you about that today. It was done by a graduate student Ha hung and my postdoc at the time Danny Amons who I mentioned yesterday is now assistant professor at Stanford. So this is HMO It is a feed-forward deep neural network. It has four Layers on it that we I roughly call v1 v2 v2 v4 it So we were using for because every talk we'd give would say look there's roughly four things just like I gave you So let's build it to approximate that. So what I say that this network is Neurosciences Constrains is a constraining the macro architecture that is those four layers Things like that you have the convolution across space That is you take the same filter type and you copy it across space here and and then I mentioned also some of the sort of the the meso architecture about things like doing Dot products followed by rec non non linearities with some normalization This is Mantae of Carendini and David Heger. These are again older ideas of how which should what? V1 for instance does and presumably v2 and v4 as a cortical modules take linear models Followed by some non linearities with some normalization. We didn't implement this exactly But these family of models are not far from implementing things like this I'm putting this up to again to make a connection to some Existing neuroscience. So let me try to on this side go through very this very slowly About what this what am I actually showing on this? But let me sort of pause here to make sure everybody's on the any questions at this point Okay. Yes Yes, so that's what I was about to go through so so so look so again I wish I don't have great slides of this so I can draw it on the board But I'm gonna try to tell you here. So there's four planes as David A said you guys see those four planes, right? Those planes each consider them each of having lots of neurons thousands of simulated neurons Neurons and quotes in those planes. So these neurons first of all I should say they're analog units, right? There's no spikes. They just compute a number their real number negative. It can be negative to Negative to positive. It's there's no constraints on them. They're just real valued output units So in that sense, that's why I'm using neurons and quotes because they're not spiking neurons We think of them as approximating the rates of real neurons And so as David a point out, there's four planes there that I'm showing if I take that, you know V1 Had there it sort of looks like this So think of those it's like this is all of these are filter type that all of these are filter type That all of these are filter type this and all of these are filter type I don't know if I've drawn them all that's like you have four different types of filters and You have spatial copies of them across the visual field That's what's being sort of Simulated or just kind of schematized by this drawing here. So you might start asking questions. Why four why not eight? Why not 20 right? How big are these receptive fields, right? Those are some of the parameter details that We're gonna optimize for you next and in this case I showing four, but the model actually had more than four I mean the original model I think had Something like 20. I can't even remember in the first layer Here of V1 and I'm probably don't remember because this this detail doesn't actually matter that strongly But one thing you should see is that notice in the next layer There's more of these planes and then even more of these planes and then even more so the number of so-called filter types is increasing as you go deeper and that's that should kind of be obvious to you right because I can sort of Start to take mixtures of these things and and what I'm doing is as I take mixtures of different at different layers different V2 takes mixtures of V1 Whatever mixture I choose to take here. I take the same mixture here So again, I'm gonna copy that across space But you see as I do this I can take more and more complicated mixtures Which should fit your intuition of complicated sets of pixels I could end up with very complicated Combinations of things, but I'm also doing it over a locally spatial region Which is gradually collapsing my spatial axes which is why you see this these things getting smaller in this direction So the sort of the neurons are devoting themselves more to the feature dimensions Which on the first level are like things like oriented edges and the later level are hard to describe because they're Linear nonlinear linear nonlinear linear nonlinear combinations of a deep stack of things So they're a very complicated nonlinear function of the image But the key thing you should have in mind is there's more features at the end in the beginning And there's less space at the end than the beginning But there's still a little bit of a map there even even when you get towards the deepest step of it has shown here Because you don't have fully connected layers at that. You still have some spatial copying across the field I don't know if that makes sense so far. I should probably go slower I think I hopefully answered your question Davide Do you guys Did you guys does this make sense should I do more on this? Anybody need me to go and what would I show you some results and then we can come back to like wait What's going on in that part of the model about specific questions, but hopefully you get the gist of this model It's like linear nonlinear copied across space and I Guess that's sort of the main things I wanted to get okay, but I've also been saying there's a lot of Parameters here like even these oriented edges like why did you choose oriented edges first of all in HMO? We didn't choose oriented edges We're gonna let the model choose it the parameters choose it for me if you think about this linear opera this operator right here That's basically its pixel combination on the input it has some pluses and some minuses over some pixel field That's what's being shown there. Why did I choose the pixels that those weights that way? Those are called the weights of the linear operators that you choose here So those are free parameters among many other free parameters like well How did you set the relu and the normalization these kind of output parameters here? There's a ton of free parameters in these models it and again neuroscience doesn't tell you I'm not gonna the natural thing for people to do is say let's take some neural data and Fit the neural data use all the use the neural data to sort fit the parameters so that can fit the neural Responses and then that'll be a model of every neuron I do that for many many neurons and then I'll get my answer of what's going on That's that's kind of the classic bottom-up approach, but it gets very hard because pretty quickly you run out of neural data So we took a different approach and this is really I think the main contribution of this work Is that we sort of did an approach that actually also exists in other parts of the field? We're just doing it for this particular problem at a much bigger scale than was done The approach is essentially let's assume something more like ethology that this network isn't designed by evolution to Build neurons that physiologists record it's designed to do something greater let us perform some kind of task to survive in the world What tasks would we want to make it do well? We we're not going to try to build a full robot here to walk around and pick things up and so forth So we said let's pick the task that we kind of already thought the ventral stream was doing Which was this core recognition task which is to be able to identify categories invariant to things like position scale Pose background so forth all the things that I talked about yesterday So we sort of said let's pick a task that again was you know originally inspired by Cognitive science and other things things that I briefly mentioned to you yesterday of like why do you even need a visual system? These could to perform these kind of tasks, but let's take this seriously as an engineer and just saying that's just not imagine the task But let's actually optimize the parameters of the network to actually perform this task So now when you start when I phrase it this way you say well what this is a neuroscientist telling me he knows this stuff about the brain But there's a lot of stuff he doesn't know so he has this kind of model Which is basically a big model family and then he thinks this part of the brain does this thing and So you put those two together and you say well Let's just optimize within this family to find parameters of this model That is choose a particular model within the model family that will do the thing that you say you want it to do And so this is an assumption This is a general family of models and then you use a bunch of applied math and computer science tricks to actually tune the parameters I'm saying it this way because I don't think any of the so-called you could call this learning because we're finding parameters inside Network some of them look like synapses in the network or could be mapped to synapses some of them look more like anatomy parameters I don't think that what we're doing here has really much in my mind probably almost nothing to do with the Actual learning of what's going on here in the brain It's really a trick to just an engineering trick to get the thing to be near the space that does this task Well within the model family, okay, so that's how you should think about that optimization So I'm telling you all this and maybe I should give you the punch line is what when you put these three ingredients together So architecture which is here Task which is here an optimizer which is here then you you know these are actually the sort of core elements of any machine learning System if you you know just listen to yashua-benjua talk or anybody talk about machine learning They'll talk about something like this some sort of task which is usually called the cost function And then some sort of optimization which usually called the learning rule So you put these three elements together and this is a general framing for all of machine learning and here I'm giving you a specific version of this for the ventral stream here The task is core recognition here the architecture is something about the ventral stream anatomy and physiology as I described to you And then there's an optimizer which again, maybe machine learning people really like this part again I don't think this part has anything to do with what's going on in the brain This will be a fun discussion But what's cool is when you put these three ingredients together Then the neurons that you get in the middle of this network look an awful darn lot like the neurons that I showed you that We couldn't predict or explain before and that's what I'm going to show you next So you put these ingredients together stir it up poof a network pops out a neural network And you then say well, how does this network look like this brain network? And I'm going to show you how we make those comparisons and they look Remarkably similar and so that is the kind of main take-home message here So one way to think about this is that what we're doing is there's a set of parameters here and every time we fix The parameters it's like we've birthed a new organism. It's like a new artificial visual system It's like you're running some sort of evolution in the computer again That's I'm using the word evolution instead of developmental learning because we are not trying to say that this is what's going on in the brain yet We're just saying we're getting to an endpoint that we evolved it through a computer through this process And we're comparing that adult endpoint to the observed adult endpoint that we measure in the non-human primate So get evolution what does it evolve to I already should have showed you that well It evolves to something that is much more like the brain than any network that we had you know Hmax Fukushima our models much more like the brain than anything that had been built before and that's what I'm going to show You next so how do we know it looks like the brain? Well, we're going to say we're going to now say because these is a neural network I can make a mapping between neurons in this network and neurons that I measured in the brain and ask how Similar they look and this is where things get really interesting and forward-looking as to how we as a field I don't something we I mean the royal we should be making these kind of comparisons So I'll show you how we've made these comparisons so far But I want to pause here to remind you very way back the first slide of the talk I said we got a bill with neural networks remember that when I was talking about reverse engineering and neural networks This is why you want to build with neural networks because your model can be mapped to the brain if this was a bunch of probability equations and so forth like you'd see in a lot of computer vision models up to time I Would have to make a huge number of assumptions of somehow those mapped to the brain here I have many less assumptions because I'm building with hardware that is not that far from the model of this hardware so from the very outset the architectural space is reasonably in line with the architectural space that the physiologists are measuring that was again by construction of these things and that's what's exciting to me about that what's going on in our field right now is that people used to do Computer vision with probabilities and hand-tuned features and their ways of thinking about things and I like well That's great. They're doing something and now they're doing computer vision with neural networks and like well now They're actually generating hypotheses that may be relevant to the brain So the fields are now working together because they're building with tools that are not far from the hardware that we are measuring And I don't mean their exact copies of neurons, but they're much closer than they used to be so that's one of the most exciting Things going on right now not that we've solved the problem But that we're building with hypotheses that are relevant to neuroscience. So Let's now let's do this mapping. So again, I said this is a neural network. It has neurons analog rate neurons This is a neural network It has spiking neurons that I convert to analog rate neurons by averaging over time as I showed you yesterday And now we can ask what does this it look like that it? So let me sort of this is the fun part here to I think have a discussion So how I've kind of given you the answer a little bit here But how do you guys think we should compare if I told you there like this network has an IT that looks like the monkey's IT? What what are the ways you can think of to compare? Anybody want to offer up some ideas? Representational power. Okay, so what do you mean by that? So we have two IT's and we're gonna just do something with each of them Right, so you I think to translate it so everybody can hear he's saying how do the feature set that come out of the two IT's the simulated IT in the actual IT how good are they at doing a bunch of categorization tasks and now good probably you need to put Put linear classifiers on and you can kind of do all those kind of things right and and I shouldn't say it's like remember This thing was actually optimized to do that which essentially this is the soft mix layer This is the linear classifier step and the model so it's already we know how good this is and we can make those comparisons in absolute terms I didn't have the slide up here, but on average that was why we knew we were in good shape because we'd gotten these networks to be near Human-level performance on these kind of tasks We sort of already knew that going in at sort of a mean level now that there's more other things that can be done like Behavioral details of well how what are the patterns of errors that you get out of those that that would be in line with the Kind of things that you're saying so these are behavioral level measurements of both systems in effect, right? One is you take its IT and you linear classifier it and you get its behavior and the other is you take the actual IT And you're classifying to get its behavior and those are sort of indirect measures of the similarity between the systems So it's one way to do it. It's not the most direct way, but it is one way that you could do it So that's a fine answer. I saw some other hands Yeah Yeah, good question. So I didn't I said kind of science sets the tax I just meant in general they were saying in very recognition something that humans do very well So let's get machines to do that well and it was sort of that level of Setting the task it didn't mean another version would be I see humans perform this Prep when they make this error on this image and this error on this image and this error on this image And so you should optimize something to perform those all those exact same errors That would be another version of what I said, but that's not what I meant Well, so again, it's more of an aspirational goal here It's like try to be like a human Please do your best to do this in very recognition tasks and optimize your parameters as best you can To do that and then the answer to this question here was like actually once we optimize the parameters and we compare with humans on the performance It was in the game. So it was about human level on these tasks or percent accuracy overall And so I just mean it in that very aspirational sense of please try to be good at this And if you are good at that then let's look at your internals to see if your insides look like my insides So so what I'm talking about now is how do we judge? We have two insides a model and a brain How do we compare their insides in a way that's sort of fair to say one is like the other and that's kind of what we're trying To talk about next so I see two hands. Yes, go ahead Yes, but so how do I if I have I showed you some examples of neurons that I pulled out I said look here's a face neuron here's a chair neuron, but there's a lot of neurons, right? There's let's say there's a thousand so I need a way to compare a thousand things So I need to sort of deal with that problem still right because all I have is an IT I have two bags of neurons right a bag a simulated IT and Yeah, there's a there's a mapping problem Right right and so that makes you start thinking about the reason I'm having this discussion It makes you think about interesting questions of how do I know your IT is like my IT? How do I know one monkey's IT is like another IT? What are fair ways to even assess whether there is such a thing as IT? Right and so, you know, I have my particular view on that Which you know the simplest view would be like we all have neuron 12 6 9 in each of our heads Like we have a copy like my copies of neurons are exactly in your head and certain like that sounds possible But sounds a little bit extreme right at some point that doesn't sound quite right that we're exact copies in that way What really what we want is something more like an algorithmic copy, which is hey We share an encoding space of the world so we might have like we span the same encoding dimensions But our axes of spanning could be rotated so our neurons are now like linear copies of each other Right so that's another version of what it means to be the same right these are different Types of similarity and which would be assessed with different metrics So the metric we used originally was essentially that kind of spanning metric where say look if if this and that's what I was doing here If this if there's a neuron, let's say that face neuron if it exists I don't expect that exact face neuron that we happen to pull out in monkey 12 to exist in this network 16 But I should be able to take network 16 and build linear combinations of its network that can predict the neuron in monkey 12 as if monkey 12 lives within the span the linear span of network 16 Just I'm throwing numbers 16 and 12 to say there's many monkeys and many possible networks So we're gonna try to build linear maps from here to here and say oh build a map Oh, I have free parameters again, right? There's a map there weights that need to be determined to build that map and to establish the correspondence So I need to start using some of my I didn't use any neural data up till here other than historical Perspective data as I described for you, but now I'm gonna take some of the neural data Which is the actual responses of that let's call it the face neuron just to fix ideas Take some of the images randomly selected to build this linear mapping from these features These neurons to that neuron so it's a sort of many features to one prediction And then I'm gonna ask how well that mapping does on held out images of the actual neuron How well can predict the future responses of the real neuron to images that I didn't use to build the mapping? Does that does that make you guys with me? This is kind of a key step you have this artificial network You want to say it's the same you build a linear map and then you test it with some held out Stuff and then you can do games like what do I hold out? Maybe I only show it faces and I test it on cars or maybe I only show it cars when I test it on faces Or maybe I do it randomly, which is what I'm gonna show you next right just randomly sample from that space And I randomly test on held out images and I'll show you versions of different cuts on those This is called the training test split now not training in the sense of training this full network But training in the sense of building the map from the network to the brain. Yes Yeah, well actually so I can tell you let's say there's thousands of units in here And I already told you there's 10 million units here so Whether it's lower dimensional is not is itself a very interesting question that I wasn't ready to get into yet in Ambient dimensionality like total dimension. It's actually lower dimensional because I just said thousands and millions in reality I have hundreds and thousands because I have hundreds of samples But the actual deeper question is what are the true dimensionalities of these two spaces? And that's another way to make the comparison Which is you know a more interesting way that that's sort of more current Directions, I think Bob today are gonna talk about some of that in his lab this afternoon Right because that that's actually not captured by this kind of metric that I described for you Which is you're basically just saying is this neuron live within the space of the dimensions? And you've actually sort of already to me uncovered one of the ways that this Comparison metric is still not a fully satisfying metric because if I had a big huge neural network whoops I could I could have a very over complete space here that I could still satisfy a prediction Here and in that sense you wouldn't want to declare these as being this being a copy of that It's more like it's a super set of that and this linear mapping metric that I'm show you does not guard against that Supercent possibility just says that it's within this span of it to the extent it works. I Don't know if I I took your question and detorted into my own little little Tyrate so maybe you well, that's what I haven't shown you any results yet, but that is the idea I just wanted to set the stage for like what am I gonna try to do to make these similarities Because and then you guys make kind of I'll show you two ways of doing similarity And I've sort of described one of them and I'll show you one more But this is the one we use for much of what's published so I'm showing it to you first And it's nice because it connects naturally to what I showed you earlier It's like here's that chair neuron and now I'm stringing these images And I'm used a bunch of other images that I didn't show here to build the linear mapping And then I'm going to show you the prediction and that's the prediction of the ANN model Simulated it neuron as a weighted combination of its IT features to fit this observed IT neuron This is like predictions of these images again It didn't see any of these images in fact the network itself never even saw these Objects when it trained that full network that I showed you was doing the classification test So so there's sort of two training data one is what did you build the darn network on and then what is the second is? What did you use to build the mapping of the network to the brain and here? I'm holding out Images that it didn't see to do this mapping so I want you to see like if you look at this like you can see It's not perfect, but it's sort of actually getting a lot of the structure within this here So it's like yeah, it's not a chair neuron either. It's sort of not anything It's hard to describe it in words, but it's fitting the data quite well In fact, you know we can show this but you can see it's also not perfect This black here is higher than that red and that looks lower that you can see this thing is not perfect Here's this face neuron Yeah, okay, and then here's the simulated face neuron or again faces the short face in quotes It's not really a face now. I mean you can see it actually captures the structure again quite well and so I'll show it to you there so again We can quantify this in terms that we quantify this in kind of explainable variance It really it's a you know predictable variance. This is all held out variance. That's Reproducible that is again not do not do to neural noise that we that's irreducible variance in our data and a fraction of Reproducible variance that is explained on average in these two examples are about that. This is what a 50% explained variance looks like It's about half of the explainable variance is explained Which is actually quite good old models like H max and others were actually around 20% and I'll show you that in it So when I said we had a breakthrough I mean the models were like pathetic and then they had this huge jump Once we started building things this way and that's what I meant by saying there was a breakthrough So now we were immediately up to sort of over 50% of the variance explained by this particular strategy of Optimizing networks and then taking out neurons and comparing them with these high-level visual areas It's not just the highest level IT and again, I'll show you all this to quantified But I want to first show you another example. This is v4 This is that neuron I showed you earlier and here is the model now It gets more interesting because it's IT layer here's its best fit from the IT layer you can see it's you know It's kind of falling but it's actually not that good This best fit comes out of this so-called v4. It's layer 3, which I called quote It's v4 in which is shown here in red So this is a combination of layer 3 features fitting the V the V4 neuron And I should have said this the v4 data are just copied in the background here So you can see it four times the red is what's differing on these plots So you can see this red if you eyeball it is actually the best out of the four And that's what I'm trying to say here and that turned out to be true on average As I'll show you in the next slide But again, even neurons that are hard to explain with human words are reasonably well predicted Again, I think this is about half of the explainable variance. That was the 2013 number We were doing a little better with old models. So the jump was not as big for v4 30 to 50 instead of 20 to 50 Here's what I just said now quantified over all sites. So this is like 100 v4 sites and 100 it sites This is median explained variance fraction. So remember, I just said about 50 percent That's what you're seeing here and the top layer of the hmo model was the best at explaining it Which is maybe not surprising to you It didn't have to be that way and then the then it goes if you look at earlier layers They don't explain it as well and then here's the v4 layer And this was especially interesting to us because the top layer was not the best The middle layer was kind of the best. Both middle layers were pretty good And then it falls off again in the first layer So this is one way that we can kind of come up with a match score over all our samples of neurons It's like an average or median explainable variance fraction Here's those control models that I mentioned to you by the way hmax plus 09 was that model search They mentioned from my lab where we were doing architectural searches. They were pretty good But nowhere near is what these models could produce And since and since we have time in this group, I want to highlight There's another model on here called this category model This is essentially that kind of trying to be an ideal face neuron model or a car neuron model or a dog Chair neuron model whatever you want within those categories And I'm showing you that there because sometimes people look at this and say well, of course it does recognition And therefore if you do recognition these neurons should look like it But if you try to model it neurons again as face recognizers or you know Basically outputs of linear classifiers You don't do nearly as well as when you model them as these hidden units That are one step away from linear classifiers, but aren't quite yet the linear classifiers That's what you get if you just model them as like categories themselves in this purple here And so those are other controls for us But so this is the gap between these models and all these others was when I said we were as a breakthrough That was in our minds. That's the breakthrough and things have just continued to improve on this Since then and that's what I'm going to show you on the rest of the talk here. Yes question Well the matching between one organism another should be linear Otherwise, it's hard to say what it even means to match is the core argument there Right, if I allow non-linear matching then everything equals everything else, right? So I need to draw a line somewhere Less linear with respect to it pretty linear with respect to the pixel, right? It sort of depends linear with respect to what right? So When you say wait we before you mean which thing are you looking at these two or These two for instance Yeah Oh, but okay, but these these okay, but those are non-linear transforms within the model. So I I'm not following your logic exactly Relative to what? Well, this is like again the lower levels don't fit v4. Well at all here's pixels trying to fit v4 So v4 is not too linear on the pixel space because it can only explain less than 10 percent of the v4 variance with a pixel model I mean you are there are interesting points of like can how far can this metric? What kinds of non-linearities can this metric kind of be sensitive to right and and which ones are kind of gonna hide inside the similarity score And I think those are important scores Well, I mean you guys you're you're kind of obsessing about these exact kind of colors What I'm obsessing about is what I when I look at this plot what bothers me It's a 50 percent and not a hundred percent Right so like this models you can kind of glad you know people like to look at go Oh success breakthrough. I'm like, well, we got half full here, right? It's only halfway up. It should be a hundred percent because I've already removed the noise variance But we'll come back to this question later. So Your questions are good ones they have to do with like whether we're sure it is before and I'm looking at this like Well, we've got this or just improvement to 50 percent and those are your questions are me sort of Secondary but interesting, but you know, there's different ways to view these data as half full half empty in that regard So let me um, I've got about an hour or half an hour left. So I want to kind of keep yeah, I want maybe one more so The current sorry, I couldn't get current ages We say ages. Sorry. I'm not good. Oh Recurrence here like these recurrence here these The remember this is a schema schematic of the real brain So that's all I'm showing you here And this is just to say if I take samples out of the brain and compare their variance fraction to a model This is what I get So the model has no recurrence No feedback The model is a feed for I should have said that more clearly if we back up to the model the model is feedforward only See these arrows feed forward feed forward feed forward. This is a feed forward only model. It has no time Right. It just an image comes in images You can think of image comes in image comes out image comes out image comes out image Technically if you implement it there's some time unrolling here, but it's basically a static thing And it produces a layout of the data At all levels that just one one set of numbers for each image So there's no recurrence. There's no time comparison. This is a very basic feed forward model at this point In the biological network, there are recurrences and there's time dynamics that I'm not modeling. Remember, how did I get around that? I just averaged over time to give you one number out of it So any time dynamics I've collapsed in the way I'm comparing the data Which is when I show the neural data that is the that black dot is the average of that it neurons responses over 100 milliseconds And I showed you that yesterday where I was like, here's a bunch of spikes. I take this big chunky time window It gives me one number Tomorrow we'll break down the the time scale within it a bit more To look within the dynamics and that connects naturally to the questions of recurrence For now we were just trying to kind of get the the models sort of in the space of the The responses of the neurons with the sort of first-order approximation ignoring the recurrence Yes These are held out images So they're cross validated across images as I said, but and we can cross validate in different ways Like what do you want to kind of different training test splits? I have a slide on that next Okay, there's everything I show you guys by the way anytime we make a prediction or an r it's always cross validated The question the interesting question is like cross validated in exactly what way like how far is your generalization from what you show me Okay, in this case you imagine I show you a bunch of images Think of like I might have trained on this and predict on these right because these are just sort of samples of images Right, so so you could say well, they all have chairs, but you know at the pixel level. They're very different So what's the distance measure that we should use but that's what we're cross validing over here Um, and I'll show you another slide on that that might maybe the point will come up again So let me let me just jump ahead or try to get to the um I think I've kind of given you one of the main punch lines. I wanted to give you today But oh, yeah, here it is So this is another way of comparing Neurons brains and models and I'm bringing this up because you see this a lot in the human fmri literature And so you probably should be familiar with it And we can do it with the neural data too. It wasn't our preferred metric to start You know, I think they're pros and cons of these metrics. This is called representational dissimilarity analysis And nico cregas corte is the person whose name is most associated with this And he's done a lot of really nice work on it. So if you're interested in this, I encourage you to read his stuff Here's one review paper from them. Um, but the key idea goes like this Remember this neural state space that I showed you Actually at the beginning of the talk when I showed the faces and the non faces Here's the neural state space again responsive neuron one response a neuron two response So there's like three neurons you could imagine recording And because we can visualize a three-dimensional space and now here's this image You know just schematize as to where it lives in the neural response space of these three images Here's another image. Here's another image. So there's three images What's nice about this is you can then think of the distances here Somebody yesterday asked a question about euclidean distance. I don't maybe that was you right? So this is where that really matters when we compute distances between them of how you want to compute that But what I'm doing this is you could imagine here's a model And um, it has features neuron one neuron two or neuron three So I call that features it could be neuron in quotes from a model And you can see what I've drawn here is like I'm trying to draw these as if they're the same in the terms of preserving the Distances between the images inside the representation And that's essentially what representational dissimilarity tries to do is to compute the pair wise distances Um, and you can do that between images or between categories again You can you can do this in various ways And then you can ask how the distance matrix computed here compares with the distance matrix computed here So you end up with a matrix of distances a matrix of distances and then have a similarity A distance of distances, which is the similarity judgment. Here's examples of matrices From this is monkey it. This is our it data using the representational dissimilarity metric of crigus corte This is distances And a couple of things you should look at when you look at these plots here So these are a bunch of images of each category You see these kind of blocky structures here in blue blue means nearby red means far In distance. I'm sorry. I don't have a scale bar and this blocky structure means like You know animals tend to be near other animals and boats need to be tend to be in our other boats and cars Cars and faces near faces that block diagonal structure is it means you're already set up pretty well to do categorization Which is why you look at those and say it's kind of a face like neuron There's a lot of neurons that are sort of like that that means that you end up with these kind of block diagonal structures But so so that's sort of the categorization information sort of reflected there But you see there's all this kind of off diagonal structure about distances that is reasonably well captured within these are the model it units From the hmo model And you know you can see that these look you know visually very similar And then we can go ahead and do a correlation of this and that's what you're plotting over here And here's hmo Here's all those other models in gray that I showed you earlier again There was a big jump here And this is the max possible variance given this particular comparison metric again trying to remove the noise from this simulation So this is a way another way of comparing one representation with another called representational dissimilarity now So again, I'm criega scorte od Olivia jack on to jits hondor malik Justin garter these are all people using fmri studies and humans to make these comparisons I've also seen groups doing this with things like, you know ECOG or other measurements as a way of using rdms to compare deep neural networks with these other types of brain measurements To the question of general you asked about this so this is a version of this So this is what these rdms look like when you build them When you when you build the mapping So I should have mentioned the way we're building the hmo is like these are the simulated hmo it neurons once they've been mapped to the brain And now we can do that mapping using held out categories, which is at the bottom or held out images Which is what I was showing you at the beginning or held out objects, which is in the middle So these are different types of training test splits. I think this was your question And I think you can see category generalization is the hardest one to do and that's the worst performing You can see it's kind of the lowest here on the plot, but still way better than all the other models Right, so this is maybe more for the aficionados of how you do the train test split But and I could point you to this paper if or we could talk offline Okay, so um, okay That is I showed you kind of the result Right, which is you train a network and you get this neurons in it and it looks like the brain to some part And now I think for the last a little less than a half hour. This is the fun discussion to have I was like, what does this all mean? Where does that take us? What should we do next? Right, so I don't want you to you know, it's not about hmo. It's about a bigger picture here of well How is this giving us understanding of what's going on? Individual system or any other system where you want to apply this result I should say this approach I see now being applied in you know Motor systems and other systems where people are basically optimizing neural networks to do a task and then comparing the internal units With the units that they measure and that approach is having success in lots of domains within neuroscience Here we showed it in vision because it was sort of well on that path for many decades and it's a well set up to succeed But now it's sort of being applied more broadly But so there's lots of good conceptual questions in there, whatever system you're working on But let me let me kind of show you how the take home message from the hmo work that we really like people to remember It's not so much about hmo, but about this broader message about performance optimization versus internal model matching So this is how we got to hmo in fact so backing up to history Remember I said every time you have a that model It's like a family of models if you pick the parameters of that cnn family You can see some random network that may or may not be high performing And each dot here is actually a sample with randomly sampled parameters within the family of the that that we were messing with at the time And what i'm plotting for you. Oh and the blacks are examples of existing models I mentioned hmax plus 19 was a model we built with that gpu optimization I referred to these were other models that sort of existed as kind of control references for us Here is performance on that invariant object recognition task that I mentioned we That we thought was important to this part of the brain This is so this is a behavioral level measurement Just how good are you in an absolute sense not do you match the behavioral patterns? Just how good are you and this is a measure of functional fidelity to the primate brain of the last hidden layer So this is essentially what I just show you that median explain variance once you do a mapping to it So this is this is the measure. I just show you at 50 percent And what we thought we did I maybe I'll back up a bit because Okay, let me like this is a good discussion when you guys see this plot. What does it tell you? If you were looking at this and saying ah, what do you see on this plot? Are the dots correlated or not? Do you see correlation? Okay, there's a correlation between this and this so if I say i'm a neuroscientist and I care about this Then I see what I could do Right, which is I just showed you clearly clip the slide I could say don't optimize for this optimize for this and this comes for free Right so and that was actually the trick because hey We don't have a lot of it data to kind of fit parameters on But we can get images all day long and run tasks and search computer vision Databases to just do well on tasks and then we'll for free get some neuroscience explanation power And so what we did was we optimized the network which I called like evolution in the computer You can think of it as development if you like Something like evolution development probably not exactly biologies But just give you a sense that we're searching some family of stuff to evolve a system And we got to a system we called hmo That we first had in 2012 was first published in 2013 Which was actually a 50% of the explainable variance That that um that was kind of a strategy So you have an AI goal computer vision machine learning wants to do this and neuroscience wants to do this and look Our goals are now connected Right, so one goal one drives the other or maybe we do this and then that comes out right So this is sort of a big picture view you might want to have from this But it also leads to kind of if you're a neuroscientist just like well Maybe we should just stop building any models. We should just wait Until this keeps going and then we'll get an even better model right someone else will do it for me If they're because this is not what our field of neuroscience Classically is doing we're supposed to be like asking if it looks like the brain or measuring the brain And what do you guys think so we just sit back? Is that sound right? What is it this plot suggests it's right, right? But it also kind of have to keep in mind that we had kind of already put some Neuroscience into the game we sort of put the models in the right space to make this work So if you think about that it's like people get optimized It's just optimizing off into wherever there's no guarantee that that's going to keep helping us But it's because they're excited about working with networks now because this worked for them once There's actually and again a happy medium where we can kind of just watch what's happening in those fields and ask How brain like is it becoming as these models continue to evolve? So that's what i'm going to show you next but this is the big picture slide here So right around this time. This is when alex and it came out. Who has who's heard of alex net Only a few people okay, so alex net was Was the the reason alex net was the was kind of the breakthrough It was the model that put the neural network models back on the map of computer vision Right, so for a long time computer vision folks and I would talk to them in conference And they would be doing their probability thing and their You know bayesian version x y or z And you know they were having a they were and then suddenly there were some groups You know I'd go in the meetings and you know Yon lakoon would be in the corner with his you know gradient descent network And i'm like yon this seems like it's in the right direction because it looks like the brain But there'd be no one at the poster and he's like yeah, but it doesn't work as well as the current thing So you know computer vision would have these competitions of like how good are you at doing things? And so image net was one of their benchmark challenges Which is a categorization task of just saying there's a thousand categories as a dog Is a cat a thousand categories some of them are very weird and non-human derived You can go down it's on the web you can go look at it, but it's cool It has a million images so you have and they had competition so they could see what models were doing well in 2010 Here's a bunch of traditional computer vision models and they're this is their error rate So lower is better here and then in 2012 you suddenly see this dot where it kind of blew out all the other models This was a big jump in performance moving down this far relative to all these models Suddenly everybody took notice. Oh, there's a model that's doing so well What are those guys doing and what those guys were where this is alex krishevsky and jiff hinton's lab They were basically using gpus and to train up deep neural networks that were sort of brain-like that were roughly in the constraint space They were doing the same thing we were trying to do with hmo But for actual computer vision competition rather than to fit the brain So they actually kind of won that competition in 2012 and then you see suddenly there was a couple blue dots left But suddenly the whole field of computer vision decided this is the way to go We're all going to build with neural networks. This is by far the way to go And now they basically this just keeps going down, right? So the old approaches have died out and been replaced by these CNN approaches of all kinds of stripes and colors and full forth and i'll show you some of those In a minute. So that was around that was the big breakthrough around 2012 Your articles in new york times more broadly about deep learning not just deep networks for vision You know how inspired by theories about how the brain recognizes patterns. That's essentially the background that i've been Giving you So the way i like to think of this is you know Some of us in brain and cognitive sciences We're basically trying to follow this track of models fukushima grossberg allman I didn't mention these hmax models that we were building And we're just trying to build models of how the brain works that was our goal And then computer vision had a bunch of threads going on and it had this one thread that again was languishing for a while Where you know smart people were working on it, but it wasn't performing as well And then suddenly around 2012 it started to perform well And this is when these fields. I think sort of really strongly Converged and what's cool about this is again This has continued to kind of pull these threads have died out now Computer vision is mostly pulling in these kind of neural network style threads And now there's all these newer models, you know, you know vgg resnet exception the list goes on and on And now that means that these are hypotheses for us to ask How do these network models compare to the brain because they're building with neural like hardware I can make these comparisons with what we actually measure in the brain So these are higher performing and they were higher performing than the hmo model we had built So if you remember my setup, I said build a model that's like a neural network that performs well And if you can do that then look at its internals and it looks like the brain So these guys were doing that and it's nice because they didn't care about neurons They're just building something to try to be like the brain So there's no overfitting of our data or anything You might be wondering about that our lab was building the model and comparing the data So there's labs build models that are neural networks and we can compare them with the brain So i'm going to show you how do those models do when I ask their neurons and those models of how they look like the brain in it So remember here's the ai goal and now i'm changing this to image net validation performance because that's a computer vision benchmark But this is essentially the same thing I had on that correlation plot before here's this neuroscience goal This is in this case fitting it Here's some models that we later developed in our lab as a simplified class. It's not hmo anymore Here's the hmo level ignore the numbers here because they're not normalized data I just want you to look at the absolute level, but here's hmo Here's a bunch of other models our class and then here's alex net So first thing alex net was better than the hmo model at fitting our data that predicting our It responses so when we publish that way back here in these papers if you're interested And so alex net kind of like oh, that's cool a computer vision model was actually also kind of fitting it So that kind of continues this correlation curve. I've been describing But um, yeah, so the ai is kind of doing our workforce a bit, but you look more recently This may be plateauing. Maybe even going down Um, so these more recent models even though they're higher performing and winning these image net competitions We're doing better at them. They start gaining in performance The correlation seems to be breaking now with these newer models Which is what you'd expect to happen at some point at best It's flat with respiting it, but it may be it's kind of stuck right now with the style of models that people are building Okay, so does that make sense? So the Guys all with me. All right. I think this is kind of a This is I mean, this is sort of really relevant to the intersection of this course. Okay Way back to the beginning of yesterday. I said, hey, there's this behavioral data and I kind of it came up I said, well, I'll give you that punchline earlier these deep neural networks You know, they met this barrier So this isn't doesn't look exactly the same You see there's like a little more blue here than over here But statistically these are very hard to tell apart and you can see that with your eyes suddenly these At these behavioral scores these models started to look behaviorally a lot like humans and monkeys And this is kind of your question about putting linear classifiers on them. This is sort of a version of that So they're meeting this kind of behavioral benchmark So these models are not quite explaining it, but they're in the game behaviorally So one of the things that we have been that we've done Since then and so now we're getting into more modern things of giving you a bit of history of how we got to where we are Is to like look more precisely at the behavior and this came up yesterday too We're going to collect a lot more behavior out of humans on mechanical turk and monkeys in their home cages doing the thing I showed you yesterday So we're on this is these are actually older numbers were sort of millions of trials now and we can then We can then look now at much higher behavioral resolution So we can now these are like 24 object categories now. I'm showing you individual images There's 2400 images here and I can look at the probability Let's say if everybody was perfect that all the yellow here would be on this these diagonal blocks So you can look inside the diagonal block and see some images of a wrench are more blue here There's just an example So that means there those are images where there's mistakes more often made on those images and we can detect that reliably This is the probability score and similarly sometimes images of a hammer incorrectly called a wrench Right and that's these are these yellow bars on this plot here I'm just showing you this so you can get the intuition that we can now do image by image measurements of what's going on Guys with me. Okay. So why we're doing this because we're like these models. Let's I'm an experimentalist Like let's find out where they're broken right because at this behavioral level they don't look broken And so we can measure this on the an ends. Here's humans and monkeys We sort of discussed this yesterday now at the image grain This is a different color scheme Don't worry about the exact colors. All I want you to see is like if you eyeball these you can see the All the little structure here is actually very very similar still between humans and monkeys even at the image grain We can't tell if it's perfectly the same yet, but it's really close In fact, we can easily see even by eye the current deep neural networks You could take hmo. You could take any of those networks. They you can see even by eye There's something funny going on here. They're not quite they're getting some images right that the models You know the the primates don't And they're getting a lot of images wrong that the primates do And so there's something still kind of broken here when you look at the image level comparison So again the models aren't quite done yet and we can quantify that for you. There's a gap here The models have gotten better. This is sort of time. There's alex net. Here's some of those newer models But they're still not quite of up the human primate level performance So we have here's a big picture summary of what I've been trying to tell you here today You have these kind of summary deep neural networks that kind of have simulated Ites and v4s and so forth. You have the real brain. You have a kind of match at it It's about 50% v4s. We've lived a little higher than 50% other groups. Um, this is mostly uh, uh, Andreas Tolias Mateus Bethke's group have been making comparisons in v1 and found actually quite good matches There as well with these lower levels of these networks even optimized for recognition tasks And so and then behavior. I kind of showed you it's like I this is about 80 percent That's take that as an approximate number But you can still see there's problems in there as I was in this paper here if you'd like to see it So this is kind of like we're in we're where we are at the moment And so here's a sort of summary of what I maybe want to leave you with today so One part of general visual recognition intelligence is we have core recognition as a setup task It's not the only thing of vision, but it's one important task And what I try to tell you through all most of today is that several deep a and ns I started with hmo, but I talked about alex net and some of the more recent a and ns behave Both internally that as their neural units and externally their behavior Far more like the primate brain than previous artificial systems And I showed you some of the quantification of that both internal and I showed you some of the external things beginning yesterday And briefly here today This is really an intersection of what I think what this course is about It's an intersection between ideas of brain science and ideas from AI And I kind of this is a kind of a version of the slide I showed you early on in the very early second slide I showed you yesterday I won't read this for you to say but it's sort of the combination of these two things coming together is sort of what's enabled this And so if you take a summary of where our subfield is right now is that we've had achieved a good Let's call it approximately 50 percent understanding of the initial Ventral visual stream processing just the first couple hundred milliseconds Not the learning not the spatial layout of how the the neurons live within the brain I haven't talked about that at all And and one of its intelligent core intelligent abilities general core recognition so This is kind of a half full half empty story in my mind because this glass used to be down there And now we've sort of filled it up to there So again, we we should not lose sight of this has been a dramatic increase in progress and how we can explain in the brain But no deep in and passes all our tests So even our current experimental data can rule out these models as we can falsify all existing models In that sense as being copies of what's actually going on In the brain at least at this functional fidelity levels And so here's where now I think this is a fun discussion to have in the last 10 minutes It's like there's two views here, right? So and how we go forward a and n's are fundamentally flawed as hypotheses of brain or visual function And the other hypothesis is um, no They're correct a and n is just is out there somewhere. We just need to find it We just haven't found the right one yet when I've kind of told you so one way that we've gone about doing that So, um, I'm going to just like let's say take hands. Who thinks who thinks this Who thinks they're flawed? Okay, one flaw a couple flaw a few flaws. Okay. Who thinks they're, um, we just need to find it Maybe a bit more. Okay. Um, everybody else is like doesn't care, I guess This is kind of a trick question, right? I sort of set this up yesterday too. It's like, I mean to me It's like this, you know, I didn't say, um I kind of said it in my phrasing. So maybe I led you guys on I didn't say, you know, um I I I didn't say, you know, alex net is fundamentally flawed because that's true, right? It's like, I'm basically saying There is some way to simulate the brain in a neural network I would presume which is what I could call it anything that is an a and n, right? So so it kind of to me it has to be like this It's just what details do we have to put in and what ways do we have to set the parameters? But I realize that some people would say well, Jim, I thought when you said a and n You mean convolutional with certain type of learning and if you put in those details then of course Maybe you could say they're fundamentally flawed. But my view is there's some way There's got to be some way some neural network that's going to approximate what's going on inside the biological neural network Because if that's not true, that's like the central dogma of neuroscience that we're computing with neural networks somehow to do all our fancy behavior So that means there should be some artificial way of simulating that To to the levels of accuracy that at least we can measure Experimentally and so I think this is the way I would prefer that to think about and I hope maybe you would too Is that we just need to find the right parameter sets to do that and how we do that is still an open question So I'll like leave you today with kind of one idea that We've been trying to do it's not really an idea. It's more like a Trying to move us into the modern age as a field Which is really what I've been trying to describe all along is I want one unified. I want to find One a&n that predicts many things right? I just don't want one a&n that predicts it and a different thing that predicts v4 and different things I want essentially a copy of what's going on across this whole system is what success looks like to me So it should be 100 100 100 100 100 all across the board plus other things that maybe even aren't on the board yet And that would be a kind of fully functional fidelic model of at least this part of the brain and its supported behavior And so, you know, I can quantify these right now as I did for you, like how good our current model is at V4, IT, behavior, and these are the things that I've showed you today, and we can have a whole list, a set of things that we could measure. And how can we use those though to find a better ANN, to guide discovery of either better ANNs? And the short answer is I don't know, but I think my responsibility is to at least provide them as gradients on engineers searching the family. So I should at least be able to give you a score to know if you had a model, how good is it relative to other models by some integrated score of all these things. And so, we've taken that idea literally, and Jonas Kubilis is a postdoc in the lab, started something that we call Brain Score, we have a little archive paper, there's a website, which basically exposes all these metrics. If you have a neural network model and you want it to be scored, we're trying to automate all this. Right now you can just send it to them or go to the site and they'll score it for you, and there's like a leaderboard and so forth, and we're getting other data from other groups on here. We have Andreas Tolias's data, Tony Mofftian's data from V2, and we're trying to expand the kind of data metrics that we have here. So we're trying to take this idea literally to at least provide the community with a score of where it is. And I'll show you kind of now big picture again where the current neural networks look on this Brain Score. Again, this is all in this paper. Here's this same plot I showed you before, but now I'm not just showing IT, I'm showing a combination and average of V4, IT, and behavior as one big... We're equally weighting them in this average. And you can see that there's AlexNet, it was up there, and then even though I said models had flattened out, if you look at them overall, they're continuing to increase. There was a nice run from 2012 or so to about today, but now if you zoom in, it might be kind of like, again, flat towing, as I said earlier in the talk. So it might need something other than performance drive to kind of keep this thing moving up on an overall Brain Score fit. And if you want to turn and look at the guts of it, here's the behavior, matching the behaviors, mostly what's been driving this, IT and V4 are being kind of flattened. This is the part I showed you, this is kind of drifting down in IT. So the models are continuing to do better predicting monkey and human behavior, but they're internals, we got a run out of this image performance trick that we used with the Yemen's work that I described, but it's not clear that that run has more juice in it. It needs more constraints from neuroscience to bring this up to these kind of levels that we think are probably possible in these models. So that was about where I wanted to end because I think this is a good discussion point. We have five minutes, but if people had questions about the earlier part of the talk, I'm happy to also rewind or we could discuss this and it's sort of forward looking directions. Yes, so one there and then one there. Yes, go ahead. This one or the Brain Score? Okay, matching in quotes because you got to make an approximation, you're not going to get a perfectly biomimetic copy. Do you think that was built into the models I showed? Would you call the modular like the networks that I was drawing on the board where they have these like areas with? Well, I would say we're trying to simulate the whole ventral stream and then we compare individual areas as sort of marker points along the way. Yeah, so that's a good question. So you're asking, well, maybe, so when one version of this goes, somebody has a neural network model and there's some neurons in it and I'm like saying, hey, they don't look like IT neurons, so your model is wrong. And I'm like, I won't say the model is wrong, I'll just say it doesn't look like IT, but it could be like it looks like the LIP where I didn't record. I'm not trying to reject that model. My job is to find a model of the ventral stream. So I take your point, although I hope if you guys think I'm talking about modular, I mean, I think in the neuroscience community, I'm like the opposite of modular. So if I'm presenting you guys a modular talk, I could give you some other people that would give you a real modular talk. So these networks are just connected sets of things to me and the modules are some just handles of like, well, there's a place called IT that we can reliably go to every monkey and say I'm gonna record over there, but that's just a sort of a reference point. It's not meant to say it's modular function in some strong way. But I think you're making the point about ventral stream module as opposed to all the vision and that's since I'm a ventral stream modular guy in the whole setup. Is that your kind of like? Oh yeah, I don't, right? So that's a good, so right. So kind of what you're asking, I think is like, well maybe this other part of IT that's not explained because if you keep optimizing for core recognition tasks, you're never gonna get all of IT because IT supports more than core recognition. Is that that's, and I totally agree with that point. So right, so these are the, so now you see if we map this back to those three things for machine learning, architecture, task optimizer. In my mind, you're pushing on task, right? We said categorization. You're like, well, why don't you do categorization plus something else or maybe just something else. So those are the terms of the hypothesis space that we're now engaged in in this discussion which I think are the useful terms. Like change the task and see if you can align the system to be better like the brain still using and then it would show up on our brain score as like, oh, suddenly it looks better. Now maybe you need to mix your tasks but so if you were able to execute on that, we're trying to be ready to show you you're actually on the right track by these kind of measures. That's sort of what I view my job as. So I'm encouraging you in that direction and Dan, Gaimans and others are moving in those kind of directions and that is a great direction. I mean, it's not, again, that's where this feels this is exciting because you're talking in machine learning terms now but actually intersecting with the neuroscience to me because you're talking about task and that's sort of top down and that is not the way neuroscientists typically would approach this problem. I think it's the way they need to be approaching this problem and that's what I'm the preacher to that field of what this is how it should be and trust me, maybe you guys are receptive to this but there's a lot of pushback in our field as to whether this even counts as a way you're supposed to do science and neuroscience. So and maybe we could talk about that. Yeah. Maybe the future doesn't. Right. But I guess they just haven't said a long picture on just taking a picture. Yeah, so I think one translation of your question which is a good one is you're asking like what's the form of generalizable knowledge we hope to take from these approaches and frankly speaking, I'm not looking for generalizable knowledge. I'm looking for a copy that I can do things to the ventral stream with like brain machine interfaces or some of the things I'll show you tomorrow. So this is kind of, this is where I am more of a biologist than a machine learning person. I'm not really a machine learner by training. I'm an engineer and biology is full of hardware specific examples that we need to sort out if we wanna fix it. Maybe it's an old processor but we better know how that 8086 works if we wanna fix it. So it's kind of like of that. I'm not necessarily looking for generalizable knowledge out of this to say memory. If there's something generalizable it's the performance driven approach. So you could again step back and say let's change the architecture to make it more like the memory systems of the brain, performance optimize again, check the neurons using all the mapping stuff that those guys have already worked out and replay it under a new problem. But for me in my lab it's more about how do I get a kind of something that is most like a ventral stream even if it doesn't generalize. Maybe it'll generalize some ideas to the other sensory systems. But I don't memory that may be a stretch and I'm not even presuming that we would even get such a thing out of this approach. I hope I didn't try to convey that idea other than the performance driven idea. So I guess I'm saying I admit that this isn't going there but I still think it's important if you wanna understand how vision works you need to get an engineered version of what's going on in the visual system even if it only applies to the visual system. I just wanna make sure you understand where I am. It doesn't mean it's a right or wrong. I'm just kind of giving you my perspective. Exactly, and that's kind of when you look at the brain and you ask Christian about memory, you look, there's different architecture. The hippocampal architecture doesn't look like the architectures I've been using here for instance. And I didn't put basal ganglia and other, that was a forward looking questions to other brain systems and other problems. Vision is in a more mature state so I can kind of do this here but how to translate will require restarting the architectures to those other systems exactly as you say. Yeah, so we're kind of over time but maybe Davide you want more and then. Oh no, okay. No, in the sense that the, I mentioned the discussion of the sense of the cue. You are making sense of the cue in the sense you do have more than you can say. Many, I mean, the term may just be some of the areas that you're going to be talking about. Right, it can explain and predict the variance. Yeah, right. You can go to your grandmother and say, you know what do you want to do? You can find children and protect them. So I have a way to fix it. I mean, I can explain it to my son. I mean, he has nine years old, right? Now here, we have an explanation of it. You know, she's happened out from there. He was in the hospital. Everybody, okay, what do we actually learn about it? I mean, you never, because I couldn't sound like he's saying, what does it mean to be like this more? I mean, or to be like, I mean, and then what do you have to know about it? Because I can say, I think it was, I mean, so before you know what it means. Right, so this is a great question. This is the discussion we want to have and we'll start it, we'll have it tomorrow. Because I think, but I thought it was just a brief version is like, what can you do, what is understanding for? And one thing it's for, is for communicating with other human beings. And I'm not trying to, that is what, that was your V1 example. Tell my grandmother. And that is not, that is only one thing that it's for and this is not giving you that other than to say performance optimized and that's not a satisfying thing to most grandmothers, I would say. Right, so, but it is actually the most compressed version of the understanding. It gives you performance optimized within your architectural space and you should be happy. Vision scientists aren't happy with that. But the other things models are for are for sort of acting as foundations for other experiments and this certainly gives you that. It also gives you things like, you know, predictions for neural control which I'll show you tomorrow. It gives you sort of setups for BMI. It gives you the things that engineers actually need and it's reproducible and downloadable code and falsifiable as models should be. It's not communicable to your grandmother in a way and that is actually the tension that I mentioned at the beginning that people are not happy because they thought that's what they're supposed to do. And I think there's lessons, you know, physics or other fields where actually you get to the edge of a science and you make it a real science that this always happens to some degree and we're moving from a heuristic science where we communicate in words to one where we're going to communicate in models and downloadable code and that's what we're going through right now. But in the long run, you know, I can tell you all the same colleagues that sort of say this is an understanding then they show up, you know, six months or a year later saying, look, I've discovered the V4 neurons are fit by a deep neural network. They just cast in a different language. So even if they don't like it, they're all using it. Right? So the field is, and if you talk to older scientists they complain, younger scientists, they're all on board. So it's just a matter of like, you know, this science precedes one professor's death at a time. So the field will naturally kind of move this way. I already see it happening. But I will return to this question tomorrow because I think it is a good discussion about science in general and what models do for them. But let me let you guys go and thank you and we'll talk tomorrow. Thanks.