 So, Daniel, it's up to you. Thank you, Lohan. So I want to thank you and Olivia to be at the origin of all these projects to do a proposed with artificial neural networks. I want to thank my co-laborator, Fouclot, okay? And also another source of my interest for a purpose coming after this work about information that I will speak a bit, is with Alain Berthos, so in Neuroscience, by trying to find kind of internal geometry in the brain. And we propose that they are organized like a stack. So we hope there is also application to a natural neural network, not only artificial. For other, acknowledgement are written in this text. So it's not a good reference. It was yesterday and Jean-Claude gave us to the good reference to this paper, which contains all what I will explain with the text. So the source goal was to find the same for a deep neural network, the DNN. And Jean-Claude has explained where. So I will skip some slide because they were already presented by Jean-Claude. So to focus on the different topic, which is in the same sense, now we call include also in this framework of the topos and Al, the possibility to combine moduli in network and to make that networks could speak each with one other. So we saw that which is a category we can call an ectopology by adding this four. So I don't remember the construction, I don't remind you the construction. But in fact, I want to add something. I will work mostly in another setting because what matters is the topos. And as Olivier explained us, there are a lot of types then the convincingly. In fact, we could choose different sites for that. And in this finite case of site without under office, in fact, we can throw out all the topology and it was a market by Joel Friedman from years ago. And it depends on Caramello-Thiorenski, which say necessary and sufficient condition for a topos to be a topos of pressure over category. And here we have such a specific pressure, a category of pressure for the post set, in fact. And I don't want to explain in detail all this structure, but perhaps we look at this kind of example, the better one, perhaps this one. So it is a representation of the graph of the cell which are in LSTM and I come back at the end of the call because in some sense they play a very interesting role to have a semantic content and we will try to understand why. For example, not only for the extension of gradient, but they continue to be used now, 30 years after the invention in networks, each time, for example, you have to translate the text. So it's known that networks, mainly the text statistical properties of the text, that in some sense it says that also this kind of cell has some, at least syntactic development which is not cemented. And so you see the graph, ordinary graph to the left, and now I write the post set here on the right. And you see what I added, in some sense, is this red half cross, and which makes that this point which looked like a factor, correspond to the thought better. So in some sense, you have two kind of input. You have the input which comes, for example, XT here is data coming at time T. XT plus one, data, new data coming at time T plus one. You see the arrow in the reverse direction as the information flow, because we look at the component from talk and for many good reasons. But here, the finite setting is not so much wrong. You see that in some sense this internal part are also sending information. And that's the lesson. Dynamically, they will be neutral in ordinary neural network. But in the future, we can expect that some kind of spontaneous activity can come from this point. Because they are similar to the input. And after that, you have an output for the process network too. And as the output is here, for the HT, it's output of all the steps. It's a memory for the future. CT also is part of the memory for the future. It starts here, and it starts and goes away. And so we work in some sense with pressure. And the activity is represented by the pressure dynamic. And the weights are also represented by the pressure. And in some sense, they are each over because the specific transfer information and the pressure W forgets the weights by passing to the next layer, which correspond well to what happened in the back propagation learn. But then we just have to remember that we have transformed the graph to go to a process. And we have a dynamical object on this process. In the top of this process of pressure. And one is the kind of the fibered object over the weights. For each system of weights, we have one dynamic. But this dynamic is not really as usual dynamic because in some sense, at this point, we have this kind of possibility for spontaneous activity that at this moment we do not choose. In some sense, the dynamic is given by the section. So the section with the hypothesis we make are totally characterized by the data in input as a set in the layers input, the data minus section. So what is the dynamic in this sense is a section of a pressure. And here in the equation one, you have the characterization of this pressure, but perhaps it's much too long to describe and it was already described by Jean-Claude. And this is the equation two here. Only the equation two, look at it. Say, I always made the full dynamic. So you have the state, it's index R, you have the weight. All the system of weights be coming in the network. And you apply only part of this white to the state. And at the same time, you forget this part of the weight. And you are only interested by the weight we will come after. And here it's the detail that's not very, very interesting. You could look at the paper because it's a theorem one, is that the back propagation can be represented here as a flow of natural transformation of the object of points. And of course, what is interesting practice is more the lemma one, which makes the success of this back propagation algorithm because you have this explicit sigma pi formula to rise the gradient. And so it give an access to the gradient. After that, it's another matter, of course, to integrate the gradient because it's known that in most applications, you have many, many local attractors, and it's very non-conductive function. And here it's a schematic view of the functioning that we will take in mind. And with this analogy, in some sense, we want to pursue that this is a kind of dynamical system with parameters, which are given by the weights. And it's not only learning, which is important. In fact, the learning is really dependent on the data and the question of the output. And so this influence of input data and output will make that the weights will change depending on the kind of input and output you want. And you have certainly a kind of catastrophic study to perform, to understand how the dynamic depends to the weight. For example, in the experiment that John Claude presented, we saw something that really we don't expect. That two layers, for example, in a simple problem, you have kind of a story analysis is very spectacular is that you have an approximation by harmonic of the big speed of the shape of the formula because we were looking for formulas. And at some moment, you have a differentiation and you go to a new attractor, which makes logic. So in some sense, it says that all this structure of the differentiation, and here is this representation of what I say about the dynamic, which is a dynamic that is going by by section, which has the first level of the homology. And here, sigma denotes this differentiation set which has to be understood. Of course, at this moment, very few analytical results. For example, why do you shift from formulae to logic? Certainly, very hard from the analytical point of view. So the second role was try to explain why given n do better than an approximation of function and interpolation. It was a subject of John Claude, you see, too, because in many interesting cases, you have restriction of the structure on the functioning and the learning. And it comes from geometry or semantic, which is extracted or expected from the data on the problem that the network has to solve as output. And this is this internal structure that John Claude discussed, which is given now by not a pressure, in some sense, but from the construction of the technique of the whole of the question, it is given by a contrarian factor to some category. For example, the category of group of, of the category of group. And this is the, this gives a stack. And the hypothesis is that the dynamical object and all the objects that we will present a list to this stack and become object of the purpose over the stack. And we go in this sense to the classifying purpose. And we join here, the notion which is now studed by Olivia, of the relative purpose. The sense is another name for this situation, but she looks very carefully now as a logical interpretation in terms of for example, forced order or second order logic. And we started with that. So before, perhaps I show you because we mentioned several times this kind of Fourier analysis made by the CNN. And here you have an idea of the architecture which are used today. And you see there are really no more shame of fully connected network. You have several parallel views which come together at some point. And you see in the first layer where after that is very difficult to understand what happened, but in the first layer you see this very nice, the web-like kind of analysis or opponent analysis of color which probably constructs a three-dimensional space of basic color. And here I present you because it will be another reason to look at this invariant structure. Is this where we use translation by this here you see the LSTM chain in the right bottom. That's what is original. And this is the main term I will talk about at the end. Is this part that we recent use of the networks for example, starting in 2015 and now continuing and exploding is to use both language and image analysis with CNN to answer questions about the same. It's not only how to detect the cut which could be done by a simple CNN but to combine many things and to try to have a description. And now people are able to have a dynamical description in time of what happened in movies not only in image. And so this is a part of the expression of this internal structure which are very far from a fully connected network which is used to prove to the 3M of the approximation. And so this is what I said. The NL that analyze the image today for instance in object detection have several channels of convolution map max pooling fully connected maps that are joined together to take decision. It looks like this is a structure for localizing the translation invariance as it happens in the 6C visual area in the brains of animals. That you have in some sense many copies of the translation into the image. This was discovered also by a neuroscientist Ubell and Riesel for a long time ago, in the 50s. And the experiment shows that the post-legal in the purple let's form to translate from practical thought. But after that you develop certainly more and more invariance by using not only architecture, but the experiment. Oh, other leg ropes were used. And now I come to this presentation of the stack. It is given by the contravariant from Tor from the process C of the BNM to the category of categories, for example. And the vibration is reconstructed by this continental or zero construction which is described by the formula. Now you construct the morphis in the category F, which are given by the function of contravariant function F. And now it's the interest also of the Olivia and Jean-Claude Manson, that. So the pressure for F of the element for the classical interface are also described by local pressure in each fiber, which are related by natural transformation. And among this natural transformation you have very important one, which are the two adjoint of the F star. And this F alpha is the map from one category for one layer in the category associated to the layer just before the dynamic point of view. And just the composition with the alpha gives you from the top of the fiber and more from core. And this from core has two adjoint, one left adjoint, one right adjoint. And from the left adjoint with the extension, you have a map which goes from the classifying object in what is here, the fiber before to the next fiber more closer to the output coming, come back by pullback. And this define an object and this object of the classifying purpose with some kind of integration which is written as a derivation by Ropendic is the way you construct the logic from the local logic, the logic in each fiber. But that's a very important point for all one to explain. This is not efficient in general to transport the theory because in sense the moral of the semantic function that we expect because here in sense we are describing the structure which we expect that the dynamic can learn that it is from kind of constraint we give for example in CNN we impose that the maps from states to other states are given by convolution matrix states that with non-linearity, okay. But this is in the sense of hope that the real network understands it, okay. But this hope cannot be satisfied if at the theoretical level you cannot transfer the question from the output inside the network and the theory coming by the network working in the layers to the output that means the time that he helped to answer. So in this case in some sense two things which are important probably not necessary but insufficient is that the two morphisms which are here the pullback and it's right adjoint which is more difficult to describe because it's already some kind of homology. These are both geometric and open. And for that we could follow Olivia or also McLean and Murdy and to extract this necessary and no, sufficient condition to have a good function, theoretical function. So following Giro in this paper we see that the case in the same sense is Eiffel Fah are themselves favoritism. So in some sense, the good stack are not only vibration over the side, the seat. We forget the topology because we look at pressure. And that also internally you must have some vibration. And what it's saying is something that people suit in statistical terms to what is named the bottleneck by hypothesis is that in some sense you lose progressively information from the input data to be closer and closer to the simple question which are asked about. So in this sense, this map says that you are a kind of operation for forgetting the data to construct something more significant. And the lemma which express that is that say what when this you have this internalization, in some sense you have a conjunction as a level of logic which is given by the left adjoint here that we looked before, which is really a left adjoint of what is coming from the high adjoint of the control of alpha. You could think really geometrically to that as a kind of hypothesis that this vibration is itself layer from layer vibration. Now you come to language. So you have several languages. It's true as Jean-Claude told at some moment that we are using now language which are more close to linear logic than to traditional logic. And this is developed in our text but now we could also just restrict to the simplest case that the language is given by object in the top of. And we are not looking exactly as a classifier for the path that has the classifier of sub-objects of a given object. Particularly this is sufficient in all application we know for example we have bar in entry and they have color and they have length and it's easy to construct an object of the purpose where coming from the stack which express very well all of that. And what is now a semantic function or I connect now only to the dynamic is the fact that the activity of the layers express axioms of theories which are interesting to conclude at the end. So T out of C in is the name for the theory which is expected when you are the input C in. And so now, so this map in sense people external to the networks are able to use and will take a lot of time so that's what they look on the sample of the spot. But theoretically the human people today are able to understand the fact that so it gives this kind of theoretical correspondence between input and theory. But the network system as to construct the discussion the discussion are just given by activity of some neurons of the layer. Is that what the new is for discreet not all the neurons are discretized but here we look at the discretized and this discretized neurons give the part of some object which are different axioms for theory and this axiom much transport to the good theory. This is a good semantic function in this sense. You could have that semantic function but it is how we relate the dynamic to the theory. And now we can vary the stack of the C because given architecture depending on the kind of problem we could use different category to construct the stack. And this was looked by Jiro when in some sense it is the category of stack over C when the two categories of stack over C when M is the category of category. But here as you will see it will be very interesting to generalize that into because category is what is named as closed model category from what generalize sense but not many categories which are interesting. In fact, purpose the most interesting at this moment is the category of groupoid. But you can sort of have this category perhaps take one word about it made by a vibration groupoid of a process which is also closed model category for several closed model categories too. In the sense of freedom, so with good properties. But what are these categories? Model category because they will play the role now for the best of the told. So not everybody I suppose know what one said. But they were invented by Quillen and there were a large short inspiration for Rotendik in the tour shift of the stacks and history of the battles. Which turns their own notion of closed model in another term. So Tomasson extended with Rotendik and Cizizki and Malsignotti and spread any question of Rotendik about this object and we get the construction line I needed. We need them. And it says that closed model category is a category with the special models which look like vibration, other cooperation and other as a homotopy. And the goal is to do homotopy in the framework of category, in particular for defining what is the homotopy category that the morphism of to homotopy with some special object. Not all the object. And why the motivation of Quillen was to extend the construction coming from algebra, from topology to algebra. Like it was happening at the beginning when Orkz for example understood that the homology introduced before for topology was very useful also in group theory. And after that it was done for algebra, for the algebra. And in fact all that concerned homology and the next step was homotopy and it's now arriving and it was still arriving before. And so what says the first paragraph here is that when I have such kind of category with homotopy if I take so the category of contravariant from top from C to M, then I get a new model category. And very interestingly, this generates a very wide tip theory. And it's stable for example, if you use for M the category of simplicial set to generate what is named homotopy theory of type or void voicing or the univalent action. So this kind of tip theory constitute some extension of the coordinate tip theory including many kind of set theory and in some sense kind of a vision of the basis of mathematics. And art and kaputin has shown that precisely this category in C, E, F for for example the category of group theory define such generative theory which was introduced by Martin Leu, he says that it is a wide extension of the kind of language we used before. But it is that when you vary now, the start there, so not art, so sorry. And the main reason is that technically for what we are doing for which module of network is this fact that in this special case of DNA it's not true for process. Then you can determine easily what are the vibration and in particular what has been the vibrant object that is the object such as the map to the point that the category which is finite category which always exist in this setting. This, this vibrant object are descriptive and we were very stuck because they correspond exactly to the condition to make that the theory are propagated in the logic of language of other stuff. And so perhaps it's too technical to describe that we recover exactly this condition that in fact we will not only have vibration but vibration made by vibration. And it includes the course of four where you see the projection from a product and this projection from a product is exactly what you need here to have a vibrant object. So what is there? What is important to look at this vibrant object? It's because there are, because in this case every object is also co-fibrant, co-fibrant, that is the map you have map from initial category which is a co-fibration. And this is important because the homotopy category has exactly this object as the object. This is a vibrant and co-fibrant. It is stayed by freedom. And so here in this instance we connect with homotopy something which were interesting from the semantic point of view. So this will be the next step now to use information, we will try to define the semantic information. So perhaps I skipped about this cat's manifold. And we look now at this language of theories how they propagate. And we consider the type of theories and in some sense we can localize further to get this cat's manifold. And we can look at the subject of theory which exclude some proposition, theory excluding proposition. And that we observe is that they play the role, the special role in the logical activity of the cell. And this is the kind of localization. And now we construct from this language we construct vibration of the vibration. And we can extract the kind of more precise category here. So the theory behave well over this language. Now, this is the important formula here and then in the definition. And we suppose that we look at the proposition which are implied in the ninth sense by the proposition in this elementary logic. All is local, you have a new layer and you are in a certain context. And you look at the theory which is expressed in this language. And you have this interesting operation which is given by internal implication. And the idea is that this internal implication will play the role that conditioning play in the Bayesian family. And so, perhaps I come to this semantic information. So this is the work on purpose that Laura reminded with Pierre Baudot, we have shown that the Shannon entropy is a universal from logic class for a special module in a ring of purpose. This is a function of the probabilities. And this ring of purpose is made by the set of quantum variable with the forgetting maps. This is the maps of the category for this process are just forgetting. In some sense, analog, there's two analogies, the network. First, you are from layer to layer kind of forgetting. But more than that, when you look at the proposition that's an internal semantic over this fact, you can say that this proposition itself, are variable, which could have value, for example, if you introduce an additional function for measuring the value of the proposition. And here, you see that you have this concept of quantum variable. So here, the ring structure is the joint of variable in our case, it is a hand or some kind of product for modeling that category. And the action is given by the mean of conditioning, that the conditioning means the action. And when you look at that in sense as a topological object, you extract the information quantity from the topological invariant of this object. So it was natural in this context to decide, it is very top down now that information can must be the invariant of what we just described because we describe the categories, there are this propagation of language and all of them, you have the theory and the analogy is that theory analog or probability. And the conditioning, which is given by the internal implication is the analog of the probabilistic condition. In fact, we have looked in this paper that in another paper, which will be put on archive, make this analogy precise. And now we compute the invariant. And so as I'm doing zero, all is okay. In the sense, we have the kind of, we describe as a theoretical cut statement for the development of cement. But what is deceptive is that there is no higher homology in this case. And this is mainly due to the fact that you have always a kind of equivalent of size in this theory. And by using this pulse or the equivalent, which is for example, P, when you localize there, it's very easy to kill all the possible site. In fact, it is something which happens also in quantum theory. You can apply the same principle to quantum information and you compute the chronology as before. And you see that the entropy, which is also present in quantum, because quantum always give also ordinary probability law. And the fundamental entropy in the sense is the analog of this function C, that it makes that the entropy depends on the problem that it creates. But yeah, it's worth because in fact, we don't have the justification for hash. So you have a very large ambiguity about what could be this fundamental question. So this function C is largely at the time. So we will try to make it more precise by using something like some code spoke at the end, the kind of Galois theory, which is given by the structure of the fiber at you, for example, Gropoli and to make that this function, which has some factorial property, is better characterized. In some sense, we recover, when you take numerical function in the Boolean cast, we just recover the ambiguity that was already described by Karnab and Barillan in 1952, where the concept is that in some sense, you have an arbitrary measure, for defining the information, except some element of independence or symmetry. So perhaps things will be better when you go to homotopy. And so here we can describe an object which replaced the bar complex homotopycally from the set of theory and two very interesting kind of gluing in the theory. Since the gluing, there are in some sense adjoined gluing. You see, here I look at what is named homogenous crochet in the case of the bar complex, which can be described without anything abelian. So it's only to find that if you make the action which is given by conditioning on the theory, it must be the same as if you multiply the proposition that you want to take an account. So you have two kinds of elements, your propositional calculus, essentially coming from the output, which make by the gamma, and the theory which propagates in the network to unswear the question of the output. But you have another one, which in some sense, translates the dynamic that the kind of propagation you get of the theory along the theoretical network. And if you can define not only equivalence of that, but homotopy equivalence, that you take cylinders and you just look that you make an homotopy question, which is not ordinary question, okay? And you get something like a natural space of, in fact, it is in a simplicial space, which looks like a good candidate to represent homotopy the theory. And now the semantic function, you can do the same, not for the proposition, but for the propagation with the network itself, with the dynamical object. And now the semantic function becomes the same piece, conjectural semplicial, because it can be very poor in fact. The objective is make that this map is rich. So it defines in some sense a kind of homotopy type of the description by the network of the problem, which is to solve. And I look at it now. So I must stop very soon. And here at this level, we could look at maps from theory to theory of space, for example, in a model theory. And it's easy to reproduce totally non-habilial framework the main relation, for example, of the information theory. For example, here, you have this relation Shannon equation, the hash, and now it means that when you subtract the entropy of some theories, then you get the conditioning. And so one example we work out was L23, this theory that Jean-Claude is trying. So perhaps I skip that, because you already spoke about this kind of what you read, just to explain that at the end. Now we will vary the size. So we are very, at this moment, the stack and the language. But now we want also to vary the network, the architecture. And when we do that, we come exactly with what we're going to do, especially when we enlarge the group point category to any closed model category. So we come to this two function on the two categories, a small category, two larger categories. And the principal example is a, that is do dm of the small category y are all the pressure, revaluing m over this. So exactly the one which I come to with Jiro or for the relative purpose of Columbia. And this generalize what was done before for the derivative category in the Abellion case. And the hope of importantly, and which is make concrete as a theorem of the 1930s and 1960s, is that this constitute not only the generalization of purpose, you see, it's exactly the formula for the purpose of patient, but it generalize all the homological algebra. And what is important is really to consider the functionality in the small category here. And as we have shown, that's the information space that belong to such function. And to consider only a kind of object for doing that, in this instance, we are working on the homotopy category of this derivative. And so, I come back to finish to concrete structure because there's some kind of module in networks where happening, there is structure. And this one, for example, with this name, attention module or more generally a transformer. You see, it's given by this very interesting formula to take the nonlinear map of some quadratic operation X and Y are the input vectors and the W, W, K are matrices in concrete equates. And you take something which has fundamentally kind of the gates free, but which has also this form, which is very important to first apply a nonlinearity to the quadratic part. And all that, all these cells are now essential for all semantic functioning in notebook. And what we explained in our paper, in the same sense is degree three is recovered as a level of neurons. And you can understand this functioning as a sub-network where the internal structure of the fibers are an active group of which is beyond an underlining the deformation of the gate people in the map. And it was the same for the NSDM cell or approximation of them, which are MGR. These are exactly the same form. And this is for H. And this form, you see, it was a plus. And so this is only quadratic. The first, Ct is quadratic, but Ht is only the three. And all the formation which work on this form. And so this is the categories that Jean-Claude show us. So it's in some sense a pancate to understand the structure of this model of modulus and the way it contributes to the global function. And so this is the term which says that this map is not particularly stable, but on each coordinate team. That is the same sense of ways to put sufficiently many parameters to make that individually, the neurons become stable. And this connect with two things. This stability connects with the theory and I'm happy to encounter in this context which has the unfolding theory of connected because it itself is connected with the story of the stability of incoming to the linguistic. But what is more impressive is that independently through language, which is included primarily developed such kind of structure to understand the manipulation of notions in a language. And here I see the two references to the book, the book of Tannikov. And of course, we are not working with true language. As recently, Gromoth and Aline there is no theory, no mathematical theory of language. In fact, even more, it was said by Gestein and Osteen that we cannot coerently of language. Language cannot be embedded in language and it's endless. So the possibility of language is endless. But we encounter them and we can not coerently mathematical for some kind of artificial language which are not ordinary language. Okay, so thank you so much, Daniel.