 So, there are these issues about randomness and probability, which used both in-mathematic and without. And my point is that you have to kind of, this kind of concept has to be redone, revisited every, you know, 10, 20 years. And this was long ago with you, with the last kind of, the current use of probability goes back to Kolmogorov. It is 1933. And the idea is 200 years old, coming back from Bufal. It's a little bit old-fashioned. So, I will just show that very briefly. And so, what we can do, you know. So, this is a different thing we have. And here, and one of the problem is that very thinking, the probability especially in application, said metaphorically, and then as if it were precise, and then people start arguing with that. Especially like in the history and evolution, it's people, about random, without having specific mathematical model behind it. And then it just, well, says it's poetic, yeah. But it sometimes pretends being scientific. And then, on the other hand, the most successful probability theory is, it appears in, in, in physics and in genetic and astronomy example, I just wanted to say, that we know the probability of us being here by the large meteorite tomorrow, well, something like one over 100 billion. Expectation is pretty high because we all will be wiped out. But it's still small. It's, it's a realistic number. It's really not, not metaphor, because we know roughly the number of big asteroids going around and they distribute it. We know how. So this number may be incorrect, but meaningful. But if you say that probability of certain historical event, economic event, or biological event, such and such, it just means English. There is no number attached to it. There is randomness, but not numbers. And that one of the point you have to review concept of numbers and probability. And this kind of contradiction is a kind of historical. So here, where numbers and probability are not applicable, though people try to apply it, it doesn't work. And there is a lot, long, you know, discussion in, in, in, in sometimes getting really kind of political, about statistical, say, plausibility of, say, evolution. What is so called Darwinian. I used the word, but it's, it doesn't correspond to the true historical reality of the origin of the concept. But if you go on a deeper level, where it's more realistic to do something like molecular evolution, even there, concept of probability is not easily applicable. Because we don't know what is the probability space. And the whole concept of probability space becomes questionable. And then there are two other questions concerning entropy, because in physics, you don't so much deal with probabilities, we deal with probabilities, yeah. It's not true that what you observe are probable events, yeah. Everything we observe is highly improbable, but entropy is just right. That's another point. This is because, you know, this, again, probability when you start looking, even at physical model, and immediately they go out of the kind of classical concept probability and the null set, whatever. We observe null sets all along, yeah. It's, but, but anyway, there is good mathematics, yeah. However, and this is a big, big question about information in the cells in the brains. And, you know, people say, and it's not meaningless, however, we don't know how to make proper sense of that, that there is flow of information in the cell. The cell needs something like information flows, and also in your brain, something like that happens. And you have no clue how to formulate this mathematically, right? There are attempts and they all fail so far for all I know. It never goes beyond metaphors. And then I'm using point in probability that there are two sources. It was major historical source of probability. It was not scientific. It came from gambling and usually ascribed to Pascal. However, if you look at discussion of other people like Galileo, he was going to show the problem. It was obvious to him. It was obvious for 5,000 years and it was actually created, in fact, not by Pascal, by Huggins. And so this is what was said about 5,000 years ago. And there are archaeological findings corresponding to that. So it was common knowledge among educated people. And then I think Huggins and Bernoulli, who gave it more or less more than shape. And Pascal, it was just correspondence and he says, we don't have direct evidence. It's unclear to me, actually, they're all Pascal. Cardani wrote some books, yeah, and actually extremely interesting books about that, but then we can argue where the origin of the collection. History is highly material. There are many sources which I don't touch. And then we come, another source was this one. And this is what I'm using. It also was said quite a long time ago. It was found 2,000 years ago. And then it was misunderstood for 2,000 years and was essentially clarified by these people. It is to create a slightly less than 2,000 years ago, but then it was these people who clarified that. And mathematicians usually refer to Wiener. So what's amazing, there is a model of Einstein-Smolkowski or Brownian motion, as we understand it now, of that. But I believe the experiments, high-level experiments, and let's give you a fantastic precision in the variation of a harder number, it's not as good as we do it today, but it's pretty good. However, this model is incorrect. Because a fine experiment showed that these particles, which is moving in the liquid, and certainly Tito Kretscher has never observed them, yeah? He just came to this idea from philosophical standpoint. And people who look after their black brown himself understood nothing of that much less than Tito Kretscher because he was... And again, Tito Kretscher didn't say something original. It was common knowledge of his time, I guess. He was a smart guy, but he was not a scientist. He was a poet. And he said many other things, like he described what is called dervini evolution, like in a few lies. It was common knowledge in Central and Greece, yeah, of course. It was not proven, but it was not proven. They knew almost as little as Darwin did. But mathematics was developed later. And again, experimentally, these particles don't abide the Brownian motion law. The finding experiments show us it's not the case. It happens very often. You have mathematical theory, you have beautiful conclusion. It works very well. Confirm experiments, but when you look at... Make finding experiments, something else happens, but the outcome is the same. You know, the typical example is elementary mass action equation in chemistry. So particles interact, and so how reaction goes proportional to the density of these atoms. And this kind of very reasonable formula, and you look how hydrogen burns and oxygen give you very good precision. However, if you look finer, what happened there, there are hundreds, intermediates, and they absolutely don't abide these laws. Outcomes from how it does, and don't get in the set way. So this primitive, sometimes this primitive approximation works amazingly well. And sometimes they work extremely bad. Another example of this very poor performance in the model, I think, of many people. I think of born and then by, you know, quite a few physicists, I've forgotten exactly who was involved, have this model of matter when you postulate position of nucleus more or less classically, and then you take electronic field, the density of electrons around them. And you're running an equation. It was used systematically for describing quantum chemistry, whatever. But when fine experiment came out, it was absolutely all the magnitude away from reality, which is, you would expect, certainly a very naive idea. An equation is very ugly, unlike the same mass action equation or Brownian motion equation. The point of this equation, the mathematical is so nice, and they work. And when you make this primitive naive approximation, and there is no reason for it to work. If justification, not mathematical beauty, but your so-called physical intuition, it usually doesn't work. At least in this example, it was actually discovered about five years ago that this approximation does. When you experiment, you know, you can visualize it. You can make a look, single molecules, and see that this many naive model don't work. Okay. But now I go briefly because and this is actually what we want to identify, this concept of Maxwell, who expressed the ideas developed until the moment of the 19th century, if he iterated very much to the 20th century, that numbers determines everything. And it's, I think in many cases, it's not quite that. Nowadays, it's not just real numbers, but something else. Maybe, sometimes, I'm not saying always. And one point is that, of course, in probability, when people speak about randomness and classical situation, probability works. It works because there is high-level symmetry. And moreover, probability is added to this symmetry. You start with a, say, a category of sets and it enhances the category, say, of linear spaces with some extra structure. And there are two kind of directions. One, you add order relation and this kind of classical real probability, or you go to quantum and then you have complex number structure when order disappears, but something else enters in mentality. But in both cases, you have high level of symmetry. And this was understood in fact already by Cardano, who kind of realized that if there is no concept, if you cannot a priori say that certain events are equiprobable, you cannot make any theory. And amazingly, this is a default conjecture. Given two events, given two quantities, conjecture, they equal. And that's kind of, it looks absurd. However, what else you can assume about them? It's just, if you don't know anything, you say they equal. Because saying they are not equal is empty, right? So assume they equal. From that you develop kind of theory and see what works. And in probability, that's the whole point. You don't know the meaning of these probabilities. You don't know what these events are, whatever. But you say the numbers you obtain supposed to be equal. And then you make computation, you can then make computation. And sometimes, you know, it is an experiment. Sometimes doesn't. Okay? And then this was formalized by Kalamagorov following this before. And I think it was a new idea. Maybe it was expressed by somebody else. But prior to him, the before, it was always discreet setting. It was finally many events, some random variables, etc. And he first considered geometric probability. And, you know, this, actually he made also experiments apparently. Well, it's probably a po-critic story, but nevertheless, he understood mathematics. And this is another point, by the way, when... Oh, shit, what happened? Because he was a mathematician and his major work was in biology. And he wrote this huge volume of national history. I thought you had three volumes or something. And he wanted 50, but he couldn't complete during his lifetime. And then the whole intellectual culture of 200 till now actually was determined by this writing. It was a major kind of textbook. And it's still completely, I think, he's saying that by people speaking about Darwinian evolution, whatever, everything can be different. And he was just elaborating some lines and different what people were doing afterwards. He was a kind of... And unlike people after him, he understood mathematics. And that's the health point. And the others don't. And that's the major point. And this is kind of how he's kind of kind of forgotten, though if he's influenced everywhere. Even recently, people were just writing, getting fame for just slightly liberating what he was saying in one line in this volume. And the problem, of course, is him, he couldn't say everything openly because the way he... Some of his books were burned down at that time by ecclesiastically Aryan people. So he was very careful in what he was saying and therefore easily can be misinterpreted, including people like Darwin. He'd say, oh, you don't refer to Buffon because he has contradicted the things. Of course, he was saying because he couldn't say openly what he wanted to say. And this is... And then there is this concept of probability as a measure in a standard set, and the simple model would be this square, prefer square rather than intervals, intervals one-dimensional. You cannot see anything in this square, you see more or less everything. And then it was for exactly 200 years after Buffon-Indle. And at the same time it was the same, very similar idea in foundation of algebraic geometry by Andrea Way. And how this idea was dismantled, almost immediately with Rothenreich suggested that Kolmogorodon also should be kind of revisited and changed. Not that you destroy it, but you look at this differently in this... I don't know, of course, how to do that. Easy to criticize, much harder to do. And then... So this concerns more or less pure mathematics. You want to find different in developing the idea of probability, and one of the applications you want to look after is in linguistics. So what is probability of a sentence? What is the frequency of a word? And it's not that clear to say, usual probability, if you naively apply it, you have nonsense and what Naumsky-Chomsky said, but because I believe in his mind, only naive probability exists that he is so categorical. The point is you have to be smarter mathematically than he is. He's a smart guy, but not, of course, not mathematician, and now he's not even scientist these days. And so what, I guess I can say, little attorney of the kind of little touches which I will make, and this will I explain today in detail number one. Secondly, is describing probability space as covariant function in the simple spaces, and then gives you kind of immediately older Gothic theory, classical Gothic theory, like the Magorov theorem, which were kind of difficult theorem at the time, because they were formulated in the kind of traditional language, so measure, measures disappear, sets disappear, and categorically, everything becomes very clear, because you carry, when you speak very often, the point of category third, you throw away what you don't use. You only serve, you refer to objects and ideas you actually use, but not remember all this history, and you carry all this kind of historical thing, say, luggage, or whatever it is, it tremendously hampers your thinking. Even in the case of Magorov, he was even making mistakes in these papers, though they are completely trivial, as we shall see from a categorical point of view. And I will explain it again, but there is another, yeah, another one point is that you can use large deviation non-stand analysis for anthropists, and this very much in the spirit of Boltzmann, actually, or both using categorical approach and non-stand analysis. If you have this idea, you read Boltzmann and what he does. And criticism of Boltzmann was exactly a mathematician that time had another idea, and they were just, he only used kind of numbers as Maxwell understood them, and it was not adequate, not adequate to what Boltzmann was doing. And then there is another point that you can linearize some sort of concept of measures, and there are interesting examples when you can do it, and you have potentially interesting theories which I only have rough indication of. And then just, I will, there are these issues in languages or in learning which is called some type bias and approach to probability, which is a metaphor, because there is just words and there is no techniques, no systematic way to do that. How to describe, say, languages or learning mechanisms, like learning languages or learning mathematics statistically. These are statistical processes in a way that depends on repetition of events. In that sense, they're statistical, they have some symmetry involved. So from a certain point of view, probability is very much of the science in general depends on symmetry of repetition. Things repeat themselves. But that, from this point of view, the history is not a science. Historical events don't repeat themselves. So what can you say? Each of them, individual, right? And this kind of great, but you cannot advance it as a scientist, as a mathematician. But something does repeat itself. And, but this repetition is different. It's not the kind of symmetries that are not groups, they're weaker symmetries. And the question is how to exploit them and how to work with them, that's not contributing. If you just use the language of usual probability, you are dead because your usual language kind of, you are testically assumed some kind of symmetry. And then, so I gave some references here because when I thought a little bit about something, something, if you're precise in something, there's speculation and visual thinking. And now speak about entropy. And the point is, well, just again, a lot of citation on great people. No. This is, of course, very much again Galilean concept of mechanics. I understand what you're saying was kind of thinking otherwise. Then, of course, what is reality? And then what was more relevant is what Gothenburg's saying. You have to introduce some childish concept, like entropy. So if you start with the point of view, sometimes you can explain to a child. And the child is supposed to understand it. However, they say just to say whatever you say, you can say opposite. And there is some much misquotation of Rosa Ford who said that if you have a scientific theory, you can explain to a Australian penguin, you better come up with a different theory. He didn't quite say that, I would say. But it is opposite to what he's saying and he was saying it not politically correct way. So I say it this way. So the theory is not supposed to be, it must be supposed to be simple for unprepared mind, but not to be simple for kind of already educated so-called educated mind. Educated mind is a mind like animals have the educated revolution. People from some age on become educated. Their mind is as unflexible as those of penguins. But sometimes, you know, in certain mood you can be more flexible. And now, again, we speak about physicists, how we speak about physical system. System is something vague. And the question is how we can make something precise out of this. And precise meaning you develop parallel mathematical language and it must be nice and simple. In a simple in a sense of growth and dick, right? Not in a sense of Maxwell, because numbers are not simple. That's another point. You think, oh, numbers, real numbers, we don't understand them. Nobody understands real numbers because nobody ever wrote rigorous specification of properties of real numbers. It doesn't exist. So real numbers are not a figure of theory. And therefore nobody understands them. And it is, of course, extremely elaborate. Super elaborate, incredible, why it can be such thing as real number at all exists. If you formulate them to unprepared mind says such thing cannot exist. There are too many properties in one option. But there are much easier things. And so this is how you can think about systems. You don't know what they are. Imagine making experiments and you have protocols how to make experiments and you make experiments and you see boom, boom, boom, boom, boom, all you see. And boom, boom, boom, here is a tick, tick, tick, there. And all you can say this boom, boom, boom, and this boom, boom, boom is the same. And this tick, tick, tick, different from boom, boom, boom. All you have, all information about the world, especially in physics, come from that. There is nothing else. There is no probabilities, no numbers, nothing. And the events repeat themselves. If they don't repeat themselves, you repeat many times, you can say this is equal, this is not equal. And this pattern with quality, with only quality. Out of this, you want to make something. And then you say, huh, can you know what to do? We take this category of fine spaces. And this, of course, at Hock move, however, it has simple rational behind it and you can explain to each other. You have drops of water, many drops. And some drops will come together and become bigger drops and have smaller drops. Total volume or mass doesn't change. And this is category. How you go from one to another. So you forget these were drops. You only remember these arrows. And you speak in the language of these arrows and advantage is this language well developed and it tells you what to do. Right? It's already good. And it's amazing how much you can say in this language actually almost everything. And if you cannot say something in categorical language, possibly go to the higher level, but most likely just stupid enough you don't know how to say. And the people hate, hate for this. People say category exactly because they don't know how to say and they say we have no art. It's nonsense. It's usually we don't understand but if, of course, there may be parallel language, even more, more sophisticated and more, I'm sorry, even more primitive and more powerful because category is the most primitive language which is kind of available to us. And therefore most powerful. And it's amazing why so many things. And so this measure space is very simple things. But it's notation which I want to use. And so there are drops of water, a pile of sand and this can become together in this and then there is little explanation why. Yeah. Yeah. This, by the way, another point. It looks trivial. It's highly non-trivial issue in quantum mechanics. What about a state with probability of zero? You cannot throw them away. And there's technically one of the major difficulties in describing quantum probability. You cannot throw away part of a Hilbert space which you don't observe, so to speak. It's all universe and world. You cannot isolate things and say the rest is zero. So you have functions somewhere zero but you cannot throw this zero. It's still with you. You change your coordinate system, it's come back. And this is a, well, when you come to a phenomenon I'm going to explain this. But anyway, this is what you have. Of course you don't have to have this normalization. And it is sometimes as usual normalization is an artifact. You say, no, you find it says you can normalize it. However, you shouldn't say this so easily because you will normalize it and multiply by a real number. And all this multiplication means multiplicative group of real numbers. And because you do it many times, you have product of that. So you have users, real torus, yeah, product of multiplicative group of real numbers. And if you look and you have, you have, of course, complex torus. And these acting complex varieties. So you have all toric varieties involved. But you normalize it, throw all this away. Remember it's all there. So this is one of the point of normalization. You can normalize it, but remember you throw away a huge amount of mathematics. And it better to come back at some moment. Though I don't quite know how. But I'm just starting, it cannot. It's partly done by some people nowadays. There is some work which I haven't studied well now, but it's... Now, this is very convenient physical terminology reduction. Instead of saying map, you have to say reduction. You have between sets. And with this property that yes, atoms come together and they must add up. And again, physical picture. You have one kind of apparatus and another attached to this. And what you see through the smaller one, you have a billion windows, sometimes you're happening a billion, maybe too much. And then another which stands. And see what you see by that. And that's exactly crucial kind of operation. Both in physics, mathematics and also say in linguistics and in learning. All the time we can change our windows and one is reduction of another. It's one of the patterns of learning, but not the only one. But that's one of the very, very nice concepts. Then explanation why we say that instead of this. And one of the justification is that first it has category has no meaning. Sign, this arrow has no, like I explained before, the knowledge of partially ordered set is impossible to define in mathematics. People think they define it, but it's illusion of definition. They refer to physics. You cannot do it without physics. Because when you write A to B automatically you see the symbols P and Q in this position of the blackboard. If you don't know them, they switch. And we know in all what make this mistake in papers, we switch science. It's not accidental. It's built in order. We know there are big objects, there are small objects, but order is artifact. It's hard to learn how to read from left to right before right to left. You just have to break this fundamental symmetry. And this is a non-trivial issue. And categorically, it probably disappears. You have F, you have notation, and then you can speak about something attached to F and we shall be doing with that. You can write entropy of F and this we shall be doing and defining, but not entropy of the symbol. I don't know what's entropy of abstract symbol. It's nonsense. And that's notation. So you save notations. In order to write this, you have to carry P and Q in this. So instead of three symbols, you have one. So your formulas are reduced by factor of two. So when you use categorical language and describe the same thing about all that's said by categories, you just throw away half of the notation. Because half of the notation is do nothing there at the point. And of course, they may be even better. And now there is this principle which we want to kind of elaborate is that entropy is the logarithm of the number of states of something. And this you can find in some undergraduate text, physical textbooks. And starting from that point, let's try to decipher what this means. And not being ad hoc, writing formula like some PI log PI. This, by the way, boys would never know this formula, he didn't need it. Plank or all this formula for some reason. You don't need this formula. Now, so the entropy is something log that will have this identity. We don't know where it is, but it might have some nice properties. And that's another point. You don't give definitions and everything follows. It looks kind of rather strange. It's something which must have some network of properties and then something exists with this network. There are too many kind of sometimes almost contradictory properties like real numbers. But then when should it come up? So far it's a number, but maybe not even a number, but something where this must make sense. And symbolically, this property I want to write this way, in the second you usually see why, because you have two systems which are far away. And being product, being they don't interact. If they're naively whatever the number of states it is, it certainly will be each state here and each state there giving you a new state, because they don't interact, nothing happens. But if they interact, of course they may restrict each other and then they have this formula. So let me make the corresponding picture. And this is the picture. Now they're close enough and they may talk to each other and just say well, some states may be incompatible. Of course we cannot have more of this to show inequality. It's obvious, physically. Mathematically people say, ah, it's very easy to derive from convexity of log. Log, you know, you need calculus, you need to really kind of volumes of formalism in order to define it. However, this is kind of obvious. And we want to make it obvious. So indeed you have interacting systems, then the number of states smaller than an interactive one, which is kind of obvious, but except you have to... There is a stronger property. This is called strong subjectivity, which is essential especially in quantum mechanics or actually in classical also. When you have this... So entropy is what you can see the difference. You have a bunch of atoms and then look at a small piece of them or atoms or whatever they are and measure what you see from there. So how many states you see from position one? How many positions two, three and then there is two, three and one two already was there, so I didn't know how to intact overlapping this parenthesis. And this is a quality. And then it looks kind of rather innocuous quality but immediately it has interesting corollaries. This is one of the corollaries if you just add them together and bring you have this kind of corollary. And then this corollary another property you have to keep in mind that entropy is less so set meaning if I get it just set. And this vertical line mean cardinality and this actually have to need some explanation. It doesn't follow from some naive reasoning but a priority will come up will come up. Then it is another property is if there is reduction entropy goes down. So if you look through some window what you see less than where before which is not true in quantum mechanics which is incredible. You can look at single atom but you are kind of where you look at this maybe kind of kaleidoscope and you see many many atoms different ones right. So quantum reduction don't decrease entropy. It may go up and it's again I don't know exactly mathematically it's quite easy why it happens and I don't quite understand physical rational behind it but then there is the following theorem which follows from the above which is by no means those also look very formal must be obvious this anyway but this is not an obvious theorem and because it implies it implies the usual isopermatic and it implies all sub-level and even log sub-level the old trivial corollary of that. However it follows from the fact that kind of non-interaction system have a few number of states than interaction ones and I will explain that and just and then if you kind of so you need the strong sub-adjectivity plus this inequality and this you can write in dimension 3 so is to see relation between what we had before so if you have this inequality we wrote before together with this inequality what you have is this so then you understand why the things are related yeah so I have a subset in the Euclidean space you project it into planes and then volume squared of the set is less than product of the projections and this is stronger than isopermatic inequality I mean it's constant not right but because in the usual isopermatic inequality you have here some of them squared rather than product so modular arithmetic geometric mean it's stronger and and we don't know exactly the geometric kind of the full full and geometric form of that and this is an open question is there a similar inequality in variant under that of a group because this is not in variant it's in variant under this very special group but not under the whole group and then amazing thing is in my view that this inequality which we wrote before all this Shannon and this Lumi-Sweeten inequality which is about projection we said in the algebraic language has the following shape which is more general so if you apply it to very special form namely you take this measure set and you take function multiply them integrate them and give a name also to one then become this inequality and then notation let me see if it's clear these are for ranks so you have multi-linear form full linear form and when you separate variables divided in different points you have quadratic forms by linear form, by linear form have ranks and then there are inequalities this one of the inequalities actually you don't have for this case this strong strong subjectivity you have subjectivity but not strong subjectivity in this case but this corollary that you have and I'm just curious what are the full extent of that of course this is true for all dimensions and there are many inequalities of this type everything kind of this again stronger than all sub-relief inequalities together it's implied in all kind of trivial special cases you have all the sub-relief inequalities log sub-relief inequalities everything follows from that which is in dimension 3 of course you need it for many variables so I will explain this later on how we prove it, it's very simple once you have this concept as I described categorical everything comes for nothing you just do it what you categorical theory tells you just do it without thinking I hope I put the exponent right I think it may exponent right there is some scaling problem you have to be careful not to confuse and then another application is where entropy enters and it was emphasized and this has interesting history related to Mendelian genetic which I love very much the story because it tells you something I don't want to comment that when Mendel's paper came up and actually know how people invented and found papers by Mendel there were kind of encirculation and everybody could read them but then for 30 years nobody kind of paid any attention to the papers and why at some moment they started referring to Mendel why it all Mendel is remembered you know why because at the turn of the century three groups of people they discovered Mendel and they were arguing who was the first and of course my guys look at the history it's not you it was Mendel it's obvious kind of otherwise and I think there are many other things like that because but even there is no competing groups the real authors never never come to the surface and then when it came up Mendel most biologists couldn't accept it and they couldn't accept it because there was some corollary of Mendel and it was looked to them as if it contradicts evolution very remarkable thing which Mendel perfectly understood because Mendel had mathematical education like Darwin whatever he understood mathematics and it is when you have two groups of flowers separated by mountain range and one of them will be blue and the other will be red and they have and then you destroy this mountain range and they become mixed up and so the next year you have to say 20% blue and 80% red what happens year afterwards and everybody say ok they will be only 2% maybe blue and red no nothing change the next year will be the same this mixing is it important it's not it's just elementary mathematics which biologists just failed to understand because it was really complicated formula proven by Harje so I write down the formula here is the formula and this was certainly beyond biologists of that type Harje write in terms of P and Q we just make abbreviation because exactly when you substitute P and Q it doesn't fit in this shit but then become really formidable if you write this and so Harje wrote this formula he said ok it certainly happens but from my view he didn't understand what was happening because there was some simple mathematical phenomenon related to the entity by the way and so what is so again the object which he was quite arrogant about that but interestingly enough he wrote one page and another guy who discovered this was called Weinberg who was a doctor it was a physician German physician he wrote a thing about 80 pages proof of that and the point of course was not writing this formula but understanding why this discussion of this mixture of two operations is expressible in this formula just understanding how we apply probability not how to make this computation of course everybody can make this computation if of course the question is to have it in your kind of right mathematical in your mind not make mechanical computation but then what he was overlooked and thinked by Harje there is a the true statement is like that that when you have this map which I appears in the manual paper you look distribution not of gene but so called a lab because there are alternative genes so you have some distribution of features present by genes this probability distribution you see what happens in the next generation when there is mix random and then this is the point and so what I am using the map is a rational function this map given by very simple rational function which is written by this formula and this rather amazing that there are rational map so it's squared and the grid doesn't go up it still remain the same because everything cancels and so indeed it's kind of miraculous and yeah this is the map described here and this map is kind of property and this really kind of there was a reason not to believe such map it's possible inside of Mendelian genetics everything depends on this map all its variation etc and there were kind of further developments in mathematics quite simple and amazingly enough so in literature it's never said what I said people still cry these formulas after hardy they are still kind of obsessed by numbers because they don't know mathematics beyond 19th century that's the problem of course for modern mathematician you see numbers when you come there and you don't have to think kind of but for all the people it was not so we don't understand don't appreciate really how much mathematics has changed and now we come to after all the preamble I come to the subject so what I want to say so we want to understand we want to understand entropy in these simple terms as a log of the number of states and first if all states are equi probable so there is no weights then it will be log of the number of states by definition now we take this as definition and now we want to from that understand what happens in general and so we have this concept of homogeneous of homogeneous spaces where all atoms are the same weight but this of course can be described pure categorically yeah that's important and except there is one little point and this will be essential in in quantum case if you allow atoms of zero weight and the homogeneity becomes tricky because there are atoms all equal weight say one or a billion and then there are zero weights atoms and that will cause some problems in quantum probability but here we don't have it and so by definition we have this concept of entropy so this is from this moment on now we make the following kind of observation that when you measure entropy of physics states you always have repetition of the same thing many many many times and then just let's formalize that first I need this definition so we want to compare two probability spaces and we compare them we kind of on one hand of course I use numbers and there are two aspects of numbers additive and multiplicative and the probability there are additive and multiplicative part to that and so I want to say when two probability spaces are close to each other and one thing is that if I throw a small percentage there are two kind of steps you can modify spaces and then say this modification makes it small one I throw away small percentage of atoms weight wise so one billion of all atoms some epsilon weight and if the remaining things are just equal I say ha my probability space epsilon close it's additive and then again in again in classical case it's kind of so simple you maybe don't have to say it but when you look in the quantum quantum thing it's become more more subtler and secondly it's multiplicative so I imagine I can have matching between two spaces so the ratios between atoms weight will be small but small you must be careful how you normalize it if space if you have huge spaces and the numbers to start with a miniscule the ratio may become huge so you normalize them by the number of elements in this set and this how it goes yeah so you put this when I make this distance as the ratio kind of of this but this depends of course of correspondence which I give yeah so I have one category of this reduction but then there is another category of correspondence between spaces which obviously you compare to different spaces so you want to compare them you have to need correspondence between these two subsets and it must be one on one hand what you throw away or on the other hand and what remains the ratio must be small compared to log of this total number all of course because multiplicative because I put log on the both sides otherwise I would have this part and then I add them together and say this is distance between my spaces depending on this correspondence but again you take as usual infimum of all correspondence once you say it I can say if I have sequences of my finite major spaces so I may have a situation what does mean asymptotic equivalence it means this difference goes to zero and then comes the theorem of Bernoulli this is definition so what is mean asymptotic equivalence this distance goes to zero that's quite clear and of course in a more flexible language which I don't want to emphasize they become again because they think it's simple anyway it's better to speak in the language of non-stand analysis for this infinitely large number and for this infinitely large number this number is the same and then because you don't have to go to the limit but of course another way to say the thing may not converge yet so you have to go to sub limits and then you define what I call Bernoulli's semi-group of asymptotic equivalence class of spaces for all P and P and then there is this theorem of Bernoulli you apply it in the fundamental theorem which is called the law of large numbers like the partition theorem it has many names it's such a way you can say nowadays simple theorem at Bernoulli it took him 20 years to prove it and it is I don't know exactly what was the difficulty there I'm curious what were intermediate steps because nowadays proving it very short still there are we should have thinking but whenever we start doing it we immediately converge to improve actually it was already understood but not proven by Cardano and the point is that in the limit we take powers powers of powers of a space meaning you have repeat experiments independent and the quantities are independent so what you observe the thing become homogeneous begin homogeneous modulate the factors to throw away the states if you look at the if you think the sets and probability is a function of the set it doesn't become constant everywhere you become constant on a subset out to the subset becomes zero so that the whole this is kind of should be important otherwise you can make mistakes and so this is a this is a now why it is the law of large numbers maybe as you say it and then interrupt I have to put the buttons here so I have this atoms p is yeah well I probably have to introduce indices yes I have tradition with the indices but actually indices are horrible they do not need it and then so we have this collection of these indices and you multiply it by themselves many times and then what you have products of this yeah so I have p i 1 p i 2 p i 3 many of such entities and if you take logarithm of them of course we have sums and you apply the law of large numbers to these so the point of the law of large numbers that we do that you add kind of random independent variables it becomes eventually constant if you properly normalize it dividing it by n if you do it n times and divide by n it becomes eventually constant and this corresponds to this kind of concept of equivalent centered use and therefore when you multiply this probability space by itself it becomes in this equivalent sense become asymptotically constant all probabilities become equal and this very strong property reduces kind of function to constant and this happens in other instances in geometric inequalities and it is a very powerful means to proving theorems when it works you reduce proof for function to the proof to constant and this argument for example all hereditary inequalities you can really follow from that and what hereditary inequality tells you if you take this integral of function that log of this is as I keep forgetting convex to concave function beta this is what hereditary inequality tells you tells you I don't remember exactly how we write it, it's impossible to remember but this is the whole thing you take product of function to different exponent integrate this log either convex or concave and and the proof is go to this limit it's x and in some measure space replace it x to the power n put here power n and n goes to infinity of course nothing changes all quantities become the same so it's enough to show when n goes to infinity when n goes to infinity function becomes constant on their respective intervals for constant functions that's the proof and share inequality at the same time but it's more more profound than hereditary inequality though they're related of course so it becomes eventually constant therefore surprisingly many properties of entropy so defined reduce the properties of just numbers because for example what is a reduction for homogeneous spaces where all elements have equal weights right so this projection means just the composition of numbers if you have here number n and you project to the number it means that m decomposes that way these are all integers so this category where you find the homogeneous spaces when you reduce to this homogeneous object become just essentially multiplicative group of integers essentially slightly enhanced right and the properties follow from that so this is the law of large numbers it's very simple and then it's only have to see that once you know some property for the dense set of objects it's true for other objects and just after that maybe I can give you one example then we come to something more sophisticated that you don't have to do even that in a way because you can think about entropy in different terms that so the density theorem what it tells to you it tells you what are this class of asymptotic equivalence of powers of spaces so you can see that this growing powers of spaces identify them when they asymptotically equivalent and then called Bernoulli-Grothendijk semi-group in a second in 10 minutes I explain why the Grothendijk and so and then entropy is before you know it's a number you know it's something in this semi-group and then I posteriori apply Bernoulli theorem and say it happens to be a number because the semi-group is a morphic to the semi-group of real numbers but in general when you do in more sophisticated context it will be not a number so what kind of point of this definition because you don't expect it to be always a number in a more interesting framework, non-physical framework and object when you come up will be whatever you come up you don't know a priori what entropy will be when people try to define information naively, complexity or whatever they just want to have a number but it's maybe not a number something measure it must come from the structure involved and it must come by itself or it will not come it will be not a number and here it happens to be a number but kind of accidentally and then in fact there are some difficulty here one is sure one by the way, before difficulty I explained it in a second one is there is Boyceman formula which is mistakenly taken for definition, misconception it's not logical mistake it's mistake of understanding it's not and Boyceman actually never wrote it, he never needed it and it was written down by Planck later and of course it is a formula and it has physical meaning and mathematically it also has mathematical meaning usually ignored by people who write it which I there is some kind of mathematical meaning as well but not of course in this writing but the point I want to say that there is a I want to I return to the two I was saying there is some problem how factorial is Bernoulli theorem so you can approximate these limits by homogeneous spaces but how much they agree with all errors so you have kind of complicated complicated diagram of errors if when I go to this limit and approximate I can simultaneously I can simultaneously satisfy all this these one are extra property all firms must be injective I didn't say what it is but there is extra condition which needed in order to make must kind of preserve subjectivity epimorphism properly understood monomorphism must go to monomorphism under this approximation which makes no sense a priori but because all morphism in category of probability space epimorphism however there is obviously category when there are monomorphism and this must be taken into account but it's unclear to me probably it's not true again but up to what extent Bernoulli theorem is factorial I mean this kind of very simple question and I just haven't thought enough I'm pretty sure no easy can be we can find on one hand to find count example no problem but to find exactly up to what extent up to what extent this factorial because up to some extent it is and that's crucial for this application without that many things become not not not so not so not just not true okay but this is a I'm going ahead of myself I want to reach on back and there's maybe interruptions now I now I want to explain how indeed from this category collectives derive essential properties of the entropy this is will come so here we stop and the rest was literally ahead of us but this again must be understood that this is a stupid formula written by mathematician of course it was again useful in the in Shannon description when he was buying the units but this is a true physical formula and has really highly nontrivial formula you see because it tells you how to how to out of behavior of atoms derive something to happen in the real world yeah it's not just this kind of stupid definition which I yeah I never could actually swallow the definition if you look at textbook and say any entropy is this way what it's not by the way it's not in physical there is a constant then it has deep meaning because entropy is something kind of number of states and then can be computed in this way because you cannot computer is from the definition and of course that's the question of physical pieces exposition people kind of mix it yeah and mathematically as usual kind of take the kind of rather simplistic view and don't explain properly so and see which is just before going to one of the essential features I had and this and sometimes and again it's kind of incredible how things being repeated in history some of the trivial points we recently brought to everybody's attention some multiplicity which are multiplicity which have entropy that entropy of two independent systems just adds up and it is of course of one of the most essential features of the entropy it looks very much like probability right so where is this entropy of A plus B entropy of A plus entropy of B and there are lots of other numerical invariance of that yeah so if you only insist on this property and make this kind of Bernoulli-Krottenich semi-group without topology but only use this multiplicity it's impossible to make it faster unfortunately you have lots and lots of invariance yeah and but there for some reason they're all wrong for example you have a bunch of p i's and you take integral of of squared of p i to the power one half like L to norm yeah and L to norm also have a multiplicity so let me write it down so the key the key point which distinguish entropy from other very similar invariant yeah this one identity of the entropy there are lots of kind of then at some point I learned from people in applying mathematics they were kind of surprised suggested was written by my incompetent people who discovered for themselves that if you take almost everything you have this property actually you think about this in mathematical terms and you go to Laplace transform and you have functions that vary at every point it give you multiplicity of morphisms from function to numbers and essentially they are there are so many other things and I thought oh there are other entropies but of course none of them has any physical significance despite the I think the flow of papers yeah just maybe in nature some scientists kind of discovered this elementary stuff yeah because this because entropy is not like value of a function some way it kind of raises you some way if you look at the corresponding Fourier transform entropy will be not the value of a function at the point but the value at the singular point which makes it so kind of it's more subtle than just value of function at the point it is a value at the singular point where value doesn't make sense it's not normalized and actually you see if you look like you have numbers p i you have this numbers p i and you associate to this something like summation p i squared maybe you have to take square root I'm not certain I probably can make no difference if you multiply the space being multiplied you see that's that is true but it doesn't have the right to continue each other and for the reason it does not it's a priori unclear why but it's not good but the essential thing is that this defined it has this property for all p i's not but the entropy only you need extra condition entropy is not multiplicative if you consider all major spaces if they're not normalized so there is normalization factor so it is not polynomial kind of it's a rational function and the same as in the Mandel map it's crucial it's a rational not just polynomial it cannot have polynomial in this property so there is singularity coming from this renormalization so there is renormalization acting there and because of that something slightly more subtle happens and you have to keep it of course probability space probably can be thought not when you normalize it but when you divide or remember action of this multiplicative group and then I don't know exactly how to make it right but it's not a number even in kind of this even here this entropy and this probability they are not numbers they quotient by the action of this group but again in the correct setting you don't divide by this and you have to keep object together with this symmetry and I haven't thought about this how to do it carefully but that's one mathematical distinction between the two before you see that continuity is different these are not continuous kind of in this weak sense entropy is the only one which continues so okay so let's make 10 minutes break keep in mind and that kind of essential one is additivity when you have two independent system make observation and the number of states of two independent system is product of states here and here each pair of states give you a state so entropy is additive and if they interact then sub-additive of course and you have some degrees of freedom being closed because they must kind of agree one with another but what is slightly more subtle and it's not so these are two states here it is this strong sub-additivity and when there are overlapping sets so one two is not shown there it's like that because I couldn't make it but then there is this inequality and this has no immediate kind of intuitive justification but this exactly formally in classical case just one follows from another follows again in certain sense but not not always and also in dynamical systems also this is sometimes not true this relative property and this is one of the corollaries which is more robust than this inequality because there is no minus sign you see kind of in a way over there if you look carefully if you put entropy two on the right you take the total entropy is plus plus minus so entropy two is subtracted which makes a little bit touchy and yeah this is something property of the entropy where this extra point equality holds if and only if the all atoms are equals homogeneous system and then there is this reduction property which looks most kind of naively what you expect the other it will have smaller system it has smaller number of states quantum mechanics incredibly and also coming a little bit ahead of myself it's not even true in fully in dynamics so in dynamic there is dynamical entropy in up to certain point in the situation has been started by Karl Magorov and people who followed him it's not only entropy was invariant dynamical system but entropy was monotone decreasing on the morphism of dynamical systems and when dynamic was shaded with special class of groups and it was not defined for other groups and then recently about four years ago there was next step made in there due to Louis Boyne who discovered that there is a entropy also for other groups like for free group which invariant dynamical system but is not monotone under reduction and this is a kind of quite remarkable work where very much in the spirit of what we discussed here though is different and then again this is one of the inequalities which follow which is purely without minus sign you see and this has this geometric manifestation geometric corollary when you project it's upset in three dimensional space on three coordinates plane and so this notation self-explanatory and you have this inequality and indeed you see it's isopermatic inequality because volume of course volume and here is area of projection not area of the boundary but it's up to a constant it's kind of obvious right that product of area of the projection is smaller than the sum of the area of the whole thing every projection of course area goes down it's a very kind of crude thing but here it's a product so for certain shape this inequality qualitatively sharper than the more precise and usual isopermatic inequality and again it's unknown what should be the corresponding orthogonal invariant inequality though there are some results and then there is kind of rather amusing pure algebraic statement over any few the pure algebra and the numbers enter via ranks of by linear forms so you may think everything was about real numbers quantities measures however this is stronger than it's more general or rather more general it implies this third-dimensional isopermatic inequality the thing about ranks it implies another kind of very very special case see the point is if you look at an extremal case where this may come to equality you arrive at very special diagonal forms which come from measurable sets so it's not that deep generalization however it is for strong subjectivity is not true in this linearized context it is a naive form I don't know it may be and then there is this Mendelian stuff which is really for the entropy is that there is Mendelian map and that entropy goes up when you mix populations they become mixed up and equality they become constant only we have what they call equilibrium states and equilibrium states so it's kind of matrix which I hate of course this word because I don't know what matrix is mathematically right it's just quite able which is not quite mathematical term and here there are entries so if each entry is product of entry here and entry here the equilibrium state so I have this kind of distribution and then the invariant under this Mendelian map and this exactly the ones where entropy of the two reduction so we can observe from here reduction here, observe from there reduction there so entropy of the whole thing is smaller than the sum of the two but if they are equilibrium which means this thing is independent then then there is equality and algebraically it corresponds to matrices or pirates of rank one right so the key point is that kind of rank one so this projection this operator is actually is a called in the algebraic geometry Segram map the Segram map sends all forms to forms of rank one and it is the most significant instance of that of course is for quadratic forms if you look at the quadratic space of quadratic forms and the positive one make a cone and here there is this projective space of forms of rank one and they exactly they convex hull convex hull of them makes all positive definite forms and from there quantum probability starts in a way this is a if you replace classical probability is a positive quadrant which defines probability and here is a space of positive definite quadratic forms and the extremals so we replace permutation group by orthogonal group so here is a orthogonal group and here is a kind of minimal kind of orbit acting there and again there are lots of nice and simple mathematical attach to that but so little and now come this this go quite clear what is homogeneous spaces are and the key definition kind of geometric definition analytic definition is what how you measure distance between two measure two probability spaces and there are two way to say things are close one when you throw away subsets of small measure or when you multiply all entries by function close to one but close to one must be properly normalized because when you have like powers you have exponentially many terms and so exponent of what you feel everything small the exponent is small so you're normalized by this log of the number of entries in your matrices that's the key point by the way this when you do typically this large deviation theory there is very annoying number n you have some sequences of probability objects and they're parametrized by n and you take logs of probability and these are become entries and then you develop this large deviation but this n is kind of annoying because it depends on this n in this particular instance n is cardinality of something but of course artifact is just number we should bring artificially which is somewhat unpleasant I don't know how truly to avoid it and the limit of course which happens here is a nowadays called tropical limit which was called norm for a long time and so there is two distances additive and multiplicative and of course they're exactly motivated to the low large numbers so they're exactly causes distance for which things work so there is asymptotic equivalence and then you say you find Bernoulli semi group you can see that all find probability spaces and take all the sequences take them in Cartesian powers n going to infinity and take the asymptotic equivalence classes and this is what you call a Bernoulli semi group but you start with spaces and then you go to the powers and the meaning of that is you make experiments but you repeat them many many times and only when then what you observe will be kind of similar you say they're the same because what you're I'll get multiplied spaces this is power n if you multiply to space p by q this power will be multiplied right ok because pq to the power n is Cartesian problem p to the n times q to the n so this Cartesian product and this was not really different and what's essential is topology there if you make the definition without topology you have huge you know uncountable something horrible thing but even even if you do it say for homogeneous space with rational numbers you get something with all rational numbers maybe all maybe even more means it's become extreme like usually forget topology you get all these pathological things but this is essential use the right topology so it's kind of when you do a geometry analysis category with some kind of twist twist topology ok so that's and then this you can how you can define so one way to define entropy as a number you say huh by Bernoulli theorem any sequence and I'm sorry this powers powers equivalent in Bernoulli sense to homogeneous space so these homogeneous space and the line you take the homogeneous space take its entropy right it's kind of fictious space it's not there yeah it's you know to really have it in your hand you have to go to non-standard analysis the point is if you go in and start non-standard analysis take p to the power n and infinity is homogeneous and that's what you don't have to do it entropy is defined normalize all the time because you take entropy naively what you take you take kind of entropy of p to the n being approximate by homogeneous thing and you take log and divide by n like I like you know but in other way and I think conceptually most satisfactory you just have this semi-group and you don't know a priori it's reduced to numbers and you just take element of the semi-group and this how we can proceed in different categories you don't know a priori what this would be yeah you have to you can see this categories here is product when you have some kind of product but what's subtle and which is not clear even in simple examples what is the right topology in this in many many many naive examples you run out of topologies there is no natural topologist and even when you have this computation and proving something even for classical dynamical systems so there is usually entropy properties but it's unclear if it's unique with the kind of the the only topology which a priori makes and that's not quite clear and but that's of course underlies things all the time and then boys would equal Bernoulli and that's Bernoulli theorem of large numbers saying that this that this semi-group that and essentially the same as Bernoulli I say density not the law of large numbers because there is slightly more of course technically the law of large numbers all the users any proof but in this proof and that's essential for example we have a reduction it agrees with reduction it agrees with certain diagrams it's unclear if it agrees with all diagrams as I said but with many diagrams which are relevant for us it agrees because the proof of that whichever proof you take there is some materiality in that and that is because you don't notice it if you don't think about that but once you have it it's extremely useful and in particular when it comes to where is the inequalities and now so how so the key inequalities are for a certain inequality some are kind of rather immediate so sorry but you say that this is on two right error it's isomorphism so it's subjective it goes it's isomorphism it goes on to an injection an example of a homogeneous space which is not a natural number so it was a question I understand which spaces have imaged the natural numbers but what spaces have a real number no it's just any probability if you take just two numbers any because eventually have this formula sum of p i log p i minus so anything is possible right when numbers are real you have everything so in h into because you go to the limit you see in the individual h of course there is a log of integer but rational numbers kind of dance on the all real numbers because we go to the limit to have everything there's no log before you go to the log yeah again it will be everything yeah because you go to the power n it's not for individual spaces you go to the limit this asymptotic equivalence classes we should divide by n so you have some integer and then normalize by n so it's at every moment you have a rational number when n goes to infinity you have everything right but it's there is no no you see it is a homomorphism it extends yeah so homomorphism for integer spaces for homogenous space goes to inches but you extend this and then goes first to rational because you may have so to speak you have two spaces such that p to the n equals u to the m which of course never happens for individual spaces but it happens in the limit and then we have kind of secret rational numbers but they're secret they're not implemented by actual spaces when you go to this equivalence classes and you I am saying you it's not at some moment what you have is not actual measure spaces but kind of fixtures measure spaces and this in this growth in the group that's that's usually how it works but again you don't have to kind of bother about that much so the point is naively that every space this is just fancy reformulation of a simple fact that if we have p to the n and n goes to infinity and then the sequence of homogenous spaces which asymptotic to that but they're all different n varies yeah when you make entropy you divide by n so this this in the image in naive image they go further and further back but then divide by n and so they become a rescale right of course when you go to n the number becomes bigger bigger bigger it runs away but when you normalize it so you kind of secretly they do the following thing you take p to the n and then take square root of n and then you replace it by hn and so you have something like that kind of fixes limit of homogenous spaces yeah which of course kind of doesn't make literal sense but in this semi-group it doesn't make sense we kind of force it when you say this I believe in semi-group it means you make you allow these operations right you allow this operation and then you have it in a group but they're not always spaces they're questions you don't have it in the space themselves and that's kind of the point of the point of that that you you introduce fiction subjects and but the point is that it may be something else it may be in general it may be more parameters more elaborate something whatever and of course it may be not the only the right way to do it and so let me so these points from formulas we discussed then there is this relative entropy you can be defined in this way or which actually even better now we have really kind of growth and dig type of definition right so it's again for asymptotic equivalence class of objects and instead of taking powers you just add this relation and they become truly growth and dig group and the point I'll write the formula now I'm going to explain it so so if you write it in certain language then so this was a problem which you discussed okay now there is the following maybe you saw it many times this kind of description of Shannon inequality this is just sub-editivity of entropy so there is extra operation or operation over measure spaces it's not completely well defined but in second I say what it is and but then this stronger sub-editivity which looked a little bit erratic the same but for for morphisms so the difference between a de-tivity and a stronger de-tivity one is for objects and another is for morphisms and once you have kind of right language one follows from another and if you decide for this formula I haven't explained what this side over here I'm going to jump here ah it's already I have to go show you so this kind of very simple formula for kind of uncontestable formula for for errors implies a slightly more elaborate formula for objects so how to jump up very fast and know how it happened right so this is a formula for spaces then the same formula for errors but if you write it down you have this somewhat more complicated formula for object which is usually used now let me explain how things work so this looks a little bit kind of mysterious why from this formula we can derive something and so let's look at the simplest example of the usual usual shared inequality how you can think about that so whether it's actually fine to infinite measure spaces a kind of immaterial for computation again pretty fine but I make pictures on here so it's maybe kind of infinite so I have a subset and I project it here and project it here so I have say probability measure distributed in this square and so I have some probability measure here and some probability measure here I imagine everything being discrete so I divide into little little little things so I don't have care about continue each so so this is one finite measure space here is another finite measure space and I have so here is p, here is q, here is in the center by r and I want to say that entropy of r less or equal than entropy this basic inequality entropy of q right so I have this matrix I project on this coordinates so I have here measure space, here measure space probability space so total sum 1 in the center I want this formula so this is a distribution so how to prove this from this kind from Bernoulli theorem so you observe if whether it's true or not true nothing changes if I take this power right because all quantities and entropy may be understood if you wish as sum of PI log PI we share it in the studio no matter so everything is multiplicative so I can send n to infinity because all quantities multiply by constant then I have to normalize back and so I imagine I just put n equals infinity and then I'm saying aha I applied the law of large numbers but kind of functorily which means when I go to this imaginary limit so what happens I have this limit subset no here it was a measure in the limit it will be just subset where it's constant right so just subset and moreover here is probability when I project it here and I project it here this this intersection will be all constant so what kind of set it is just look at that typically maybe like this little square and another square right so when I project it I'm sorry I'm equal square so when I project it intersection is all lines the same observe I unless they're empty and here all the same no but that for that set it's kind of obvious right that cardinality of this set is smaller than cardinality of the two projection here is two projection and of course it's cardinality it may be I feel if you use the whole thing this square is smaller alright and because of the density because in the limit it's like that it's true for the original logs so again because for certain it's true in a second you see exactly so what you use that cardinality is monotone increasing under objective maps and from that this follows on the other hand if you spell it out in an allergic channel it says function x log x I keep forgetting convex or concave convex I keep forgetting which one it has second derivative constant side no it's opposite to log so log is look like that and this look like that if I'm not mistaken it's a different direction and so it follows so property of log follows convexity of log from the fact that maps of embed one set into another the bigger set is bigger so you don't have to make calculus in days so and indeed my objection to calculus is of course you can make a computation but somehow it's like making a computation no difference computer added proof or just computation add proof in the paper or in your mind whatever the computation sometimes you go out of your it's not in your head it's on the paper you need calculus to do it it's not automatic and what I say up to that point I don't use any kind of calculus just everything in your head this is the exact kind of growth index how he insists mathematics should be done things might become absolutely obvious without any computation of course you accept the statement the limit thing become asymptotically constant and the proof of course there is a proof it just follows from Pythagorean theorem if you understand what is Hilbert space is and that is just Pythagorean theorem and we should give you of course more than the law of large numbers so this one example the same works for but of course on the level of sets that was obvious to start with right if I have any set then it's area less than product of areas of the projection so but look at but relative inequality is stronger for example because because it implies this this Lumi-Sweeten theorem so let me explain what happened there now we have something in dimension 3 or in the cubic I also prefer to think everything in think in combinatorially because continuity is a material here but again geometric you have a subset in three space save and then you project it into three coordinate planes this plane, this plane, this plane and then squared volume of this less than product of these areas so again what you do it's a material, it's Euclidean space what's a material, what's essential that three space is product of three measure spaces so we have subset omega in the product of x1 times x2 times x3 and you call it 1 to 3 and then volume of this less than projection here, projection here and projection there projection is the best thing to do in measure theory but we still do that now so one point you have in mind that and this is obvious from our definition but in a way there is some subtlety here is that entropy is over P less or equal than log of cardinality of the set of the atoms certainly when you make the law of large numbers so what you do you go to this high high high power and so what happens some atoms all become equal except they throw away set of zero measure but the set of zero measure may have huge cardinality may have predominant everything cardinality wise you can throw everything but measure theoretically you keep it and so therefore of course and then you count the number of atoms that remain so you have this inequality but it's not at all obvious why equality holds if and only if all atoms are constant and one direction is clear of course all atoms are constant you have this if and only if and for that this argument doesn't give you that so there is some if you know because you know boys performing the PI log PI and because this function is analytic of course it follows however you need this kind of something analysis yeah there is no simple direct proof I mean we will without kind of appeal to some analysis of course you can make computation I prefer to say function is analytic and therefore it's unique the maximum or minimum must be unique it's one dimensional problem so isolate maximum and convex function analytically convex function then it's only one maximum point but I mean just you go out you kind of language of categories and I don't know it appears in other instances we'll be using this argument and always having trouble with extremal cases here you can handle it but say in the case of Shannon inequalities for Neumann entropy it will be not that not clear but now so coming back to that so we have this point therefore what we actually want to prove is that entropy truly we have some density measure here if you project in three coordinates and you want to say that entropy of this less than some of this entropy right some because you took logs of course now so again we go to this very high power and go to the limit so here it was subset anyway from the very beginning the material it becomes subset but what is essential you can assume that all projection of all coordinates all slices now have constant values all function involved function involved again when you have a subset and you start to compute volume given area of course what you do intersect with these lines and integrate but this was function now it become a constant and the picture which you have is exactly the only way to have it it's exactly what I said the only way to have it after transformation you reduce the picture of the tangles so we had before or maybe two of them and so all this projection become constant this kind of picture of course they are not solid everything up to measure transformation which preserve coordinates and of course for that what I said become the tautology of course this property is true right because projection essentially what you use we have two sets and projection bigger than each of them so it's a little bit now mysterious so I haven't explained that I will explain probably in a second I said but we will not truly prove it so this is a kind of it may look a little bit tricky to explain my mathematics behind it it's really kind of non-trivial thing that happens yeah this the law of large numbers mix things and make things constant but if you think formalistically how it happens that how to say it explicitly that so monotonicity of cardinality and the injective map implies something about measure theory because in measure theory all maps are subjective right so we have to this our category and kind of general principle of enlarging any category is considering category of diagrams so if you have any category consider any kind of diagram so any graph whichever where you may assume when we add conditions that certain that certain cycles are commutative and actually kind of diagrams in our case will be like that then these diagrams in this category make again a category and so the real one category for us for Schellen theory the category of these diagram is fans with two fans with two arrows so we have one object one space and two maps so kind of it is pairs of maps for one space and now in this category you do have you do have monomorphisms and monomorphism are very simple things you have set theoretically again if you translate back what you have it means you have one major space P it goes to Q and it goes to Q1 it goes to Q2 and resulting map in Q1 times Q2 is injective as a set map of course you don't have to say it again the whole point of category you know secretly it helps you of course to know what happens but if you want to generalize it you don't have to do that now you have this notion of injective fan and then because of the materiality of this Bernoulli's large numbers because everything approximated by homogeneous spaces for homogeneous space and maps injective mean well sets goes to subsets cardinality goes up bigger set is bigger cardinality and when you look back to entropy you have this thing called entropy that entropy is entropy of P will be less or equal to entropy of these two Q's right so it means you have a product you have a subset and so entropy of this is always smaller than entropy of the product and similarly it's work for for the same language you do the same for morphisms again you look at all this injectivity and essentially this will be the if you look carefully what is that kind of the limit picture it's the one we describe in this limit sweet new theorem which contains everything now and then but then read this point which I was talking about this concept of this think about entropy that you can speak to make this operation between spaces and then you're yet another way to think about entropy because you see the Shannon equation they have many shapes and it's important relevant just to have this picture of many way to say and this is another way to think about that and how it usually enters in textbooks and not measure space anymore but they're partitions or fixed measure space and then there is another partitions here like that and this is just what you see here when you intersect all elements and so and what is the entropy and the entropy is because you have partition probability space you have this weights p i and you write this formulas but now we want to say it of course without formulas and without knowing what the measure space is and in fact you can define measure spaces starting from the category of finite places spaces and then this operation will enter and so let me give you what is the proper formalism this is follows so if you have any category in our categories like probability space between same general category we can adjoin something to it we can adjoin something to it which are will be covariant functions from our category to category of sets now this is sounds kind of formidable but this extremely by the way general construction mathematics use everywhere you use it in topology and analysis everywhere it's very trivial it's so simple sometimes you don't notice it but the point is that when you have abstract measure space X and of course there are usually if you use the measure space with the data which is in my view it's never ever was rigorously just impossible it's a huge mess in a way because you use some it likes explaining what is a matrix the square table of the blackboard so what is a blackboard I mean it's really long long discussion and the same there are sets but there are null sets you take them you don't take them I mean just absurd it's really absurd I mean it's shame when you're doing that this because in fact what you have is you have this category of fine measure spaces and you take covariant factor from there to sets with some properties now what are these properties and how you think about this covariant function covariant function mean to each object here p you assign a set and this set will be just of course set of morphism from here to here so it means you add one object to your category and you allow morphism from this to all other object of yours and this what it means you have extension of your category and then there is one particular feature which you need to have it in your measure to have this to have a measure space so in of course in our category it's very simple to have a measure space naively the man in the plane and so what function you assign what value you assign to given set it means you can see the all measure preserving maps which means all partitions with weights equal to these numbers but again the point is you don't have to think about when you start writing think you don't have to kind of spell it out yes on the back of your mind but notationally I think disappear so you don't need this set you don't need these points all you need is that it is some object in this category and this makes sense but essential feature which you need in order to have it is as follows you have this abstract X it's not a space it's just symbol yeah and it goes to this p1 p2 pk it's not one morphin but many it always there is intermediate q in our old category and this one single arrow and then it goes from there right which means exactly that you have many additional you can intersect them all and there is one maximum object in this category of fan there is one maximum whatever minimal object and then more than this object this q is become what we call the i but this kind of depends secretly on this x and this is not canonical operation it depends on this function you use however it's essential property is that this map is this fan is an injective fan right so if this was injective this arrows resolving x this become injective yeah this is and therefore when we our inequality this entropic inequality follows from the corresponding inequality for injective map between sets right so the point and this I will elaborate in my next lecture but the point is that when you use measure 30 defining entropy of dynamical system whatever you don't need sets as measure spaces and actually never use this sets all computation map is fine partition with numbers and when you start to interpreting in terms sets you derive you know you go into problems who is measure 0 who is not measure 0 I mean you don't have to think about that right so all you have to know you have this measure spaces you have this operation with this q the secret x you don't care about that you prove everything and then you see what you prove in the traditional language you see you prove slightly actually more than the usual measure theory because this measure space is slightly more general than usual measure spaces on the other hand the generality is fictitious because they are more general but on the other hand everything reduces to usual measure space because kind of this is kind of universal measure space kind of which is the minute them all and so but the point is you don't have to know all that all this measure language absolutely material and you prove everything and actually this I will do next time it's written on my papers but also all measure theory like the back integral become tautology in there it automatically comes to you don't have to think about the back integral there is only one kind of integral right yes from this for functionality if you know something about numbers and this finite measure spaces if you know what is the sum of numbers what if you apply this far you become the back integral with all its properties right and these properties immediately again you have to check them its properties of numbers but category theory tells you what you have to prove and in this in the say the instance of that if you do it in a robotic theory and this is called Magorov's theorem if you go over this category and you know what he proven that entropy is environment or dynamical system but what was his contribution kind of is that this converges to one and then goes and that's all we have to be everything else is just you know hand waving categorical language right and this kind of of course this hand waving is okay when you have this language if you don't have it it's not so not so easy and so next time I will explain to you how from finite measure space to go to infinite measure spaces and but this of course only one path it's not it's not I think it's you just clean up classical things the question is what should be the new thing and then they do the same phenomenon phenomenon probability phenomenon entropy but then things will become more and more problematic so what you can do and then just I give you so they disappear incredible because again just returning back to my schedule I want to show you what would expect us in future I don't want to click on what I want to return just to what would expect us I don't know how to make it fast or maybe it's too long yeah because so we will do through this all kind of entropy and then I want to describe measure 30 co-homologically because and look at the physical system like system of particles topology there by far more you know it's too boring to go for that topology will be playing replacing not playing replacing replacing numbers where what is what is more what is more interesting okay