 Okay, I think we can start. We will have one lecture from Bayanima, from the Institute, about, well, I live to say, yeah, please. All right, well, it's absolutely wonderful to be at the school again. It's great to see so many of you here. Originally, I was going to give two lectures on various basic aspects of quantum entanglement. Unfortunately, I only have time for one lecture, but I decided to spend the one lecture on a topic that's really possible to cover in one lecture. The topic of quantum entanglement has been a very, very important and interesting one for many, many decades, and, of course, it's something that people thinking about quantum gravity have also thought about for many decades, but as you all know, people have been thinking about it a lot more. Again, recently, you're going to get, I'm sure, a beautiful set of lectures by Hiroshi later in the school, but there's a few nontrivial facts about the very basic objects. The very basic objects are von Neumann entropies for density matrices, or if you trace out parts of a Hilbert space, von Neumann entropies for density matrices in various subsystems, and there is a few quite deep results about these entropies that were really established in the 1970s, mostly by the work of Eliot Lieb together with a number of collaborators, and for example, there's the famous statement about the strong sub-editivity of entropy that plays such a big role. Essentially every nontrivial fact that you're familiar with, which involves these entropies boils down to the strong sub-editivity statement, and so let me actually quickly remind you what the strong sub-editivity statement is. If you have some density matrix rho, we define S of rho to be negative trace rho log rho, as usual, but the statement of strong sub-editivity is that if you have a density matrix for these systems, A, B, C minus a density matrix, and I'll tell you how to remember this, it's always hard to remember the statement, this is less than or equal to S of rho A, B minus S of rho B, or I hope the notation is obvious, we're starting with a giant density matrix A, B, C, and then rho B, C is the density matrix you get by tracing over a sub-subsystem A and so on and so forth. This is the famous statement of the strong sub-editivity of entropy, and in fact this statement and all of these related statements follow from one master fact, and it's this one master fact that I want to spend this one lecture explaining, which has to do with another object which I think is a little more fundamental even than ordinary von Neumann entropy, this notion of relative entropy, and the fact that relative entropy is monotonic under course graining, it always goes down when you trace out parts of the system, so let me just tell you what the definition is, and we'll motivate it, we'll see the four parts of the lecture as I'll motivate it classically, and then we'll talk about some very simple statements about convexity that lifts from just ordinary functions to operator value functions, and then I'll just in order to make the proof a little more understandable and motivated I'll introduce some nice quiver notation as a model builder, we don't call them quiver diagrams, we call them moose diagrams, but anyway there's some quiver or moose notation for talking about quantum states, and then I'll go through the proof, so that's the outline. I should say that the reason I'm giving this lecture is that when I talk to people, grad students, even some postdocs working in this field, I find that most of them don't know the proof, and the reason is that the proof is buried in the back of textbooks, and the original proofs of Lieb and Russovsky are somewhat complicated, they involve some relatively heavy machinery about operator convexity. Now these proofs have been simplified, so if you Google, which is what I did when I was trying to learn the subject, simple proof, strong sub-editivity, you get a list of papers, most of which are superficial rewritings of the deep results of Lieb and Russovsky to kind of make them superficially look simpler, but I ran across this very beautiful paper by Nielsen and Pat in quantum pH rule from 2004, which is, I think, this was, you know, when you're looking for an understanding of something, you want an understanding so that when people come to you in the middle of the night, they put a gun to your head, they wake you up and they say, why is strong sub-editivity true, that you can remember why it's true, there's a moral reason why it's true, there's a reason it makes sense, and I didn't see that in any of the other works, and this is something where it's possible. After this lecture, if someone puts a gun to your head in the middle of the night and asks you to prove the monotony of relatively entropy, you'll know why it's true. And it involves some very simple ideas, but somewhat deep ideas as well, very simple, but deep ideas about ways of thinking about operators and states in somewhat interesting ways. Anyway, but before getting into it, let me just tell you what the master statement is, and at least how the strong sub-editivity follows from it. The master statements involve this notion of relative entropy, which is something that involves a density matrix rho and another density matrix sigma, and the relative entropy between rho and sigma, so I'll use this notation for, is, and we'll motivate this in a bit, but let me just write it down for a second, it's this interesting object, okay, it doesn't just depend on one thing, it's a measure of how different two density matrices are, and it's asymmetrical between rho and sigma, so I'm going to hand you sigma, it's a measure of how different rho and sigma are, oops, no, that's fine. Now, the claim is that if you have two density matrices, rho and sigma, and a big Hilbert space, and you trace out any part of the Hilbert space to go to smaller Hilbert spaces with some new densities, rho prime and sigma prime, that this object always goes down, okay, it's always monotonically goes down under any tracing out, okay, so that's cool, that's sort of built into the structure of Hilbert space in quantum mechanics, you know, this is nothing, you know, quantum mechanics is nothing other than the linear algebra of operators with positive eigenvalues, okay, so that's the, built into that structure is a certain, is something that's monotonic, it always goes down under course grading, and of course, as many of you know, in the past number of years there have been beautiful proofs of things like the C theorem that follow from, instead of, you know, complicated planes around with correlation functions, they follow more simply from looking at the things like strong sub-editivity, and that ultimately boils down to something which is monotonic in any quantum mechanical system, okay, so that's the, that's the really deep fact, so this is always goes down under course grading, so tracing out any part of the Hilbert space, and now, yes, yeah, let's just assume finite dimension, it doesn't really make a difference, the only thing you have to worry about is some of the eigenvalues of rho or zero, but that's not, yeah, it doesn't really, it's not going to make a difference, okay, but let me just relate this statement to strong sub-editivity quickly, so if we talk about this statement now, but for a case where I'm looking at the relative entropy between rho ABC, that's one guy, and another guy is just the identity matrix in Hilbert space A, tensor rho BC, okay, this guy on the one hand, this is equal to trace rho log rho, so this is negative s of usual von Neumann entropy of rho ABC, and then the minus log sigma, because there's the identity there, I just end up, I just end up tracing over a, so I just get a trace of rho BC log rho BC, so that's the entropy of system BC, okay, so this is plus s of rho BC, and monotony tell me that this is always greater than or equal to, now let me trace out the system C, if I trace out the system C here, this is now just rho AB together with the identity here, and just rho B, which by exactly the same logic is s of rho AB plus s of rho B, okay, so you see that monotonicity tells you that this is greater than or equal to that, and that's exactly the statement of strong sub-editivity, okay, all right, so what we're going to do in the rest of the lecture is motivate this notion of relative entropy and prove it, classically where it's trivial and quantum mechanically where it's more interesting, any questions right now, very good, so first, and there's lots of many ways of motivating relative entropy, Shannon had some very, very nice reasons for motivating it, which I won't have time to talk about, so we'll just get into the heart of it quickly, imagine you're given two probability distributions, so these P's add up to one, and the Q's also add up to one, okay, so imagine we're giving two probability distributions, and we'd like a measure of how different they are, okay, how different is this distribution from that one, then just classically we can define s of P and Q to be the sum over i of P i log P i minus P i log Q i, okay, and this turns out to be a great measure, and we'll see why it's a great measure in a second, again it's a measure of how different the two distributions are from each other, we'll see right away that it's always greater than or equal to zero with equality only when P equals Q, so it's not symmetric, okay, but it's greater than or equal to zero with equality only when the distributions are identical, A, and B we'll also see really, really in a trivial, deeply trivial sense that this object is monotonic under course graining, okay, so that's, so it's a real measure of the sort of the information content of the difference between P and Q, when you course grained you have fuzzier goggles and it's harder to tell them apart, so the difference between them is smaller, okay, so this is a quantity that's sort of built to do these two things, so let's see why that's true, okay, so in fact the way that we'll show that it's, the way that we'll show that it's positive is by actually showing that it's monotonic under course graining, okay, so we'll do that statement first, so what do I mean by course graining, so imagine that these are some bins, right, I have some, I have some bins with probabilities in each bin, so imagine I have some bin I, there's some bin I, and there's a bin J, so in one picture this would have probability P I and probability P J, in the other distribution it's Q I and Q J, and what course graining means is that I'm going to combine them into one bin, right, so I'm going to take these two bins and I'm going to combine them into one bin I plus J and whose probability is P I plus P J for one distribution in Q I plus Q J for the other distribution, okay, so is it clear what I'm doing, you know I have like a hundred possibilities for one, a hundred possibilities for the other, they each have some probabilities, but I'm just going to take these two possibilities I'm going to say I'm merging you together in one bin, so their probabilities at, I get P I plus P J and Q I plus Q J, and now the reason this thing is monotonic is that just under this elementary operation of binning it's trivially monotonic, the statement of monotonicity in this case becomes simply P I log P I over Q I plus P J log P J over Q J is greater than or equal to P I plus P J log P I plus P J over Q I plus Q J, okay, so that is a statement, now why is that statement true? This is a direct, it's almost a definition of a direct consequence of the fact that the function of the function f of x equals x log x being convex, so if we have any convex function, let me put it, if we have any convex function at all then it's true that if I have Q I f of x I plus Q J f of x J is greater than or equal to Q I plus Q J f of Q I x I plus Q J x J, and if these aren't normalized, if the sum isn't one I just have to normalize it here, okay, so this is just the definition of a function f of x being convex, since we'll need it again in a bit let me just remind you of trivial facts about convexity, so when you have a function whose double derivative is positive we say that it's convex up, if it's double derivative is negative in some region we say that it's convex down, so suppose you have some f of x with f double prime of x positive, then clearly from pictures this is what it looks like and if you have some a and some b then it's clear that the average of f of a and f of b, any weighted average of it anywhere in the middle is going to be above this curve, so if I have a point here just somewhere in the middle t a plus 1 minus t b then I have that f of 1 minus t a plus t b is less than or equal to 1 minus t f of a plus t f of b, okay, minus sign there so that f double prime is positive, okay and and and this this follows, I mean it's obvious from the picture but another way of saying is if I look at the quantity g of t which is the difference between the left and the right hand side clearly g double prime of t is positive by assumption, okay so because g double prime is just f double prime, so g double prime is is is everywhere positive, g is equal to zero at t equals zero and it's equal to one at t equals one and therefore it can't be positive anywhere in here, if it was positive anywhere in here it would have to have a maximum somewhere in there and if it has the maximum g double prime would have to be negative, all right, so I'm so it's obvious from the picture but we're going to come back to that quantum mechanically in a tiny bit, okay so the function negative x log x is convex you can trivially take this double derivative and that fact is everywhere in stat mech, right and so we see that the fact that this relative entropy decreases under rebinning is literally the statement of the convexity of the function negative x log x just to finish it all I'm doing here is writing this expression by putting a qi here and a qi and then over qj and a qj and then that's literally the statement of convexity where xi and xj are just pi over qi and pj over qj okay so this notion of relative entropy then has the property if you rebin if you bin them it always goes down right and so in particular if I bin everyone in one giant bin what is it it's zero right the relative entropy is zero in that case and that's why it's always greater than or equal to zero all right is this clear so this little this this tells you why this object deserves to be thought of as a measure of the how much information you have in the difference between p and q right again as you make your goggles fuzzier foggier you resolve less and less the difference you bin them the the amount of difference you can see between them always goes down all right any questions about this no if I had more time I actually it's a very nice exercise I'll leave for you since I probably won't have time to do it here we talked about it in the language of rebinning okay you can always you can also say it in the language where you where the number of bins stays the same but what you do is you take you take a given bin and another one and you chop this one into a big piece in the small piece which is of order epsilon and then you take that order epsilon piece and you add that to the next bin over okay so you first increase number of bins by chopping it into two pieces but in this way obviously you're not changing the relative entropy and then you add this little bin over to some other some other bin J if you do that that gives you a simple derivation of what are essentially Boltzmann equations rate equations for how probability distributions can change smoothly now keeping the bins the same and this proves that that always goes down so you know the the the convexity statements in the second law of thermodynamics are the same fact so that okay so so let's now move on to the quantum statements so now what we want to do is do exactly the same thing except except in quantum mechanics and of course it's non-trivial so we're going to define I'll switch notation a little here back to the one I used before is trace rolog row minus trace rolog sigma and now it's non-trivial of course in general row and sigma don't commute and so we would like to show that this is monotonic before showing that it's monotonic let me at least quickly show you that it's positive okay and there is an interpretation for what this there's a very simple interpretation about what this guy is that we can give right away and we'll see a slightly more sophisticated version of this interpretation than the actual proof but if we just first write this down I mean let's just let's just compute it let's imagine that we're diagonalizing row okay so so s of row and sigma this is equal to the sum of if these eigenvalues are p and the other ones are q it's pa log pa minus let me call it u alpha a mod squared pa log q alpha okay so I'm assuming that row has eigenvalues pa sigma has eigenvalues q alpha and u is the matrix that is unitary matrix that takes us back and forth between the two bases so u alpha a is just q alpha the overlap between q alpha and pa now notice that because this is a because it was unitary we know the sum over alpha of u alpha a mod squared is equal to one also the sum over a of u alpha a mod squared is equal to one and so there's a slightly cool thing that we can do we can imagine classical probability so I'm going to imagine a classical probability big p alpha a in other words not just big p that depends on n variables but something that depends on n squared variables okay alpha and a which is pa u alpha a mod squared and q alpha a which is q alpha times exactly the same thing u alpha a mod squared all right so these are classical probabilities but if we had n probably n eigenvalues for the peas these are n squared classical probabilities we can call them probabilities because they add up to one right if I sum over because of this if I sum over alpha first let's say for the top guy I get one then I summer the pa's and I get one all right so so these individually add up to zero and then it's extremely easy to see that as classical for this big p and this big q is equal to this relative entropy for the quantum states okay so and therefore it's equal greater than or equal to zero and equality I didn't say this before it's obvious but equality if and only if row is equal to signal so we can think of the relative entropy itself as a classical relative entropy but in this doubled space right not not in the at the Hilbert space is n-dimensional and an n-squared dimensional space we can think of it as a classical relative entropy we're gonna see that n- squared dimensional space coming up in the proof and as I said a little bit more interesting sophisticated way but this is already a hint that's that we should be thinking not in the Hilbert space itself but not in an n-dimensional space but in an n-squared dimensional space okay any questions about this all right so now let's move on to part two which is some quantum mechanical statements about convexity obviously convexity is going to be important it was important it was important classically already but quantum mechanically so we saw that classical convexity or ordinary convexity is a statement that if you have f of t a plus one minus tb it's less than or equal to tf of a plus one minus tf of b okay you could say that a function f is now quantum mechanically convex there's many notions of convexity but this is this is a simple one so so quantum mechanically we say that f is convex or operator convex in this sense the same thing is true as a quantum mechanical statement where a and b are operators but of course then we have to take expectation values of this in any state so if I choose any state psi then psi f of t capital A plus one minus t capital B is less than or equal to the expectation value of tf of a plus one minus tf of b okay now just because a function is classically convex most certainly does not guarantee that it's quantum mechanically convex okay but there is a special function where it's easy to check that it's true and that's the function negative log x so f of x equals negative log x is operator convex and this follows if we can show that this this this follows if we can show the t squared of this sucker psi f of tA plus one minus tB psi is always greater than or equal is always greater than or equal to zero okay so we just have to be able to show that and that is indeed a property of of the logarithm function so let me just the demonstrate such a simple proof let's just demonstrate it quickly so let's say we have the function negative log a and since I want to manipulate this I'll give an integral representation for it just the definition of the log as one over a plus u minus one over one plus u so this is a convenient representation so if I look at negative log a plus epsilon delta then if I expand this out this is negative log a and then I have integral zero to infinity the first order term du which is negative epsilon one over a plus u delta one over a plus u and then the quadratic term plus epsilon squared integral zero infinity du then one over a plus u delta one over a plus u delta one over a plus u plus dot dot dot but we care about this piece because we care about the double derivative right and so we see that over here so we see that the epsilon squared of log a plus epsilon delta is just this integral just the last piece which I can write the integral du of du dagger du where this matrix du is just is one over a plus u delta one over a plus u one over square root a plus u okay okay and so so a is assumed to have positive eigenvalues of course u is positive everything is positive the square root makes sense and so f of x is operator convex sorry negative log x is operator convex okay now now we're going to see another elementary consequence of convexity which is again very similar to the rebeening idea that we talked about here but said in a slightly different way since we're going to be interested in having big Hilbert spaces and small Hilbert spaces and tracing things out let's explore a little bit consequences of operator convexity when you have big spaces and small spaces so so simple consequence so if f is operator convex then suppose we have an operator of the form suppose we have an operator oh but now we have a big Hilbert space in a small one so I knew we have a tensor product of two factors and so let's say it looks like this some a and a dagger so let's say I have some operator that looks like that then clearly f of x again as an operator statement I'm going to not put this not gonna sandwich it anymore is less than or equal to half f of so sorry and y is 0 0 so this is heading towards tracing something out right so so something where there's no communication between the two sectors is less than or equal to a half of f of x a a dagger y plus a half f of x negative a negative a dagger y so that's just that's just the trivial consequence and so now let's take this statement and I'm gonna sandwich it between some vectors so let's say I have a vector of the form v dagger 0 ok so I have v dagger 0 f of x y zeros here so this says that this is less than or equal to v dagger 0 f of x y a and a dagger p 0 actually I should have it's a half this plus a half that but it's clear because I can I can get to this matrix from this one conjugating with a you that's only minus one in this part of the Hilbert space that term does not these two terms sandwiched in this state are going to give me exactly the same answer is that clear okay and so this has a very nice interpretation so let's let's interpret this it tells us that that let's say that we have some unitary operator you that maps from so sorry some operator you that maps from the small Hilbert space to the big one and in the obvious way so it takes a vector v here and it takes you v is equal to something that just as v in a zero underneath okay and then I can also clearly have something that goes backwards a dagger of it a you dagger which goes from big to small such that it takes vw and it just throws out the bottom component leaves me with v right so clearly you dagger you equals one you was not a square matrix right it's mapping from one Hilbert space to another Hilbert space but you dagger you is equal to one and so we learn already just about simple functions here is something that looks a lot like a coarse-graining statement that psi dagger for any psi f of you dagger OU psi is less than or equal to U psi dagger f of O U psi so this is a this is a statement that we have just learned okay is that clear so O is this big O is the O is the operator that acts on the big space so if I map a state in the small space into the big space and I take the expectation value of f it's less than or equal to had I done it the other way around okay that's all we're going to need about operator convexity just these very very elementary facts all right so now let's move on to the third part before launching into the proof let me introduce some nice quantum notation this is not strictly necessary but but I think you'll find that this kind of notation is useful of course it's very standard people use it all over the place cancer networks I mean in quantum information they use it all over the place of course we use it all over the place everywhere I find that it's a more grown-up way a better way of dealing with quantum states than just writing bras and cats all the time makes it easier to see the relationship between objects especially when you have Hilbert spaces that are product of many factors so let's leave that so three let's have the quantum notation okay and so your anyone who's played with quiver gauge theories or anything is very familiar with this notation instead of having a state psi well think about this as like a vector with downstairs index on which UN acts right so if I'm in an n-dimensional Hilbert space there's there's no reason there's no reason to privilege I mean these are just these are just representations of the UN so we should talk about them the same way we talk about representations for anything else we should use a nice tensor notation for them so this is really that's equivalently what they are but we can also I'm also gonna represent that I'm gonna remind myself that it's a that it's the representation of the downstairs index by drawing it with an arrow coming in okay so from that point of view a cat hi this would be some hi with an upstairs index and I'll represent that with a guy with an arrow going out so what are what are operators from this point of view an operator oh is something with an index coming out and going in and so an operator oh can take a state and return another state right so if I take I'd take this thing and I multiply it by psi then I get a new object with an arrow coming in which is the state u psi so this is all this is all trivial the density matrix is just again it's just something like this expectation value an expectation value of oh would be row oh row and so they just we just contract the indices everywhere we can when you have a when you have a Hilbert space which is a tensor product of a bunch of little Hilbert spaces then you can just have arrows for the different pieces so if there's two pieces then this could be a state in the tensor product of the two Hilbert spaces I can think of this I can think of this as a giant composite arrow right one big arrow which is really made out of two little arrows alright so this is really this is completely trivial the first place but but where this notation starts getting interesting is when you start swinging arrows around which we can of course trivially do with the pictures and and interpreting the same object in slightly different ways it so it's still pretty trivial what is going on but already here you see I can either think of an operator I can think of it as I can think of as an operator as an operator like this or I can swing this arrow around if I swing it around I can think of the operator actually as a state but as a state in a Hilbert space that's n-squared dimensional okay so this is exactly like double line notation for gluons right and then in and out arrow but again I think of these as one giant composite arrow so I can think about so so I can think about operators as operators on an n- dimensional Hilbert space or I can think of them as states in an n-squared dimensional Hilbert space okay and the second point of view where you're thinking about the mistakes is going to be the one that's really important for our proof and as I said we saw that already very early on in the fact that even the positivity of the monotonicity of the monotonicity was arguing that we should think about things in an n-squared dimensional space and not the n-dimensional one all right but let's just play with this notation a little bit more then I will tell you the precise moment where something interesting happens for now we're just screwing around and everything is trivial and just transcription nothing so for example what if I want to think about operators as states what is unitary evolution unitary evolution is some operate so some some operator big you but I want to think about this unitary evolution as an operator on this n-squared dimensional space but what is this operator this operator is just you and a you dagger in terms of the ordinary evolution operators guys UT UT dagger so if I take this and I and I acted on oh I will get you oh and so so this so this thing is this big operator UT what about other things let's say I have a row AB okay so it's a density matrix and a two-dimensional in a Hilbert space with two factors then how do I trace out B well I just do this then that's equal to row a okay and so on now actually before getting to the proof it'll be useful for us to play a little bit more and this is the one place where something a little interesting happens where the interesting things happen is is when with these pictures you move the arrows around so that you end up thinking about as an object sometimes as an operator and sometimes as a state if you always think of it as an operator always think of it as a state then it's totally stupid you can go back and forth between the two pictures that will what's interesting is that sometimes it's useful to think of it as an operator sometimes as a state and that's where these these pictures make that something very easy to do so let me give you an example of this so let's say that we have an operator B that lives in a big Hilbert space with two factors so the dotted line this so this is an operator and now let me write down something trivial of course this is equal to B let me now have an operator in the small Hilbert space S and I multiply by S inverse and then S okay so this is total triviality right but now we're gonna do something interesting now we're gonna take this arrow and we're gonna swing it around so that it points in the opposite direction okay so I'm doing absolutely nothing but let me now rewrite this identity right as saying that B now now I'm gonna think about B as a state so that's why I'm gonna put the arrows like this okay but B is equal to again and the line coming the other way so is it clear what I've done I've just swung that line right I also swung these things to make it look like a state and I swung this line from this side to make it look like it's going in the opposite direction okay but now this is kind of cool because this is telling us that we found a way to make a state in the big space in a natural way given a state in the small one okay so this is a state in the small Hilbert space and this sucker is an operator U that maps us from the small state into the big one so this is this we can define to be again let me just so yeah so this thing I will call you from S into B so I'm gonna need these I'm gonna come back to them so let me let me draw this one here so so there's this sucker you from S into B so it takes something from S it converts to something in B and this is just what I drew there it's also we can also take naturally a dagger of this U right how do we take a dagger of the U well just the definition of dagger right so what what we want we want to define so if I have in general if I have Y and U of X this is equal to this is defining what U dagger is is U dagger of Y with X so so if I put okay so if I just go back I just do this I imagine I put a big Y there I replace S by X then I can interpret exactly the same picture as X U dagger of Y right so the dagger is exactly the same picture but viewed in this direction so I have to flip the arrows all right is that clear okay so what is so U dagger then U dagger which is now doing the opposite so it takes something in the big space and converts it to something in the small S inverse the I'm just I'm just reading that picture backwards and flipping the orientation of the arrows and finally of course it's natural to look at U dagger U what is U dagger U let's compute U dagger U U dagger U which is okay this is just gluing those pictures together this is just S inverse now this thing is not in general equal to the identity but it is equal to the identity if we choose S appropriately okay so notice that this object this B this is just the trace over the this is just the trace over the small Hilbert space of B squared so if I choose S is equal to the trace over small of B squared to the minus to the plus a half then then U dagger U equals one so so in this way the picture is giving us a nice canonical way to map now between between operators in two different Hilbert spaces ultimately we interpreted the states we can go back and think about it as the operators as well but as you see in the intermediate parts of this process sometimes it was useful to think about those O's as operators and sometimes as states and that's why there was content to these to these drawings and to this idea of thinking about states as operators all right any questions about this okay so now now that we've played this game I have five minutes left I might take a little more not much more let's go through the proof the key idea of the proof the reason I went through all this and this is really the deep aspect of the proof is that the statement of monotonicity looks like a statement about operators in a Hilbert space the key idea is to interpret it as a statement about expectation values in the n squared dimensional Hilbert space okay so to not think about them as operators but thinking about states and exactly the way that we talked about here so so let's finally go to the proof and the key idea is the following so we're going to think about these density matrices as states so we're going to think about things like this but we're going to define an operator delta to just be it looks a lot like the unitary evolution operator that we talked about before which had a you on one side and you inverse on the other side except now we'll do these with these matrices row instead it'll be something that's just row inverse up here and sigma down here so the idea is that we're fixing row and sigma they're given and we're interested in their relative entropy okay but what I'm going to do is construct the following operator I'm going to construct the following operator on the on this n squared dimensional space so this is delta a beautiful thing about this is that the log of delta is also just negative log row plus log sigma right these two parts just don't talk to each other so the operator the operator that corresponds to log delta is just nothing here and log sigma here minus log row here and the identity on this side okay and so here is the gorgeous point is that the relative entropy between row and sigma can be actually thought of as just the expectation value in the state square root of row of the operator minus log delta that's exactly just because it's that that's how it breaks up that's what the relative entropy is so the relative entropy is an expectation value psi dagger convex function f psi right and we've already seen from from our discussion of convexity that in states there's this natural notion of coarse-graining that that statement that I erased over there but let me just say now what the logic is and then there's just one little tiny computation that we'll do with the diagrams and then we'll done I should say and this is where all of the you know all of the non-triviality and complexity in the original proofs come from they have to do with the fact that you're not talking about convex functions of one operator we have to talk about joint convexity of two operators and it's and it's more a lot more complicated the whole idea is that by going to the double Gilbert space you reduce everything to just simple statements about about convexity of single operators and expectation values okay so so what what what we want to do is use this fact so if we have a map u if we have a map u from from a small Hilbert space into a big one let me just draw it like this then the statement that we had before is that psi dagger this is just a general fact right this because of the because the function log is operator convex psi dagger log x u conjugated with u psi is less than or equal to psi u dagger negative log x psi u right so that in the notation that I introduced in this notation that's what the old statement is that we proved right so what is it that we want to do to prove monotonicity then what I want is I want the big Hilbert space obviously to be let's say the big Hilbert space I know let's say it's a b I'm gonna trace over b so the small Hilbert space is a so so small as a so what I want is I want x I want x because I'm trying to make a statement about monotonicity of relative entropy I want x to obviously be the operator delta in the big Hilbert space a b right that's what I want on this side I want psi I want the state psi to correspond to row a to the half so if I have if for x I choose delta a b and psi I choose row a to the half then if we can find then if we can find such that psi u such that the map from the small Hilbert space to the big one such that psi u is just row a b the big one to the half okay oh and and sorry and x u is equal to delta a then we're done so there's nothing non-trivial going on here I'm just in this having having transcribed relative entropy into this expectation value statement I'm just saying what what we need what we need to show okay now in fact everything everything is fixed for us because we know what this u has to be we just figured out in general how we're map canonically map states from small Hilbert spaces into big ones so we know what this u is so we know exactly what the u is which which maps which for which psi u is equal to row a b to the half and so on all that remains is to show that in fact so we've checked u dagger u is one all that remains to show is that in fact x u is equal to delta a so we have to look at we have to look at u dagger delta a b u so we now want to show that this is equal to delta a and then we are done okay so so let's just do it so let me remind you what u is again so I'll write it down again u the u maps big to small right map small small to big and this is equal to oh a b to the half oh and sorry there's a little comment I need to make here and I'll just explain when slightly drawing this notation very slightly differently so remember before we had an s there but we said that we have to choose the s to be we have to choose the s to be the inverse of the trace of that thing squared over the b Hilbert space right now for this thing we have row a b so when we square it we get row a well when we square it we get row a b when we take the trace over b we get row a so we have to take so we have to take the square of its inverse okay so that's how we get a row a to the minus the half there u dagger I'm choosing to order which ones are dotted and dashed just to make my resulting diagram in a moment very slightly cleaner of course it doesn't have I can move them around that will and finally what is delta but is delta well we just so delta now is on the big space and so I oh sorry so I'll write delta like it has two indices this one and the other ones so this was just row a b inverse if you remember and then the arrows going the other way for sigma a b so again no again it's not symmetric between row and sigma because of the direction of the arrows and because there's an inverse so it's a row inverse with the arrows this way and sigma with the arrows the other way all right so now that's it u dagger times delta times you so let's draw what they look like the final big picture you dagger let me I guess I could have done this more intelligently but to save me drawing a lot of drawing but so here is you dagger one more time okay this whole line to the bottom now I have to put a delta in the middle okay so and the delta in the middle the arrows going out of the you dagger have got to hook up with the row inverse here so this part let me put this coming out this part is gonna go into row inverse a b this part is gonna go into the sigma a b those going out that and this part of just going into sigma so maybe just draw it like this okay so that part is going into sigma all right and now what's going on on this side is for you I have row a b to the half that part just goes in then I have row a to the minus a half the identity okay so this is the computation of you dagger delta you and now of course everything collapses okay because this row a b to the half row inverse row a b to the half is just the identity right so that whole chain this whole chain I can this whole chain here so let me let me write let me draw it so row a b to the half in verse a b row a b to the half and then I had these other things now let me draw it let me draw that arrow down on this end here actually let me just draw it where it was no reason to switch it sorry this is nothing other than that's just the identity and so what do I get from so on the top line what I get is row a to the minus a half and another row a to the minus a half so on the top line I get row a inverse okay so sorry from the solid line which goes through so this is equal to on that side but what about on the other side on the other side I have these lines going into sigma I have the lines going to sigma a b right this was a sigma a b coming out but these lines these lines going in and out are now getting contracted that identity so they just go around in a loop that of course is nothing other than the definition of the reduced density matrix for sigma in the space b in the space a delta a exactly as needed so that's it that is the that is the that's the proof I think something that would be interesting to think about perhaps in holographic context and other context is in in general and as you'll hear about from a Hiroshi I'm sure there are many many more inequalities that are satisfied by these density matrices than simply monotonicity than simply strong sub-editivity okay so so we saw that monotonicity is a sort of deep fact that lies underneath all these statements about that are used all the time but even classically people discovered in the late 1990s I'm sure Hiroshi will talk about this we even possibly people discovered in the late 1990s that if someone just hands you that they hand you some probability distribution and they give you all of the entropies for all possible subsystems how can you check whether or not it could in principle come from some underlying probability distribution of course they have to satisfy things like strong sub-editivity and so on but is that it how can you intrinsically categorize the space of all possible entropies for subspaces and people discovered in the late 1990s that there's much much more than strong at sub-editivity and that in fact there's an infinite number as you go to larger and larger spaces there's an infinite number of inequalities that are satisfied in a sense the space of part of allowed possible entropies of subsystems is probably not polyhedral if you want to describe it as being so it cannot be described by giving a finite number of inequalities of linear inequalities it's an interesting curved curvy-shaped space and there's a lot of progress made on the subject that's classically in the late 1990s surprisingly as late as that the quantum version of those statements is complete virgin territory very very little is known and people are starting to learn more and more about it so I think something that would be extremely interesting given that we know that when geometries emerge and Hiroshi will talk about it but when geometries emerge much more is true than just sub-editivity of entropy you have many many more relations are satisfied then perhaps better and more intrinsic understandings of where these facts come from might tell us more things so here I haven't done anything particularly I mean this is just pedagogically interesting and I hope you got something out of it but as far as research goes it doesn't yet lead as far as I can tell to new things that are true and you should always be suspicious in physics of just proving old things okay you can do a lot of work pushing things around it's very elegant and fun but there's nothing better than a new formula that has not been written down before that's vastly vastly more important than pushing things around and understanding them better so what will be fantastic is if this kind of thinking could lead to new formulas and new understandings for what the actual space of allowed possible entropies looks like nonetheless I think it's such a it's such a beautiful subject and there's a lot of my second lecture if I was going to give one was going to be all about the sort of convex geometry and polyhedral geometry associated with Hilbert spaces in this way but it's a very beautiful geometric subject and I encourage you to read some of this literature and enjoy playing with it more all right thanks a lot okay thanks team for your talk so now there are two issues on the one end would be nice to let people ask questions on the other end we cannot make nima missing his flight so the compromise is one question okay yeah okay so two questions three questions questions or comments you used the minus log delta as one particular case of a convex function but can you have just chosen some other functions and prove some other results that could be right and how can you extract information because there I mean it's it's the number of convex functions modular Hamiltonians and stuff stuff stuff like that I mean that's a little bit of a question do you think there'd be holographic analogs for each of the diagrams or each of these steps that you drew yeah I don't know of any I mean part of the problem is that there aren't still as far as I know there aren't very good holographic understandings of what relative entropy is relative entropy is used I mean that this trivial fact about it just as positivity of relative entropy is proved by for example Cassini in the proof of the of the of the of the first law right but but yeah as far as I know there's there isn't a good there isn't a simple holographic understanding of what the relative entropy is it's too bad I mean that having played with it or thought about it not terribly deeply but I love relative entropy it seems like a much more I mean it seems like a more fundamental notion to me than entropy itself and so yeah it'll be it'll be nice if it had a holographic interpretation but a little bit more generally I think you know we know that there's a famous picture with geodesics giving in a nice geometric understanding of strong subeditivity right I think that's that's part of the part of the dream that I'm sure Hiroshi will talk about but but part of the dream is that is that the emergence of geometry would somehow be correlated with something geometric which is sitting there already in the structure of the of the patterns of entanglement okay let's relative entropy is related to expectation values of modular Hamiltonian can you does your proof teach something about modular Hamiltonian that's that's that's that's what I was saying I mean I think I um I think the monotonicity the positivity of relative entropy certainly makes an appearance in the proof of the of the in the casini's proof of the of the second law so I don't remember if monotonicity does it must have it must have made an appearance somewhere um I can't remember off the top of my head if it adds but but yeah but it's just a statement of the it's just indeed that we can think of relative entropy as as giving us the entropy relative to the vacuum state and then we can uh the the the yeah and so it's it's very much related to expectation values of the of the modular Hamiltonian ah good question um I think there are there there I think they're relatively um uh yeah so there I mean that um I think there are to the extent that um uh that there are other functions that are like that are powers of operators that are also operator convex and so that could probably be used uh that that might be that might be useful for uh many type statements but but I haven't seen anything I haven't seen anything but I think it's true um I think it's true that uh I'd have to double check but I think it's it's a result not a not a complete trivial result um uh sorry let me let me not say say say something wrong I think that I think that there is there is a statement involving powers for sure there there's an operator convexity statement involving powers I don't remember what it is exactly but that that that should be useful for the retinas uh I think we have to stop here now so let's thank him again and uh