 It's supposed to be given a smooth, closed initial hypersurface F0 Mn in a smooth remaining manifold before, then is a maximum tiny interval, 0 Tmax, Tmax is certainly positive, may or may not be infinite, where a smooth solution exists. And if Tmax happens to be finite, then we can detect this by seeing that the curvature blows up. Then the soup of the second fundamental form, MnT, is unbounded as T tends to Tmax. So this is the smoothing property that Toti Descalopoulos already mentioned, and it's a parabolic property of this equation, it cannot happen that the first 27 derivatives of the curvature remain bounded and the 28th derivative blows up. This cannot happen. If the curvature is bounded, then all the higher derivatives are bounded and the surface remains smooth, and then we can extend the flow further. So sketch of proof, yes, both is possible, no, but that's a different proof, so you can prove that something that is embedded remains embedded, okay? But for this existence proof, the surfaces can be immersed, and as part of the proof I show that an immersion stays an immersion, okay? Sketch of the proof, so the key thing here is to assume that the curvature is bounded and then show that you can extend the solution beyond Tmax. So suppose the soup of A squared MnT is less than some C0 squared, less than, or call it D0 squared, less than infinity for all T less than Tmax. And then we want to bound the higher derivatives. So we need the evolution equation of the higher derivatives. So compute from the evolution equation of the second fundamental form. Remember this was Laplacian of H ij minus 2h H il H lj plus A squared H ij. And yeah, I'd only do this in Rn plus 1 because in a fixed Riemannian ambient manifold the extra terms are lower order anyway, right? So it's enough to understand how this works in Riemannian space. And it would be too complicated to do everything a step for all the indices. So in order to get a reasonable notation, we write this in the following way. We write this as Laplacian of second fundamental form. And these terms here we just combine and say these are linear combinations of the second fundamental form with itself, right? In fact, these are cubic terms. So this notation here means linear combinations of cubic terms, possibly with contractions with the metric. If we use this notation, then you see that DDT of the gradient of the second fundamental form can be written as a term involving the gradient of DDT of A from the covariant derivative, there will be some DDT of Christoffel symbols. So we need to know what is DDT of the Christoffel symbols. But the Christoffel symbols are of the form, as you know, gij, g upper ij, and then there here is derivatives, ddx i of g. So since DDT of g, this was a minus 2h hij, so this can be written as A star A, the derivative of Christoffel symbols has certainly the form, just is just metric, so we just have to worry about this. Differentiating this gives one derivative of A times A, so we can write this as gradient A star A. So this gives us that DDT of gradient A can be written as the gradient of Laplace A plus A star A star A plus gradient A star A star A. So the whole thing is equal to, and now if I, so I certainly get here something which is Laplacian of gradient A, I switch these derivatives, this gives me an extra term which looks like gradient A star A star A, but all the other terms are the same, right? If I took gradient here or gradient there, I always get one gradient and two terms which are not gradient. So and this is all I need, I need to know just the order of these terms, because then if I compute what is DDT of gradient A squared, what I'm really interested in, I just multiply this with gradient A, then I get two times gradient A times Laplace gradient A plus gradient A star, gradient A star A star A, and this one here, this can be written as Laplacian of gradient A squared minus, you see if I do the Laplacian, I get this term, but I also get this so-called Bochner term where I have the square of the second derivatives. I think it's very useful to know this notation because it allows you without worrying about individual indices to get the structures of the equation straight. It was invented by Richard Hamilton. So this now means if I apply Schwarz inequality there that I get an inequality, DDT of the gradient of the curvature can be estimated by the Laplacian of the gradient of the curvature minus two times this Bochner term plus, and this one I just estimate, this is a linear combination so there's some constant depending on, call it C1, depending just on the dimension times gradient A squared times A squared, now the structure of the evolution of the first derivative and then you just do induction and you compute that DDT of any higher derivatives of A can be estimated by Laplacian of these higher derivatives and you get also the Bochner term with the next highest derivative and then you get a whole bunch of these terms and you can write them the sum i plus j plus k equals m, you always get one term and here's constants Cn of n, you always get one term because you have a square here which is like the gradient m and then you get lower derivatives, you get gradient iA, you get gradient jA and you get gradient kA, you always get four terms just like here and the sum of all the derivatives you get has to be of order m so you can have this, in this way you can have the structure of the evolution of all the higher derivatives under control and now we are going to bound these higher derivatives under the assumption that our curvature is bounded, you see if the curvature is not bounded then this term here is a disaster but if this is bounded this is just linear in the gradient of A so our chance is to estimate this term because this is bounded, now if you do it brutally this would already be enough this would sort of give you if this is bounded gives you sort of exponential growth of gradient A but I want to show you a trick that's important that lets you avoid this dependence on time all together so recall that we had ddta equals la plus A plus A star A so this means in particular that ddta of A squared is equal to Laplacian of A squared minus 2 gradient A squared plus 2A to the 4 of course if you put m equals 0 it conforms with this formula here but you see if we do now suppose we want to estimate the gradient of A and this term is bounded times gradient A squared then we can swallow this term if we add enough of that term okay that's an important trick so consider the function f equals gradient A squared which we want to bound and then add enough alpha 1 times A squared in order to overcome this and let's see how this works out then we get ddt of f we go and get the Laplacian everywhere so we get less than the Laplacian of f I throw this one away I don't need it right now but okay let's write it down just I'm not going to use it I only need it in the next step when I go for the second derivatives because then that's going to make me my good term so let's not throw it away but I don't need it for the bound of f and then estimate this guy here by c1 of n times d0 squared by our assumption times gradient A squared and then I get from here minus minus 2 alpha 1 gradient A squared and I pay for this with this term so I get plus 2 alpha 1 a to the 4 but a to the 4 I can estimate by d0 to the 4 because it's bound okay and now you can see if we choose alpha 1 something like what did I take choose alpha 1 equal to right okay so so if we choose so choose alpha 1 equals maybe maybe twice this guy 2 c1 times d0 squared consider the first instance so some point p0 and some time t0 where f reaches a new maximum say k some big number then at this point and at that instance we must have because it's a new because it's a new maximum so we must have d dt of f at this point p0 t0 must be greater equal to zero because it's a new maximum on the other hand because in space it's a maximum we must have laplace of f at p0 t0 less or equal to zero so if this is true then that p0 t0 we get that zero is less than zero minus zero and then here if I put alpha 1 as I said there I get minus 3 c1 times d0 squared times gradient a squared and I get a bad term plus 2 alpha 1 times d0 to the 4 so I put this on the other side so I get 3 c1 d0 squared times gradient a squared but it but gradient a squared is equal to f minus alpha 1 and alpha 1 I have chosen to be c1 d0 squared yeah I'm too fast f minus alpha 1 a squared less than 2 alpha 1 d0 to the 4 right I get this but the f at this point is equal to k so I get 3 c1 d0 squared times k minus and this one I can bound from above by d0 squared times alpha 1 which is 2 c1 d0 squared and you see this is a contradiction contradiction if k is bigger than something like some multiply multiple of d0 to the 4 a constant on n and c1 times d0 to the 4 right if this is too big this is a contradiction so this means we can never achieve a maximum bigger than d0 to the 4 so this concludes that that the f that since since the dA squared is less than the f we get that the soup of dA squared on m and t is less than this constant times d0 to the 4 and this is also the correct scale and this is for all the t's in 0 to t marks so this way you get out of the curvature estimate a gradient in the estimate which is independent of time you don't have this exponential growth because you have this good term on the right hand side that you get from the term of lower order and I will iterate this procedure so in the same sense with induction you get that the soup of gradient m a squared over m and t is less than some constant times d0 to the to the 2 plus 2 m so we get all the higher derivatives bounded now on the other hand we need to know that it remains in immersion this question came up already so we need one more step to show in order to show that things converge to a nice limit as t approaches t max so that we can use our short time existence result we need to show that the thing remains in immersion and there is a nice lemma I guess it was known before but Richard Hamilton wrote it down and it is purely on matrices so if positive definite time dependent matrix gij of t so you think of the metric here satisfies the integral from 0 to capital T of d dt of gij dt here of tau d tau is less than some constant l less than infinity then matrix gij of t are uniformly equivalent equivalent on 0 t and gij of t converges a positive definite metric as t approaches t and t could even be infinite okay there's an absolute value missing here and the proof is simple you just take any take any vector fix x and say in our case it would be in the tangents point the tangent space of the surface independent of time and you compute d dt of the logarithm of the norm of x with respect to the metric g of t and then you get one on x squared with respect to the metric at time t times two times the derivatives of gij since x is no the derivative of this times xi xj because the x is independent of t and this can be estimated just by in terms of absolute value by d dt of gij with respect to the metric at t by schwarz inequality and then you just integrate and you get that the logarithm of the ratio is bounded i think i can erase this it's not an assumption it's a it's a conclusion i start with something positive definite right yeah and now i want to prove as long as long as this lem this integral estimate is true the remain equivalent and right so so with this estimate you get that the log of of x squared with respect to t two divided by the log uh divided by x squared of with respect to some other time t one is less than the integral from t one to t two of um d dt of gij at tau d tau is less than l and therefore you conclude that and and you get the absolute value of this and this tells you that the norm at times t at any t is bounded by um e to the l times the norm say at time zero and similarly that this is bounded below by e to the minus l times the norm at time zero and this is what i mean by uniformly equivalent for the norm so this proves the uh this proves the the the first thing and then notice that from this it also follows also x squared t is cushy as t approaches capital t because you have this uh uh integral bound on the derivative and therefore these guys converge x squared t converges uh for all x and then by polarization if the squares converge you also get the inner products then by polarization y with respect to gt converges x y so we get a and this limiting metric of course is also uniformly equivalent with the same bound um that we had for all the x squared of t and therefore we get a limit metric um s t approaches capital t and the surfaces remain immersed notice that this is of course the key difference uh to say the harmonic map heat flow and the harmonic map heat flow which is on the surface a nicer equation it's more linear but if you start with a uh even if your initial map is an immersion under the harmonic harmonic map heat flow um you don't have such a theorem yeah it doesn't work you in fact it's wrong there are examples that you start with a hypersurface immersion as a map and then the harmonic map heat flow just mixes up the map completely and it uh doesn't remain a hypersurface or diffeomorphism does not remain a diffeomorphism and that's why people moved from the harmonic map heat flow that was pioneered by jameel's uh and uh samson uh towards things like mean curvature flow and richie flow which have these properties and that they have this property depends on this lemma okay so now we can wrap up our proof um at time take at time t yeah with them with them evaluated at them because that's what i need when i do the schwarz inequality yeah you want one more t yeah no so in our case so in our case ddt gij t is of course the norm of minus two h hij squared and this is uh four h squared a squared um and this is uh less than uh four n a to the four is uh less than four n times d zero to the four okay oh and since so if t max is less than infinity then the integral from zero to t max of ddt of gij t dt is uh uh less than uh two square root of n times d zero squared times t max less than infinity infinity so it's important here that we use our hypothesis that t max is less than infinity and you already see um where there are possible pitfalls and i've seen papers that were wrong because of that where people claimed that they that they have convergence as t tends to infinity but in reality their estimate for ddt of gij was just decaying like one on t so you quite often in terms of the scaling one on t would be just right okay but one on t it's not good enough to give you this to give you the assumptions of the lemma so to get convergence say even in the case t tending to infinity you have to be extremely careful that you get an estimate which is a little bit better than what you expect from the scaling okay so our lemma applies therefore the surfaces remain in immersion and the speed is now bounded the mean curvature is bounded because the curvature is bounded so we get a so s t tends to t max we get that uh m and t converges to m and t max a smooth limiting hyper surface immersion and then you just use on this one short time existence use short time existence this gives you a solution which is a little bit longer because this new solution because you have so much smoothness fits together with the old solution so you get a solution mean curvature flow on zero t max plus epsilon but this is a contradiction because t max was the maximal time where we had a smooth solution and then we are done so it cannot be it cannot be that the curvature is uniformly bounded up to the maximal time because then we could get a smooth limit surface and use short time existence to go a little bit beyond the standard procedure for many non-linear geometric evolution equations okay so now we know how to detect singularities Carlo yesterday showed you that there are many different singularities that we can get now to understand the singularities better one needs a rescaling procedure that Carlo mentioned already yesterday and you will need a more precise control on the smoothing behavior of the hyper surface so before I come to the monotonicity formula I want to show you as long as this on the board how you can improve the estimates that I just did on the gradient a little bit you see here the estimate on the gradient was somehow assuming that we had initially gradient but because I only said a new maximum cannot occur so the it would be nice to have an estimate which only depends on the curvature bound and does not assume anything on the initial smoothness of the hyper surface so notice the gradient estimate estimate can be improved and this is a very useful trick as well namely we choose a slightly different f we choose as f we we smuggle in affected t in front of the gradient a square then we put alpha one times a square like before so it didn't change much I just put the t but this makes more sense actually because the alpha one should be really it should be a constant which does not depend on scaling itself remember we chose alpha one comparable to d0 squared in order to get the scaling right now the t takes care of two factors of scaling so I could choose the alpha one something to be independent of scaling in fact we could choose alpha one to be one or two this time and then when you do the computation you get d dt of f this time is less than laplacian of f I throw away the second derivatives and I get the c1 times d0 squared and now I get minus four this is not such hang on what did I do I see one maybe maybe I'm too fast here so let's put the alpha one and I get times t times gradient a squared and I put minus alpha ah okay that was my mistake alpha one I forgot the t here's a t and then I can put the four and there's no t here okay I forgot the factor t here we had four before we had c1 d0 squared but now I have the t and my minus two alpha one alpha one equals two and I get one extra term from the differentiating the t and the other term is the two alpha one which was plus four d0 to the four so the only damage I've done here is this but now you see if the time interval is not too long so if the time interval now if if t is in something like zero c1 to the minus one d0 to the minus two times um yeah times one is enough if I have this controlled time interval and notice that the scaling is the inverse of curvature squared as it should be then I get a contradiction get contradiction at new maximum if gradient a squared is bigger than a certain constant depending on n divided by t times d0 squared in other words we now get an estimate which is completely independent of the initial data of the gradient of a just don't depends on the curvature bound this is called an interior in time gradient estimate theorem to pose m and t solution of mean curvature flow as above so closed manifold smooth ambient space no boundary we have the soup of a square m and t is uniformly bounded then there is a time interval delta one greater than zero such that on the time interval zero delta one d0 squared to the minus two we have that the soup the gradient m and t is less than the constant c1 just depending on n divided by t times d0 squared and then by induction you get that the soup over gradient m a squared over m and t is less than a constant c m of n over t to the m times d0 squared on some time interval from zero to some delta depending on m times d0 to the minus two and this is a what I call an interior in time derivative estimate and they are particularly nice because they scale right in terms of time and it's exactly the correct power in terms of curvature and the time involved now this was rather easy so you see just sort of maximum principle type argument you can do this also in space right instead of doing an interior estimate in time by the way this could start at any time right you don't have to start this at time zero you could also start this at some time t0 but if you want to do it interior in space then you have to use a cutoff function and that is just a little bit too messy to present it in this class because it would just take too much time to write it all on the board so let me just state the result as a remark can also prove interior in space and time estimates for example if the soup of a squared over m and t intersected with some large ball to be b to r around some point x0 is bounded by d0 squared on some interval 0 t then you can get the soup of the gradient or even higher derivatives of the gradient over the surface on the smaller ball of just radius r around x0 bounded by constant depending on n m divided yeah times 1 over t plus 1 over r squared to the power m times d0 squared so very similar but now the penalty you have to pay compared to this estimate is of course that instead of just having a bad term like 1 on t you also have to get a bad term 1 on r squared if the region that you're considering is small this would be done with a cutoff function function people use for example eta of x is something like x minus x0 squared minus 2r squared squared and the positive part of that so you use sort of use a cutoff function that picks out this point and here's r or 2r okay and this is for example in the paper with klaus ecker a long time ago in inventionists yes or in the book of klaus ecker yes you have to make the hypothesis that the curvature is bounded on a bigger region in order to deal with the cutoff function all right and you get the result then only on the smaller region yeah now that i use two and one is arbitrary i could also take here i could have taken instead of two i could have taken 11 over 10 but then this constant gets worse you need a little bit of room to squeeze in the cutoff function and you need the you only get the good estimate where the cutoff function is reasonably big right but you need the curvature bound on the whole region where the cutoff function lives and that's why you have to give up a little bit of space so now we essentially you know this is a lot this is a big part of the regularity theory for mean curvature flow because it's very general and you can and it's completely scaling invariant so this this this estimates here survive if you rescale the thing they they survive because left and right side and right hand side are scaling exactly right so it's sort of the best that you can hope for for example in the theory of non-linear hyperbolic equations like Einstein's equations you don't have such estimates that makes them so hard yes yes you because you get of course then the constants will all these constants here will depend on the curvature of this Romanian manifold or this constant here right but they are lower order term it's because it's a smooth smooth thing that doesn't move yeah okay so the next tool i want to show you is the monotonicity formula and the idea of the monotonicity formula is to say okay mean curvature flow is d dt f equals Laplace Beltrami of f well it was a funny Laplace Beltrami but if you linearize it it's the usual Laplacian if you linearize it around the tangent space of the manifold it's the usual Laplacian on the tangent space therefore couldn't we somehow compare the behavior of a mean curvature flow to the behavior of the ordinary heat equation in the ambient space so consider a positive solution you on the ambient space well i could yeah let's do let's do this even just to show you how general this is on the on the in the Romanian setting of d dt u equals minus Laplacian bar of u and Laplacian bar refers to the metric g bar and if you like you can think of r n plus one yeah so r n plus one so this could this could be r n plus one and you just have to heat equation so for example r n plus one this is sometimes also this is this is called the backward heat equation or i prefer to call it the adjoint heat equation and of course in r n plus one you can just take the heat kernel the backward heat kernel k of x and t is one over four pi t naught minus t to the n plus one over two times e to the minus x minus x naught squared over four times t naught minus t and this heat kernel has the property is x naught and it's centered at time t zero so first it looks like the usual Gaussian and then as time approaches t zero it looks like the delta function so this is a time t two bigger than t one and this is a time t one and this heat equation of course is extremely important this backward kernel for example gives you the representation formula for solutions of the ordinary heat equation right if you fold this kernel with some initial data f it gives you the solution of the heat equation at the time t naught at the point x naught so maybe if it gives us if this kernel say has all the information of solutions of the ordinary heat equation maybe you can use this kernel to get information about our mean curvature flow now it turns out the kernel here satisfies some interesting relations so just notice that you get that d i d j of the logarithm of this kernel k so if we take the logarithm we forget we can forget the factor just depending on t in front just to take the Hessian of this logarithm here and we get minus delta i j minus delta i j divided by two times t naught minus t so there's a very easy expression for this logarithm it's concave function and there's a not hard to prove theorem um i don't know who saw this first it's certainly related to Richard Hamilton but people like Elliot Leib told me about it and and and this was known before it's simply if you have an arbitrary u on r n plus one to r u positive and u satisfying the heat equation the edge on heat equation i'm now in the i and plus one case then it is true that the i d j of the logarithm of u divided by k is greater equal to zero in other words the logarithm of the quotient of an arbitrary positive solution of the heat equation with the standard kernel is concave it's convex sorry okay so i i rephrase it here yeah so that's just a property of linear of solutions to the linear heat equation the log of the quotient of an arbitrary solution and the right so the let's write it down here again so theorem for an arbitrary we get that d i d j of log of u divided by k is greater equal to zero where u where k is the kernel and u arbitrary yes as a quadratic form yes in other words log of u divided by k is convex arbitrary solution of d d t u when you spell this out when you spell this out just just do the computation what you get is that the i d j of u minus the i u d j u divided by u minus u times delta i j over 2 t not minus t is a non-negative matrix again again in the sense of quadratic forms this is what richard hamilton calls a matrix hanak inequality because why why why hanak what does this have to do with hanak it's because you get this corollary you get just take the treks but taking the trace you get that the laplacian u minus gradient u squared divided by u plus oops here's a plus sign got it wrong here there's a plus sign sorry plus n times u over 2 t not minus t is greater equal to zero and this is this gives you a lower bound on d d t u implies hanak integrating hanak inequality after integrating a d d t of log u along spacetime i'm not going to do this but you know this gives a lower bound on log u and if you have a lower bound on log u it tells you how how quickly can you cool off at most and this is the term that tells you how much it can cool off at most and gives you a very precise estimate and of course notice that you have equality on heat current so you have a very you get a very precise hanak estimate and this inequality here is a very sharp and i should also point out that this inequality was generalized li and xintong yao li yao extended any remanian manifold here is i should have written n plus one right by because i'm in our n plus one to any n plus one if the richie curvature of g bar is greater equal to zero so li yao showed that this inequality is true not just on our n plus one but on any remanian manifold of non-negative richie curvature this is the famous li yao hanak inequality so this is the background from the linear heat equation this is completely linear heat equation that i just summarized some important properties of the heat kernel and of positive solutions of the heat equation let's see how we can apply this to mean curvature flow another 15 minutes right you want some discussion okay i only take five okay now suppose we have a solution of mean curvature flow so suppose mn cross zero t nn plus one g bar solves mean curvature flow and suppose we have such a u suppose u form nn plus one g bar cross zero t zero to r here we t zero has to be less than t solves u positive solves d dt u equals minus laplace bar of u you see this is on an n plus one dimensional flow on an n plus one dimensional manifold my heat kernel my manifold is just n dimensional so i cannot expect that the n plus one dimensional thing just perfectly works on the n dimensional thing so i have to adjust how do i adjust well the heat kernel has this factor in front of it and this is responsible for the scaling it has an n plus one over two if i want to be on an n dimensional surface i should just have n over two so i rescale important point here now consider rho from mn cross zero t zero into r where rho of pt is defined as square root of 2t naught minus t to the one half this is the rescaling that i just explained and then i take the u at f of pt and t so i take u restricted to my surface but i rescale it in order to adjust the function to the fact that i'm on an n dimensional surface and not on an n plus one dimensional space okay so this is the rescaled heat kernel on the ambient space and now i want to compute its evolution equation right so so now compute d dt of rho now i get from the front i get straight away minus 2t naught minus t to the minus one half times u and then i get from the full derivative plus 2t naught minus t to the one half times the ambient gradient of u multiplied with the speed of f which is the mean curvature vector this is the derivative with respect to the first entry here and then i still get the derivative of u with respect to t so i get t naught minus t to the one half times the laplacian in the ambient space of u how do i can convert the laplacian in the ambient space to the laplacian in the hypersurface well the ambient laplacian obviously takes all the derivatives in the hypersurface but i'm missing one this is the derivative in the normal derivative normal direction of u and then of course these guys are sort of mixed up on the surface they have to remain on the surface so i get a term from the curving of the in the hypersurface and this is given by the mean curvature times the normal derivative in direction of nu times u is this right i think i get a minus sign here this is r plus u is plus yeah plus i think i got it got the sign right the sign is such that this term here is not cancelling that term there but adding it up so you get d dt of rho is equal to now the first term here just combines with this thing here to give me minus the laplacian of rho takes care of this term then the term there gives me minus two times the normal derivative of nu times h this takes care of this and this guy and then i have minus two t0 minus t to the minus one half times u and then finally i got this guy here minus two t0 minus t to the one half times the normal derivative in the ambient space of u now this guy here looks like the term in our hanag inequality if i take this in direction of the normal you know this is the ambient space so if i take this in direction of the normal then i get some information on this from the hanag inequality so let's express this and then because i'm here this is a trivial calculation just combine terms and you can see you can write it like this i add in h squared rho this is the adjoint heat equation on the hyper surface because the area element on the hyper surface moves like minus h squared so the adjoint equation on the hyper surface to the laplacian to the heat equation is this one so i add this in then i have to subtract it off and i subtract it off by completing the square with this term and you can write it like gradient bar in direction nu of log rho squared times rho and what's left over is minus um two times t0 minus t and miraculously these terms all combine to give you d bar nu d bar nu minus d nu bar u squared over u plus uh plus um u over two t0 minus t uh back and closed in other words we get exactly the term a term which we know is greater or equal to zero if we are in r n plus one so in r n plus one this last bracket here is greater or equal to zero and it vanishes completely on the uh standard heat kernel and that's the thing that's the key theorem this this thing here so let's summarize this as a theorem to conclude the lecture so theorem function tan zero is satisfies d dt rho less than minus laplace rho plus a squared rho minus h plus gradient nu log rho squared times rho uh minus two t0 minus t d bar nu d bar nu minus d nu u bar squared u uh plus u over two t0 minus t so in in particular in r n plus one is a sub solution the adjoint equation on m and t d dt is less than minus laplace rho minus plus h squared rho and of course we don't throw away this good term right so we get in fact some d dt of rho is less than is equal to or no it's less less than two uh minus laplace and rho plus h squared rho uh minus h plus gradient nu log rho squared rho better inequality and as a corollary we get the monotonicity formula if you take d dt of the integral of rho the laplacian integrates out to zero h squared rho kills the evolution of d mu we only left with this term and we get that this is less or equal than minus the integral of h plus gradient bar nu log rho uh squared rho d mu which is of course less or equal to zero that's the monotonicity formula and that's a good point to stop so we have three minutes