 to more mathematical introduction to his work and the topic of today's colloquium. So, Luigi belongs to this great Italian school of analysis. Of what the Georgie I'm told has led this school for many years. And he has been a full professor of mathematical analysis at School of Normale Superiore of PISA. And he has been nominated as a director of the School of Normale Superiore. ICTP has a long-standing relationship with School of Normale Superiore, which culminated in signing an agreement for a number of joint research collaborations in mathematics and physics. Also, Professor Ambrosio has given various lectures on optimal transport and monge ampere equations. We are sure that this fruitful collaboration will continue and will be enhanced in the future while he is the director. So he has received many accolades. I will just list a few. So he has been a visiting scientist at many major institutions like MIT at Max Planck Institute, ETH, Ecole Normale. He received the Riemann Prize of the Riemann International Mathematics School in 2022, the Balzahn Prize for the Achievements in the Theory of Partial Differential Equations in 2019 and the Blaise Pascal Medal of the European Academy of Sciences in 2019. And the Gold Medal for Achievements in Mathematics awarded by the National Academy of Sciences of Italy. And, of course, he has been invited as the plenary speaker at the International Congress of Mathematics, which, as all of you know, is an honor. So he will be talking about a very interesting topic, but maybe I will leave this to Claudio to elaborate upon, but before that I will invite Professor Andrea Romani not to say a few words. Thank you, Atish. It is just to say that it's a great pleasure to welcome Professor Ambrosio, who's a well-known scientist and director. I had the privilege to meet the director most often, so I'm really looking forward to listen to the scientists today. I'm glad to see a significant part of the Trieste Mathematics community and not only the Mathematics one, which has several different strong ties with the PISA one. And so thanks to ISTP for gathering us together here and hosting this event. Just in joining me to welcome Professor Ambrosio, I just always make a point that, I mean, important recognitions come after a good theorem. So in this case, my role is to try at least with few slogans. I mean, Professor Ambrosio, of course, has been the recipients of hugely important recognitions in the mathematical community, but just in three slogans, I mean, why? I mean, so since I was a student almost, I mean, I came across his work, and I mean, he, with his collaborators and his students, he has brought mathematical analysis to an even higher level than was before, and especially in connecting it with the, now I would say almost all branches of mathematics in a much, much deeper way. So the keywords of his work are, I mean, he wrote probably the book actually together with Nicola and Nicola Gidia, Cisa and Savare about gradient flows in metric space. I think if there was probably, maybe I don't know if he would agree, but I mean, a unifying team is to take actually very classical problems and bring them at a level of an extremely refinement of regularity to also to drop from the classical picture to the modern one, regularity issues, which is not just a game per se. So this certainly, I have to mention gradient flows, for sure, which is a topic that actually became huge. So he brought together, as I said, with many collaborators, but I mean, the study of gradient flows in metric spaces, probability measures, and geometric measure theory to an even deeper level, especially study, I mean, for me, there is one piece of work, which is still certainly contains some treasures that geometers will still have to dig out properly, which is the study of currents in Banach spaces. I mean, currents, very classical object in mathematics, but typically you study them on a manifold or a smooth manifold and so on. Ambrosio's work brought this thing to a completely new level, I mean, completely inimaginable before these works. And as I said, I mean, thinking of differential geometry, which is my subject, I'm sure, there is still a lot of understanding to be done on that side. Also, even other classical problems, like the famous isopereometric inequalities, probably one of the most famous mathematical theorems in the history of mathematics. But again, thanks to his work and the work of his collaborators, he has opened up completely new lines of research in terms of, again, asking these questions in Banach spaces, in probability measure spaces, in general metric spaces, and so on. So for all this reason, I think he has been, I mean, I could go on. I mean, there is a general theme or another general theme is the connection with optimal transport and connecting with various PDE type of equations, PDE's equations, which are famous by themselves, but now there is a kind of, again, a unifying theme thanks to his work. So these are, as I said, in 30 seconds, some of the reasons why he is recognized as one of the leading mathematicians in the world. And so I think today I was actually asked to say a few words about today, the truth. I mean, let me confess it openly. I don't know. I'm eager to learn. I suspect it's going to be, and I mean, I gave a quick look to the slides before. So I suspect it's going to be another opening up to another extremely exciting side of the story of connecting analysis with now themes coming from machine learning and also, again, classical problems about putting sparse and random points in a place where essentially it's all about minimizing an energy which is a theme, of course, very dear also to our physics colleagues, but now we will see it appearing in a very different and probably less usual approach. So thank you very much for accepting our invitation and it's all yours. It's all right, okay. So first of all, thank you very much for the invitation. Of course, I am honored to be here for this joint CISA-ICTP colloquium, also on the occasion also the Ramanujan Prize awarded to Professor Foll. And I'm going to mention in my talk maybe I move here some problems in calculus of variation which have a geometric character which are motivated as we will see by also by some, let's say, more practical problems related to the theory, which is, of course, a theory still in the infancy from many points of view on machine learning. In fact, I came to this problem thanks to a conversation with Michael Luncer who is professor at APFL in Los Angles which is a specialist in this topic. So let me start to give you a plan on my talk. We start with some general motivation and then we enter a little bit into the mathematical side of the theory. Of course, I had to introduce some functional spaces which are maybe familiar to some of you where we try to study this problem in an infinite dimensional setting. Then I will mention kind of variational problem which is the kind of problem which is analyzed in the machine learning community. And then I will come really to the main part, to more geometric part of my lecture. I will tell you about extreme exposed points and these analysis that we have been doing of the so-called class of CPWL functions. This is an acronym for continuous piecewise linear functions. Okay, so, well, this comes from two papers. I wrote with different collaborators last year and more recently this year. One of the co-authors is also Camilla Brenna which, by the way, is a former student also at CIS, at Intriest and in CIS. And Shayan Aziz Nejad is a PhD student at PFL. More on the mathematical side, well, they read this paper by Michael Luncer, more on the applied side of the theory. If you want to go really more on the mathematical side, if you are interested in more on the purely mathematical side, there are these papers that I will mention by Bredius and Caryoni, which appeared on Calculus of Variation and PD. And also this paper by Boyer, Shambhal, Castro Duval and Gournet, which appeared on the Sium Journal of Optimization. Again, maybe if you are interested in this topic of using this kind of energy, I will mention the Ascian-Shat and Total Variation as a regularization term. You may have a look also to this paper. Okay, so let me start with a very informal introduction. So what we mean by sparsity? In the classical theory, which is based on Fourier series, sparsity means that you may have problems where, let's say, only very few coefficients in a broad sense are relevant, right? And when this happens, when you do the Fourier expansion and only very few coefficients are relevant, you mean that there is this notion of sparsity. Of course, whenever you have a sparsity, this is very important for compression because this means that you can use only those coefficients to reconstruct the signal. More recently, a kind of analogy with this problem, which is typically of the theory of signals, appeared also in calculus of variation. In particular, in the two papers I mentioned to you, the authors consider this kind of energy, where F, as I will explain later on, is the usual discrepancy term. For instance, in the applications to machine learning, and Fee is a regularization term. And what is important is that they were able to prove by basically by convex analysis tools that you can find the minimizer. So minimizer can be represented with a special representation. Fee, for instance, after normalization that Fee of Ubar, where Ubar is a minimizer, is equal to one. So this is not restrictive up to normalization. So you may think that this is a convex combination, right? And what are the Ui? The Ui are, here I'm repeating this formula, right? What are the Ui? The coefficients in this kind of convex combination are extreme points for the unit ball of this energy. Right, so it comes here, the geometric problem of trying to study the extreme points of the energy Fee. Okay, sorry. What is the connection with machine learning? The connection with machine learning can be explained in these terms. Well, first of all, the typical form of a machine learning problem when you try to train the network is this. So you design, first of all, you design the network. So you specify a class of functions which you want to try to reproduce. And in this class of functions, you will optimize, right? This function depends on a free parameter theta. And then you minimize, with respect to this free parameter, this is a kind of discrepancy terms with respect to the data you have. And lambda, lambda is a kind of tuning parameter between the discrepancy term and the regularization term Fee. So what happens is that, of course, these problems are being well studied in the classical theory. But in the classical theory, typically the dependence of U on theta is linear. So this is the so-called the theory based on kernel methods. And for instance, H could be a kind of Gaussian kernel. But in terms of performance, at least empirically now we know that this approach has been outpowered by the nonlinear version. For instance, the deep neural networks. So essentially, the most simple version of a deep neural network consists on a number of layers, L. And then the parameters in this problem in which you want to minimize corresponds basically to L affine functions. Right, where BL, say W1, WL are matrices and B1, BL are potentially drift vectors. So you are basically describing L affine functions. And you do the composition of these affine functions. Modulo, then of course in this case you will simply get a linear function, but what you're doing really in this case is to compose modulo, the so-called activating function. So there is a nonlinearity here. And the typical form of the nonlinearity using the application is the positive part. So you apply the positive part to all components of this vector. And then you iterate all these operations. Right, and so what can we say? Well, in this class, in this kind of parametric class of functions, sorry, maybe I go a little behind, in this parametric class of functions you have to single out what is the most natural class of functions to deal with. And it is clear that for instance a natural class is the class of continuous piecewise linear functions because this class is invariant under all these operations. Truncation, so multiplication by the activation function, translation, multiplication by a matrix and so on. And actually this is true, it is true in a strong sense because there is a relatively recent paper in 2016 which shows that actually any piecewise linear functions can be generated in this way. Provided you have a sufficiently large number of layers. Of course, the more your piecewise constant, piecewise a fine function is complex, the more layers you need that is intuitive. Also, here comes another analogy. In the classical Fourier theory it is known that sparsity is produced adding an L1 regularization. And so Unser and the collaborator proposed as regularization. Of course, you should take into account the regularization which is compatible with the class of piecewise fine functions and the regularization that they suggested that I will explain in the next slide is the space of functions with bounded dash and the total variation of the second derivative as regularization. What is in this kind of measure theoretic framework, the analogous of sparsity is that the derivative is concentrated. And in fact, if you think to a piecewise fine function the second derivative in the sense of distribution is precisely concentrated on the edges which appear, of course, in the simplest scenario of the one-dimensional case where you have the so-called splines. Of course, the second derivative will be concentrated at a single point. Second derivative, of course, in the sense of distributions. Okay, so now let me enter a little bit into the mathematical part of my talk. So first of all, we need the definition of the sobole space, W1P, which is the space of all functions in LP, which are whose peak power is integrable with the property that the derivative in the sense of distribution is still representable by a function in LP. These are by now classical spaces which are called sobole spaces. In formulas you ask these integration by parts where, of course, u over d psi is not in general the classical derivative but it is the function for which the integration by parts formula works. And if there is such a function, it is unique. And of course it is consistent with the classical, the smooth case. Of course, you can try to iterate this procedure because, like in the theory of distributions, you can differentiate infinitely many times. And so you can iterate this and define the space of functions whose derivatives up to, in the sense of distributions, up to the order k are in LP. But in our case, it is sufficient for us to stop at the second derivatives. But our second derivatives, just look at this example, will not be functions but measures, okay? And so this, we come to the next definition. What are, which are the functions whose distributional derivative is a measure is the space BV, a space very much studied by the Italian school since the work of George Caccioppoli and so on. And so again, you ask an integration by parts formula here where in the right hand side, you don't have a function but you have a measure. And also here, I gave the definition for a vector valued function. So this is true also for the components uj where j, let's say, is between 1 and m. So we have m components and for each of the components, we have the, for each of the partial derivatives of the components we have that measure. Okay. So this is the frame. So let me just say a few words for those who are not experts about what we know about the distributional derivative. Well, in general, we can split this derivative into three parts which are called absolutely continuous jump and counter part. So the absolutely continuous part is the part absolutely continuous with respect to the back measure, so the one for which you have a density. The density is typically denoted by nabla u because indeed one can show that it is the differential of u in a suitable sense. Then there is the jump part. The jump part is concentrated on the discontinuities of u and I think also to the higher dimensional example, let's say that you have, let's say you have at the top of a roof. Then of course in this case the distributional derivative of this function is concentrated on this line and the derivative corresponds is proportional to the difference between the values of the jumps from one side to the other, right? Cove? No, no, for the moment I'm speaking about, sorry, yeah, this is already for the second order. So let me just do the first order version. Thank you. So let's say that you have a function which is jumping along a set. You have a discontinuity. Let's say you have limit on both sides which are denoted by u plus and u minus. You have the normal, which are the objects which appear over there and these objects give you the so-called jump part of the derivative which as you see is a tensor product. So the first factor is a vector in Rm and the second factor is a vector in Rm. And then there is a more mysterious part which we call a contour part of derivative. I say it is mysterious because in some sense it can't be related to the point-wise behavior of the far less to do with a differential in a suitable sense while the jump part you can recognize it because it has to do with the jumps of the function. The contour part is more mysterious. Sorry. It's more mysterious. And why it is called the contour because there is the famous example of a contour vitally function which is also called the Davis staircase. It is the example of a function whose derivative is zero almost everywhere. So in such as this graph is flat almost everywhere but the function is continuous. And so you have neither this part of the derivative because the derivative in the point-wise sense is zero. You don't have this part of the derivative because the function is continuous, no jump but still you have a derivative because you go from zero to one. And so all the derivative in this case is contour part of the derivative. And of course when we are going to consider later on the class of continuous piecewise linear functions only the jump part will be relevant, right? Because in this case only the gradient jumps along hypersurfaces. Nevertheless it is important in some sense to place our problem for geometric reasons as we see in the full setting and also to consider potentially all the other parts of the derivative. This is necessary in some sense to single out the piecewise linear functions as special ones. Okay, now it comes to my space BH, the space which was proposed, the space by the way was already well known in the theory of elasticity in calculus of variation in the theory of elasticity in some sense has been rediscovered in the machine learning community which is the space of functions with bounded addition. So you might consider the space of functions which have a derivative in L1 whose gradient, let's say whose all components of the gradient du over d psi, all these functions are bb. And then there are a few results about integrability. In particular this space has been studied in the context also of elasticity by French researchers. Dementia, for instance, he proved that in dimension 2 this space embeds into the space of continuous function but maybe these are more technical things which I leave only for those who are more interested to the mathematical part. So now let me start, go again instead to the choice of the energy, the regularizing term in the variational problem. So what you can do is typically whenever you have a vector-valued measure, so here, so I'm denoting by d2u, so this is a kind of a symmetric matrix of measures which is the derivative along the j direction of du over dxj. By the symmetry of distribution of derivatives this is symmetric. So this is a vector-valued measure and what it typically you can do is to write this vector-valued measure as h u times a scalar measure which is called the total variation and h u is a symmetric, is a symmetric, let's say with norm equal to 1. So h u gives you the orientation in a suitable sense, in the vector-valued sense of this second derivative and there is a deep theorem proved by Giovanni Alberti which applies also, it is deep because it applies also to the counter part of derivative showing that h u is a rank 1 matrix. So in some sense it is more like, more alike to the jump part than to the absolutely continuous part. And this, of course, will be an important information for us. And so combining rank 1 with symmetry eventually you can prove that for the singular part of the derivative h ij, h is a tensile product of a vector by itself. And now you can decide what is your favorite semi-norm matrices or if you like only on symmetric matrices and define your energy in this way. You weight according to your favorite semi-norm h u and then you integrate. We will see that it was already observed by in the applied community that the most relevant choice of the energy of the norm is the so-called Schatten norm so where you just sum the modulus of the eigenvalues. Of course there is also the more traditional Euclidean norm. And so let me denote in the rest of my talk by phi s and phi e the two energies that you can build putting here either the Schatten norm or the Euclidean norm which will correspond to taking the square root of the sum of squares. Okay, so this is also a small remark that will play a role later on these energies behave well with respect to many operations in particular we will see also by convolution but in particular they behave well also under symmetrization. So if you take a function u and you symmetrize it meaning that you take the mean value of u maybe the most legible formula is this one. You take the mean value of u on the sphere with radius modulus of x then getting essentially a function of one variable a radial function then this energy any of these energies decreases. And then a typical variational problem which is studied in the applied community is this you minimize the discrepancy term as I said tuned by with respect to phi this parameter lambda plus my phi energy and then there come analytic questions which are related to the embedding of these spaces inside continuous functions so one can prove that for n larger than 3 actually this problem is not well posed is not well posed in general on the other hand in the case n equal 1 the problem is well posed if you consider functions just of one variable and indeed in this case what Unser and collaborators did was to compute minimizers explicitly and prove that exactly minimizers are splines are one dimensional course like those I wrote over there then there is the critical case which we studied in the second paper with Conti and Brenna in this case this is really the critical dimension because there are kind of concentration effects of the action one has to analyze them very carefully anyhow maybe this part I think will be more relevant for those of you who work already in calculus variation so we will go about this more quickly actually what we can prove also because the assumption n equal 2 I will say is not particularly realistic for applications to machine learning it's just a mathematical more mathematical question anyhow in dimension 2 one can prove that this problem is a solution but the norm for lambda sufficiently small actually if q equal 1 of course everything depends also depends on the power q that you put here if you put to here q equal 1 actually this is true you minimizer for any value of lambda as I said in this problem everything is related to the analysis concentration effects of the action along minimizing sequences and so one has to study this capacitary problem and in the analysis of this capacitary problem so you try to minimize the energy among all functions which are equal to 1 at the origin and which have compact support and the constant 4 pi emerges really from the analysis of this problem I think this is more not so surprising so let me go instead to the geometric part F are the data F are the data in your problem so you are given sorry it should be u sorry F is u is a type obviously it's u is lambda u of x i minus y i let me check if there was also before the same area ok thank you ok so instead let me go instead to the more geometric part of my talk which is I think the one more interesting so let me remind to you what is the definition of extreme point and the exposed point the definition of extreme point in convex analysis is a purely geometric definition so let's say that you are in a vector space and you have a subset S of your vector space and you say that a point a vector v is extreme point if you can't write it as a convex combination of distinct points in S for instance of course in this case for a circle of course all points are extreme points for a square the corners are the extreme points and so on there is also another interesting notion which is the notion of exposed point exposed point which uses a little bit of topology so if you have a topology you may say that a point is exposed if it is the unique minimizer for some continuous linear function what are the basic thoughts about these two notions well known since let's say I would say since the 1950 first of all in a housed topological vector space a compact convex set can be generated by the extreme points generated means that if you take the convex combinations and you take the closure you recover all the set of course if you are in finite dimensions you don't need to take the closure this is Caratiadori theorem and also there is clear implication between exposed and extreme exposed points which are the unique minimizers are clearly extreme but in general the two concepts do not coincide think for instance to this picture a kind of stadium so for instance this point is a point which is clearly extreme but it is not exposed there is no way to make this point a unique minimizer for some linear function moreover however still there are plenty of exposed points even if in principle they are less than the extreme points they are dense in the class of extreme points and so still you have the crime mill theorem you can write the crime mill theorem using exposed points ok so as I said at the beginning of my lecture the main point is to try to understand what are what are the extreme points of the unit bowl for my energy this is a second order problem so let me for the sake of illustration tell you what we know about the first order problem the first order problem is basically well known let's say it's part of the folklore on this subject so let me say let us look first at this problem we consider all bv functions so just one derivative we take the bv functions modular constants right in this way the term u becomes not only a semi-normal but a norm and you try to understand what are the extreme points of this kind of bowl computed with the total variation well in this case the answer is pretty well known one can prove that u is extremal if and only if well first of all the total variation has to be equal to 1 this is pretty clear you are not in the interior of the bowl right not only but u has also to be a constant multiple of a characteristic function so c times the characteristic functions on set e and moreover the set e is connected not really in the topological sense but in a measure theoretic sense that I'm going to tell you so how the proof works so the proof arises trying to make reasonable the compositions of u well first of all it is clear as I said that the total variation of u has to be equal to 1 so this is a necessary condition then you try to do the most natural splitting of your function you truncate at some level l and you write u2 which is the positive part of u minus l in this way we have a natural decomposition of u then one can compute this constant which are the perimeters of the level from 0 to l and the perimeters of the sub level set from l to infinity and the quarry formula the quarry formula tells you the quarry formula tells you that alpha plus beta is the total variation so it is equal to 1 and so we are basically building in this way a convex combination u equal alpha with 1 plus beta v2 so if u is extremal this means that this should be a trivial convex combination and so it should be that v1 should be equal to v2 but v1 and v2 are parallel to us gl and u minus l plus and so this means that these two functions whatever level l you choose are proportional and since the level is arbitrary it is not difficult to prove out of this information that u is basically only two levels so let's say the empty level and the level e so it is a characteristic function and then it comes a deep result by Herbert Federer in geometry measure theory which says that this is analogous to the theorems that you have in topology let's say any open set can be written as a disjoint union or connected components in this measure theoretic setting one can prove that any set with an perimeter can be written as a contable unit of in the composable components the composable means that you can write e as a disjoint union of sets in such a way that the perimeters add right of course if e were the composable if you were able to write e in this way then again you will be able to provide a non trivial decomposition of u and so e has to be under composable so this problem is pretty well understood for one derivative what I'm going to tell you is what we have been able to prove for the second order problem so again we take a connected open set and we consider all functions which are in BH but they will not be BH star again modulo affine functions because affine functions are not seen by my energy by my second order energy phi and in this way modulo affine functions again phi becomes and we would like to understand what are the extreme points of S as I said that the one dimensional case has been studied is not particularly difficult and one can prove it's not particularly hard that all the extreme points are piecewise affine piecewise linear are the so called splines and what are the natural candidates in higher dimension of course are the piecewise linear functions what do we expect about this problem well we expect this problem to be more rigid compared to the first order problem right as usual because the second derivative is more rigid than the first one for instance it is symmetric just to give you one extra information and also as we will see the choice of the norm will be particularly relevant and so what are the questions that we try to address can we characterize CPWL functions among those which are extreme well this can be done basically if you take a piecewise affine functions you may find the conditions in some sense of the support of the derivative which tell you kind of really a connectedness conditions which tell you that this piecewise affine function is extreme actually this is not yet clear for the more restrictive notion of exposed but the more delicate question which was really raised by Ulzer and collaborators can we say that all the extreme functions like in dimension one are piecewise linear or not of course a positive answer will be very good because we will tell you again in terms of this convex representation that any function can be represented as a convex representation of piecewise linear function actually we found that the answer is no it is not true and for instance if you take this radial function let's say a kind of truncated cone you can rotate maybe this picture along this vertical axis all these functions are extreme the proof is not obvious it has to use all the structures I told you the distributional derivative because in principle you have to consider candidates which are not piecewise linear so you need all the information on the second derivative on any of any BH function and so the answer is negative actually it will be interesting to know we still don't know whether the truncated cone is not only extreme but also exposed that's an interesting question anyhow we have a negative result but we may try to recover a little bit from this negative result trying to prove at least that the CPWL functions are dense in the class of extreme functions in this way we can still recover see again for people 2 otherwise you take the sum of the p powers the power 1 over p they are extreme for all the p while we are going to see actually that the question that we are trying to recover so the question at least you would like to know to have the density for this it is really essential to have p equal 1 and so this motivates in some sense even more the reason why in the applied literature they really are convinced about the use of the shutter norm the one with p equal 1 I mean ok so my next question is well we know that there are functions that are extremely and are not piecewise linear but can we approximate them in an efficient way and this is my next question I still have some time so the question is are these functions dense in energy with respect to f s density in energy of course remember here we are considering really my geometric problem so I don't need just density in LP topology which will be trivial I need the so called density in energy meaning that I want to approximate let's say in L1 but in such a way that the energies are convergent f s of u h converges to f s of u and then up to a normalization we could say if f s of u is equal to 1 so you are on the sphere generating the CPWL function to be on the sphere as well right is an approximation in energy in fact yeah by scaling you can assume these and well this is more for the expert all these energies are lower semi-continuous so what is crucial here is to get the lean-soup energy the lean-soup inequality and actually it is important to notice that this is not the question not let's say no smooth functions or bb functions because you can immediately reduce your question to smooth functions if you are able to do this for a smooth target function u you are able to do this for any function u because by convolution any of these free energy decreases by convolution why because these free energies are convex convolution is a mean value or translates so it decreases under convolution so basically we could rephrase the question is that given u not in bh but in infinity can we find an approximation in such a way that the energy converges and of course this is maybe again more for the experts if we knew that for the Euclidean norm the answer is positive then it will be true for any free energy but actually as I said before the choice of the energy for this question is relevant and the reason maybe I go about this more quickly but maybe the expert can ask to me the slides the reason is that first of all the two energy coincide the Schatten and the Euclidean one let's say on CpWL functions because on these functions you have only the jump part of the energy and then they coincide on the other end they do not coincide let's say when for instance you are just your target function is just a quadratic function so in this case you are the strict inequality and so combining district inequality with an argument by contradiction you can prove that if you have a positive answer for phi then you have a positive answer for all the other ones and so this is not possible so for phi the answer is no, it is not possible to do the approximation ok so actually we got our results in different steps maybe first I tell you what is the most elementary and classical steps so this is not particularly difficult to be proved that you can get the Lin-Sup inequality by a more or less standard approximation but you get the Lin-Sup inequality up to a multiplicative factor which depends on M and so this result of course is not optimal well this not optimal result can be obtained by any reasonable interpolation you do a Lagrange interpolation whatever reasonable interpolation you do using essentially the Poincare inequality in the single simple access of the composition gives you this inequality but of course and maybe this is clear to the experts in gamma convergence in the audience if you want to get the sharp approximation you need in some sense to adapt your frame to the local behavior of the function assume also as I said that your target function is completely smooth but of course the action matrix of you will be diagonal in frames which change from place to place and so you have to adapt your frame to the local behavior of the function and this is not so obvious because if you just change your frame in a trivial way typically the geometry of the simple access degenerate so this is very well known in numerical analysis that when you have triangulations with very small angles the constants that you get in the embedding degenerate become out of control so you have to adapt your triangulation but keeping in some sense a control on the geometry of the simple access and eventually we were able to we did this first in our paper in dimension 2 with the and maybe I just to close my lecture I just give you some idea about the construction in dimension 2 and then I will tell you about the higher dimensional case so what is the idea of the construction well as soon these are the very first steps as soon as you have your function and first of all you have made a division in many cubes in your domain has been divided in many squares dimension 2 and you know that in each of these squares here I have singled only one of these squares the action of you is almost constant and it is diagonal in this tilted in this tilted direction right so the tilted direction is the one in which your action is diagonal and it is basically almost diagonal in any point inside this square of course in a different square you will have a completely different frame you have to tilt in a completely quite different direction then if you know that this is the good direction so to speak you can do the most of your swing you divide the square into triangles which gives you let's say optimal constants the point is how to refine in some sense these triangulations near to the boundary right because when you when you are going to consider the triangulations which will be done in a close square you don't want to create jumps in the derivative and in dimension 2 we have been lucky because one can use a kind of self-similar construction so as soon as you have done already the trivial decomposition here you continue in this region and again here in this rectangle you do the trivial decomposition and then you continue and here there is some kind of self-similarity which eventually tells you that all things you are doing are under control and that there is a degeneration only near to the boundary which can be eventually controlled so this proof is using kind of self-similar construction because you see for instance the triangle which appears at this level is similar to the one which appears here and also things can be kept under control more recently with a much more complex construction with the help of Sergio Conti in Bonn we did the extension not using self-similarity but a kind of really optimization there is a kind of optimization technique for the frames we were able to get the result in the same result in any dimension as I said why this result is relevant at least from the theoretical point of view because it tells you that all the extreme points of the unit ball are at least the limits of CPWL extreme functions and so even though we know that not all extreme points are CPWL it is still true that we can recover in some sense the unit ball the unit ball of my energy the unit ball of this energy by taking not only the convex hull but this is typical also of infinite dimension by the closer the convex hull of extreme CPWL function so in some sense we recover in some weaker sense we recover the heuristics of Unser and the collaborators and I think I can stop here and thank you for the attention and of course I will be made available the slides if you are interested thank you so you mentioned before that if it were true this density in energy with respect to the Euclidean then by this argument the combination would be true for any other norm by Reshetnyak I wonder whether now that you have this result about this density in energy just for the shutter norm does it imply anything for some other special type of norm maybe I don't know made with finite number you know norm such that it's a convex hull or finite number of Euclidean it's assumably so yeah I will say in principle keeping in mind also the proof of Reshetnyak theorem I will say something which is less so a norm with respect to which the shutter norm sounds uniformly convex exactly so it's kind of a relative uniform convexity a relatively strict convexity yeah I think this is reasonable thanks you stated the density theorem for a cube omega is a cube can it be clearly extended to polyadol sets or is there some other difficulty well I think the only difficulty we had in fact in the second paper we can't we consider all the slightly more general question is let's say as soon as you are able to extend your BH function from omega to a slightly larger domain without creating a second derivative on the boundary then you can make it but it is not so clear to me whether there is a general result in this direction right for the cube you do a reflection and you make it but I will say for a general domain not so clear anyhow it's a kind of extension problem otherwise if you don't want to do the extension I think to refine a little bit of our argument while going to the boundary but since our proof was already quite complicated we didn't want to introduce even this extra level of refinement but I will say that I will guess that for a C2 domain with more complication in the proof even if you are not able to extend it should be true the comment or question I have to confess I have a curiosity I hope it's not too embarrassing I am a geometer but in passing from first order to second order in the first order problem you were studying also there was a data which were the XI, YI and in fact you had the non-degeneracy condition to be imposed which I first of all my first question was there a connection between the fact that this non-degeneracy was actually applied on an affine space with the fact that you are actually studying non-piecewise linear or piecewise linear I mean is the linearity of the functions connected to the linearity of the non-degeneracy condition the non-degeneracy condition was only meant to exclude the trivial solution because since the since the in all the variation of problems I mentioned was a second order and so it's as if you already know that your data stay on an affine plane you have the trivial solution for which it equals zero and you fit completely a u by an affine function so in that statement I was saying assuming that this is not true then the problem is imposed and this is what I was going to say dimension higher than 3 but then when you pass to second order what is the I mean this data like the x i y i I mean then the problem seemed to become different at least I didn't see about this extremes points and but I mean is there an analog problem like the one of first order that you no but the variation problem was already second order so the phi was already meant to be second order the variation problem was already for this with two derivatives not with one I presented the first order version of the geometric problem of the extreme of the characterization of the extreme points of the unique ball for the sake of illustration right because if you consider the variation problem with the first order that the non degeneracy is that the data are not constant because only one derivative right any other question if not I have to invite everybody with a phd to leave this room and also those who don't have a phd only for age reasons because it's our tradition to leave the speaker for a few minutes with students to be able to discuss freely about life and mathematics and science