 and in particular I want to talk about dual norms. This is also another way to generate norms from other norms. So, the definition is like this. If f denotes a pre-norm, what is a pre-norm? It satisfies all the properties of a norm except for the triangle inequality, need not satisfy the triangle inequality. Then the function f d of y defined as the max overall x such that f of x equals 1 of y transpose x is called the dual norm. So, notice that the dual norm is defined via an optimization problem. In order to find the dual norm at a point y, you need to solve this optimization problem where you are asked to maximize y transpose x over all x satisfying f of x equals 1. So, it is a constrained optimization problem. The cost function namely y transpose x is linear in y and it is also linear in x, but the space of optimization could be a somewhat complicated space because the set of x is such that f of x equals 1. Also, notice that because f is a pre-norm and by homogeneity of f, a pre-norm satisfies this homogeneity property. So, if you do like f of minus x that is equal to minus 1, magnitude of minus 1 times f of x, which is equal to f of x. And so, to write this optimization problem as, so this basically means that max over f of x equals 1 y transpose x is actually the same as the max over f of x equals 1 of mod of y transpose x. So, basically because of this, we sometimes also write f d of y is equal to the max over f of x equals 1 mod of y transpose x. So, both are both are equivalent optimization problems. You can write it either way. So, this is called a dual norm not without reason. It is because the dual norm of f is a norm. So, in order to show that, you need to show that this f d of y satisfies the four properties we need that is non-negativity, positivity, homogeneity, and triangle inequality. And that's where writing it in this alternative form that it is the maximum of the mod y transpose x helps because when you take a quantity and you're maximizing it over a set of points, that's always going to be non-negative. And so, clearly f d of y is non-negative. So, the other point is that it is positive unless y is equal to zero. In order to see that, if you take, so I'll just write this here f d of y is a norm. And it's non-negative obviously. And for positive if y is not equal to zero, then f d of y, which is the max f of x equals 1 mod y transpose x, this I can lower bound by choosing a specific value of y and I will particularly choose mod of y transpose y over f of y. And this y over f of y satisfies f of x equals 1. I'll write it up here f of y over f of y. f of y is just a scaling and it's non-negative. So, I can write that as 1 over f of y times f of y, which is equal to 1. And so, this satisfies this constraint. And so, the max over all possible x such that f of x equals 1 is at least equal to the value mod of y transpose x takes for a particular value of x satisfying the constraint. And that is this one. And this is equal to the norm y l2 square divided by f of y. And this is strictly positive because y is not equal to zero. f of y is strictly positive because y is not equal to zero. And therefore, this is strictly greater than zero. And we also have that f d of zero because if I take the zero vector, no matter what I multiply, which x I choose here, y transpose x is always equal to zero. And so, the max that mod y transpose x can achieve for overall x such that f of x equals 1 is just equal to zero. So, it is equal to zero when y equals zero. And it is strictly greater than zero for y not equal to zero. So, it satisfies the positivity property also. Sir, could you explain how f of x is equals to max of mod y transpose x because we are not multiplying anything with mod of minus 1. Nothing is multiplied to x. It is trivial actually. So, suppose in fact, so if you want to minimize this quantity, y transpose, suppose you solve this problem and you get an x where this is maximized. If you just take minus x, then obviously this quantity will get minimized when you substitute minus x. That also satisfies this constraint. And since this attains its maximum at that particular x, this will attain its minimum at minus x. So, if there is an x for which this quantity is negative and very large, by just substituting minus x, you can achieve the same large value but in the positive direction by replacing x with minus x. And that's why the two problems where one where you are writing f d of y equal to the max over f of x equals 1 of y transpose x is equivalent to writing it as f d of y equal to the max mod y transpose x over all x such that f of x equals 1. So, the mod has been done so that we always get the maximum value and never the minimum value. No, no, no. Even if you didn't take the mod, you will get the maximum value. Okay, so let me put it this way. Let's do it by contradiction. Suppose these two have two different solutions. You will get a different f d of y if you solve this problem instead of solving this problem. Then suppose this problem gave you a solution. Let's say for example, the all-ones vector is the solution to this problem. It gives you the max of mod of y transpose x. And suppose that happened because y transpose x for the all-ones vector was like minus 100. And when you took the modulus, you got plus 100. And that was the biggest value that the second optimization problem here could take. Then in this problem, I can equivalently use minus all-ones vector. And if this was getting a value of minus 100, this will get me a value of plus 100. And that will be the maximum value that this optimization can attain. So, another way, if you want me to tell it to you in yet another way, this problem that I've written here is I can also write this as, I'll write it down here. So, I can also write it as f d of y like this. Because for any x for which this attains the maximum value, if I take minus x, this will attain its maximum value. Because minus x also satisfies f of x equals 1. So, I could even write it like this. That is why it's okay to write f d of y to be mod y transpose, maximum of mod y transpose x. So, does that help clarify? Yes, sir. Thank you, sir. So, this cost function itself is linear in y. So, it's obviously homogeneous. The only non-obvious property to show here is the triangle inequality. Remember that f of x itself need not satisfy the triangle inequality. But f d of y does satisfy the triangle inequality. So, what we need to show is that if I take f d, if I take f d of y plus z, I need to show that this is less than or equal to f d of y plus f d of z. That is what the triangle inequality says. The norm of x plus y is less than or equal to norm of x plus norm of y. So, I need to show that that is this f d of y plus z is less than or equal to f d of y plus f d of z. So, this f d of y plus z by definition is the max overall x such that f of x equals 1 of mod of y plus z transpose times x. And I can write this as x. So, this is y transpose x plus z transpose x. And if I split this, the mod of the sum of two numbers is at most the sum of the mod of the two numbers. So, I just write that as max over f of x equals 1 of mod y transpose x plus mod z transpose x. All I have done is to split the mod across the two terms and that can only increase the value or leave it unchanged, but it cannot decrease the value of the y plus z transpose x, the mod of that. And this in turn is less than or equal to. So, I am taking the maximum of the sum of two terms. If I individually took the maximum of these two terms and added them up, that will only increase the value because it gives me more flexibility in optimizing the objective function here. So, this is less than or equal to max over f of x equals 1 mod y transpose x plus the max over f of x equals 1 of mod z transpose x. And this is just by definition, this is f d of y and this is f d of z. Sir. So, we see that, yeah. Sir, initially you told that all the properties will be satisfied for freedom except the triangle inequality may not satisfy. May or may not be satisfied, yes. So, but you are proving that it will be satisfied. No, I am showing that f d of y satisfies triangle inequality, not f of y. Okay, okay, sorry. Yeah. So, basically f d of y satisfies all the four properties needed to be called a norm and therefore, the dual norm of a pre-norm is always a norm.