 So, this proof we are going to do it in multiple steps we just split it into three four steps that is easier to follow. And again the proof goes along a similar lines as we did in the weighted maturity algorithm that is expert prediction with expert advice, but in this case we have to account for the case that I am dealing with estimators not the actual loss values, ok. Our goal is to show R n bar is L i t is equal to k by 2, t equals to 1 to n eta t plus k by. So, first thing it goes to n equal to x, first if I am going to look into the expectation of L i t hat. Notice that this L i t itself is a random quantity, right, because the way it is defined. L i t is defined as what is this? So, i t is a random quantity. So, depending on its randomness there is a induced randomness on this L i t. So, now this is a random and if I am going to take expectation with that randomness. So, this i t here is distributed according to P t, right. If I am going to take that what is this quantity is L i t, right. We said this is already an unbiased estimator. So, this is going to be L i t, but now if I look into this quantity, but now I treat this to be a random variable i index with which I am going to take this expectation as, then let me treat this value. Now what is this quantity? So, let us just write down this. What is the meaning of this quantity? This is going to say that. So, this i is a random variable. So, i is going to be drawn according to this distribution P t. So, i that is why I am going to take summation over j equals to 1 to k and then it is L j t and then P j t here, right. So, this is the meaning of this expectation here. If you are now going to plug back this quantity here, what is this? And this is still i t equals to j and P j t, right. I have just plugged back the value of this estimator here. So, what is this quantity? So, this indicator remains whenever i t equals to j for all other with vanishes, but this i t is random quantity. So, again this quantity is going to be L of i t t. Wherever this guy is there, this term remains everywhere it is going to be 0, ok. So, with this let us proceed. So, this is we are going to call as first step. I am interested in bounding L i t. See in terms of the notation I am slightly messing up, notice that like what I am doing is when I say L t, L t is a vector and when I say L t i, this is the ith component of this vector. So, sometime I am writing it as L i t. For example, here I write. So, let us try to follow the same convention, ok. Let us take this. So, here it is just like which component you are going to treat it as random variable, right. So, here when you wrote this, this is this is you are fixing an i. For every i, this is a random variable, right. Now, looking at the expectation of this. Now, here yes, this is a random variable, but you are looking at the further taking expectation with respect to the i here. So, that is why whatever you got here, it itself still a random variable, right. And you see this why this is useful when I write the further steps, ok. So, let us take fix one particular action k. This is the total loss you are going to incur if you are going to deal with k th action all the if you are going to play the k th action. And this is the total loss you are going to incur if you are going to play as per your algorithm i t, algorithm that says to to plate i t in round t. Now, using this notation whatever I have here, what I can write this is going to be minus and what is this quantity I am going to write it as this sorry this should be k here. It is fine, I have written this loss difference in this to cumulative loss in terms of their expected values that are induced by the randomness of your algorithm, ok. Now, we are going to write this quantity over here in this following fashion, ok. Let me write it first and we will discuss, ok. So, I did some strange manipulation here. First notice that suppose you for for time being just forget this part, ok. If you just forget this part, I am basically this and this negate, if you for if you do this. Now, let us take this part. Notice that this part is an expectation here. So, if you are going to take exponential inside, then this quantity is an already constant because this is already taken to be expectation. So, this expect when you take this exponential on this, this is like a log of a constant, then it log of exponential simply becomes this quantity, which is on the left hand side, ok. And this is just going to see that this part if you just ignore because this is going to get nullified with. If you just focus on this part, it is nothing but log of exponent expectation of exponential of this quantity, but this exponential of this quantity is constant because it is already taken expected value. So, so this is simply going to be log of expectation of this quantity because this expectation will not matter because this is already constant. So, log of exponential will nullify, this log will nullify this exponential. What will end up is simply minus eta times this quantity, this eta is theta I cancel and you will just end up with this quantity. So, why we did this? Why do we did this circus? The circus is to make sure that we express them in terms of their moment generating functions. You know moment generating functions, ok or characteristic functions. So, it is basically log of expectation of exponential of that random variable. Here basically I have written this LTI to be the moment gen basically this first part is the moment generating function of this quantity LTI tilde. This is a random guy, right? This is basically the moment generating function of that. And now, what is this quantity? This is nothing but the mean even though I have written k here, but this is just an index, but this is nothing but the mean of this quantity. So, I basically subtracting mean from it and then looking at its moment generating function, ok. So, this we have expressed this quantity in terms of its moment generating function. Now, we will see that this moment generating function is easier to handle to bound, ok. Now, what we will do? We are going to handle each of these parts separately. So, let me call this as 1 and let me call this as 2. Now, we are going to bound. So, the second step is to bound 1. What is this quantity? Log of. We always love to write the things in exponential form because we get very tight bounds when we write them in exponential. If you remember like in when we wrote weighted majority algorithm, we had a tight upper and lower bounds on e to the power minus x, right. So, on e to the power minus x we had this nice bound. What was that? 1 minus x plus x square and also we had a lower bound 1 minus x. We will use similar things here also, ok. Let me see what is this quantity. So, this one I am going to write it as and this quantity. So, the first term into the second term this is going to be eta t, this term here eta t times I am only going to look into the expectation of t times t t and t k. I just expanded this. Now, sum, ok. So, this is the same argument yes. So, this is the same argument as I used earlier, right. Like this is like a constant already because expectation is taken. So, log of exponential will nullify and it will just be you this much. Now, let us try to see how to apply this bounds here. This is the log quantity, right. So, there is one more bound I am going to look. Log of x is upper bounded by x minus 1. It is true always for let us say x positive, right. This should always hold. Now, let us apply that logic here. So, if I am going to apply what is this is going to be exp. So, I am going to treat this entire thing as the expression is going to be expectation of I eta e exp and minus 1 plus eta t this correct. So, now, I am going to pull this expectation outside from both the terms. It is going to be exp minus eta t lti minus 1 plus L this correct. I am just manipulating things here. Now, I am going to go and appeal to this quantity here. So, what I will do? There is an eta t also here, right. I am going to treat eta t lti as my x and now this is e to the power minus x minus 1 plus x. So, e to the power minus x I am if I am going to take minus 1 here and plus 1 this is has to be upper bounded by x square by 2. So, this quantity will be upper bounded by eta t square lti square by 2, ok. So, it is fine. Now, what I am going to do? Anyway eta t square by 2 is these are constant. I will pull it out and now I am going to look at expectation of this it is a tilde it is a hat right. So, what is this expectation? Let us compute this expectation. This expectation is nothing, but over j L t hat j p t i p t j. This is the definition of this expectation and now if I am going to replace this quantity by estimator by this definition this is L t j indicator i t equals to j divided by p t j this whole square divided by p t j. So, doing some manipulation maybe you can just keep around go fast now. So, this quantity is going to give me what? Only this is going to remain only for that i t everywhere else it is going to cancel and it is going to give me 1 L i t i t square divided by p t i t. This is fine, ok. I do not know if I mention that we will assume that these losses are always in the interval between 0 1, ok. So, if we are going to make that assumption this guy is going to be 2 p t i t because this L t i t square is upper bounded by 1. So, fine what we will actually end up with is if I am going to substitute this quantity here this guy is eta t square by 2 times 1 upon p t i t, ok. So, this is the second step what is my third step? Third step is to deal with this guy here. So, let us take 1 upon eta t. So, by the way this the way this proof goes everything looks like what is happening, how these steps are all coming one after another right this manipulation. At least in the adversarial case in this look like steps I mean somebody came up with this, but like these are like standard steps in this way manipulation will come or the other way of manipulation things in this fashion will end up giving you the bounds that you are looking for. See like we are ended up with a regret bound which was like of orders square root n right. That means, it is this algorithm is making things learnable. Right because if I let n go to infinity I am learning that means, I am I am doing as good as my benchmark per round if I am allowed large number of rounds. But see what you are learning you are trying to learn something about which you are clueless. This losses are generated in an arbitrary fashion you do not know anything about that right you are not making any assumption. The only assumption is that made is like they are in the interval 0 1, but that can also be relaxed by normalizing. So, you are dealing with a very general scenario coming up with is a what is the right intuition to prove you this process is hard, but whatever the way based on this exponential weights based algorithm has developed and it looks like they have some standard way to go about proving these steps. Even though I am doing lot of manipulation here, but by and large you see that this are kind of similar to what we earlier did it for the weighted majority algorithm. So, it is it is good like we know the steps and if we can understand I mean at least if we are conversant with how this proof has gone through maybe in some other setup that we want to prove we can play with this steps and able to come up with about ok. In that regard it is important that we know the proof steps for this even though it is all math, but after it is not clear why it has to go in this fashion ok. But do follow with all the steps I am trying writing here so that later if you have to prove for something else maybe if you understand this already maybe you can see where to tweak the proof to get the bounds for your algorithm ok. Now, this one I am going to write it as so, what is this one minus 1 by theta t and what is this? This is expectation of i 1 2 you have k rounds e x p times minus theta l t i is theta theta and then p t i right this is the expected value. But now I am going to replace I know what is p t i now I am going to bring in how this p t i use in my algorithm. So, for all the steps I have written they are generic there is nothing special about d x p algorithm. So, now if I am going to use this p t i here so, what is p t i? It has been shown to be it has defined as e x p minus eta t l hat t i divided by summation of that quantity. So, now if you now I am going to take this numerator so, this quantity what is l t i hat? l t i hat was defined to be the cumulative sum till round t right. Now, if I am going to add this as well ok sorry. So, this at p t this is defined to be l t minus till the previous round not till that round. So, if I am going to add this l t i hat to this quantity it becomes l hat t is this correct I have just manipulated this. Now, I am going to just define this quantity to be phi of t minus 1 computed at t minus phi of ok where I am going to say define. So, if I am going to define phi t of eta to be this quantity then this ratio here I could write it as phi t minus 1 of computed at eta t and phi t computed at eta t. So, is this clear? Ok fine. So, then maybe for the remaining steps we will do it in the because this completes my third step there are just two more steps we will just do it in the next class fine.