 Morning hi 9 30 a.m. Maybe a little bit later. I Haven't slept Monday. I thought oh what I'm teaching today. Well Thursday, right? And so I was thinking to teach one lesson and then I'm like, oh, no, why not teaching something new right? When do I prepare the lesson well overnight of course? So here you go new lesson for you. I hope you I hope it's gonna work because I even haven't tried This lecture actually came together yesterday. I went to this by torch developer something conference And I start explaining to people there today lesson and so Small thingy here. So yeah, this one is like the summary for today class Which it was like written on a on the on the on the iPad thingy And then you can see the whole thing below and then hold on Something else. So if you if you're watching this video not live right now, I recommend you to actually go through these tests Over here. So there are quizzes, right? There are four quizzes on Twitter. You can try to solve before watching the video if you're live Okay, I'm gonna get you're gonna get the answer right now Okay, that's it So let's start talking about today lesson Hopefully this stuff is gonna work You see my screen yes, okay and very good. So today we gonna be talking about planning and control a Three-part story what why three part because I I guess at this point. We're gonna have three lessons on this topic, right? usually I Usually been teaching only part two and three and then you've been complaining every year. Oh We don't understand anything So today we teach part one which didn't exist yesterday today exists. So hopefully Hopefully it makes sense. Okay, just you know, bear with me. Be kind Okay, I hope you like it I really spent a Lot of energy behind this So today the the main topic is gonna be model predictive control We're gonna be learning what this thing means, okay In particular in specific case and we're gonna be talking about the back propagation Through the kinematic equation well through the state transition equation, okay So and then we're gonna be minimizing some Energy right with respect to the latent so So So you already seen the stuff in the previous Labs in the energy based model right whenever we have a latent variable We do minimization right in in the latent in the latent space to find the minimum energy, right? So this is the same stuff would been seen over and over again across the whole semester but For a new topic, right? So you you we use the same concepts We've been learning across along the semester, but in a different environment. So this is like the beauty I think of this framework. We've been given the we've been given to you you can think about other things different, you know Aspects of the machine learning field or even this one today control in terms of these block diagrams that we have been teaching and developing right Second part is gonna be I guess next week is gonna be the truck backer upper. So in the first case, we go forward in the second case we go backward In this case, we can be learning an emulator of the kinematics from observation So what are these terms right? We don't know yet. It's too many words But I'm gonna give you the plan nevertheless such that when you watch again this video for the second time You actually have like a plan of action The point is that in the first case we assume that we have these equations someone give you, you know, physics In the other case, we are actually doing machine learning, right? In the second case, we are learning what is this Function these equations from observation, right? So yeah observations. You do regression. You're gonna find the function. So machine learning yay And then given that we have these we train another function right another function that is gonna give us The Control or we don't even know what is this control. So again bear with me So this doesn't have to make sense. Okay, this is just the plan for you to watch later Whenever you actually know what these things are you're not supposed to know these things So by the end of today lesson, you will be able to know the first three points over here Finally in the last lesson, we're gonna be putting together variational out encoders actually variational predictive conditional network variational predictive Predictive network right variational conditional predictive network. There you go. So we put that together with some stochasticity like with some measure of Uncertainty and we also add stochasticity to the to the environment. So this is like a mess Okay, so this is like really really many things but nevertheless everything I'm gonna be talking this PPUU which is planning and Prediction under uncertainty prediction and planning under uncertainty All the things are gonna be throwing in there. You will know. Okay. So although this PPUU Prediction and planning under uncertainty is gonna be like a mess of things, right? We're gonna be using every tool basically that we taught you during the semester Which is gonna be a super complex not complicated or complicated. I don't know which one is it like many things together, right? It's like a whole system That you will be able to understand. Okay Okay All right, so the this PPUU has a stochastic environment Which means I'm gonna be throwing inside this environment other agents which are gonna be you know behaving Random like not randomly. We have not con we don't have control over. Okay, so this stuff is gonna be just messing with you with us and Finally, we're gonna be learning a policy which tries to minimize the uncertainty with which it takes Actions, okay, so it's gonna be really like And this is borrowing from the Bayesian Bayesian neural networks, which is So interesting. Okay. So this is my plan for you for the end of the semester, right three classes three major topics The first one been developed overnight. I hope it works. Let's see get started. Are you ready? You should you should be excited to like, you know, yeah, okay, it's just thumb up. You have those pop pop Now you're someone is clapping. You don't have the poppy. What's called it? Ah, yeah, there you go. Okay. Thank you. All right cool. Oh And latent decoupling there was a third part in the PPUU. Okay, that's awesome crazy. So We remember from the energy based mothers last slide. We said that the choice of latent Size was a big deal. Right. Do you remember why tell me why it was The choice of the dimension of the latent So tricky Do you remember right? energy based model second second lab second Second lab we had less Yes, exactly. Raul is saying with the generate solution. What does he mean the generate solution? It's correct. Tell me more. Oh Zero flat energy fantastic. That's correct Camilla So and so we had to find a way of restraining the information content in the latent, right? And so in this case here, we are also doing something similar, right? This latent is gonna be in the way. It's gonna be the mother is gonna cheat and so we are managing to decouple the The interaction of this late and again super cool super Super crazy in the future lesson. Okay, stay tuned All right Let's start with this topic and I hope it's gonna be clear and and and it's correct state transition equation evolution of the state Many words. What is this time? So we start with an arrow that shouldn't be there, but okay See this broken the slides state transition equations. We start with these Formula on the top at the top left hand side, which is gonna be this ball X We know the ball X is our input observation. Whatever rain our input variable There's a dot on top of these X. We figure out what that means very soon It's equal a function F of These same X without the dot comma and the you okay You just already can tell the the color of the U is orange. So you may already know what you is. Okay? What is you? Tell me in the chat By the color right I've been always using orange for what is orange? Latent variable. Yes. Yes, you're following cool So you is just a latent variable, right? Nevertheless, we are gonna call it you not not said because because we are using we are moving now to the control Topic your control field. Okay, so they use you for the latent can cool So X is gonna be called state of the system and You is gonna be our control. Yeah, again, no no big deal. It's like an input, right? Like Z the latent is an input. Do we observe Z? the answer Answer me. Do we observe Z? No Do we observe you? No, okay cool same stuff, right? Do we observe X? Yes, okay usually Not today All right, so notation definition. What is this stuff? So the dots on top of a variable First of all, it means it's a function of a function of time, right? So these axes here are X of Parentheses T right parentheses T means it's a continuous function of the continuous variable P The dots on top means it's a temporal derivative, right? And so the if I expand this notation, which is compact, which is Convenient, it becomes the DX of T in the T, right? So the temporal derivative of the Function X in the continuous temporal domain T is going to be equal equal to the function F of These X function of T, right? A fun temporal function Function of the continuous time T and these control you function of T, okay All right. What next? So, oh, okay We're gonna be drawing some diagrams on the right hand side. So first we're gonna be coming up with this car Okay, what does a car need? Well a car needs wheels, right? Otherwise doesn't go anywhere So we have two wheels there and then we're gonna have oh, okay, it's not a car It's my tricycle. Okay, so the why is that because it's a simpler to to to model Okay, then I will tell you how to do to to use this model for a actual car But for the moment, let's think about, you know, you have your tricycle Um It's like a bicycle but has tricycle. Is it called tricycle tricycle? Tricycle or tricycle? I don't know. Okay. Anyway What do we need here? We need a reference system such that I can start drawing things. So the axial Distance right the distance between my Front and rear wheels is gonna be called capital L Moreover we have that the front wheel is orange Fact that it's orange What does it mean? Why is orange? Tell me why it's orange Because it's the latent. Okay. Sure. It's the control. Okay. So the actuators the the the thing that actually moves this thing Forward is the actual orange wheel How do you move forward? How do you move forward on a tricycle? Have you ever rode a tricycle? How do you move it? Yeah, no, we don't know. How do you move this tricycle? Yeah, yeah, the front wheel, but you had to pedal, right? Usually if you don't pedal, you don't move, right? So you had to pedal and then you can steer the thing, right? So your control is in the front wheel in this case. So it's orange All right. Moreover, it's tilted, right? So it's tilted at some angle phi positive to the left You can see now there is an axis of that wheel. No, if I extend that axis I get this line over there And then we are gonna get That we have an angle which is the same angle, right? Because it's orthogonal You have the orthogonal line is intersect is Euler, right? Those are, you know, these things from from primary school And then we're gonna get the intersection over there. It's gonna be called xc, yc And this is called the instantaneous center of rotation. Okay, instantaneous center of rotation. What does it mean? Now you should have that arrow curved arrow point appearing there means that this tricycle will move on this circular with a circular motion, right? We're centering this xc yc Pay attention now. That's very important this Not bold x And not bold y are completely different things from the pink bold x. Okay, so there are two different things pink bold x One thing is the state lower case Non-bold x or y's are some numbers scalars, right? Which are representing coordinates in this in our case I know we use x and y for other things, but they are different colors and they are not bold So you shouldn't get confused Moving on What is now this radius? Can you tell me what is the radius of this circle? Did we do trigonometry in high school? Tell me in the chat Some trigonometry. What is rc? Pythagoras You can do Pythagoras, but I don't know the the the length, right? Oh the the hypotenuse hypotenuse. I don't know you call it in English rc You have to put spaces because I cannot read one whole thing. So what do we have here? Can you tell me we have l, right? We have l we have phi So if you have phi you have l What is l divided by rc? yes That's tan of phi very good. So Rc means as someone said before who was jeffrey Rc means it's going to be l divided by the tangent of phi. That's correct And so we have these radius of rotation cool We're gonna be using this in a second Well, maybe I forgot to actually write it down on the slide, but okay, it will appear later eventually So we said this is a tricycle. Maybe we like more cars. So What what does it happen? What happens if you have two wheels, right in the in the front side? So if you have two wheels The orientation like the angle of these two wheels Or is the the same as it means like or you have a wheel over here And a wheel over here. So you have a front axle Axle which is tilting, right? You know, like maybe a tractor. Maybe I think you have the whole axis Axle that is tilting, right? And so if you have that one The wheels are going to be like moving on a basically on a circle, right? And then it's going to be exactly the same both wheels will always have the same axis, right? So if you have a wheel like a a wheel over here And a wheel over here and these wheels are moving this like like that, right? Then this is exactly the same model Usually our cars don't have such thing, right? Our cars usually have two wheels that are just steering like that Are they steering the same amount? No Why not because if they would steer the same amount one wheel would actually, uh, what's called in English There's like Skidding sk I double the ing skid. Okay. Yes, skid camilla. That's correct. I don't know this word. I learned it yesterday All right, cool So what what do we do then we talk about hacker man steering? What is hacker man steering? You have your wheel there and then You're gonna have your axis Axis of the wheel now. I don't know if you can see it's very light gray Which will go through that point xc yc And so you're gonna have a phi one angle And then you have the other wheel on the bottom with another axis I don't know if you see it's like it's very faint and then we are gonna have a phi two angle Okay, and of course they have to be different right because both axes will need to go Meet that xc yc if you actually want to avoid Skidding Okay Anyway, we just think about the tricycle. It's just easier and without these additional flies So We now Tell you what is this x the ball x state variable and the u control variable, right? So x the pink ball x is going to be the collection of four items Lowercase italic not ball x lowercase italic not ball y Theta and s and they means they mean respect they mean respectively Position x coordinate x because according to y Angle theta and then s for speed is just a scalar that can be positive and negative but doesn't have a direction Um, the direction is given to you by the theta Otherwise, you could have like a velocity x velocity y Yeah, so you have two different possible parameterization of these x you can have x x y Vx vy for the velocity in the x direction and velocity in the y direction x and y Or you have this other representation, which is like polar you have x and y cartesian Then you have theta and then s for the speed in in magnitude, but with sign U is going to be instead the Vector of parameter phi single phi right so you can think about the hacker man Somehow middle option middle phi whatever, but otherwise just tricycle single phi and then a is going to be the acceleration We're going to be Adding to the system again to the wheels Okay, so we need to write down some equations So if I am the if theta is zero What is going to be the velocity? On the x axis If theta is zero, we are horizontal. What is the speed in the x direction? Well this direction x No, no, no, no, we we have already the number letters here, right? So x and y are a coordinate theta is the Angle orientation and s is the speed, right? I'm telling you right now theta. It's zero. Okay So my speed is s I'm asking now. What is the x component of the speed? Mittish and rule are answering s. That's correct. So if my if my car theta is horizontally oriented, right my X velocity the dx is going to be exactly s. Okay. What is my y velocity the one that goes upwards? Zero, that's correct. Very good. Now. Let's say I have an arbitrary angle Let's say I am like 30 degrees something like that right some angle theta. What is going to be my x velocity in this case? Given that I have a theta angle s cos theta. That's very good. What is my y velocity? s sin theta. That's perfect Finally We said that Um, this is a little bit. Maybe I should write it down. Maybe not. I don't know Um, maybe you're right. Let me write it down. Okay improvising I didn't plan Okay, let's see if it works One moment, please Okay, wrong screen. There we go And so you have that what is the velocity the the speed that we have here So, okay, what is the space that I I go around? Given that I have a specific Angle, right? So if I move by an angle phi or whatever angle alpha have alpha And then I have This rc if I multiply alpha times rc I get basically the displacement right that my vehicle moves, right? So similarly, I'm asking you now if I have a specific angular angular This is omega angular velocity omega if I multiply this by rc. I'm going to be getting what? the actual Speed right s Right. So if I divide this by dt on both sides, so this was this was my single Angle that I moved times the rc, which is the radius I get my displacement across the Line, right? In this other case, I have omega Which is my velocity angular velocity times rc. I'm going to get the speed until here everything should be fine Omega we said now Let me clean that bar there. This is annoying Eraser Okay, all right What is omega? We said now that omega is going to be my Temporal like my velocity, right? So it's going to be the d theta Over dt, right? Oh, this is same as writing Uh, theta with a dot. Okay, so my question now It's going to be how much is theta dot Well, theta dot if you from this equation, tell me how much it is. That's simple, right? We just divide s by rc, right and rc. Yeah and rc. How much was it rc? You told me before, right? It was l It was tan tan tan phi divided by l, right? That was what you told me before from three econometry. And so Moving forward, we have this one Oh the last one. I didn't ask you. Okay, too bad What is the last one where the last one is going to be the Derivative of the velocity which is the acceleration. Okay, that was the easier one. Okay, so do this equation make sense So the first one tells you what is this is vx, right? This is the x component of the velocity. This is the y component of the velocity This is the angular Velocity and this other one is simply the the derivative of the speed Okay, very good. And so now we have these equations That are describing the emotion of this tricycle or the hacker man variant here when We have a given phi and a which are these control signal over here, okay And then we have said that these x here represents the state or the configuration Of the system, right? And in this specific case, we have the x location y location angular orientation like theta and then the speed And today we've been doing physics, right? But mechanics. What is this stuff? This stuff is mechanics, I guess Questions so far Yeah, I know some way. Okay one day people should figure out a way to differentiate state x and direction x So as you can see from my slide The state is pink and bold. Okay whereas the Non-bold and non-pink that's the other x, right? So it's really different symbols Okay, just use colors in your brain All right questions so far are we are we okay? Are there is there anything that you Feel it's not clear so far. Are we okay? Yeah, give me thumb up thumbs up. I I seriously don't know. I haven't slept Yes, okay. Very good. All right moving on What do we do? We clean up the screen and we're going to be introducing A name here so the left hand side. It's a differential equation. What is a differential equation? We have that our function right x is a function. Remember also in the convolutional network Loud we talk about x that usually we talk about as data point we learn that those are actually Functions right on a domain omega, which was a discrete Space, right? So we kind of like discrete things because we use computers, right? This stuff is just continuous things that are like real physics, right actually to be honest The real world is actually discrete, right? If you study quantum physics, you know, there is a Like a q amount of Discrete amount of time. There is no continuous time but again Whatever, right? That's very tiny amount of time Uh, so yeah, we can imagine that normal world is continuous, right? Uh, what is the question here? It is easier to write that. Yeah, yeah, sure. So on the left hand side We have this differential equation, right? Uh, but then I'm going to be introducing this Difference a question. Okay, so this is similar but different So in this case, I discretize our system and pay attention now That I use square brackets to I think to to specify that the x function it's a Function of this of the discrete temporal index t so t can only be 0 1 2 3 is a Natural integer. Okay, cannot be it's not continuous on the right hand side, right? So maybe I should have used a different letter, but okay, whatever usually is In digital signal processing we use n or in control system we use k, right? Okay, I use t because we've been using t as well enduring the class So here we have that x t Minus x t minus one is going to be equal to this f Times d dt, right? So I just put the dt on the on the right hand side Which should be maybe a delta t And then I I move on the other side of the equal that x t minus one, right? But basically is the same stuff And so I'm going to be writing the same equations now with this discretized version So I have x of time temporal time t. It's going to be x t minus one plus Something right, which is the s cosine of my previous theta s sine of the previous theta and then I have this s divided by l and tangent of the current angle of the wheel and then the other one is going to be the Acceleration times delta t All right, so let's see how this stuff works. So let's imagine we have u It's going to be my my my bold u vector orange It's going to be this collection of points. So these phi and a are functional of the time, right discrete Function of the discrete temporal index t, right? And so both of these are going to be zero, right? So all those dots on the index diagram on the top left, it tells you that all the values from t equal one to t equal 10 The uh, the value of phi and a are zero both of them, right? So let's start now We actually figuring what are these dimensions, right? So u square bracket means the dimension dimension Right, and so phi is the angle. So it's in radians and then a is going to be the acceleration. So meters per square second Moreover, we said that x bold vector x, which is my state In this case, it's going to be x and y in blue I'll tell you later why and then the angle theta and the speed s What are the dimensions of x? So again, I'm going to be writing the square bracket around x I'm going to have meter meters radiant For the angle and then meters per second for the velocity. Okay, so this is how we Write things in physics Why do we do this such that we know that we are not making mistakes, right? So you always want to have these dimensionality checks all the time Let's start now with my initial condition x naught equal zero zero zero one. So everything is zero. We are in the original We are at zero orientation and our velocity is one meter per second. Okay So what happens if your vehicle is moving at one meter per second and is in horizontal position and no one Does any action no acceleration? No braking no steering What is going to be the motion? Of this item, tell me stationary In a Russia. Yeah, there is a specific word. I'm looking for that. This is correct, but I'm not sure it Uh, what type of motion we have there is a specific name. I think at least in italian then I don't know Straight line in x direction. Yeah Okay uniform. Thank you. So this is a uniform linear motion Uniform a rectilinear motion. That's what we call it. I don't know in english how you call it, but this stuff moves Uniformly in the x direction, right and there you go. Boom. So we start a dislocation over here, right? And then we move on the right hand side One meter per second Every every second and also here you can see a box which is kind of representing the car but the car is also Long one meter. So it's overlapped like the tricycle overlap the size of the tricycle overlaps with the actual Next one, so it looks like a big rectangle. We figure out how this works very soon Cool, what happens now if I change my control Here and we have the acceleration It's some negative number and we still start with this one meter per second on the horizontal position. What happens Still centering zero oriented in on the on the right hand side at zero degree, right? One meter per second one meter per second initial velocity and I Press the brake What happens to this thing? That's a deceleration. That's correct. And so that's what you see over here, right? We started here Before we were ending up In a 10, right? So before we had 10 on the final destination here and now instead Oh And now instead we end up at more or less 5.5, right? over here And so you can see now this Boxes start overlapping because we are breaking, right? So the the next shape Starts a bit earlier and they are not no longer Exactly one after each other. So you can see now how the car just breaks and basically stops over here If I would keep breaking this stuff would go backwards, right? Because I'm actually giving a negative acceleration constantly, right? Cool. Finally, what happens if my control is the following? So for the first part of the first first five time temporal indexes 1 to 5 My steering it's a positive 0.2 radians Value and then I switch from 6 to 10 with a negative 0.2 radians What is going to and I start still with this one meter per second Velocity in the x direction, right? What happens here sine wave ish? Yeah, that's correct Uh cosine minus cosine wave ish, right? Anyway, so we turn left Until half way through and then I turn right And then you can see now that the final car here is actually a horizontal, right? Because I undid the all the other steps, right? But then we moved here up there So guess what we haven't done any machine learning not control. This is just running things forward And we have some time so How about I show you the notebook, right? Why did this and at 8 not at 10 because the length of the path Is the same, right? So the length of the path it's it is The length of the path is 10, right? And this one You Curly that right if you if you have a string of length 10 and now you you curl it up then It will not be as long as the straight string. Okay, very good Um Do we have time for this? I don't know. I'm gonna try and then I hope I'm not gonna be screwing up. Okay Uh, so here I import some libraries some plotting stuff. We don't care. We have this State transition a question. We just described and we figure out what these things are And so here's the implementation of this f function. Okay, so here we go. We zoom a little bit So f is going to be a function of the state variable x and they control you Uh, I don't know what this t is Time there is no time Uh, I mean usually these functions can be time Time varying this one is not time varying. Okay, it doesn't change if you have so the full expression is Uh x dot is a function of x u and time in this case time doesn't come to be Uh in the actual equation the functions are function of time It doesn't it doesn't matter if you do it today tomorrow yesterday or one week from now. Okay, so it's stationary Uh x is going to be our states x and y theta and s I told you already how to write these theta's right you do backslash Theta and then you press tab Okay Even for the code then you have view is the control it is the time we don't have it f is the kinematic model x dot again backslash dot no again this is a backslash dot Uh, oh no actually sorry x backslash dot Tab, okay, and I use this also in the in the in the code And then here we have this x equal x t minus one so previous x plus f applied to previous x time uh any u and then and t right All right, so my l the length of this vehicle is going to be uh one D t is once uh one meter d t is one second x y theta and s are going to be my x variable And theta and a are going to be my control So we have that my f initially is all zeros and then I fill up the things and you have f zero is going to be s um multiplied by the cosine of theta times d t f one is the s times the sine of theta d t f two we said is s divided capital l right and this should be capital l Uh times the tangent of phi times d t and then the last one is going to be a times d t right all the dimension Also check and so f I return f which is this uh populated version of this vector I have drawing routines. You don't care And so here we drive the uh car manual right so this is what I show you right now so for example, I can have this one which is all ones and I all of them are minus zero point one and then I set all these Fies to zero right so this is the breaking case right so if I break um Yeah, so if I break you have this one here right so you have the breaking line Uh And how do we call this all right then we help let's also have the s such that we can see the everything right So we have the s here Right so the s we said that we have all zeros and then for the first five steps We're gonna be turning to the left by zero point two radiant For the other last five points. We've turned to the right with zero point two And then let me turn on the s here Okay And so this one you have this all right so you can play along with this stuff over here and lines disappear now never mind Um How do we do that? I have a range here right for t In a range 10 so zero to nine I have x I remember right we said we have one meter per second right so we have zero x zero y Zero theta one meter per second right and so I simply have a for loop where I say my x is going to be my f of the previous x and ut Where ut is going to be in these you know control signal I created over here right times dt Uh Oh, okay, so my bad. There's no dt inside here right so this is actually the f function My bad all right, so this doesn't have the dt because This is just the f so no this one. There we go Uh Okay, and this is the dt which is equal to one second And then I create the trajectory which is simply appending these x's right and then I have this Uh trajectory in in torch. I just stack all this and then I plot the trajectory and that's how I I plot these things over here, right So this is how we can generate this one by starting from this Uh f Equation, okay. Are we good? Are there questions? There's no no no magic so far Are we okay? Give me thumb up thumb down Are we fine? Thanks. All right move on moving on Uh, let's see if I can go back to the presentation. Yes. It works amazing. Okay. I didn't know What is this course? How will this thank you camilla? Exactly exactly exactly How this I will this relate to deep learning now that second part of the class in the next 15 minutes Are you ready? Are you excited? Are you curious? Yes Yes, okay How about uh, how about we Aim to go here No, so how about we decide we want to go to a specific destination What are going to be now the actions the control the latent Which will take me to where I want to go, huh? Can you can you can you can you figure out you have an idea? Yeah, yeah use npc But if you don't know what npc is right So, I mean people that know already the answer don't say people that don't know the answer, right How do you how do you get to destination? The answer is correct, but People don't shouldn't know Like It's the people that don't know the actual answer right how do you get how do you get to destination? Okay, it's a bit weird right if i'm asking to say If i'm asking to those that don't know to answer that they don't know so they won't answer right so Okay, it won't work right so I all right. I tell you how this works Okay, so enters the kelly brison algorithm the one that young talk about he also talk about the hidden figures movie I really recommend I loved it. So what is this kelly brison algorithm? But it's just break back prop right so they came up independently with back propagation in the 60s. I think 62 Back propagation through time plus gradient descent not stochastic gradient descent gradient descent But first of all, let's recap rnn recurrent neural networks with the energy diagrams Okay, so we start with this recap of energy based models Perspective of recurrent neural network. Also again, uh, this stuff was like a quiz on on twitter Monday Anyway, the decoder white tilde c y is always the same stuff, right? We are always seen that then we had the different things can go inside the decoder Uh, we have an x. So this is conditional right x is my pink bolt Thing so it's it's also shaded. So it means it's observed So if you have x on one side, you have y on the other side. What do you need in between? How do you go to one space to the other? How do you go from x to y? What do you need? You need a How do we move spaces? How do we go from x to y? What do we need? No encoder Encoder stays in the same space. Okay, how do we call it when we move from one side to the other? So encoder goes from your space to the hidden space And then from the hidden space you can use the decoder to go down, right? We use a predictor. Yes D. Oh, this is the first time you talk I think I've never seen you very good. Yes We need a predictor. So x goes inside a predictor very good Then this predictor Gives you a hidden representation. Okay, cool. This looks like a predicted network But then We said is a recurrent neural network. What do we have to do now? Since it's our current you need to Tell me come on Input the previous state. Thank you. So now we introduce. I think I really show you this last time This uh, zeta z. This is a z in in in greek, right? So this is the same as writing z, right this stuff here This is like z minus one. This is the unit delay, okay Uh, this is comes from the z transform, uh, which is used to Deal with discrete time signals I cannot use z because we used z for the latent, right? So I had to use greek z Zeta, okay, still it's just one temporal thing in the past, right Finally, uh, what's what's required? Uh, how do we start our recurrent network? The first thing you do when you start the recurrent network you What do you do? Can you just use a recurrent network? No What do you need to do first? Thank you. You had to put a zero At the beginning and so I use it. I I I draw it there So what is this thing over here? There's two dots with a with a wiggle in the in the thing. That's a symbol for a switch Okay, we already seen this in the Uh gun lecture, right? So this is a switch. You can decide to connect to the Uh upwards to the to the to the delay version of the hidden Or you can connect it downwards to the initial value for the hidden, right? Just decide our side whenever Um, we want actually That thing could be connected to another circuit, which is uh, it's going to be like a delta for Center in in zero. So the delta means tick that connect to the zero just in zero and then to go back to the Um to to the other one, right? Or we can use a heavy side step, right? So it's always um, let me think a Inverted heavy side steps going to be one one one one one one one. So it's always connected to the zero and then when we move to to to To actually start it goes down and then you can actually start using the thing But anyway, it's just an you understood right and talking about um blabbering So what are the equations? Oh, yeah, and then there is the connection, right between the hidden and the decoder again We already seen this stuff, but never seen in this flavor. I think I think it It is helpful So we have here the rnn equations But what are the rnn equation as you told as you just tell me as you have just told me right now We have the initial value for the hidden Value right hidden layer is going to be zero, right? So I I use the square bracket again to indicate that h Is a function of the discrete temporal index In this case zero, right? So it only existed specific location And I set it to zero Green right the vector zero Then what is going to be the equation of the rnn? What is ht? Well, if we adjust I just write down what is written on the diagram. So ht h at the discrete time for an index t is going to be my predictor output Which is the predictor fed with the h at t minus one, right? So this first dot here this dot over here It's ht that goes through this module becomes ht minus one, right? This is the unit delay module And then comma x which is my observation Finally again, we have this decoder Which gives me the y tilde, which are you gonna? I'm gonna have basically a spring between the y tilde and the y cool How do we train our recurrent neural network? Well, there are two steps. There's backdrop through time, right? So we Unroll the stuff to to to to to to in the future And then to to to to to to you compute all the gradients. We already went through this stuff And then you perform stochastic in the sense with respect to the predictors and decoder Parameters right to match x and y We we know this stuff Now enters the new diagram, right? It's gonna be the control diagram So I clean up the screen and let's see what happens there So I still have an x but now I have an x not right x initial x Hmm interesting, right? Then the x is we now goes inside the predictor and gives me An x Which is no longer bold. Sorry. It's no longer shaded, right? So this circle here Is black. So It's not observed. What is observed is this x not the only original Or initial condition that one we observed before, right? The one meter per second Horizontal moving to the right And so this x here is going to be the future x which also depends Which is going to be also used for, you know, the next iterations, right? So He needs the connection over there And then we have this additional U no the z latent variable, right? So this is the diagram of this control Huh, very very similar to the RNN, but the things are swapped, right? So if I show you the other one, right? This one here had the Initial zero, right? The initial condition is zero my variable Observed variable is what controls this, you know Trajectory of the hidden state And then in the other in the other case here, I have just a initial condition Which is my input my observation and then I have this latent Which is controlling this trajectory of the state of the system Very similar, right? But it's swapped The input was this one, right? And this was the Like reset now there is no reset. It's well, it's a set, right? We we specify initial condition and we have this control Cool, I think So we have the initial value for x so x of temporarily index zero is equal defined to be this x naught my initial conditions And then I have my x t is going to be my predictor Output whenever I feed it with a x t minus one Which we we have through this temporal Unit time unit temporal delay and this z, right? What is z? Orange it's latent, right? It's the same as the control u Why do I call you u is control z is this energy based things so What was the objective? The objective was to get to that destination, right? And so How do we perform optimal control? What is this optimal control? This is inference, right? So here we do backprop through time exactly as a rnn But then we do gradient descent not stochastic gradient descent with respect to z To go from x naught to y to the target, right? So we start a temporal time index equal zero here. I have my initial x Okay, so that's x naught Then I have my control z Together with the previous x gives me the current x and this happens at temporal time index one Then I have another one for the temporal index time two I have another one for temporal time index number three. I have another one for 10 four What what what happened here? Let's say I run out of space. So now we want that the final location x Meach match the target y, right? So I want now this one To give me a cost. So I put a spring I put a msc between my final target and my final location Like my target and my final location on my trajectory And that was a time temporal index five So This is how it looks uh, whenever I have The the blue one it means no control So I go on a straight line same initial condition and then I have my x target over here And so here I just did back propagation through time and then gradient descent in the use space in the control space To minimize the distance between the final location and this point over here See, this is like with five data points right with five temporary. Sorry five temporal indexes Then if I have six temporal indexes You have to break now because this destination happened wait before right? So if you go on a straight line, you're gonna Would end up here now we have to kind of break to get here, right? What happens with seven you have to break a lot, right? And actually we passed The we missed there the thing what happens with t equal eight Well, we pass too much and so the tricycle will just try to Go back to the original location steer like crazy and then actually everything goes bunkers Right, so if you pass the thing The more you go away the more these msc will increase and to minimize the msc The control is going to be tried to steer like crazy and everything is going to just you know break As you can see here. Yes, we can minimize the We could reduce the the the step right the the inference step Anyway, so what happens in this case in this case? I also say that I want the final speed to be equal zero So not only go through this point, but basically park there What can we do right? We can do many other things for example, we can minimize Some we can have a cost for every item right so perhaps we may have intermediate points Or we try to have like the average like the the distance on average going to be close to y we can try to do Have we can try to have multiple costs over on the also over on the previous temporal Time indexes, okay And so for example here a few example that I try is going to be like minimize the average distance neither neither quadratic So this is basically like l1 uh, and so this one actually managed to break And then it actually backs up right if you can see here Uh, maybe you know just maybe just break it breaks a lot right so this one really really break break break break a lot Oh in the last case instead, I performed the soft mean over those multiple energies right these quadratic distances and so here also we managed not to diverge Now class is officially over and you can leave if you choose so and I it's totally fine But I'm going to be showing how this is done on the notebook Okay, if you're curious curious and then you can ask questions if you want but again, we are late But I didn't even plan to teach this stuff. So I'm glad they actually worked Anyway, so how does it work? So here I defined different type of costs the first one tells you that I have my target. It's going to be my xx and my xy target uh The state is going to be my collection of all those x's right, so I take the last state the x location I want a square difference square equivalent distance between my target x and then I have the plus the square equivalent distance With the y coordinate right so you can see last state right state minus one zero the the x location with the xx and then the y location with the x1 Then the one with the cost with the speed I also add at the end The final speed so the last state Last item if you remember it was x y theta Speed so the last item is the speed. I also have the square speed at the end, right? So this one counts as well. What is the final speed I have then here other things right? This is like the the cost the sum of the the mean of the Of the distances. This is the the mean of the energies The square distances and this is like the uh soft mean, right? The log sum x the blah So how does this planning work right and this is the machine learning? Well, it's not learning. Ah, this is not learning This is inference, right? It's inference That getting the scent in a latent space, right? This is inference latent variable inference the thing we already seen So blah blah blah you is going to be my nm parameter such that I can have gradients with respect to you, right? It's going to be a tensor of T items, right? Uh, like I have T rows temporal index one two three four until 10 two items Phi and and acceleration right angular Angular angular orientation of the wheels and the and the acceleration I have a optimizer which is Gd no sgd my bad. So forget about the s Uh dT is equal to one this collection of the cost is nothing. So I have four It's not epoch. So in step in steps, right? So this is not epoch Uh, I have that x is going to be The first value the original x, right? And then here Oh, okay, the epochs are used for the uh For the it shouldn't be called epoch. I think epoch is used for training not for inference, but okay So here I have for t goes in the range from one to 10 basically, right? I append here these, uh X minus one so the the previous values, right? So this this thing The previous x plus the function at previous x and Current t basically So this should be ut, right? Uh, maybe I should change the range here times dT So here I just build all the trajectory Then we stack then we uh, we have this style. That is my my my my trajectory My cost I compute the cost as I show you before for example Just the distance between the last item and my target, right? Then, uh, I optimizer zero grad I do backward and this backward basically, what does it do? Well, it computes the partial derivative of the cost With respect to the u signal, right? Then I do optimizer step, which is going to be stepping in the opposite direction of the gradient in this latent space That's it. So we do gradient descent in control or latent space That's finished. I five lines, right? We already know everything in here, right? There's no magic. We all we know already everything, right? so in this first case I just use the vanilla cost to reach this x location and so you can see boom it goes immediately and then below I have a few Uh examples where I have several different number of steps And this is the case. I just show you in the in the before right in the in the slides So this one I just have my x target here and I have five steps in this case. I have six steps. I have to break In this case, I have seven steps and I have to really break a lot and then eight actually here we mean we reduce the The step so it never actually breaks. It never actually, um Uh diverge. Okay. Here you can see how we managed to get to destination This one instead is using this cost with the target speed. So this one actually has zero Well, it tries to get zero final speed And it also just works fine. I think again because we Play a little bit with the hyper parameters Yeah, and you can see here or you can clearly see here how it reaches an end, right? You see this, right? So all the last cars here, they all share the same position, right? They all they reach they it reached an end, right? It reached a stall is it called stall? Whereas before the other one Okay, also this is stopped because it has all these things, right? But I think it's more clear here Here you really both in both cases. It's stopped and then the last two cases is going to be using the sum of the distances And then the other one the soft mean Okay This is the one with the soft mean So that was the lesson for today. Okay The lesson was about Not learning learning. We're gonna be learning. We're gonna be learning about learning We learn about learning next time Today we learn how we can control a Agent by using gradient descent in the latent space or control space to reach a final destination given that you have initial condition x naught What if we don't have access To the f function, right? How about we don't know how to compute that kinematics a question, right? F the thing I show you here, right? How about we don't know, right? So today we come up with this one from physics because we are we know physics Let's say we actually these are an approximation, of course, right? So you have a Real-case scenario you observe some vehicle moving somehow How can we come up with the kinematic equations, right? How can we come up with this f? state update Equations, right? So this is gonna be next lesson next week we're gonna be learning how to learn a Kinematic emulator From data and also here. We just did gradient descent in the action space, right? but we if you remember in the auto encoder lesson or in the target prop you have learned about Amortized inference remember What is amortized inference? Well, instead of performing the optimization problem that is taking time we just learn a encoder Which is giving you the target z right the optimized z guess what we're gonna be doing exactly the same, right? It's exactly the same thing. We we we've been doing The same steps, right? So this is like we started with latent variable energy based model This in specific case was a conditional latent variable energy based model Next time we're gonna be doing a auto encoder Kind of right. We're gonna be learning How to end up how to come up with this? Well, we're gonna be learning this amortized inference How to find out the perfect you right and this is gonna be called a policy or controller And we also learn f so tomorrow next week we learn all these functions Week afterwards we put inside stochasticity uncertainty and latent decoupling And that was it. I think I succeed although I run out of time Well for the one that actually stick with me Congratulations. You endure A crazy me for one more day Next time I will sleep so I will be less crazy Thank you for being with me If you don't have any other question, I wish you a happy nice third day Enjoy the weather. Although it's like two degrees Celsius. It's like it's winter It was 20 degrees At noon and it was two degrees in the evening. Okay, it's crazy weather Anyway, thanks again. Bye