REINFORCEMENT LEARNING: Lecture - 3: ITERATIVE ALGORITHMS & SINGLE AGENT PATH PLANNING IN FOMDPs. (Fully observable MDPs). Sanjeev Sharma - Founder & Co-Owner - searching-eye.com , Undergraduate Indian Institute of Technology Roorkee.
CONTENTS:
Optimal Value Functions, Bellman Optimality Equation, Relation b/w Optimal Action value function and Optimal State-Value Function, Policy Evaluation, Policy Iteration, Value Iteration, Policy Improvement, Agent Path Planning in Static Environment in FOMDPs.
DESCRIPTION:
In this lecture first of all I mentioned few things from the previous lecture. Then I provided an introduction to optimal policies, details about the relationship b/w the Optimal State Value Function and OPtimal Action Value Function. Then I mentioned the Bellman Optimality Equation for both the State-Value function and Action-Value Function. Then I provided a brief overview of my Example in Agent\'s Path Planning in Static Environment in Fully Observable MDPs. Then I provided the details of 4 most important algorithms i.e. Policy Evaluation, Policy Improvement, Policy Iteration and Value Iteration.
AGENT PATH PLANNING: For Path Planning of agent, in this lecture I used the Policy Evaluation Algorithm for FOMDPs. See the path planning channel
Link to this comment:
All Comments (0)