2. Maze Escape
Due date: November 10, 2023
In this assignment, you will solve a MDP problem using semi-gradient Sarsa.
2.1. Context
A robot lands in a random position in a maze and need to find the exit as soon as possible.
The maze has a size of at least \(10\times 10\).
The maze has only one exit.
2.2. Assignment
Model this problem as a MDP;
Define the state space,
Define the action space,
Define the transition function,
Define the reward function.
Implement this problem as a Gym environment as seen in labs.
Inherit from the Gym environment class.
Implement all the necessary function of the class (
step
,reset
, etc.)
Implement Semi-gradient Sarsa.
Calculate the optimal policy and plot the evolution of the expected value function.
2.3. Submission
You need to submit on Moodle the following:
A latex document with the problem model as an MDP.
Your code in a python file. The filename should be your last name followed by asn-sarsa.
Your plots.
2.4. Academic Integrity
Any cheating/plagiarism will be sanctioned by a zero and an automatic report.
No exception will be allowed.
You can find the academic integrity policy here: Academic integrity.
A list of non-exhaustive things that are considered cheating/plagiarism:
Submitting someone else code. Even with citations!
Asking someone else to do the code or write the report.
Submitting someone else report.
Etc.