2. Maze Escape

Due date: November 10, 2023
In this assignment, you will solve a MDP problem using semi-gradient Sarsa.

2.1. Context

A robot lands in a random position in a maze and need to find the exit as soon as possible.
The maze has a size of at least \(10\times 10\).
The maze has only one exit.

2.2. Assignment

Model this problem as a MDP;
- Define the state space,
- Define the action space,
- Define the transition function,
- Define the reward function.
Implement this problem as a Gym environment as seen in labs.
- Inherit from the Gym environment class.
- Implement all the necessary function of the class (step, reset, etc.)
Implement Semi-gradient Sarsa.
Calculate the optimal policy and plot the evolution of the expected value function.

2.3. Submission

You need to submit on Moodle the following:

A latex document with the problem model as an MDP.
Your code in a python file. The filename should be your last name followed by asn-sarsa.
Your plots.

2.4. Academic Integrity

Any cheating/plagiarism will be sanctioned by a zero and an automatic report.
No exception will be allowed.
You can find the academic integrity policy here: Academic integrity.
A list of non-exhaustive things that are considered cheating/plagiarism:
- Submitting someone else code. Even with citations!
- Asking someone else to do the code or write the report.
- Submitting someone else report.
- Etc.