2. Maze Escape

  • Due date: November 10, 2023

  • In this assignment, you will solve a MDP problem using semi-gradient Sarsa.

2.1. Context

  • A robot lands in a random position in a maze and need to find the exit as soon as possible.

  • The maze has a size of at least \(10\times 10\).

  • The maze has only one exit.

2.2. Assignment

  1. Model this problem as a MDP;

    • Define the state space,

    • Define the action space,

    • Define the transition function,

    • Define the reward function.

  2. Implement this problem as a Gym environment as seen in labs.

    • Inherit from the Gym environment class.

    • Implement all the necessary function of the class (step, reset, etc.)

  3. Implement Semi-gradient Sarsa.

  4. Calculate the optimal policy and plot the evolution of the expected value function.

2.3. Submission

You need to submit on Moodle the following:

  • A latex document with the problem model as an MDP.

  • Your code in a python file. The filename should be your last name followed by asn-sarsa.

  • Your plots.

2.4. Academic Integrity

  • Any cheating/plagiarism will be sanctioned by a zero and an automatic report.

  • No exception will be allowed.

  • You can find the academic integrity policy here: Academic integrity.

  • A list of non-exhaustive things that are considered cheating/plagiarism:

    • Submitting someone else code. Even with citations!

    • Asking someone else to do the code or write the report.

    • Submitting someone else report.

    • Etc.