Project

Important

It is an individual project spread over the whole semester, therefore the amount of work expected is high.

Introduction

You will need to define the Markov Decision Process of the Sokoban.
Then you will need to define and use a reinforcement learning algorithm to solve it.
At the end of the project you will submit your code and a report explaining your model, your algorithm and its performance.

The code must be done in python.
The Sokoban must be implemented as a Gym environment.
You can find more detail about the API here: Gym API.
Your RL algorithm must use the Gym Environment that you implemented and any other Gym environment.
You can save and load a policy.
You can load a Sokoban map/level and use it.

The project is divided in different steps with different due dates.

Step	Due Date	Worth
Project due	November 30th	80%
Final presentation	Last day	20%

It’s a 15 minutes presentation.
No overtime will be accepted.
Common rules about presentations:
- You must stand in front of the class.
- You should not read your slides.
- The slides must not contain too much text.

The detailed marking scheme for the project can be found here: Project
The detailed marking scheme for the final presentation can be found here: Project

Important

Any cheating/plagiarism will be sanctioned by a zero and an automatic report.
No exception will be allowed.
You can find the academic integrity policy here: Academic integrity.
A list of non-exhaustive things that are considered cheating/plagiarism:
- Submitting someone else code. Even with citations!
- Asking someone else to do the code or write the report.
- Submitting someone else report.
- Etc.