Frozen Lake

Frozen Lake#

Important

Due date: TBD

Context#

In this assignment, you will implement TD(0) and MC apply to the Frozen Lake problem, and analyze the results by comparing their convergence and policy performance.

Assignment#

Part 1: Implement Temporal Difference (TD) Learning#

Task: Implement TD(0) and MC algorithms.
Environment Setup:
- Frozen Lake: Two versions - \(4\times 4\) and \(8\times 8\).
- Use a discount factor ( \(\gamma = 0.9\) ).
Deliverables for Part 1:
- Both algorithms TD(0) and MC.
- A plot showing the value function convergence over episodes.
- Analysis of how the different parameters affect the convergence:
  - learning rate (\(\alpha\)).
  - etc.

Part 2: Comparison and Analysis#

Task: Compare the performance and behavior of both methods (TD(0) and MC).
Required Analysis:
- Discuss which method converges faster and why.
- Analyze under what conditions one method outperforms the other.
- Provide a table comparing key metrics, such as:
  - Number of episodes for convergence.
  - Sensitivity to hyperparameters (e.g., learning rate, episode length).
  - Stability of results.
Deliverables:
- Report with comparisons and insights.
- Plots showing side-by-side performance (e.g., convergence speed, value estimates).
- Explanation of potential trade-offs between both methods.

Evaluation Criteria#

Correctness of Implementations: 40%
- TD(0) and MC methods are implemented correctly and produce expected results.
Analysis and Insights: 30%
- Depth of analysis in comparing the methods.
- Clear presentation of convergence behaviors and performance metrics.
Code Quality: 20%
- Well-commented and organized code.
- Proper use of libraries and good programming practices.
Presentation: 10%
- Clear plots, tables, and visualizations.
- Well-written report with proper formatting and insightful observations.

Submission Guidelines#

Code files (.py) uploaded to Moodle.
Report in PDF format with relevant plots and analysis.

Resources#

Sutton & Barto, “Reinforcement Learning: An Introduction”
Documentation for environments like OpenAI Gym (optional but useful).

Frozen Lake

Contents

Frozen Lake#

Context#

Assignment#

Part 1: Implement Temporal Difference (TD) Learning#

Part 2: Comparison and Analysis#

Evaluation Criteria#

Submission Guidelines#

Resources#