8. Eligibility-traces - Algorithms

In this lab you will implement you first algorthim using eligibility-traces (Related topic).
You will need to reuse the code that you produced during the previous labs.

8.1. Sarsa(\(\lambda\))

Implement Sarsa(\(\lambda\)) algorithm.
Create a function sarsa(env, eps, alpha, lambda, T, E):
- env: the gym environment.
- eps: is the \(\epsilon\) parameter.
- alpha: is the step size parmeter.
- lambda: is the decay parameter.
- T: is the maximum number of steps.
- E: is the number of episodes.
The function needs to return the policy, the history of the sum of rewards for each episodes and the weights vector.

8.2. Experiments

For different \(T\) and \(E\):
- Run the algorithm.
- Draw the evolution of the sum of rewards for both algorithms.
Draw the weights vector.
Now compare with the policy calculated using Semi-gradient Sarsa.