Skip to main content
Ctrl+K
CSCI-531 Reinforcement Learning - Home
  • Reinforcement Learning Course

Introduction

  • Introduction
  • Multi-armed bandit
  • Multi-armed bandit - Action Selection methods
  • Let’s practice
  • Advanced Reinforcement Learning Concepts - Appendix

MDP

  • Markov Decision Process
  • From MDP Definition to Solution: Why We Need More
  • Policies and Value Function
  • Dynamic Programming
  • Let’s practice

Reinforcement Learning

  • Monte Carlo Methods
  • Temporal-Difference Learning
  • Applying our first RL algorithms!
  • Approximation - On-Policy
  • Policy Gradient Methods

Deep Reinforcement Learning

  • Introduction to Neural Networks
  • Deep Q-Learning

Assignments

  • Cliff Walking
  • Frozen Lake

Outline

  • Computer Science 531 - Reinforcement Learning

Reference

  • Glossary
  • Repository
  • Open issue

Index

Symbols | E | P | T | V

Symbols

  • : A policy is a mapping from states to probabilities of each possible action.
  • : A sequence of states and actions.
  • : The value function of a state s under a policy \pi is the expected return when starting in s and following \pi thereafter.
  • : The world that the agent interact with.

E

  • Environment

P

  • Policy

T

  • Trajectory

V

  • Value Function

By Dr. Jean-Alexis Delamer

© Copyright 2023.