************* Probabilities ************* Why are we talking about probabilities? ======================================= .. rst-class:: bignums-tip 1. Working with mobile robots, it's working with uncertainty. * Uncertainty can come from the error in the motion control. * Uncertainty can come from the measurement errors of the sensors. 2. To reduce the uncertainty, we need an explicit representation of the uncertainty. * The uncertainty is often represented with probability theory. Probabilistic inference is the process of calculating these laws for random variables that are derived from other random variables and the observed data. Discrete Random Variables ========================= .. important:: This is something that you should already know! * We denote :math:`X` a random variable. * And :math:`x` a value that `X` could take. * :math:`p(X=x)` or :math:`p(x)` represents the probability that :math:`X` takes the value :math:`x`. .. admonition:: Example If you flip a coin that you can obtain ever *heads* or *tails*. :math:`p(X=head)=p(X=tail)=\frac{1}{2}` Discrete probabilities always sum to 1: .. math:: \sum_x p(X=x)=1 .. note:: Probabilities are always non-negative: :math:`p(X=x)\ge 0`. Continuous Random Variables =========================== You will see that in robotics, we usually address estimation and decision-making in continuous spaces. * We denote :math:`X` is a random continuous variable. * We assume that all continuous random variables possess *probability density functions* (PDFs). .. admonition:: Activity :class: activity Can you give me one common density function? A very common one: * The one-dimensional normal distribution with mean :math:`\mu` and variance :math:`\sigma^2`. * The PDF of a normal distribution is a Gaussian function. Gaussian function **PDF**: :math:`p(x)=(2\pi\sigma^2)^{-\frac{1}{2}}\exp\{-\frac{1}{2}\frac{(x-\mu)^2}{\sigma^2}\}` **Abbreviation**: :math:`\mathcal{N}(x;\mu;\sigma^2)` :math:`x` is a scalar value. .. admonition:: Example :class: example .. code:: import matplotlib.pyplot as plt import numpy as np import scipy.stats as stats import math mu = 0 variance = 1 sigma = math.sqrt(variance) x = np.linspace(mu - 3*sigma, mu + 3*sigma, 100) plt.plot(x, stats.norm.pdf(x, mu, sigma)) plt.show() .. figure:: ./pyplots/gaussian.png :width: 80 % :align: center If we have more than one parameter: * :math:`x` becomes a multi-dimensional vector. * Normal distributions over vectors are called multivariate Multivariate normal distribution **PDF**: :math:`p(x)=\det (2\pi\Sigma)^{-\frac{1}{2}}\exp\{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)\}` **covariance matrix**: :math:`\Sigma` is a *positive semidefinite* and *symmetric* matrix. As discrete probabilities, continuous probabilities always sum to 1: .. math:: \int p(x)dx = 1 .. admonition:: Example :class: example .. code:: import numpy as np import matplotlib.pyplot as plt from scipy.stats import multivariate_normal from mpl_toolkits.mplot3d import Axes3D #Parameters to set mu_x = 0 variance_x = 3 mu_y = 0 variance_y = 15 #Create grid and multivariate normal x = np.linspace(-10,10,500) y = np.linspace(-10,10,500) X, Y = np.meshgrid(x,y) pos = np.empty(X.shape + (2,)) pos[:, :, 0] = X; pos[:, :, 1] = Y rv = multivariate_normal([mu_x, mu_y], [[variance_x, 0], [0, variance_y]]) #Make a 3D plot fig = plt.figure() ax = fig.gca(projection='3d') ax.plot_surface(X, Y, rv.pdf(pos),cmap='viridis',linewidth=0) ax.set_xlabel('X axis') ax.set_ylabel('Y axis') ax.set_zlabel('Z axis') plt.show() .. figure:: ./pyplots/multivariate.png :align: center :width: 80 % Joint and Conditional probability ================================= Joint distribution **Formula**: :math:`p(X=x \text{ and } Y=y) = p(x,y)` Independence **Formula**: :math:`p(x)p(y) = p(x,y)` **Definition**: Describes the probability of the event that the random variable :math:`X` takes on the value :math:`x` and that :math:`Y` takes on the value :math:`y`. Conditional probability **Formula**: :math:`p(X=x|Y=y) = p(x|y)` **Definition**: Knowing :math:`y`, the probability that :math:`X=x` is conditioned to :math:`y`. If :math:`p(y)>0` then the conditional probability is defined as: .. math:: p(x|y) = \frac{p(x,y)}{p(y)} If :math:`Y` and :math:`X` are independent: .. math:: p(x|y) = \frac{p(x)p(y)}{p(y)} = p(x) Law of Total Probability ======================== From the definition of the conditional probability and the axioms of probability measures is referred to as *Theorem of total probability*. In the discrete case: :math:`p(x)=\sum_y p(x|y)p(y)`. In the continuous case: :math:`p(x) = \int p(x|y)p(y)dy`. Bayes formulas ============== Bayes rule **Discrete case**: .. math:: p(x|y) = \frac{p(y|x)p(x)}{p(y)} = \frac{p(y|x)p(x)}{\sum_{x'}p(y|x')p(x')} **Continuous case**: .. math:: p(x|y) = \frac{p(y|x)p(x)}{p(y)} = \frac{p(y|x)p(x)}{\int p(y|x')p(x')dx'} Why is this important? ====================== .. rst-class:: bignums-tip 1. If :math:`x` is a quantity that we would like to infer from :math:`y` (like a position). 2. The probability :math:`p(x)` will be referred as the *prior probability distribution* and :math:`y` is called the *data* (sensor measurement). 3. :math:`p(x)` summarize the knowledge we have on :math:`X` prior to incorporating the data :math:`y`. 4. :math:`p(x|y)` is called the *posterior probability distribution* over :math:`X`. The Bayes formula can be formulate as: .. math:: p(x|y) = \frac{p(y|x)p(x)}{p(y)} = \frac{likelihood \times prior}{evidence} .. note:: :math:`p(y)` does not depend on :math:`x`. Thus :math:`p(y)^{-1}` is the called the normalizer and denoted :math:`\eta`. :math:`p(x|y) = \eta p(y|x)p(x)` .. admonition:: Activity :class: activity You are planning a picnic today, but the morning is cloudy. * Oh no! 50% of all rainy days start off cloudy! * But cloudy mornings are common (about 40% of days start cloudy) * And this is usually a dry month (only 3 of 30 days tend to be rainy, or 10%) **What is the chance of rain during the day?** * We will use `Rain` to mean rain during the day, and `Cloud` to mean cloudy morning. * The chance of `Rain` given `Cloud` is written `P(Rain|Cloud)`. Conditioning ============ We can condition the Bayes rule on more than one variable. For example, we can have a condition on :math:`Z = z`: .. math:: p(x|y,z) = \frac{p(y|x,z)p(x|z)}{p(y|z)} as long as :math:`p(y|z)>0`. It also means that :math:`p(x|y)=\int p(x|y,z)p(z|y)dz`. Similarly, we can condition the rule for combining probabilities of independent random variables on other variables :math:`z`: .. math:: p(x, y | z) = p(x | z) p(y | z) Such a relation is known as *conditional independence*. And it is equivalent to * :math:`p(x|z) = p(x|z,y)` * :math:`p(y|z) = p(y|z,x)` Expectations of random variables ================================ The expected value of random variable is denoted :math:`E[X]`. * You can think of it as the "average" value attained by random variable. * In fact, it also called its **mean**. Expected value **discrete case**: :math:`E[X] = \sum_x xp(X=x)` **continuous case**: :math:`E[X]=\int xp(x)dx` .. admonition:: Activity :class: activity Calculate the expected value of a roll of die. Covariance The covariance measure the squared expected deviation from the mean. :math:`\text{Cov}[X] = E[X-E[X]]^2 =E[X^2]-E[X]^2`