***************************** Robot Environment Interaction ***************************** Robot environment ================= .. important:: A robot is not an isolated system. * It evolves in an environment, or world, which is a dynamic system. * The only way for the robot to obtain information about the environment is to use the embedded sensors. * The sensors will send measurements to the robot. * The measurements can be noisy. * Some things cannot be measured by the sensors. .. tikz:: :align: center :xscale: 80 :libs: shapes, patterns, arrows, patterns, backgrounds, chains, fit, positioning, calc, intersections, through, backgrounds \tikzstyle{block_red} = [draw,rectangle,thick,minimum height=2em, anchor=north west, fill=red, text width=2.6cm,align=center, rounded corners=2, fill opacity=0.3, text opacity=1] \tikzstyle{block_orange} = [draw,rectangle,thick,minimum height=2em, anchor=north west, fill=orange, text width=2.6cm,align=center, rounded corners=2, fill opacity=0.3, text opacity=1] \tikzstyle{connector} = [->,thick] \tikzstyle{line} = [thick] \node[block_red] (A) at (0,0){Agent}; \node[block_orange] (EN) at (0,-2.5){World}; \node[below] (U) at ($(A.east)+(1,0.6)$){action $a$}; \node (O) at ($(EN.west)-(1.5,0.4)$) {measurement $z$}; \node (n1) at ($(A.east)+(3,0)$) {}; \draw [connector] (A.east) -- ($(A.east)+(1.5,0)$) |- (EN.east); \draw [connector] (EN.west) -- ($(EN.west)-(1.5,0)$) |- (A.west); Consequently the robot needs to maintain an internal belief on the *state* of the environment. .. important:: Each interaction of the robot with the environment (using actuators, etc.) can change the environment. State ===== The state characterizes the environment and the robot. Some variables can have their value change over time, we called them **dynamic variables**. * The position of the robot. * Whereabouts of people in the area. Some variables are **static** such as walls. .. important:: We denote the state :math:`x`. As the state change over time, the state at time :math:`t` is denoted :math:`x_t`. .. admonition:: Activity :class: activity * Consider a 4-wheeled robot in a 3D environment. * Gives the variables defining the state of a mobile robot evolving in a 3D environment. * Then, gives a formal definition of the state. .. figure:: ./img/yaw_pitch_roll.jpeg :align: center .. admonition:: Definition :class: note A state :math:`x_t` is called **complete**, if the knowledge of the past states, measurements or controls carry no additional information that would help us predict the future more accurately. In the previous example, the state we defined is not complete. We are missing a few things: 1. The robot velocity. 2. The location and features of surrounding objects in the environment. 3. The location and velocities of moving objects and people. 4. Maybe some other variables. .. warning:: The definition of completeness does not require the future to be a deterministic function of the state. The future may be stochastic, but no variables prior to :math:`x_t` may influence the stochastic evolution of future states, unless this dependence is mediated through the state :math:`x_t`. .. figure:: ./img/complete_state_01.drawio.png :align: center Environment Interaction ======================= We can distinguish two types of interaction between a robot and the environment. 1. The robot can influence the state of the environement with its actuators. 2. It can perceive the state with its sensors. .. admonition:: Environment measurement data :class: note The measurement data at time :math:`t` will be denoted :math:`z_t`. .. figure:: ./img/data_z.drawio.png :align: center :math:`z_{t_1:t_2}=z_{t_1}, z_{t_{1+1}}, z_{t_{1+2}}, ..., z_{t_2}` denotes the set of all measurements acquired from time :math:`t_1` to time :math:`t_2`, for :math:`t_1 \leq t_2`. .. admonition:: Control data :class: note Control data is denoted :math:`u_t`. It represents the control action executed by the robot at time :math:`t`. .. figure:: ./img/data_u.drawio.png :align: center :math:`u_{t_1:t_2}=u_{t_1}, u_{t_{1+1}}, u_{t_{1+2}}, ..., u_{t_2}` denotes the sequences of control data from time :math:`t_1` to time :math:`t_2`, for :math:`t_1 \leq t_2`. When a robot executes a control action :math:`u_{t-1}` in the state :math:`x_{t-1}` it changes stochastically the state to :math:`x_{t}`. Probabilistic Generative Laws ============================= If the state change stochastically, then we could calculate the probability of generating :math:`x_t`: * We can use all the data accumulated until now: :math:`u_{1:t-1}, z_{1:t-1}`. * Then we need to consider all the past states: :math:`x_{0:t-1}`. .. note:: The control and measurement data start at :math:`t=1`. It is important to specify that the robot executes first the control action :math:`u_1`, then takes a measurement :math:`z_1`. .. admonition:: Activity :class: activity * Try to explain why we execute the action first, then measure. The evolution of state can be given by a probability distribution: .. math:: p(x_t|x_{0:t-1},z_{1:t-1},u_{1:t}) It can be visualize such as: .. figure:: ./img/action_sensing.drawio.png :align: center | .. admonition:: Example :class: example The problem: * Suppose a robot that needs to move from one room to another one. * There is a door that can be either closed or open. * The robot needs to know if the door is open before trying to move. .. figure:: ./img/door.PNG :align: center | Formally: * We denote the state of the door :math:`x`. * We denote :math:`P(x=\text{open}|z)` the probability that the door is open. * At time :math:`t=0` the robot doesn't know the state of the door meaning: .. math:: p(x_0=\text{open}) = p(x_0=\text{closed}) = 0.5 * The robot decides to stay (:math:`u_1 = \text{stay}`) and receive a first measurement :math:`z_1`. * From experience you know that: * :math:`P(z_1 | x=\text{open}) = 0.6` * :math:`P(z_1 | x=\text{closed}) = 0.4` What is the probability that the door is open? .. math:: \begin{align} P(x_1=\text{open}|z_1) &= \frac{P(z_1|x_1=\text{open})P(x_0)}{P(z_1)}\\ &= \frac{P(z_1|x_1=\text{open})P(x_0)}{\sum_{x_0'} P(z_1|x_0')P(x_0')}\\ &= \frac{0.6\times 0.5}{0.6\times 0.5 + 0.4\times 0.5}\\ &= 0.6 \end{align} * Now, we consider that the robot stay at the same position again (:math:`u_2 = \text{stay}` and receive the measurement :math:`z_2`. * From experience you know that: * :math:`P(z_2 | x=\text{open}) = 0.5` * :math:`P(z_2 | x=\text{closed}) = 0.6` * The probability that the door is open is: .. math:: P(x_2=\text{open}|z_2, z_1) = \frac{P(z_2|x_2=\text{open}, z_1)P(x_{0:1}|z_1)}{P(z_2|z_1)} We can see a first issue with this formulation. * The number of data increase with each time step. * Increasing the complexity of the probability distribution. **How do we solve this?** If the state :math:`x` is **complete** then :math:`x_{t-1}` is a sufficient summary of all previous controls and measurements up to this points (:math:`u_{1:t-1}, z_{1:t-1}`). Concretely, to calculate the probability of :math:`x_t`, we only need to consider :math:`u_t` if we already know :math:`x_t`. Meaning, .. math:: p(x_t|x_{0:t-1},z_{1:t-1},u_{1:t}) = p(x_t|x_{t-1},u_t) | If we want to visualize it: .. figure:: ./img/action_sensing_02.drawio.png :align: center | **What about measurement?** Again, if :math:`x_t` is complete then the probability to measure :math:`z_t` is :math:`p(z_t|x_t)`. .. important:: The **Markov assumption**: :math:`z_t` is conditionally independent of :math:`z_1,...,z_{t-1}` given :math:`x`. .. math:: \begin{split} P(x|z_1,...z_t) &= \frac{P(z_t|x)P(x|z_1,...z_{t-1})}{P(z_t|z_1, ..., z_{t-1})}\\ &= \eta P(z_t|x)P(x|z_1,...z_{t-1})\\ &= \eta_{1..t}\prod_{i=1..t}P(z_i|x)P(x) \end{split} .. admonition:: Activity :class: activity * Come back to the previous example and calculate :math:`P(x_2=\text{open}|z_2, z_1)`. * What can you conclude? Bayes Network ------------- What we discussed before lead us to the dynamic Bayes network. .. figure:: ./img/bayes_network.drawio.png :align: center | This figure illustrates the evolution of states and measurement through different probabilities. .. note:: The state evolve stochastically from time :math:`t-1` with :math:`u_t` to :math:`t`. The measurement :math:`z_t` is also stochastic. State transition probability The state transition probability specifies how the state evolves based on the robot control action :math:`u_t`. The probability is given by :math:`p(x_t|x_{t-1},u_t)`. Measurement probability The measurement probability specifies the probabilistic law according to which measurements :math:`z` are generated from :math:`x`. The probability is given by :math:`p(z_t|x_t)`, but the probability may not depend of :math:`t`, then it shall be written :math:`p(z|x)`. The measurement can be noisy. Belief ------ Another big thing in robotics is the notion of **beliefs**. A belief represents the knowledge (or belief) of the robot on the state of the world: * If you consider a possible state of the robot :math:`x_t` corresponding to a position. * The robot **cannot** know it's pose, positions are not measurable directly (even with the GPS). * The robot will need to infer its position from data (measurement). We need to distinguish the real state from the internal *belief* of the robot. Belief distribution A belief distribution is posterior probabilities over state variables conditioned on the available data. The belief over a variable :math:`x_t` is denoted :math:`bel(x_t)` or :math:`b(x_t)`. .. math:: b(x_t) = p(x_t|z_{1:t},u_{1:t}) .. note:: Because a belief represents posterior probabilities we calculate if after the last control action and last measurement received. .. figure:: ./img/belief.drawio.png :align: center We can different two steps in the belief calculation: 1. The *prediction*: :math:`b(x_t) = p(x_t|z_{1:t-1},u_{1:t})` 2. The *correction*: Updating :math:`b(x_t)` with the measurement :math:`z_t`. We can predict the future state using the state transition probability, but as we discuss earlier the control action is not perfect. So we need to use the measurement to update our prediction. .. important:: This distinction is very important! It means that before executing a control action, you can only calculate the prediction. Calculating the belief ====================== To calculate a belief :math:`b(x_t)` we are using **filters**. Bayes filter ------------ The most general one is the Bayes filter algorithm: .. figure:: ./img/bayes_filter.png :align: center | It is a general filter, because it applies the belief formula as previously defined. * The filter takes three parameters: * The belief at step :math:`t-1` * The action control :math:`u_t` * The measurement received :math:`z_t`. * The algorithm starts by predicting the new state using the control action (line 3), so the *prediction*. * Then update (or correct) the prediction using the measurement, so the *correction*. .. If the derivation of the belief formula is not clear, you can find the detail below. .. .. admonition:: Mathematical derivation .. Belief formulation: .. .. math:: .. b(x_t) = p(x_t|z_{1:t},u_{1:t}) .. Bayes rules: .. .. math:: .. b(x_t) = \eta p(z_t|x_t, z_{1:t-1}, u_{1:t})p(x_t|z_{1:t-1},u_{1:t-1}) .. Markov assumption: .. .. math:: .. b(x_t) = \eta p(z_t|x_t)p(x_t|z_{1:t-1},u_{1:t-1}) .. Total: .. .. math:: .. b(x_t) = \eta p(z_t|x_t)\int p(x_t|x_{t-1},z_{1:t-1},u_{1:t})p(x_{t-1}|z_{1:t-1}, u_{1:t})dx_{t-1} .. Markov assumption: .. .. math:: .. \begin{split} .. b(x_t) &= \eta p(z_t|x_t)\int p(x_t|x_{t-1},u_{t})p(x_{t-1}|z_{1:t-1}, u_{1:t})dx_{t-1}\\ .. &= \eta p(z_t|x_t)\int p(x_t|x_{t-1},u_{t})b(x_{t-1})dx_{t-1} .. \end{split} Application ----------- Let's try to use the filter on the previous example. We have a robot in front of a door that can be ever closed or open and the robot has two control actions available either doing nothing or push the door. At the time step :math:`t=0` the robot doesn't know if the door is open or closed, so it gives us: .. math:: \begin{align} b(x_0 = \text{open}) &= 0.5\\ b(x_0 = \text{closed}) &= 0.5\\ \end{align} .. figure:: ./img/belief_example_01.drawio.png :align: center We are defining the state transition probabilities as: +---------------------------------+--------------------------------+--------------------------------+------------+ | State :math:`t` | Control action | State :math:`t-1` | Probability| +=================================+================================+================================+============+ | :math:`x_{t} = \text{open}` | :math:`u_{t} = \text{push}` | :math:`x_{t-1} = \text{open}` |1 | +---------------------------------+--------------------------------+--------------------------------+------------+ | :math:`x_{t} = \text{closed}` | :math:`u_{t} = \text{push}` | :math:`x_{t-1} = \text{open}` |0 | +---------------------------------+--------------------------------+--------------------------------+------------+ | :math:`x_{t} = \text{open}` | :math:`u_{t} = \text{push}` | :math:`x_{t-1} = \text{closed}`|0.8 | +---------------------------------+--------------------------------+--------------------------------+------------+ | :math:`x_{t} = \text{closed}` | :math:`u_{t} = \text{push}` | :math:`x_{t-1} = \text{closed}`|0.2 | +---------------------------------+--------------------------------+--------------------------------+------------+ | :math:`x_{t} = \text{open}` | :math:`u_{t} = \text{nothing}` | :math:`x_{t-1} = \text{open}` |1 | +---------------------------------+--------------------------------+--------------------------------+------------+ | :math:`x_{t} = \text{closed}` | :math:`u_{t} = \text{nothing}` | :math:`x_{t-1} = \text{open}` |0 | +---------------------------------+--------------------------------+--------------------------------+------------+ | :math:`x_{t} = \text{open}` | :math:`u_{t} = \text{nothing}` | :math:`x_{t-1} = \text{closed}`|0 | +---------------------------------+--------------------------------+--------------------------------+------------+ | :math:`x_{t} = \text{closed}` | :math:`u_{t} = \text{nothing}` | :math:`x_{t-1} = \text{closed}`|1 | +---------------------------------+--------------------------------+--------------------------------+------------+ We also need to define measurement probabilities: +---------------------------------+--------------------------------+------------------------------+ | Measurement :math:`t` | State :math:`t` | Probability | +=================================+================================+==============================+ | :math:`z_{t} = \text{open}` | :math:`x_{t} = \text{open}` |0.6 | +---------------------------------+--------------------------------+------------------------------+ | :math:`z_{t} = \text{closed}` | :math:`x_{t} = \text{open}` |0.4 | +---------------------------------+--------------------------------+------------------------------+ | :math:`z_{t} = \text{open}` | :math:`x_{t} = \text{closed}` |0.2 | +---------------------------------+--------------------------------+------------------------------+ | :math:`z_{t} = \text{closed}` | :math:`x_{t} = \text{closed}` |0.8 | +---------------------------------+--------------------------------+------------------------------+ .. note:: The state transition and measurement probabilities depend of the problem and can either be calculated or are given by experts. .. admonition:: Example :class: example Suppose we have the following sequence: :math:`u_{1} = \text{nothing}, z_1 = \text{open}, u_{2} = \text{push}, z_2 = \text{open}`. If we use the Bayes filter to calculate :math:`b(X_1)`: .. math:: \begin{split} \bar{b}(x_1) &= \int p(x_1|u_1,x_0)b(x_0)dx_0\\ &=\sum_{x_0}p(x_1|u_1,x_0)b(x_0)\\ &=p(x_1|u_1=\text{nothing},x_0=\text{open})b(x_0=\text{open}) + p(x_1|u_1=\text{nothing},x_0=\text{closed})b(x_0=\text{closed})\\ \end{split} We can calculate for each state possible: .. math:: \begin{split} \bar{b}(x_1=\text{open}) &= p(x_1=\text{open}|u_1=\text{nothing},x_0=\text{open})b(x_0=\text{open})\\ &+ p(x_1=\text{open}|u_1=\text{nothing},x_0=\text{closed})b(x_0=\text{closed})\\ &= 1\times 0.5 + 0\times 0.5\\ &= 0.5 \end{split} .. math:: \begin{split} \bar{b}(x_1=\text{closed}) &= p(x_1=\text{closed}|u_1=\text{nothing},x_0=\text{open})b(x_0=\text{open})\\ &+ p(x_1=\text{closed}|u_1=\text{nothing},x_0=\text{closed})b(x_0=\text{closed})\\ &= 0\times 0.5 + 1\times 0.5\\ &= 0.5 \end{split} .. figure:: ./img/belief_example_02.drawio.png :align: center | Now, we can use the measurement to correct the predictions: .. math:: b(x_1) = \eta p(z_1 = \text{open}|x_1)*\bar{b}(x_1) For each state: .. math:: \begin{split} b(x_1=\text{open}) &= \eta p(z_1 = \text{open}|x_1 = \text{open})*\bar{b}(x_1 = \text{open})\\ &= \eta 0.6\times0.5\\ &= \eta 0.3 \end{split} \begin{split} b(x_1=\text{closed}) &= \eta p(z_1 = \text{open}|x_1 = \text{closed})*\bar{b}(x_1 = \text{closed})\\ &= \eta 0.2\times0.5\\ &= \eta 0.1 \end{split} Now we can calculate :math:`\eta`: .. math:: \eta = (0.3+0.1)^{-1} = 2.5 So we have: .. math:: \begin{split} b(x_1=\text{open}) &= 0.75\\ b(x_1=\text{closed}) &= 0.25\\ \end{split} .. figure:: ./img/belief_example_03.drawio.png :align: center | Now, we are repeating the process. .. math:: \begin{split} \bar{b}(x_2) &= \int p(x_2|u_2,x_1)b(x_1)dx_1\\ &=\sum_{x_1}p(x_2|u_2,x_1)b(x_1)\\ &=p(x_2|u_2=\text{push},x_1=\text{open})b(x_1=\text{open}) + p(x_2|u_2=\text{push},x_1=\text{closed})b(x_1=\text{closed})\\ \end{split} We calculate the prediction at time :math:`t=2` for each state possible: .. math:: \begin{split} \bar{b}(x_2=\text{open}) &= p(x_2=\text{open}|u_2=\text{push},x_1=\text{open})b(x_1=\text{open})\\ &+ p(x_2=\text{open}|u_2=\text{push},x_1=\text{closed})b(x_1=\text{closed})\\ &= 1\times 0.75 + 0.8\times 0.25\\ &= 0.95 \end{split} .. math:: \begin{split} \bar{b}(x_2=\text{closed}) &= p(x_2=\text{closed}|u_2=\text{push},x_1=\text{open})b(x_1=\text{open})\\ &+ p(x_2=\text{closed}|u_2=\text{push},x_1=\text{closed})b(x_1=\text{closed})\\ &= 0\times 0.75 + 0.2\times 0.25\\ &= 0.05 \end{split} .. figure:: ./img/belief_example_04.drawio.png :align: center | Then, we correct: .. math:: b(x_2) = \eta p(z_2 = \text{open}|x_2)*\bar{b}(x_2) For each state: .. math:: \begin{split} b(x_2=\text{open}) &= \eta p(z_2 = \text{open}|x_2 = \text{open})*\bar{b}(x_2 = \text{open})\\ &= \eta 0.6\times 0.95\\ &= \eta 0.57 \end{split} \begin{split} b(x_2=\text{closed}) &= \eta p(z_2 = \text{open}|x_2 = \text{closed})*\bar{b}(x_2 = \text{closed})\\ &= \eta 0.2\times 0.05\\ &= \eta 0.01 \end{split} Now we can calculate :math:`\eta`: .. math:: \eta = (0.57+0.01)^{-1} = 1.72 So we have: .. math:: \begin{split} b(x_2=\text{open}) &= 0.98\\ b(x_2=\text{closed}) &= 0.02\\ \end{split} .. figure:: ./img/belief_example_05.drawio.png :align: center | .. admonition:: Activity :class: activity 1. Implement the Bayes filter in python. 2. Generate randomly different sequences of control action, and measurement for the previous problem over :math:`t=1..10` steps. 3. Use the filter to calculate the belief at each time step :math:`t`. You should start with :math:`b(x_0 = \text{open}) = b(x_0 = \text{closed}) = 0.5`. 4. Plot the evolution of the probability for each state. Not as simple as it seems ------------------------- .. * We are using the Markov assumption to calculate the belief state. .. * The Markov assumption assumes that past and future data are independent if you know :math:`x_t`. .. It is a major assumption: .. * If you don't include some dynamics in :math:`x_t`, such as people moving in the environment .. * It you have inaccuracies in your state transition our measurement probabilities. .. * Approximation errors in your belief. Not every problem has discrete states. We need to talk about the belief representation. We used the Bayes filter on a problem with discrete states. Generally, it's not the case for real-life applications. * Implies that for continuous states, we have an integral in the calculation. * Usually, we need to approximate the belief state. * Thus the Bayes filter cannot be applied easily. .. figure:: ./img/bayes_robot.png :align: center When approximating a belief, we need to keep some things in mind: * Computational efficiency. * Accuracy of the approximation. * Ease of implementations.