************************ :math:`O(1)` Hash Tables ************************ In some application, we want to guarantee a :math:`O(1)` worst-case cost. Suppose that all :math:`N` items are known in advance. * If a separate chaining implementation could guarantee that each list had at most a constant number of items we would be done. * We know that as we make more lists, the lists will be shorter on average. * If we have enough lists, then we could expect no collisions at all. This approach creates two problems: 1. The number of lists might be large. 2. We could still have collisions. Perfect Hashing =============== Perfect hashing was theorized to solve the second problem. Suppose that we have :math:`TableSize = M`, with :math:`M` sufficiently large to guarantee that the probability fo collision is at least :math:`\frac{1}{2}`. When there is a collision, we clear out the table and try with another hash function. If there is another collision, we repeat the process with a third hash function. Etc... **The expected number of trials will be at most 2** (since the probability of success is :math:`\frac{1}{2}`). How large :math:`M` needs to be? -------------------------------- Remember that :math:`M` is the number of lists/cells in the hash table. We should have a table of size :math:`M=N^2` to guarantee a probability of :math:`\frac{1}{2}`. .. admonition:: Theorem :class: important If :math:`N` balls are placed into :math:`M=N^2` bins, the probability that no bin has more than one ball is less than :math:`\frac{1}{2}`. .. admonition:: Proof :class: note * We call it a collision when two balls :math:`i` and :math:`j` are placed in the same bin. * Let :math:`C_{i,j}` the expected number of collisions produced by any two balls :math:`i` and :math:`j`. * The probability of any two specified balls to collide is :math:`\frac{1}{M}`. * Thus :math:`C_{i,j} = \frac{1}{M}`. * The expected number of collisions in the entire table is :math:`\sum_{(i,j), i