.. _lecture7:

Lecture 7: Information and Entropy
++++++++++++++++++++++++++++++++++

.. note::

    *My greatest concern was what to call it. I thought of calling it
    'information,' but the word was overly used, so I decided to call
    it 'uncertainty.' When I discussed it with John von Neumann, he
    had a better idea. Von Neumann told me, 'You should call it
    entropy, for two reasons. In the first place your uncertainty
    function has been used in statistical mechanics under that name,
    so it already has a name. In the second place, and more important,
    no one really knows what entropy really is, so in a debate you
    will always have the advantage'.* -- Claude Shannon, 1971
   
.. warning::

   This lecture corresponds to Chapter 15 of the textbook. 
   

Summary
-------
.. attention::

   In this lecture, we look at how one can quantify information. We
   all have a fairly intuitive understanding of the amount of
   information available in one claim and can easily tell, among two
   claims, which one has the most information. For example, take the
   two pieces of information: "I live on Earth." and "I live in NY
   State". It is clear that the two pieces of information do not
   convey the same amount of new knowledge. Claude Shannon realized
   that the amount of information is proportional to the inverse of
   the probability for the claim to occur. In other words, if a less
   likely event takes place, you will get more information if someone
   tells you something about that event compared to learning more
   about a more likely one. 

   Formally, this leads to the definition of information :math:`Q` (in
   units of *bits* that has a probability :math:`P` to take place:

      .. math::	 Q=-k \log P.

   We understand the need for the logarithmic function: if you are
   given two independent statements, knowing the two statements
   increases the chances by multiplying the probabilities (the
   logarithm of a product is the sum of the logarithms).

   This leads to the notion of :index:`average information`, or :index:`Shannon Entropy`:

       .. math:: S=\langle Q\rangle=\sum_{i} Q_{i} P_{i}=-k \sum_{i} P_{i} \log P_{i}.


   This definition is reminiscent of Gibbs' definition of entropy we saw in :ref:`lecture6`.
   (the difference is that :math:`k` is no longer the Boltzmann constant).

   The big leap is that information, since it carries entropy, can be
   considered as a physical quantity (Rolf Landauer). After all, this
   is not surprising since, in thermodynamic, we defined entropy as a
   measure of the number of microstates a system can be in to realize a
   macrostate. This uncertainty (that is: lack of knowledge) is
   certainly related to information!

   Interestingly, this allows us to resolve the issue with Maxwell's
   demon related to the irreversibility of the Joule expansion we saw
   in the previous lecture. The demon must lose information and thus
   increase entropy during the sorting the gas molecules!

   Finally, in this lecture, we look into the issues of data
   compression and discussed a few examples of application of
   Bayes theorem for conditional probabilities:

   .. math::

      P(A \mid B)=\frac{P(B \mid A) \cdot P(A)}{P(B)}, 

   where we defined  :math:`P(A)`, :math:`P(B)` as the independent probabilities of
   :math:`A` and :math:`B` and  we further define :math:`P(A \mid B)` as
   the probability of :math:`A` given :math:`B` is true.  Likewise, :math:`P(B
   \mid A)` is the probability of :math:`B` given :math:`A` is true. 
   
   
Learning Material
-----------------

Copy of Slides
~~~~~~~~~~~~~~

The slides for Lecture 7 are available in pdf format here:  :download:`pdf <_pdfs/slides/lecture7.pdf>`
   
   
Screencast
~~~~~~~~~~


.. raw:: html

	 <iframe width="560" height="315" src="https://www.youtube.com/embed/LJhwFw4fRkA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

.. raw:: latex

 This lecture is available as a YouTube recording at :  \href{https://www.youtube.com/embed/LJhwFw4fRkA}{chapter 15}. 
 

Test your knowledge
-------------------

1. Consider the following two statements. (a) Students who graduate with a bachelor in physics do so by passing IQM and (b) Students who graduate with a bachelor in applied physics do so by passing IQM. Statement (a) occurs with probability :math:`P=1` and statement (b) occurs with a probability :math:`P=1/4`. What is the Shannon information of each statement, in bits (we use :math:`\log_2` basis and suppose :math:`k=1`)?
      A. :math:`Q_a=1` and :math:`Q_b=2` bits. There is more information in statement (b).
      B. :math:`Q_a=0` and :math:`Q_b=2` bits. There is more information in statement (b).
      C. :math:`Q_a=0` and :math:`Q_b=-2` bits. There is more information in statement (a).
      D. :math:`Q_a=0` and :math:`Q_b=-2` bits. There is more information in statement (b).

2. Mrs. Bonnie T. has three kittens. Two of them are male. What is the probability that the third one is a female? Assume each kitten’s sex is independent and equally likely.
      A. 75\%.
      B. 50\%.
      C. 37.5\%.
      D. 25\%.

3. Mrs. Bonnie T. has three kittens. The two tallest ones are male. What is the probability that the third one is a female? Assume each kitten’s sex is independent and equally likely.
      A. 75\%.
      B. 50\%.
      C. 37.5\%.
      D. 25\%.

4. The less you know about a system, the greater its entropy.
      A. True.
      B. False.
      C. It depends.
   

.. hint::
   
   Find the answer keys on this page: :ref:`answerkeys`. Don't cheat! Try solving the problems on your own first!