Probability with Measure

1.3 Measure

Definition 1.3.1 Let \((S, \Sigma )\) be a measurable space. A measure on \((S, \Sigma )\) is a mapping \(m: \Sigma \rightarrow [0, \infty ]\) which satisfies
- M(i) \(m(\emptyset ) = 0\),
- M(ii) (\(\sigma \)-additivity) If \((A_{n})_{n\in \N }\) is a sequence of sets where each \(A_{n} \in \Sigma \) and if these sets are mutually disjoint, i.e. \(A_{n} \cap A_{m} = \emptyset \) if \(m \neq n\), then
  
  \[m\left (\bigcup _{n=1}^{\infty }A_{n}\right ) = \sum _{n=1}^{\infty }m(A_{n}).\]

M(ii) may appear to be rather strong. Our earlier discussion about length led us to \(m(A \cup B) = m(A) + m(B)\) and straightforward induction then extends this to finite additivity: \(m(A_{1} \cup A_{2} \cup \cdots \cup A_{n}) = m(A_{1}) + m(A_{2}) + \cdots + m(A_{n})\) but if we were to replace M(ii) by this weaker finite additivity condition, we would not have an adequate tool for use in analysis, and this would make our theory much less powerful.

The key point here is, of course, limits. Limits are how we rigorously justify that approximations work – consequently we need them, if we are to create a theory that will, ultimately, be useful to experimentalists and modellers.

Basic Properties of Measures

1. (Finite additivity) If \(A_{1}, A_{2}, \ldots , A_{r} \in \Sigma \) and are mutually disjoint then

\[ m(A_{1} \cup A_{2} \cup \cdots \cup A_{r}) = m(A_{1}) + m(A_{2}) + \cdots + m(A_{r}).\]

To see this define the sequence \((A_{n}^{\prime })\) by \(A_{n}^{\prime } = \left \{ \begin {array}{c c } & A_{n} ~\mbox {if}~1 \leq n \leq r\\ & \emptyset ~\mbox {if}~n > r \end {array} \right .\) Then

\[ m\left (\bigcup _{i=1}^{r}A_{i}\right ) = m\left (\bigcup _{i=1}^{\infty }A_{i}^{\prime }\right ) = \sum _{i=1}^{\infty }m(A_{i}^{\prime }) = \sum _{i=1}^{r}m(A_{i}),\]

where we used M(ii) and then M(i) to get the last two expressions.
2. If \(A, B \in \Sigma \) with \(B\sw A\) and either \(m(A) < \infty \), or \(m(A) = \infty \) but \(m(B) < \infty \), then
\(\seteqnumber{0}{1.}{1}\)
\begin{equation} \label {mdiff} m(A-B) = m(A) - m(B). \end{equation}

To prove this write the disjoint union \(A = (A-B) \cup B\) and then use the result of (1) (with \(r=2\)).
3. (Monotonicity) If \(A, B \in \Sigma \) with \(B \subseteq A\) then \(m(B) \leq m(A)\).

If \(m(A) < \infty \) this follows from (1.2) using the fact that \(m(A-B) \geq 0\). If \(m(A) = \infty \), the result is immediate.
4. If \(A, B \in \Sigma \) are arbitrary (i.e. not necessarily disjoint) then
\(\seteqnumber{0}{1.}{2}\)
\begin{equation} \label {munion} m(A \cup B) + m(A \cap B) = m(A) + m(B). \end{equation}

The proof of this is Problem ?? part (a). Note that if \(m(A \cap B) < \infty \) we have

\[m(A \cup B) = m(A) + m(B) - m(A \cap B).\]

Now some concepts and definitions. First, let us define the setting that we will work in for all of Chapters 1-??.

Definition 1.3.2 A triple \((S, \Sigma , m)\) where \(S\) is a set, \(\Sigma \) is a \(\sigma \)-field on \(S\), and \(m:\Sigma \to [0,\infty )\) is a measure is called a measure space.

The extended real number \(m(S)\) is called the total mass of \(m\). The measure \(m\) is said to be finite if \(m(S) < \infty \).

We will start to think about probability in Chapter 3. A finite measure is called a probability measure if \(m(S) = 1\). When we have a probability measure, we use a slightly different notation.

We write \(\Omega \) instead of \(S\) and call it a sample space.

We write \(\cal F\) instead of \(\Sigma \). Elements of \(\cal F\) are called events.

We use \(\P \) instead of \(m\).

The triple \((\Omega , {\cal F}, \P )\) is called a probability space.

Examples of Measures

1. Counting Measure

Let \(S\) be a finite set and take \(\Sigma = {\cal P}(S)\). For each \(A \subseteq S\) define

\[ m(A) = \#(A)~\mbox {i.e. the number of elements in}~A.\]
2. Dirac Measure

This measure is named after the famous British physicist Paul Dirac (1902-84). Let \((S, \Sigma )\) be an arbitrary measurable space and fix \(x \in S\). The Dirac measure \(\delta _{x}\) at \(x\) is defined by

\[ \delta _{x}(A) = \left \{\begin {array}{c c} & 1 ~\mbox {if}~x \in A\\ & 0 ~\mbox {if}~ x \notin A \end {array} \right .\]

Note that we can write counting measure in terms of Dirac measure, so if \(S\) is finite and \(A \subseteq S\),

\[ \#(A) = \sum _{x \in S}\delta _{x}(A).\]
3. Discrete Probability Measures

Let \(\Omega \) be a countable set and take \({\cal F} = {\cal P}(\Omega )\). Let \(\{p_{\omega }, \omega \in \Omega \}\) be a set of real numbers which satisfies the conditions

\[ p_{\omega } \geq 0~\mbox {for all}~\omega \in \Omega ~\mbox {and}~\sum _{\omega \in \Omega }p_{\omega } = 1.\]

Now define the discrete probability measure \(P\) by

\[ P(A) = \sum _{\omega \in A}p_{\omega } = \sum _{\omega \in \Omega }p_{\omega }\delta _{\omega }(A),\]

for each \(A \in {\cal F}\).

For example if \(\#(\Omega ) = n+1\) and \(0 < p < 1\) we can obtain the binomial distribution as a probability measure by taking \(p_{r} = {n \choose r}p^{r}(1-p)^{n-r}\) for \(r = 0, 1, \ldots , n\).
4. Measures via Integration

Let \((S, \Sigma , m)\) be an arbitrary measure space and \(f:S \rightarrow [0, \infty )\) be a function that takes non-negative values. In Chapter 3, we will meet a powerful integration theory that allows us to cook up a new measure \(I_{f}\) from \(m\) and \(f\) (provided that \(f\) is suitably well-behaved, which will think about in Chapter 2) by the prescription:

\[ I_{f}(A) = \int _{A}f(x)m(dx),\]

for all \(A \in \Sigma \).