Namespaces
Variants
Actions

Difference between revisions of "User:Boris Tsirelson/sandbox"

From Encyclopedia of Mathematics
Jump to: navigation, search
Line 1: Line 1:
In [[probability theory]], a '''standard probability space''' (called also Lebesgue-Rokhlin probability space or just <!-- link to DISAMBIGUATION PAGE, intentionally; please do not "repair" --> [[Lebesgue space]]; the latter term is ambiguous) is a [[probability space]] satisfying certain assumptions introduced by [[Vladimir Rokhlin (Soviet mathematician)|Vladimir Rokhlin]] in 1940. He showed that the [[unit interval]] endowed with the [[Lebesgue measure]] has important advantages over general probability spaces, and can be used as a probability space for all practical purposes in probability theory. The dimension of the unit interval is not a concern, which was clear already to [[Norbert Wiener]]. He constructed the [[Wiener process]] (also called [[Brownian motion]]) in the form of a [[measurable]] [[map (mathematics)|map]] from the unit interval to the [[function space|space of continuous functions]].
+
In [[probability theory]], a '''standard probability space''' (called also Lebesgue&ndash;Rokhlin probability space or just <!-- link to DISAMBIGUATION PAGE, intentionally; please do not "repair" --> [[Lebesgue space]]; the latter term is ambiguous) is a [[probability space]] satisfying certain assumptions introduced by [[Vladimir Rokhlin (Soviet mathematician)|Vladimir Rokhlin]] in 1940. He showed that the [[unit interval]] endowed with the [[Lebesgue measure]] has important advantages over general probability spaces, and can be used as a probability space for all practical purposes in probability theory.  The theory of standard probability spaces was started by [[John von Neumann|von Neumann]] in 1932 and shaped by [[Vladimir Rokhlin (Soviet mathematician)|Vladimir Rokhlin]] in 1940. The dimension of the unit interval is not a concern, which was clear already to [[Norbert Wiener]]. He constructed the [[Wiener process]] (also called [[Brownian motion]]) in the form of a [[measurable]] [[map (mathematics)|map]] from the unit interval to the [[function space|space of continuous functions]].
  
 
== Short history ==
 
== Short history ==
The theory of standard probability spaces was started by [[John von Neumann|von Neumann]] in 1932 [[#Notes|[A]]] and shaped by [[Vladimir Rokhlin (Soviet mathematician)|Vladimir Rokhlin]] in 1940.<ref>Published in short in 1947, in detail in 1949 in Russian and in 1952 in English, reprinted in 1962 {{harv|Rokhlin|1962}}. An unpublished text of 1940 is mentioned in {{harv|Rokhlin|1962|loc=page 2}}. "The theory of Lebesgue spaces in its present form was constructed by V.A. Rokhlin" {{harv|Sinai|1994|loc=page 16}}.</ref> For modernized presentations see {{harv|Haezendonck|1973}}, {{harv|de la Rue|1993}}, {{harv|Itô|1984|loc=Sect. 2.4}} and {{harv|Rudolf|1990|loc=Chapter 2}}.
+
The theory of standard probability spaces was started by [[John von
 +
Neumann|von Neumann]] in 1932<ref>[[ |(von Neumann 1932) and (Halmos,
 +
von Neumann 1942) are cited in (Rokhlin 1962, page 2) and (Petersen
 +
1983, page 17).</ref> and shaped by [[Vladimir Rokhlin (Soviet
 +
mathematician)|Vladimir Rokhlin]] in 1940.<ref>Published in short in
 +
1947, in detail in 1949 in Russian and in 1952 in English, reprinted
 +
in 1962 (Rokhlin 1962). An unpublished text of 1940 is mentioned in
 +
(Rokhlin 1962, page 2). "The theory of Lebesgue spaces in its present
 +
form was constructed by V. A. Rokhlin" (Sinai 1994, page 16).</ref>
 +
For modernized presentations see (Haezendonck 1973), (de la Rue 1993),
 +
(Itô 1984, Sect. 2.4) and (Rudolf 1990, Chapter 2).
  
Nowadays standard probability spaces may be (and often are) treated in the framework of [[descriptive set theory]], via [[Borel algebra|standard Borel spaces]], see for example {{harv|Kechris|1995|loc=Sect. 17}}. This approach, natural for experts in descriptive set theory, is based on the [[Borel space#Standard Borel spaces and Kuratowski theorems|isomorphism theorem for standard Borel spaces]] {{harv|Kechris|1995|loc=Theorem (15.6)}} whose proof is very difficult for non-experts in descriptive set theory. The original approach of Rokhlin, based on measure theory, leads to much simpler proofs (since measure theory may neglect [[null set]]s, in contrast to descriptive set theory).
+
Nowadays standard probability spaces may be (and often are) treated in
 +
the framework of [[descriptive set theory]], via [[Borel
 +
algebra|standard Borel spaces]], see for example (Kechris 1995,
 +
Sect. 17). This approach, natural for experts in descriptive set
 +
theory, is based on the [[Borel space#Standard Borel spaces and
 +
Kuratowski theorems|isomorphism theorem for standard Borel spaces]]
 +
(Kechris 1995, Theorem (15.6)) whose proof is very difficult for non-experts in descriptive set theory. The original approach of Rokhlin, based on measure theory, leads to much simpler proofs (since measure theory may neglect [[null set]]s, in contrast to descriptive set theory).
  
Standard probability spaces are used routinely in [[ergodic theory]],<ref>"In this book we will deal exclusively with Lebesgue spaces" {{harv|Petersen|1983|loc=page 17}}.</ref><ref>"Ergodic theory on Lebesgue spaces" is the subtitle of the book {{harv|Rudolph|1990}}.</ref> which cannot be said on probability theory. Some probabilists hold the following opinion: only standard probability spaces are pertinent to probability theory, thus, it is a pity that the standardness is not included into the definition of probability space. Others disagree, however.
+
Standard probability spaces are used routinely in [[ergodic
 +
theory]],<ref>
 +
"In this book we will deal exclusively with Lebesgue spaces" (Petersen 1983, page 17).</ref><ref>
 +
"Ergodic theory on Lebesgue spaces" is the subtitle of the book (Rudolph 1990).
 +
</ref> which cannot be said on probability theory. Some probabilists hold the following opinion: only standard probability spaces are pertinent to probability theory, thus, it is a pity that the standardness is not included into the definition of probability space. Others disagree, however.
  
 
Arguments against standardness:
 
Arguments against standardness:
Line 34: Line 54:
 
A probability space is '''standard''', if it is isomorphic <math>\textstyle \operatorname{mod} \, 0 </math> to an interval with Lebesgue measure, a finite or countable set of atoms, or a combination (disjoint union) of both.
 
A probability space is '''standard''', if it is isomorphic <math>\textstyle \operatorname{mod} \, 0 </math> to an interval with Lebesgue measure, a finite or countable set of atoms, or a combination (disjoint union) of both.
  
See {{harv|Rokhlin|1962|loc=Sect. 2.4 (p. 20)}}, {{harv|Haezendonck|1973|loc=Proposition 6 (p. 249) and Remark 2 (p. 250)}}, and {{harv|de la Rue|1993|loc=Theorem 4-3}}. See also {{harv|Kechris|1995|loc=Sect. 17.F}}, and {{harv|Itô|1984|loc=especially Sect. 2.4 and Exercise 3.1(v)}}. In {{harv|Petersen|1983|loc=Definition 4.5 on page 16}} the measure is assumed finite, not necessarily probabilistic. In {{harv|Sinai|1994|loc=Definition 1 on page 16}} atoms are not allowed.
+
See (Rokhlin 1962, Sect. 2.4 (p. 20)), (Haezendonck 1973, Proposition
 +
6 (p. 249) and Remark 2 (p. 250)), and (de la Rue 1993, Theorem
 +
4-3). See also (Kechris 1995, Sect. 17.F), and (Itô 1984, especially
 +
Sect. 2.4 and Exercise 3.1(v)). In (Petersen 1983, Definition 4.5 on
 +
page 16) the measure is assumed finite, not necessarily
 +
probabilistic. In (Sinai 1994, Definition 1 on page 16) atoms are not allowed.
  
 
== Examples of non-standard probability spaces ==
 
== Examples of non-standard probability spaces ==
Line 40: Line 65:
 
The space of all functions <math>\textstyle f : \mathbb{R} \to \mathbb{R} </math> may be thought of as the product <math>\textstyle \mathbb{R}^\mathbb{R} </math> of a continuum of copies of the real line <math>\textstyle \mathbb{R} </math>. One may endow <math>\textstyle \mathbb{R} </math> with a probability measure, say, the [[standard normal distribution]] <math>\textstyle \gamma = N(0,1) </math>, and treat the space of functions as the product <math>\textstyle (\mathbb{R},\gamma)^\mathbb{R} </math> of a continuum of identical probability spaces <math>\textstyle (\mathbb{R},\gamma) </math>. The [[product measure]] <math>\textstyle \gamma^\mathbb{R} </math> is a probability measure on <math>\textstyle \mathbb{R}^\mathbb{R} </math>. Many non-experts are inclined to believe that <math>\textstyle \gamma^\mathbb{R} </math> describes the so-called [[white noise]].
 
The space of all functions <math>\textstyle f : \mathbb{R} \to \mathbb{R} </math> may be thought of as the product <math>\textstyle \mathbb{R}^\mathbb{R} </math> of a continuum of copies of the real line <math>\textstyle \mathbb{R} </math>. One may endow <math>\textstyle \mathbb{R} </math> with a probability measure, say, the [[standard normal distribution]] <math>\textstyle \gamma = N(0,1) </math>, and treat the space of functions as the product <math>\textstyle (\mathbb{R},\gamma)^\mathbb{R} </math> of a continuum of identical probability spaces <math>\textstyle (\mathbb{R},\gamma) </math>. The [[product measure]] <math>\textstyle \gamma^\mathbb{R} </math> is a probability measure on <math>\textstyle \mathbb{R}^\mathbb{R} </math>. Many non-experts are inclined to believe that <math>\textstyle \gamma^\mathbb{R} </math> describes the so-called [[white noise]].
  
However, it does not. For the white noise, its integral from <math> 0 </math> to <math> 1 </math> should be a random variable distributed <math>\textstyle N(0,1) </math>. In contrast, the integral (from <math> 0 </math> to <math> 1 </math>) of <math>\textstyle f \in \textstyle (\mathbb{R},\gamma)^\mathbb{R} </math> is undefined. Even worse, <math>\textstyle f </math> fails to be [[almost surely]] measurable. Still worse, the probability of <math>\textstyle f </math> being measurable is undefined. And the worst thing: if <math>\textstyle X </math> is a random variable distributed (say) uniformly on <math>\textstyle (0,1) </math> and independent of <math>\textstyle f </math>, then <math>\textstyle f(X) </math> is not a random variable at all! (It lacks measurability.)
+
However, it does not. For the white noise, its integral from 0 to 1 should be a random variable distributed ''N''(0,&nbsp;1). In contrast, the integral (from 0 to 1) of <math>\textstyle f \in \textstyle (\mathbb{R},\gamma)^\mathbb{R} </math> is undefined. Even worse, ''&fnof;'' fails to be [[almost surely]] measurable. Still worse, the probability of ''&fnof;'' being measurable is undefined. And the worst thing: if ''X'' is a random variable distributed (say) uniformly on (0,&nbsp;1) and independent of ''&fnof;'', then ''&fnof;''(''X'') is not a random variable at all! (It lacks measurability.)
  
 
=== A perforated interval ===
 
=== A perforated interval ===
Let <math>\textstyle Z \subset (0,1) </math> be a set whose [[inner measure|inner]] Lebesgue measure is equal to <math> 0 </math>, but [[outer measure|outer]] Lebesgue measure --- to <math> 1 </math> (thus, <math>\textstyle Z </math> is [[nonmeasurable]] to extreme). There exists a probability measure <math>\textstyle m </math> on <math>\textstyle Z </math> such that <math>\textstyle m(Z \cap A) = \text{mes} (A) </math> for every Lebesgue measurable <math>\textstyle A \subset (0,1) </math>. (Here <math>\textstyle \text{mes}</math> is the Lebesgue measure.) Events and random variables on the probability space <math>\textstyle (Z,m) </math> (treated <math>\textstyle \operatorname{mod} \, 0 </math>) are in a natural one-to-one correspondence with events and random variables on the probability space  <math>\textstyle ((0,1),\text{mes}) </math>. Many non-experts are inclined to conclude that the probability space <math>\textstyle (Z,m) </math> is as good as <math>\textstyle ((0,1),\text{mes}) </math>.
+
Let <math>\textstyle Z \subset (0,1) </math> be a set whose [[inner measure|inner]] Lebesgue measure is equal to 0, but [[outer measure|outer]] Lebesgue measure &ndash; to 1 (thus, <math>\textstyle Z </math> is [[nonmeasurable]] to extreme). There exists a probability measure <math>\textstyle m </math> on <math>\textstyle Z </math> such that <math>\textstyle m(Z \cap A) = \text{mes} (A) </math> for every Lebesgue measurable <math>\textstyle A \subset (0,1) </math>. (Here <math>\textstyle \text{mes}</math> is the Lebesgue measure.) Events and random variables on the probability space <math>\textstyle (Z,m) </math> (treated <math>\textstyle \operatorname{mod} \, 0 </math>) are in a natural one-to-one correspondence with events and random variables on the probability space  <math>\textstyle ((0,1),\text{mes}) </math>. Many non-experts are inclined to conclude that the probability space <math>\textstyle (Z,m) </math> is as good as <math>\textstyle ((0,1),\text{mes}) </math>.
  
 
However, it is not. A random variable <math>\textstyle X </math> defined by <math>\textstyle X(\omega)=\omega </math> is distributed uniformly on <math>\textstyle (0,1) </math>. The conditional measure, given <math>\textstyle X=x </math>, is just a single atom (at <math>\textstyle x</math>), provided that <math>\textstyle ((0,1),\text{mes}) </math> is the underlying probability space. However, if <math>\textstyle (Z,m) </math> is used instead, then the conditional measure does not exist when <math>\textstyle x \notin Z </math>.
 
However, it is not. A random variable <math>\textstyle X </math> defined by <math>\textstyle X(\omega)=\omega </math> is distributed uniformly on <math>\textstyle (0,1) </math>. The conditional measure, given <math>\textstyle X=x </math>, is just a single atom (at <math>\textstyle x</math>), provided that <math>\textstyle ((0,1),\text{mes}) </math> is the underlying probability space. However, if <math>\textstyle (Z,m) </math> is used instead, then the conditional measure does not exist when <math>\textstyle x \notin Z </math>.
Line 49: Line 74:
 
A perforated circle is constructed similarly. Its events and random variables are the same as on the usual circle. The group of rotations acts on them naturally. However, it fails to act on the perforated circle.
 
A perforated circle is constructed similarly. Its events and random variables are the same as on the usual circle. The group of rotations acts on them naturally. However, it fails to act on the perforated circle.
  
See also {{harv|Rudolph|1990|loc=page 17}}.
+
See also (Rudolph 1990, page 17).
  
 
=== A superfluous measurable set ===
 
=== A superfluous measurable set ===
Line 61: Line 86:
 
  0.5 + 0.5 x &\text{for } x \in (0,1) \setminus Z
 
  0.5 + 0.5 x &\text{for } x \in (0,1) \setminus Z
 
\end{cases} </math>
 
\end{cases} </math>
 +
 
is an isomorphism between <math>\textstyle \big( (0,1), \mathcal{F}, m \big) </math> and the perforated interval corresponding to the set
 
is an isomorphism between <math>\textstyle \big( (0,1), \mathcal{F}, m \big) </math> and the perforated interval corresponding to the set
 
: <math>\displaystyle Z_1 = \{ 0.5 x : x \in Z \} \cup \{ 0.5 + 0.5 x : x \in (0,1) \setminus Z \} \, ,</math>
 
: <math>\displaystyle Z_1 = \{ 0.5 x : x \in Z \} \cup \{ 0.5 + 0.5 x : x \in (0,1) \setminus Z \} \, ,</math>
 
another set of inner Lebesgue measure 0 but outer Lebesgue measure 1.
 
another set of inner Lebesgue measure 0 but outer Lebesgue measure 1.
  
See also {{harv|Rudolph|1990|loc=Exercise 2.11 on page 18}}.
+
See also (Rudolph 1990, Exercise 2.11 on page 18).
  
 
== A criterion of standardness ==
 
== A criterion of standardness ==
Line 89: Line 115:
 
* <math> (\Omega,\mathcal{F},P) \,</math> is a standard probability space.
 
* <math> (\Omega,\mathcal{F},P) \,</math> is a standard probability space.
  
See also {{harv|Itô|1984|loc=Sect. 3.1}}.
+
See also (Itô 1984, Sect. 3.1).
  
 
=== A random vector ===
 
=== A random vector ===
Line 98: Line 124:
  
 
=== A sequence of events ===
 
=== A sequence of events ===
In particular, if the random variables <math> X_n \,</math> take on only two values 0 and 1, we deal with a measurable function <math> f : \Omega \to \{0,1\}^\infty \,</math> and a sequence of sets <math> A_1,A_2,\dots \in \mathcal{F}. \,</math> The function <math> f \,</math> is generating if and only if <math> \mathcal{F} \,</math> is the completion of the σ-algebra generated by <math> A_1,A_2,\dots. \,</math>
+
In particular, if the random variables <math> X_n \,</math> take on only two values 0 and 1, we deal with a measurable function <math> f : \Omega \to \{0,1\}^\infty \,</math> and a sequence of sets <math> A_1,A_2,\ldots \in \mathcal{F}. \,</math> The function <math> f \,</math> is generating if and only if <math> \mathcal{F} \,</math> is the completion of the σ-algebra generated by <math> A_1,A_2,\dots. \,</math>
  
In the pioneering work {{harv|Rokhlin|1962}} sequences <math> A_1,A_2,\dots \,</math> that correspond to injective, generating <math> f \,</math> are called ''bases'' of the probability space <math> (\Omega,\mathcal{F},P) \,</math> (see {{harvnb|Rokhlin|1962|loc=Sect. 2.1}}). A basis is called complete mod 0, if <math> f(\Omega) \,</math> is of full measure <math> \mu, \,</math> see {{harv|Rokhlin|1962|loc=Sect. 2.2}}. In the same section Rokhlin proved that if a probability space is complete mod 0 with respect to some basis, then it is complete mod 0 with respect to every other basis, and defines ''Lebesgue spaces'' by this completeness property. See also {{harv|Haezendonck|1973|loc=Prop. 4 and Def. 7}} and {{harv|Rudolph|1990|loc=Sect. 2.3, especially Theorem 2.2}}.
+
In the pioneering work (Rokhlin 1962) sequences <math> A_1,A_2,\ldots
 +
\,</math> that correspond to injective, generating <math> f \,</math>
 +
are called ''bases'' of the probability space <math>
 +
(\Omega,\mathcal{F},P) \,</math> (see (Rokhlin 1962, Sect. 2.1)). A
 +
basis is called complete mod 0, if <math> f(\Omega) \,</math> is of
 +
full measure <math> \mu, \,</math> see (Rokhlin 1962, Sect. 2.2). In
 +
the same section Rokhlin proved that if a probability space is
 +
complete mod 0 with respect to some basis, then it is complete mod 0
 +
with respect to every other basis, and defines ''Lebesgue spaces'' by
 +
this completeness property. See also (Haezendonck 1973, Prop. 4 and
 +
Def. 7) and (Rudolph 1990, Sect. 2.3, especially Theorem 2.2).
  
 
=== Additional remarks ===
 
=== Additional remarks ===
 
The four cases treated above are mutually equivalent, and can be united, since the measurable spaces <math> \mathbb{R}, \,</math> <math> \mathbb{R}^n, \,</math> <math> \mathbb{R}^\infty \,</math> and <math> \{0,1\}^\infty \,</math> are mutually isomorphic; they all are [[Borel space#Standard Borel spaces and Kuratowski theorems|standard measurable spaces]] (in other words, standard Borel spaces).
 
The four cases treated above are mutually equivalent, and can be united, since the measurable spaces <math> \mathbb{R}, \,</math> <math> \mathbb{R}^n, \,</math> <math> \mathbb{R}^\infty \,</math> and <math> \{0,1\}^\infty \,</math> are mutually isomorphic; they all are [[Borel space#Standard Borel spaces and Kuratowski theorems|standard measurable spaces]] (in other words, standard Borel spaces).
  
Existence of an injective measurable function from <math>\textstyle (\Omega,\mathcal{F},P) </math> to a standard measurable space <math>\textstyle (X,\Sigma) </math> does not depend on the choice of <math>\textstyle (X,\Sigma). </math> Taking <math>\textstyle (X,\Sigma) =  \{0,1\}^\infty </math> we get the property well-known as being ''countably separated'' (but called ''separable'' in {{harvnb|Itô|1984}}).
+
Existence of an injective measurable function from <math>\textstyle
 +
(\Omega,\mathcal{F},P) </math> to a standard measurable space
 +
<math>\textstyle (X,\Sigma) </math> does not depend on the choice of
 +
<math>\textstyle (X,\Sigma). </math> Taking <math>\textstyle
 +
(X,\Sigma) =  \{0,1\}^\infty </math> we get the property well-known as
 +
being ''countably separated'' (but called ''separable'' in (Itô 1984)).
  
Existence of a generating measurable function from <math>\textstyle (\Omega,\mathcal{F},P) </math> to a standard measurable space <math>\textstyle (X,\Sigma) </math> also does not depend on the choice of <math>\textstyle (X,\Sigma). </math> Taking <math>\textstyle (X,\Sigma) =  \{0,1\}^\infty </math> we get the property well-known as being ''countably generated'' (mod 0), see {{harv|Durrett|1996|loc=Exer. I.5}}.
+
Existence of a generating measurable function from <math>\textstyle
 +
(\Omega,\mathcal{F},P) </math> to a standard measurable space
 +
<math>\textstyle (X,\Sigma) </math> also does not depend on the choice
 +
of <math>\textstyle (X,\Sigma). </math> Taking <math>\textstyle
 +
(X,\Sigma) =  \{0,1\}^\infty </math> we get the property well-known as
 +
being ''countably generated'' (mod 0), see (Durrett 1996, Exer. I.5).
 
{| class="wikitable" style="font-size: 90%; text-align: center; width: auto;"
 
{| class="wikitable" style="font-size: 90%; text-align: center; width: auto;"
 
!                        Probability space
 
!                        Probability space
Line 115: Line 161:
 
|-
 
|-
 
! {{rh}} | Interval with Lebesgue measure
 
! {{rh}} | Interval with Lebesgue measure
| {{yes}}
+
| yes
| {{yes}}
+
| yes
| {{yes}}
+
| yes
 
|-
 
|-
 
! {{rh}} | Naive white noise
 
! {{rh}} | Naive white noise
| {{no}}
+
| no
| {{no}}
+
| no
| {{no}}
+
| no
 
|-
 
|-
 
! {{rh}} | Perforated interval
 
! {{rh}} | Perforated interval
| {{yes}}
+
| yes
| {{yes}}
+
| yes
| {{no}}
+
| no
 
|}
 
|}
  
Every injective measurable function from a ''standard'' probability space to a ''standard'' measurable space is generating. See {{harv|Rokhlin|1962|loc=Sect. 2.5}}, {{harv|Haezendonck|1973|loc=Corollary 2 on page 253}}, {{harv|de la Rue|1993|loc=Theorems 3-4, 3-5}}. This property does not hold for the non-standard probability space dealt with in the subsection "A superfluous measurable set" above.
+
Every injective measurable function from a ''standard'' probability
 +
space to a ''standard'' measurable space is generating. See (Rokhlin
 +
1962, Sect. 2.5), (Haezendonck 1973, Corollary 2 on page 253), (de la
 +
Rue 1993, Theorems 3-4, 3-5). This property does not hold for the
 +
non-standard probability space dealt with in the subsection "A
 +
superfluous measurable set" above.
  
''Caution.'' &nbsp; The property of being countably generated is invariant under mod 0 isomorphisms, but the property of being countably separated is not. In fact, a standard probability space <math>\textstyle (\Omega,\mathcal{F},P) </math> is countably separated if and only if the [[cardinality]] of <math>\textstyle \Omega </math> does not exceed [[cardinality of the continuum|continuum]] (see {{harvnb|Itô|1984|loc=Exer. 3.1(v)}}). A standard probability space may contain a null set of any cardinality, thus, it need not be countably separated. However, it always contains a countably separated subset of full measure.
+
''Caution.'' &nbsp; The property of being countably generated is
 +
invariant under mod 0 isomorphisms, but the property of being
 +
countably separated is not. In fact, a standard probability space
 +
<math>\textstyle (\Omega,\mathcal{F},P) </math> is countably separated
 +
if and only if the [[cardinality]] of <math>\textstyle \Omega </math>
 +
does not exceed [[cardinality of the continuum|continuum]] (see (Itô
 +
1984, Exer. 3.1(v))). A standard probability space may contain a null
 +
set of any cardinality, thus, it need not be countably
 +
separated. However, it always contains a countably separated subset of
 +
full measure.
  
 
== Equivalent definitions ==
 
== Equivalent definitions ==
Line 140: Line 200:
 
'''Definition.''' &nbsp; <math>\textstyle (\Omega,\mathcal{F},P) </math> is standard if it is countably separated, countably generated, and absolutely measurable.
 
'''Definition.''' &nbsp; <math>\textstyle (\Omega,\mathcal{F},P) </math> is standard if it is countably separated, countably generated, and absolutely measurable.
  
See {{harv|Rokhlin|1962|loc=the end of Sect. 2.3}} and {{harv|Haezendonck|1973|loc=Remark 2 on page 248}}. "Absolutely measurable" means: measurable in every countably separated, countably generated probability space containing it.
+
See (Rokhlin 1962, the end of Sect. 2.3) and (Haezendonck 1973, Remark
 +
2 on page 248). "Absolutely measurable" means: measurable in every
 +
countably separated, countably generated probability space containing
 +
it.
  
 
=== Via perfectness ===
 
=== Via perfectness ===
 
'''Definition.''' &nbsp; <math>\textstyle (\Omega,\mathcal{F},P) </math> is standard if it is countably separated and perfect.
 
'''Definition.''' &nbsp; <math>\textstyle (\Omega,\mathcal{F},P) </math> is standard if it is countably separated and perfect.
  
See {{harv|Itô|1984|loc=Sect. 3.1}}. "Perfect" means that for every measurable function from <math>\textstyle (\Omega,\mathcal{F},P) </math> to <math> \mathbb{R} \,</math> the image measure is [[regular measure|regular]]. (Here the image measure is defined on all sets whose inverse images belong to <math>\textstyle \mathcal{F} </math>, irrespective of the Borel structure of <math> \mathbb{R} \,</math>).
+
See (Itô 1984, Sect. 3.1). "Perfect" means that for every measurable function from <math>\textstyle (\Omega,\mathcal{F},P) </math> to <math> \mathbb{R} \,</math> the image measure is [[regular measure|regular]]. (Here the image measure is defined on all sets whose inverse images belong to <math>\textstyle \mathcal{F} </math>, irrespective of the Borel structure of <math> \mathbb{R} \,</math>).
  
 
=== Via topology ===
 
=== Via topology ===
Line 153: Line 216:
 
* for every <math>\textstyle \varepsilon > 0 </math> there exists a compact set <math>\textstyle K </math> in <math>\textstyle (\Omega,\tau) </math> such that <math>\textstyle P(K) \ge 1-\varepsilon. </math>
 
* for every <math>\textstyle \varepsilon > 0 </math> there exists a compact set <math>\textstyle K </math> in <math>\textstyle (\Omega,\tau) </math> such that <math>\textstyle P(K) \ge 1-\varepsilon. </math>
  
See {{harv|de la Rue|1993|loc=Sect. 1}}.
+
See (de la Rue 1993, Sect. 1).
  
 
== Verifying the standardness ==
 
== Verifying the standardness ==
 
Every probability distribution on the space <math>\textstyle \mathbb{R}^n </math> turns it into a standard probability space. (Here, a probability distribution means a probability measure defined initially on the [[Borel sigma-algebra]] and completed.)
 
Every probability distribution on the space <math>\textstyle \mathbb{R}^n </math> turns it into a standard probability space. (Here, a probability distribution means a probability measure defined initially on the [[Borel sigma-algebra]] and completed.)
  
The same holds on every [[Polish space]], see {{harv|Rokhlin|1962|loc=Sect. 2.7 (p. 24)}}, {{harv|Haezendonck|1973|loc=Example 1 (p. 248)}}, {{harv|de la Rue|1993|loc=Theorem 2-3}}, and {{harv|Itô|1984|loc=Theorem 2.4.1}}.
+
The same holds on every [[Polish space]], see (Rokhlin 1962, Sect. 2.7
 +
(p. 24)), (Haezendonck 1973, Example 1 (p. 248)), (de la Rue 1993,
 +
Theorem 2-3), and (Itô 1984, Theorem 2.4.1).
  
 
For example, the Wiener measure turns the Polish space <math>\textstyle C[0,\infty) </math> (of all continuous functions <math>\textstyle [0,\infty) \to \mathbb{R}, </math> endowed with the [[topological space|topology]] of [[local uniform convergence]]) into a standard probability space.
 
For example, the Wiener measure turns the Polish space <math>\textstyle C[0,\infty) </math> (of all continuous functions <math>\textstyle [0,\infty) \to \mathbb{R}, </math> endowed with the [[topological space|topology]] of [[local uniform convergence]]) into a standard probability space.
Line 168: Line 233:
 
The [[product measure|product]] of two standard probability spaces is a standard probability space.
 
The [[product measure|product]] of two standard probability spaces is a standard probability space.
  
The same holds for the product of countably many spaces, see {{harv|Rokhlin|1962|loc=Sect. 3.4}}, {{harv|Haezendonck|1973|loc=Proposition 12}}, and {{harv|Itô|1984|loc=Theorem 2.4.3}}.
+
The same holds for the product of countably many spaces, see (Rokhlin
 +
1962, Sect. 3.4), (Haezendonck 1973, Proposition 12), and (Itô 1984, Theorem 2.4.3).
  
A measurable subset of a standard probability space is a standard probability space. It is assumed that the set is not a null set, and is endowed with the conditional measure. See {{harv|Rokhlin|1962|loc=Sect. 2.3 (p. 14)}} and {{harv|Haezendonck|1973|loc=Proposition 5}}.
+
A measurable subset of a standard probability space is a standard
 +
probability space. It is assumed that the set is not a null set, and
 +
is endowed with the conditional measure. See (Rokhlin 1962, Sect. 2.3
 +
(p. 14)) and (Haezendonck 1973, Proposition 5).
  
 
Every [[probability measure]] on a [[Borel space#Standard Borel spaces and Kuratowski theorems|standard Borel space]] turns it into a standard probability space.
 
Every [[probability measure]] on a [[Borel space#Standard Borel spaces and Kuratowski theorems|standard Borel space]] turns it into a standard probability space.
Line 178: Line 247:
 
In the discrete setup, the conditional probability is another probability measure, and the conditional expectation may be treated as the (usual) expectation with respect to the conditional measure, see [[conditional expectation]]. In the non-discrete setup, conditioning is often treated indirectly, since the condition may have probability 0, see [[conditional expectation]]. As a result, a number of well-known facts have special 'conditional' counterparts. For example: linearity of the expectation; Jensen's inequality (see [[conditional expectation]]); [[Hölder's inequality]]; the [[monotone convergence theorem#Lebesgue monotone convergence theorem|monotone convergence theorem]], etc.
 
In the discrete setup, the conditional probability is another probability measure, and the conditional expectation may be treated as the (usual) expectation with respect to the conditional measure, see [[conditional expectation]]. In the non-discrete setup, conditioning is often treated indirectly, since the condition may have probability 0, see [[conditional expectation]]. As a result, a number of well-known facts have special 'conditional' counterparts. For example: linearity of the expectation; Jensen's inequality (see [[conditional expectation]]); [[Hölder's inequality]]; the [[monotone convergence theorem#Lebesgue monotone convergence theorem|monotone convergence theorem]], etc.
  
Given a random variable <math>\textstyle Y </math> on a probability space <math>\textstyle (\Omega,\mathcal{F},P) </math>, it is natural to try constructing a conditional measure <math>\textstyle P_y </math>, that is, the [[conditional distribution]] of <math>\textstyle \omega \in \Omega </math> given <math>\textstyle Y(\omega)=y </math>. In general this is impossible (see {{harvnb|Durrett|1996|loc=Sect. 4.1(c)}}). However, for a ''standard'' probability space <math>\textstyle (\Omega,\mathcal{F},P) </math> this is possible, and well-known as ''canonical system of measures'' (see {{harvnb|Rokhlin|1962|loc=Sect. 3.1}}), which is basically the same as ''conditional probability measures'' (see {{harvnb|Itô|1984|loc=Sect. 3.5}}), ''disintegration of measure'' (see {{harvnb|Kechris|1995|loc=Exercise (17.35)}}), and [[regular conditional probability|''regular conditional probabilities'']] (see {{harvnb|Durrett|1996|loc=Sect. 4.1(c)}}).
+
Given a random variable <math>\textstyle Y </math> on a probability
 +
space <math>\textstyle (\Omega,\mathcal{F},P) </math>, it is natural
 +
to try constructing a conditional measure <math>\textstyle P_y
 +
</math>, that is, the [[conditional distribution]] of <math>\textstyle
 +
\omega \in \Omega </math> given <math>\textstyle Y(\omega)=y
 +
</math>. In general this is impossible (see (Durrett 1996,
 +
Sect. 4.1(c))). However, for a ''standard'' probability space
 +
<math>\textstyle (\Omega,\mathcal{F},P) </math> this is possible, and
 +
well-known as ''canonical system of measures'' (see (Rokhlin 1962,
 +
Sect. 3.1)), which is basically the same as ''conditional probability
 +
measures'' (see (Itô 1984, Sect. 3.5)), [[disintegration
 +
theorem|''disintegration of measure'']] (see (Kechris 1995, Exercise
 +
(17.35))), and [[regular conditional probability|''regular conditional
 +
probabilities'']] (see (Durrett 1996, Sect. 4.1(c))).
  
 
The conditional Jensen's inequality is just the (usual) Jensen's inequality applied to the conditional measure. The same holds for many other facts.
 
The conditional Jensen's inequality is just the (usual) Jensen's inequality applied to the conditional measure. The same holds for many other facts.
  
 
=== Measure preserving transformations ===
 
=== Measure preserving transformations ===
Given two probability spaces <math>\textstyle (\Omega_1,\mathcal{F}_1,P_1) </math>, <math>\textstyle (\Omega_2,\mathcal{F}_2,P_2) </math> and a measure preserving map <math>\textstyle f : \Omega_1 \to \Omega_2 </math>, the image <math>\textstyle f(\Omega_1) </math> need not cover the whole <math>\textstyle    \Omega_2 </math>, it may miss a null set. It may seem that <math>\textstyle P_2(f(\Omega_1)) </math> has to be equal to 1, but it is not so. The outer measure of <math>\textstyle f(\Omega_1) </math> is equal to 1, but the inner measure may differ. However, if the probability spaces <math>\textstyle (\Omega_1,\mathcal{F}_1,P_1) </math>, <math>\textstyle (\Omega_2,\mathcal{F}_2,P_2) </math> are ''standard '' then <math>\textstyle P_2(f(\Omega_1))=1 </math>, see {{harv|de la Rue|1993|loc=Theorem 3-2}}. If <math>\textstyle f </math> is also one-to-one then every <math>\textstyle A \in \mathcal{F}_1 </math> satisfies <math>\textstyle f(A) \in \mathcal{F}_2 </math>, <math>\textstyle P_2(f(A))=P_1(A) </math>. Therefore <math>\textstyle f^{-1} </math> is measurable (and measure preserving). See {{harv|Rokhlin|1962|loc=Sect. 2.5 (p. 20)}} and {{harv|de la Rue|1993|loc=Theorem 3-5}}. See also {{harv|Haezendonck|1973|loc=Proposition 9 (and Remark after it)}}.
+
Given two probability spaces <math>\textstyle
 +
(\Omega_1,\mathcal{F}_1,P_1) </math>, <math>\textstyle
 +
(\Omega_2,\mathcal{F}_2,P_2) </math> and a measure preserving map
 +
<math>\textstyle f : \Omega_1 \to \Omega_2 </math>, the image
 +
<math>\textstyle f(\Omega_1) </math> need not cover the whole
 +
<math>\textstyle    \Omega_2 </math>, it may miss a null set. It may
 +
seem that <math>\textstyle P_2(f(\Omega_1)) </math> has to be equal to
 +
1, but it is not so. The outer measure of <math>\textstyle f(\Omega_1)
 +
</math> is equal to 1, but the inner measure may differ. However, if
 +
the probability spaces <math>\textstyle (\Omega_1,\mathcal{F}_1,P_1)
 +
</math>, <math>\textstyle (\Omega_2,\mathcal{F}_2,P_2) </math> are
 +
''standard '' then <math>\textstyle P_2(f(\Omega_1))=1 </math>, see
 +
(de la Rue 1993, Theorem 3-2). If <math>\textstyle f </math> is also
 +
one-to-one then every <math>\textstyle A \in \mathcal{F}_1 </math>
 +
satisfies <math>\textstyle f(A) \in \mathcal{F}_2 </math>,
 +
<math>\textstyle P_2(f(A))=P_1(A) </math>. Therefore <math>\textstyle
 +
f^{-1} </math> is measurable (and measure preserving). See (Rokhlin
 +
1962, Sect. 2.5 (p. 20)) and (de la Rue 1993, Theorem 3-5). See also
 +
(Haezendonck 1973, Proposition 9 (and Remark after it)).
  
"There is a coherent way to ignore the sets of measure 0 in a measure space" {{harv|Petersen|1983|loc=page 15}}. Striving to get rid of null sets, mathematicians often use equivalence classes of measurable sets or functions. Equivalence classes of measurable subsets of a probability space form a normed [[complete Boolean algebra]] called the ''measure algebra'' (or metric structure). Every measure preserving map <math>\textstyle f : \Omega_1 \to \Omega_2 </math> leads to a homomorphism <math>\textstyle F </math> of measure algebras; basically, <math>\textstyle F(B) = f^{-1}(B) </math> for <math>\textstyle B\in\mathcal{F}_2 </math>.
+
"There is a coherent way to ignore the sets of measure 0 in a measure
 +
space" (Petersen 1983, page 15). Striving to get rid of null sets, mathematicians often use equivalence classes of measurable sets or functions. Equivalence classes of measurable subsets of a probability space form a normed [[complete Boolean algebra]] called the ''measure algebra'' (or metric structure). Every measure preserving map <math>\textstyle f : \Omega_1 \to \Omega_2 </math> leads to a homomorphism <math>\textstyle F </math> of measure algebras; basically, <math>\textstyle F(B) = f^{-1}(B) </math> for <math>\textstyle B\in\mathcal{F}_2 </math>.
  
It may seem that every homomorphism of measure algebras has to correspond to some measure preserving map, but it is not so. However, for ''standard'' probability spaces each <math>\textstyle F </math> corresponds to some <math>\textstyle f </math>. See {{harv|Rokhlin|1962|loc=Sect. 2.6 (p. 23) and 3.2}}, {{harv|Kechris|1995|loc=Sect. 17.F}}, {{harv|Petersen|1983|loc=Theorem 4.7 on page 17}}.
+
It may seem that every homomorphism of measure algebras has to
 +
correspond to some measure preserving map, but it is not so. However,
 +
for ''standard'' probability spaces each <math>\textstyle F </math>
 +
corresponds to some <math>\textstyle f </math>. See (Rokhlin 1962,
 +
Sect. 2.6 (p. 23) and 3.2), (Kechris 1995, Sect. 17.F), (Petersen
 +
1983, Theorem 4.7 on page 17).
  
 
==Notes==
 
==Notes==
 
[A] (von Neumann 1932) and (Halmos & von Neumann 1942) are cited in (Rokhlin 1962, page 2) and (Petersen 1983, page 17).
 
  
 
<references />
 
<references />
Line 197: Line 301:
 
==References==
 
==References==
  
[1] Rokhlin, V. A. (1962), "On the fundamental ideas of measure theory", Translations (American Mathematical Society) Series 1 10: 1–54 . Translated from Russian: Рохлин, В. А. (1949), "Об основных понятиях теории меры", Математический Сборник (Новая Серия) 25 (67): 107–150 .
+
* Rokhlin, V. A. (1962), "On the fundamental ideas of measure theory", ''Translations (American Mathematical Society) Series'' 1 '''10''': 1–54 . Translated from Russian: Рохлин, В. А. (1949), "Об основных понятиях теории меры", ''Математический Сборник (Новая Серия)'' '''25 (67)''': 107–150.
 
+
* von Neumann, J. (1932), "Einige Sätze über messbare Abbildungen", ''Annals of Mathematics'' (3) '''33''': 574–586 .
*{{User:Boris Tsirelson/Citation|last=Rokhlin|first=V.A.|author-link=Vladimir Abramovich Rokhlin|year=1962|title=On the fundamental ideas of measure theory|journal=Translations (American Mathematical Society) Series 1|volume=10|pages=1&ndash;54}}. Translated from Russian: {{citation|last=Рохлин|first=В.А.|year=1949|title=Об основных понятиях теории меры|journal=Математический Сборник (новая серия)|volume=25 (67)|pages=107&ndash;150}}.
+
* Halmos, P. R.; von Neumann, J. (1942), "Operator methods in classical mechanics, II", ''Annals of Mathematics (2)'' (Annals of Mathematics) 43 (2): 332–350, doi:10.2307/1968872, JSTOR 1968872 .
*{{citation|last=von Neumann|first=J.|author-link=John von Neumann|year=1932|title=Einige Sätze über messbare Abbildungen|journal=Annals of Mathematics (2)|volume=33|pages=574&ndash;586}}.
+
* Haezendonck, J. (1973), "Abstract Lebesgue–Rohlin spaces", ''Bulletin de la Societe Mathematique de Belgique'' '''25''': 243–258.
*{{citation|last1=Halmos|first1=P.R.|author1-link=Paul Halmos|last2=von Neumann|first2=J.|author2-link=John von Neumann|year=1942|title=Operator methods in classical mechanics, II|journal=Annals of Mathematics (2)|volume=43|pages=332&ndash;350}}.
+
* de la Rue, T. (1993), "Espaces de Lebesgue", ''Séminaire de Probabilités XXVII'', Lecture Notes in Mathematics, '''1557''', Springer, Berlin, pp. 15–21 .
*{{citation|last=Haezendonck|first=J.|year=1973|title=Abstract Lebesgue-Rohlin spaces|journal=Bulletin de la Societe Mathematique de Belgique|volume=25|pages=243&ndash;258}}.
+
* Petersen, K. (1983), ''Ergodic theory'', Cambridge Univ. Press .
*{{citation|last=de la Rue|first=T.|year=1993|contribution=Espaces de Lebesgue|title=Séminaire de Probabilités XXVII|series=Lecture Notes in Mathematics|volume=1557|pages=15&ndash;21|place=Springer, Berlin}}.
+
* Itô, K. (1984), ''Introduction to probability theory'', Cambridge Univ. Press .
*{{citation|last=Petersen|first=K.|title=Ergodic theory|year=1983|publisher=Cambridge Univ. Press}}.
+
* Rudolph, D. J. (1990), ''Fundamentals of measurable dynamics: Ergodic theory on Lebesgue spaces'', Oxford: Clarendon Press .
*{{citation|last=Itô|first=K.|author-link=Kiyoshi Itō|title=Introduction to probability theory|year=1984|publisher=Cambridge Univ. Press}}.
+
* Sinai, Ya. G. (1994), ''Topics in ergodic theory'', Princeton Univ. Press .
*{{citation|last=Rudolph|first=D.J.|title=Fundamentals of measurable dynamics: Ergodic theory on Lebesgue spaces|year=1990|publication-place=Oxford|publisher=Clarendon Press}}.
+
* Kechris, A. S. (1995), ''Classical descriptive set theory'', Springer .
*{{citation|last=Sinai|first=Ya.G.|author-link=Yakov G. Sinai|title=Topics in ergodic theory|year=1994|publisher=Princeton Univ. Press}}.
+
* Durrett, R. (1996), ''Probability: theory and examples'' (Second ed.) .
*{{citation|last=Kechris|first=A.S.|author-link=Alexander S. Kechris|title=Classical descriptive set theory|year=1995|publisher=Springer}}.
+
* Wiener, N. (1958), ''Nonlinear problems in random theory'', M.I.T. Press .
*{{citation|last=Durrett|first=R.|author-link=Rick Durrett|title=Probability: theory and examples|edition=Second|year=1996}}.
 
*{{citation|last=Wiener|first=N.|author-link=Norbert Wiener|title=Nonlinear problems in random theory|year=1958|publisher=M.I.T. Press}}.
 
 
 
.
 
 
 
[[:Category:Probability theory]]
 
[[:Category:Measure theory]]
 

Revision as of 19:10, 3 December 2011

In probability theory, a standard probability space (called also Lebesgue–Rokhlin probability space or just Lebesgue space; the latter term is ambiguous) is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin in 1940. He showed that the unit interval endowed with the Lebesgue measure has important advantages over general probability spaces, and can be used as a probability space for all practical purposes in probability theory. The theory of standard probability spaces was started by von Neumann in 1932 and shaped by Vladimir Rokhlin in 1940. The dimension of the unit interval is not a concern, which was clear already to Norbert Wiener. He constructed the Wiener process (also called Brownian motion) in the form of a measurable map from the unit interval to the space of continuous functions.

Short history

The theory of standard probability spaces was started by [[John von Neumann|von Neumann]] in 1932[1] and shaped by [[Vladimir Rokhlin (Soviet mathematician)|Vladimir Rokhlin]] in 1940.[2] For modernized presentations see (Haezendonck 1973), (de la Rue 1993), (Itô 1984, Sect. 2.4) and (Rudolf 1990, Chapter 2).

Nowadays standard probability spaces may be (and often are) treated in the framework of descriptive set theory, via [[Borel algebra|standard Borel spaces]], see for example (Kechris 1995, Sect. 17). This approach, natural for experts in descriptive set theory, is based on the [[Borel space#Standard Borel spaces and Kuratowski theorems|isomorphism theorem for standard Borel spaces]] (Kechris 1995, Theorem (15.6)) whose proof is very difficult for non-experts in descriptive set theory. The original approach of Rokhlin, based on measure theory, leads to much simpler proofs (since measure theory may neglect null sets, in contrast to descriptive set theory).

Standard probability spaces are used routinely in [[ergodic theory]],[3][4] which cannot be said on probability theory. Some probabilists hold the following opinion: only standard probability spaces are pertinent to probability theory, thus, it is a pity that the standardness is not included into the definition of probability space. Others disagree, however.

Arguments against standardness:

  • the definition of standardness is technically demanding;
  • the same about the theorems based on that definition;
  • it is possible (and natural) to build all the probability theory without the standardness;
  • events and random variables are essential, while probability spaces are auxiliary and should not be taken too seriously.

Arguments in favour of standardness:

  • conditioning is easy and natural on standard probability spaces, otherwise it becomes obscure;
  • the same for measure-preserving transformations between probability spaces, group actions on a probability space, etc.;
  • ergodic theory uses standard probability spaces routinely and successfully;
  • being unable to eliminate these (auxiliary) probability spaces, we should make them as useful as possible.

Definition

One of several well-known equivalent definitions of the standardness is given below, after some preparations. All probability spaces are assumed to be complete.

Isomorphism

An isomorphism between two probability spaces \(\textstyle (\Omega_1,\mathcal{F}_1,P_1) \), \(\textstyle (\Omega_2,\mathcal{F}_2,P_2) \) is an invertible map \(\textstyle f : \Omega_1 \to \Omega_2 \) such that \(\textstyle f \) and \(\textstyle f^{-1} \) both are (measurable and) measure preserving maps.

Two probability spaces are isomorphic, if there exists an isomorphism between them.

Isomorphism modulo zero

Two probability spaces \(\textstyle (\Omega_1,\mathcal{F}_1,P_1) \), \(\textstyle (\Omega_2,\mathcal{F}_2,P_2) \) are isomorphic \(\textstyle \operatorname{mod} \, 0 \), if there exist null sets \(\textstyle A_1 \subset \Omega_1 \), \(\textstyle A_2 \subset \Omega_2 \) such that the probability spaces \(\textstyle \Omega_1 \setminus A_1 \), \(\textstyle \Omega_2 \setminus A_2 \) are isomorphic (being endowed naturally with sigma-fields and probability measures).

Standard probability space

A probability space is standard, if it is isomorphic \(\textstyle \operatorname{mod} \, 0 \) to an interval with Lebesgue measure, a finite or countable set of atoms, or a combination (disjoint union) of both.

See (Rokhlin 1962, Sect. 2.4 (p. 20)), (Haezendonck 1973, Proposition 6 (p. 249) and Remark 2 (p. 250)), and (de la Rue 1993, Theorem 4-3). See also (Kechris 1995, Sect. 17.F), and (Itô 1984, especially Sect. 2.4 and Exercise 3.1(v)). In (Petersen 1983, Definition 4.5 on page 16) the measure is assumed finite, not necessarily probabilistic. In (Sinai 1994, Definition 1 on page 16) atoms are not allowed.

Examples of non-standard probability spaces

A naive white noise

The space of all functions \(\textstyle f : \mathbb{R} \to \mathbb{R} \) may be thought of as the product \(\textstyle \mathbb{R}^\mathbb{R} \) of a continuum of copies of the real line \(\textstyle \mathbb{R} \). One may endow \(\textstyle \mathbb{R} \) with a probability measure, say, the standard normal distribution \(\textstyle \gamma = N(0,1) \), and treat the space of functions as the product \(\textstyle (\mathbb{R},\gamma)^\mathbb{R} \) of a continuum of identical probability spaces \(\textstyle (\mathbb{R},\gamma) \). The product measure \(\textstyle \gamma^\mathbb{R} \) is a probability measure on \(\textstyle \mathbb{R}^\mathbb{R} \). Many non-experts are inclined to believe that \(\textstyle \gamma^\mathbb{R} \) describes the so-called white noise.

However, it does not. For the white noise, its integral from 0 to 1 should be a random variable distributed N(0, 1). In contrast, the integral (from 0 to 1) of \(\textstyle f \in \textstyle (\mathbb{R},\gamma)^\mathbb{R} \) is undefined. Even worse, ƒ fails to be almost surely measurable. Still worse, the probability of ƒ being measurable is undefined. And the worst thing: if X is a random variable distributed (say) uniformly on (0, 1) and independent of ƒ, then ƒ(X) is not a random variable at all! (It lacks measurability.)

A perforated interval

Let \(\textstyle Z \subset (0,1) \) be a set whose inner Lebesgue measure is equal to 0, but outer Lebesgue measure – to 1 (thus, \(\textstyle Z \) is nonmeasurable to extreme). There exists a probability measure \(\textstyle m \) on \(\textstyle Z \) such that \(\textstyle m(Z \cap A) = \text{mes} (A) \) for every Lebesgue measurable \(\textstyle A \subset (0,1) \). (Here \(\textstyle \text{mes}\) is the Lebesgue measure.) Events and random variables on the probability space \(\textstyle (Z,m) \) (treated \(\textstyle \operatorname{mod} \, 0 \)) are in a natural one-to-one correspondence with events and random variables on the probability space \(\textstyle ((0,1),\text{mes}) \). Many non-experts are inclined to conclude that the probability space \(\textstyle (Z,m) \) is as good as \(\textstyle ((0,1),\text{mes}) \).

However, it is not. A random variable \(\textstyle X \) defined by \(\textstyle X(\omega)=\omega \) is distributed uniformly on \(\textstyle (0,1) \). The conditional measure, given \(\textstyle X=x \), is just a single atom (at \(\textstyle x\)), provided that \(\textstyle ((0,1),\text{mes}) \) is the underlying probability space. However, if \(\textstyle (Z,m) \) is used instead, then the conditional measure does not exist when \(\textstyle x \notin Z \).

A perforated circle is constructed similarly. Its events and random variables are the same as on the usual circle. The group of rotations acts on them naturally. However, it fails to act on the perforated circle.

See also (Rudolph 1990, page 17).

A superfluous measurable set

Let \(\textstyle Z \subset (0,1) \) be as in the previous example. Sets of the form \(\textstyle ( A \cap Z ) \cup ( B \setminus Z ), \) where \(\textstyle A \) and \(\textstyle B \) are arbitrary Lebesgue measurable sets, are a σ-algebra \(\textstyle \mathcal{F}; \) it contains the Lebesgue σ-algebra and \(\textstyle Z. \) The formula \[\displaystyle m \big( ( A \cap Z ) \cup ( B \setminus Z ) \big) = p \, \operatorname{mes} (A) + (1-p) \operatorname{mes} (B) \] gives the general form of a probability measure \(\textstyle m \) on \(\textstyle \big( (0,1), \mathcal{F} \big) \) that extends the Lebesgue measure; here \(\textstyle p \in [0,1] \) is a parameter. To be specific, we choose \(\textstyle p = 0.5. \) Many non-experts are inclined to believe that such an extension of the Lebesgue measure is at least harmless.

However, it is the perforated interval in disguise. The map \[\displaystyle f(x) = \begin{cases} 0.5 x &\text{for } x \in Z, \\ 0.5 + 0.5 x &\text{for } x \in (0,1) \setminus Z \end{cases} \]

is an isomorphism between \(\textstyle \big( (0,1), \mathcal{F}, m \big) \) and the perforated interval corresponding to the set \[\displaystyle Z_1 = \{ 0.5 x : x \in Z \} \cup \{ 0.5 + 0.5 x : x \in (0,1) \setminus Z \} \, ,\] another set of inner Lebesgue measure 0 but outer Lebesgue measure 1.

See also (Rudolph 1990, Exercise 2.11 on page 18).

A criterion of standardness

Standardness of a given probability space \(\textstyle (\Omega,\mathcal{F},P) \) is equivalent to a certain property of a measurable map \(\textstyle f \) from \(\textstyle (\Omega,\mathcal{F},P) \) to a measurable space \(\textstyle (X,\Sigma). \) Interestingly, the answer (standard, or not) does not depend on the choice of \(\textstyle (X,\Sigma) \) and \(\textstyle f \). This fact is quite useful; one may adapt the choice of \(\textstyle (X,\Sigma) \) and \(\textstyle f \) to the given \(\textstyle (\Omega,\mathcal{F},P). \) No need to examine all cases. It may be convenient to examine a random variable \(\textstyle f : \Omega \to \mathbb{R}, \) a random vector \(\textstyle f : \Omega \to \mathbb{R}^n, \) a random sequence \(\textstyle f : \Omega \to \mathbb{R}^\infty, \) or a sequence of events \(\textstyle (A_1,A_2,\dots) \) treates as a sequence of two-valued random variables, \(\textstyle f : \Omega \to \{0,1\}^\infty. \)

Two conditions will be imposed on \(\textstyle f \) (to be injective, and generating). Below it is assumed that such \(\textstyle f \) is given. The question of its existence will be addressed afterwards.

The probability space \(\textstyle (\Omega,\mathcal{F},P) \) is assumed to be complete (otherwise it cannot be standard).

A single random variable

A measurable function \(\textstyle f : \Omega \to \mathbb{R} \) induces a pushforward measure, --- the probability measure \(\textstyle \mu \) on \(\textstyle \mathbb{R}, \) defined by \[\displaystyle \mu(B) = P \big( f^{-1}(B) \big) \]    for Borel sets \(\textstyle B \subset \mathbb{R}. \) (It is nothing but the distribution of the random variable.) The image \(\textstyle f (\Omega) \) is always a set of full outer measure, \[\displaystyle \mu^* \big( f(\Omega) \big) = 1, \] but its inner measure can differ (see a perforated interval). In other words, \(\textstyle f (\Omega) \) need not be a set of full measure \(\textstyle \mu. \)

A measurable function \(\textstyle f : \Omega \to \mathbb{R} \) is called generating if \(\textstyle \mathcal{F} \) is the completion of the σ-algebra of inverse images \(\textstyle f^{-1}(B), \) where \(\textstyle B \subset \mathbb{R} \) runs over all Borel sets.

Caution.   The following condition is not sufficient for \(\textstyle f \) to be generating: for every \(\textstyle A \in \mathcal{F} \) there exists a Borel set \(\textstyle B \subset \mathbb{R} \) such that \(\textstyle P ( A \Delta f^{-1}(B) ) = 0. \) (\(\textstyle \Delta \) means symmetric difference).

Theorem. Let a measurable function \(\textstyle f : \Omega \to \mathbb{R} \) be injective and generating, then the following two conditions are equivalent:

  • \(\textstyle f (\Omega) \) is of full measure \(\textstyle \mu; \)
  • \( (\Omega,\mathcal{F},P) \,\) is a standard probability space.

See also (Itô 1984, Sect. 3.1).

A random vector

The same theorem holds for any \( \mathbb{R}^n \,\) (in place of \( \mathbb{R} \,\)). A measurable function \( f : \Omega \to \mathbb{R}^n \,\) may be thought of as a finite sequence of random variables \( X_1,\dots,X_n : \Omega \to \mathbb{R}, \,\) and \( f \,\) is generating if and only if \( \mathcal{F} \,\) is the completion of the σ-algebra generated by \( X_1,\dots,X_n. \,\)

A random sequence

The theorem still holds for the space \( \mathbb{R}^\infty \,\) of infinite sequences. A measurable function \( f : \Omega \to \mathbb{R}^\infty \,\) may be thought of as an infinite sequence of random variables \( X_1,X_2,\dots : \Omega \to \mathbb{R}, \,\) and \( f \,\) is generating if and only if \( \mathcal{F} \,\) is the completion of the σ-algebra generated by \( X_1,X_2,\dots. \,\)

A sequence of events

In particular, if the random variables \( X_n \,\) take on only two values 0 and 1, we deal with a measurable function \( f : \Omega \to \{0,1\}^\infty \,\) and a sequence of sets \( A_1,A_2,\ldots \in \mathcal{F}. \,\) The function \( f \,\) is generating if and only if \( \mathcal{F} \,\) is the completion of the σ-algebra generated by \( A_1,A_2,\dots. \,\)

In the pioneering work (Rokhlin 1962) sequences \( A_1,A_2,\ldots \,\) that correspond to injective, generating \( f \,\) are called bases of the probability space \( (\Omega,\mathcal{F},P) \,\) (see (Rokhlin 1962, Sect. 2.1)). A basis is called complete mod 0, if \( f(\Omega) \,\) is of full measure \( \mu, \,\) see (Rokhlin 1962, Sect. 2.2). In the same section Rokhlin proved that if a probability space is complete mod 0 with respect to some basis, then it is complete mod 0 with respect to every other basis, and defines Lebesgue spaces by this completeness property. See also (Haezendonck 1973, Prop. 4 and Def. 7) and (Rudolph 1990, Sect. 2.3, especially Theorem 2.2).

Additional remarks

The four cases treated above are mutually equivalent, and can be united, since the measurable spaces \( \mathbb{R}, \,\) \( \mathbb{R}^n, \,\) \( \mathbb{R}^\infty \,\) and \( \{0,1\}^\infty \,\) are mutually isomorphic; they all are standard measurable spaces (in other words, standard Borel spaces).

Existence of an injective measurable function from \(\textstyle (\Omega,\mathcal{F},P) \) to a standard measurable space \(\textstyle (X,\Sigma) \) does not depend on the choice of \(\textstyle (X,\Sigma). \) Taking \(\textstyle (X,\Sigma) = \{0,1\}^\infty \) we get the property well-known as being countably separated (but called separable in (Itô 1984)).

Existence of a generating measurable function from \(\textstyle (\Omega,\mathcal{F},P) \) to a standard measurable space \(\textstyle (X,\Sigma) \) also does not depend on the choice of \(\textstyle (X,\Sigma). \) Taking \(\textstyle (X,\Sigma) = \{0,1\}^\infty \) we get the property well-known as being countably generated (mod 0), see (Durrett 1996, Exer. I.5).

Probability space Countably separated Countably generated Standard
Template:Rh | Interval with Lebesgue measure yes yes yes
Template:Rh | Naive white noise no no no
Template:Rh | Perforated interval yes yes no

Every injective measurable function from a standard probability space to a standard measurable space is generating. See (Rokhlin 1962, Sect. 2.5), (Haezendonck 1973, Corollary 2 on page 253), (de la Rue 1993, Theorems 3-4, 3-5). This property does not hold for the non-standard probability space dealt with in the subsection "A superfluous measurable set" above.

Caution.   The property of being countably generated is invariant under mod 0 isomorphisms, but the property of being countably separated is not. In fact, a standard probability space \(\textstyle (\Omega,\mathcal{F},P) \) is countably separated if and only if the cardinality of \(\textstyle \Omega \) does not exceed continuum (see (Itô 1984, Exer. 3.1(v))). A standard probability space may contain a null set of any cardinality, thus, it need not be countably separated. However, it always contains a countably separated subset of full measure.

Equivalent definitions

Let \(\textstyle (\Omega,\mathcal{F},P) \) be a complete probability space such that the cardinality of \(\textstyle \Omega \) does not exceed continuum (the general case is reduced to this special case, see the caution above).

Via absolute measurability

Definition.   \(\textstyle (\Omega,\mathcal{F},P) \) is standard if it is countably separated, countably generated, and absolutely measurable.

See (Rokhlin 1962, the end of Sect. 2.3) and (Haezendonck 1973, Remark 2 on page 248). "Absolutely measurable" means: measurable in every countably separated, countably generated probability space containing it.

Via perfectness

Definition.   \(\textstyle (\Omega,\mathcal{F},P) \) is standard if it is countably separated and perfect.

See (Itô 1984, Sect. 3.1). "Perfect" means that for every measurable function from \(\textstyle (\Omega,\mathcal{F},P) \) to \( \mathbb{R} \,\) the image measure is regular. (Here the image measure is defined on all sets whose inverse images belong to \(\textstyle \mathcal{F} \), irrespective of the Borel structure of \( \mathbb{R} \,\)).

Via topology

Definition.   \(\textstyle (\Omega,\mathcal{F},P) \) is standard if there exists a topology \(\textstyle \tau \) on \(\textstyle \Omega \) such that

  • the topological space \(\textstyle (\Omega,\tau) \) is metrizable;
  • \(\textstyle \mathcal{F} \) is the completion of the σ-algebra generated by \(\textstyle \tau \) (that is, by all open sets);
  • for every \(\textstyle \varepsilon > 0 \) there exists a compact set \(\textstyle K \) in \(\textstyle (\Omega,\tau) \) such that \(\textstyle P(K) \ge 1-\varepsilon. \)

See (de la Rue 1993, Sect. 1).

Verifying the standardness

Every probability distribution on the space \(\textstyle \mathbb{R}^n \) turns it into a standard probability space. (Here, a probability distribution means a probability measure defined initially on the Borel sigma-algebra and completed.)

The same holds on every Polish space, see (Rokhlin 1962, Sect. 2.7 (p. 24)), (Haezendonck 1973, Example 1 (p. 248)), (de la Rue 1993, Theorem 2-3), and (Itô 1984, Theorem 2.4.1).

For example, the Wiener measure turns the Polish space \(\textstyle C[0,\infty) \) (of all continuous functions \(\textstyle [0,\infty) \to \mathbb{R}, \) endowed with the topology of local uniform convergence) into a standard probability space.

Another example: for every sequence of random variables, their joint distribution turns the Polish space \(\textstyle \mathbb{R}^\infty \) (of sequences; endowed with the product topology) into a standard probability space.

(Thus, the idea of dimension, very natural for topological spaces, is utterly inappropriate for standard probability spaces.)

The product of two standard probability spaces is a standard probability space.

The same holds for the product of countably many spaces, see (Rokhlin 1962, Sect. 3.4), (Haezendonck 1973, Proposition 12), and (Itô 1984, Theorem 2.4.3).

A measurable subset of a standard probability space is a standard probability space. It is assumed that the set is not a null set, and is endowed with the conditional measure. See (Rokhlin 1962, Sect. 2.3 (p. 14)) and (Haezendonck 1973, Proposition 5).

Every probability measure on a standard Borel space turns it into a standard probability space.

Using the standardness

Regular conditional probabilities

In the discrete setup, the conditional probability is another probability measure, and the conditional expectation may be treated as the (usual) expectation with respect to the conditional measure, see conditional expectation. In the non-discrete setup, conditioning is often treated indirectly, since the condition may have probability 0, see conditional expectation. As a result, a number of well-known facts have special 'conditional' counterparts. For example: linearity of the expectation; Jensen's inequality (see conditional expectation); Hölder's inequality; the monotone convergence theorem, etc.

Given a random variable \(\textstyle Y \) on a probability space \(\textstyle (\Omega,\mathcal{F},P) \), it is natural to try constructing a conditional measure \(\textstyle P_y \), that is, the conditional distribution of \(\textstyle \omega \in \Omega \) given \(\textstyle Y(\omega)=y \). In general this is impossible (see (Durrett 1996, Sect. 4.1(c))). However, for a standard probability space \(\textstyle (\Omega,\mathcal{F},P) \) this is possible, and well-known as canonical system of measures (see (Rokhlin 1962, Sect. 3.1)), which is basically the same as conditional probability measures (see (Itô 1984, Sect. 3.5)), [[disintegration theorem|disintegration of measure]] (see (Kechris 1995, Exercise (17.35))), and regular conditional probabilities (see (Durrett 1996, Sect. 4.1(c))).

The conditional Jensen's inequality is just the (usual) Jensen's inequality applied to the conditional measure. The same holds for many other facts.

Measure preserving transformations

Given two probability spaces \(\textstyle (\Omega_1,\mathcal{F}_1,P_1) \), \(\textstyle (\Omega_2,\mathcal{F}_2,P_2) \) and a measure preserving map \(\textstyle f : \Omega_1 \to \Omega_2 \), the image \(\textstyle f(\Omega_1) \) need not cover the whole \(\textstyle \Omega_2 \), it may miss a null set. It may seem that \(\textstyle P_2(f(\Omega_1)) \) has to be equal to 1, but it is not so. The outer measure of \(\textstyle f(\Omega_1) \) is equal to 1, but the inner measure may differ. However, if the probability spaces \(\textstyle (\Omega_1,\mathcal{F}_1,P_1) \), \(\textstyle (\Omega_2,\mathcal{F}_2,P_2) \) are standard then \(\textstyle P_2(f(\Omega_1))=1 \), see (de la Rue 1993, Theorem 3-2). If \(\textstyle f \) is also one-to-one then every \(\textstyle A \in \mathcal{F}_1 \) satisfies \(\textstyle f(A) \in \mathcal{F}_2 \), \(\textstyle P_2(f(A))=P_1(A) \). Therefore \(\textstyle f^{-1} \) is measurable (and measure preserving). See (Rokhlin 1962, Sect. 2.5 (p. 20)) and (de la Rue 1993, Theorem 3-5). See also (Haezendonck 1973, Proposition 9 (and Remark after it)).

"There is a coherent way to ignore the sets of measure 0 in a measure space" (Petersen 1983, page 15). Striving to get rid of null sets, mathematicians often use equivalence classes of measurable sets or functions. Equivalence classes of measurable subsets of a probability space form a normed complete Boolean algebra called the measure algebra (or metric structure). Every measure preserving map \(\textstyle f : \Omega_1 \to \Omega_2 \) leads to a homomorphism \(\textstyle F \) of measure algebras; basically, \(\textstyle F(B) = f^{-1}(B) \) for \(\textstyle B\in\mathcal{F}_2 \).

It may seem that every homomorphism of measure algebras has to correspond to some measure preserving map, but it is not so. However, for standard probability spaces each \(\textstyle F \) corresponds to some \(\textstyle f \). See (Rokhlin 1962, Sect. 2.6 (p. 23) and 3.2), (Kechris 1995, Sect. 17.F), (Petersen 1983, Theorem 4.7 on page 17).

Notes

  1. [[ |(von Neumann 1932) and (Halmos, von Neumann 1942) are cited in (Rokhlin 1962, page 2) and (Petersen 1983, page 17).
  2. Published in short in 1947, in detail in 1949 in Russian and in 1952 in English, reprinted in 1962 (Rokhlin 1962). An unpublished text of 1940 is mentioned in (Rokhlin 1962, page 2). "The theory of Lebesgue spaces in its present form was constructed by V. A. Rokhlin" (Sinai 1994, page 16).
  3. "In this book we will deal exclusively with Lebesgue spaces" (Petersen 1983, page 17).
  4. "Ergodic theory on Lebesgue spaces" is the subtitle of the book (Rudolph 1990).

References

  • Rokhlin, V. A. (1962), "On the fundamental ideas of measure theory", Translations (American Mathematical Society) Series 1 10: 1–54 . Translated from Russian: Рохлин, В. А. (1949), "Об основных понятиях теории меры", Математический Сборник (Новая Серия) 25 (67): 107–150.
  • von Neumann, J. (1932), "Einige Sätze über messbare Abbildungen", Annals of Mathematics (3) 33: 574–586 .
  • Halmos, P. R.; von Neumann, J. (1942), "Operator methods in classical mechanics, II", Annals of Mathematics (2) (Annals of Mathematics) 43 (2): 332–350, doi:10.2307/1968872, JSTOR 1968872 .
  • Haezendonck, J. (1973), "Abstract Lebesgue–Rohlin spaces", Bulletin de la Societe Mathematique de Belgique 25: 243–258.
  • de la Rue, T. (1993), "Espaces de Lebesgue", Séminaire de Probabilités XXVII, Lecture Notes in Mathematics, 1557, Springer, Berlin, pp. 15–21 .
  • Petersen, K. (1983), Ergodic theory, Cambridge Univ. Press .
  • Itô, K. (1984), Introduction to probability theory, Cambridge Univ. Press .
  • Rudolph, D. J. (1990), Fundamentals of measurable dynamics: Ergodic theory on Lebesgue spaces, Oxford: Clarendon Press .
  • Sinai, Ya. G. (1994), Topics in ergodic theory, Princeton Univ. Press .
  • Kechris, A. S. (1995), Classical descriptive set theory, Springer .
  • Durrett, R. (1996), Probability: theory and examples (Second ed.) .
  • Wiener, N. (1958), Nonlinear problems in random theory, M.I.T. Press .
How to Cite This Entry:
Boris Tsirelson/sandbox. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Boris_Tsirelson/sandbox&oldid=19712