Namespaces
Variants
Actions

Difference between revisions of "Independence"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Anchor|independent sigma-fields)
m (tex encoded by computer)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 +
<!--
 +
i0505801.png
 +
$#A+1 = 133 n = 0
 +
$#C+1 = 133 : ~/encyclopedia/old_files/data/I050/I.0500580 Independence
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
''in probability theory''
 
''in probability theory''
  
 
One of the most important notions in probability theory. Other terms occasionally used are statistical independence, stochastic independence. The assumption that the events, trials and random variables being considered are independent has long been a common premise, from the very beginnings of mathematical [[Probability theory|probability theory]].
 
One of the most important notions in probability theory. Other terms occasionally used are statistical independence, stochastic independence. The assumption that the events, trials and random variables being considered are independent has long been a common premise, from the very beginnings of mathematical [[Probability theory|probability theory]].
  
Independence of two random events is defined as follows. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505801.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505802.png" /> be two random events, and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505803.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505804.png" /> be their probabilities. The conditional probability of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505805.png" /> given that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505806.png" /> has occurred is defined by
+
Independence of two random events is defined as follows. Let $  A $
 +
and $  B $
 +
be two random events, and let $  {\mathsf P} ( A) $
 +
and $  {\mathsf P} ( B) $
 +
be their probabilities. The conditional probability of $  B $
 +
given that $  A $
 +
has occurred is defined by
 +
 
 +
$$
 +
{\mathsf P} ( B \mid  A)  = \
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505807.png" /></td> </tr></table>
+
\frac{ {\mathsf P} ( A \cap B) }{ {\mathsf P} ( A) }
 +
,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505808.png" /> is the probability of the joint occurrence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i0505809.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058010.png" />. The events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058011.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058012.png" /> are said to be independent if
+
where $  {\mathsf P} ( A \cap B) $
 +
is the probability of the joint occurrence of $  A $
 +
and $  B $.  
 +
The events $  A $
 +
and $  B $
 +
are said to be independent if
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058013.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
$$ \tag{1 }
 +
{\mathsf P} ( A \cap B)  = {\mathsf P} ( A) {\mathsf P} ( B).
 +
$$
  
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058014.png" /> this is equivalent to
+
If $  {\mathsf P} ( A) > 0 $
 +
this is equivalent to
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058015.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$ \tag{2 }
 +
{\mathsf P} ( B \mid  A)  = {\mathsf P} ( B).
 +
$$
  
The meaning of this definition can be explained as follows. On the assumption that a large number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058016.png" /> of trials is being carried out, and assuming for the moment that (2) refers to relative frequencies rather than probabilities, one may conclude that the relative frequency of the event <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058017.png" /> in all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058018.png" /> trials must be equal to the relative frequency of its occurrences in the trials in which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058019.png" /> also occurs. Thus, independence of two events indicates that there is no discernable connection between the occurrence of the one event and that of the other. Thus, the event  "a randomly-selected person has a family name beginning, say, with the letter A" , and the event that  "the same person will win the grand prize in the next play of the state lottery"  are independent.
+
The meaning of this definition can be explained as follows. On the assumption that a large number $  N $
 +
of trials is being carried out, and assuming for the moment that (2) refers to relative frequencies rather than probabilities, one may conclude that the relative frequency of the event $  B $
 +
in all $  N $
 +
trials must be equal to the relative frequency of its occurrences in the trials in which $  A $
 +
also occurs. Thus, independence of two events indicates that there is no discernable connection between the occurrence of the one event and that of the other. Thus, the event  "a randomly-selected person has a family name beginning, say, with the letter A" , and the event that  "the same person will win the grand prize in the next play of the state lottery"  are independent.
  
The definition of independence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058020.png" /> random events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058021.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058022.png" />, may be presented in several equivalent versions. According to one version, these events are said to be independent if, for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058023.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058024.png" />, and for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058025.png" /> pairwise distinct natural numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058026.png" />, the probability of the joint occurrence of the events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058027.png" /> is equal to the product of their probabilities:
+
The definition of independence of $  n $
 +
random events $  A _ {1} \dots A _ {n} $,  
 +
$  n > 2 $,  
 +
may be presented in several equivalent versions. According to one version, these events are said to be independent if, for any $  m $,  
 +
$  2 \leq  m \leq  n $,  
 +
and for any $  m $
 +
pairwise distinct natural numbers $  k _ {1} \dots k _ {m} \leq  n $,  
 +
the probability of the joint occurrence of the events $  A _ {k _ {1}  } \dots A _ {k _ {m}  } $
 +
is equal to the product of their probabilities:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058028.png" /></td> <td valign="top" style="width:5%;text-align:right;">(3)</td></tr></table>
+
$$ \tag{3 }
 +
{\mathsf P} ( A _ {k _ {1}  } \cap \dots \cap A _ {k _ {m}  } )  = \
 +
{\mathsf P} ( A _ {k _ {1}  } ) \dots {\mathsf P} ( A _ {k _ {m}  } ).
 +
$$
  
 
Hence, as before, one may conclude that the conditional probability of each event given the occurrence of any combination of the others is equal to its  "unconditional"  probability.
 
Hence, as before, one may conclude that the conditional probability of each event given the occurrence of any combination of the others is equal to its  "unconditional"  probability.
  
Sometimes, besides the independence (mutual independence) of the events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058029.png" />, one also considers the notion known as pairwise independence: Any two of these events, say <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058030.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058031.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058032.png" />, are independent. Independence of events implies pairwise independence, but the converse need not be true.
+
Sometimes, besides the independence (mutual independence) of the events $  A _ {1} \dots A _ {n} $,  
 +
one also considers the notion known as pairwise independence: Any two of these events, say $  A _ {i} $
 +
and $  A _ {j} $,  
 +
i \neq j $,  
 +
are independent. Independence of events implies pairwise independence, but the converse need not be true.
  
 
Prior to the axiomatic construction of probability theory, the independence concept was not interpreted in an adequately clear-cut fashion. In the words of A.A. Markov ([[#References|[1]]], p. 24):  "The concept of independent events may be considered quite clear in known theoretical problems; in other problems, however, the concept may of course become quite obscured, in keeping with the obscurity of the fundamental notion of probability" .
 
Prior to the axiomatic construction of probability theory, the independence concept was not interpreted in an adequately clear-cut fashion. In the words of A.A. Markov ([[#References|[1]]], p. 24):  "The concept of independent events may be considered quite clear in known theoretical problems; in other problems, however, the concept may of course become quite obscured, in keeping with the obscurity of the fundamental notion of probability" .
  
 
{{Anchor|independent sigma-fields}}
 
{{Anchor|independent sigma-fields}}
In the context of an axiomatic approach, the most natural definition of independence is the following. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058033.png" /> be some [[Probability space|probability space]], where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058034.png" /> is the set of elementary events, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058035.png" /> a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058036.png" />-algebra of events and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058037.png" /> a probability measure defined on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058038.png" />. One first defines independence of classes of events (the only classes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058039.png" /> considered here will be sub <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058040.png" />-algebras of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058041.png" />). Classes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058042.png" /> are said to be independent (relative to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058043.png" />) if any events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058044.png" /> are independent in the sense of (3); the classes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058045.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058046.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058047.png" /> is an arbitrary index set) are said to be independent if, for any integer <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058048.png" /> and any pairwise distinct <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058049.png" />, the classes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058050.png" /> are independent. Independence of events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058051.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058052.png" />, is equivalent to independence of the classes
+
In the context of an axiomatic approach, the most natural definition of independence is the following. Let $  ( \Omega , {\mathcal A} , {\mathsf P} ) $
 +
be some [[Probability space|probability space]], where $  \Omega $
 +
is the set of elementary events, $  {\mathcal A} $
 +
a $  \sigma $-
 +
algebra of events and $  {\mathsf P} $
 +
a probability measure defined on $  {\mathcal A} $.  
 +
One first defines independence of classes of events (the only classes $  {\mathcal B} $
 +
considered here will be sub $  \sigma $-
 +
algebras of $  {\mathcal A} $).  
 +
Classes $  {\mathcal B} _ {1} \dots {\mathcal B} _ {n} $
 +
are said to be independent (relative to $  {\mathsf P} $)  
 +
if any events $  A _ {1} \in {\mathcal B} _ {1} \dots A _ {n} \in {\mathcal B} _ {n} $
 +
are independent in the sense of (3); the classes $  {\mathcal B} _ {t} $(
 +
$  t \in T $,  
 +
where $  T $
 +
is an arbitrary index set) are said to be independent if, for any integer $  n \geq  2 $
 +
and any pairwise distinct $  t _ {1} \dots t _ {n} \in T $,  
 +
the classes $  {\mathcal B} _ {t _ {1}  } \dots {\mathcal B} _ {t _ {n}  } $
 +
are independent. Independence of events $  A _ {k} $,  
 +
$  1 \leq  k \leq  n $,  
 +
is equivalent to independence of the classes
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058053.png" /></td> </tr></table>
+
$$
 +
{\mathcal B} _ {k}  = \{ \emptyset , A _ {k} , \overline{ {A _ {k} }}\; , \Omega \} .
 +
$$
  
In the case of trials, independence is precisely the independence of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058054.png" />-algebras generated by the trials.
+
In the case of trials, independence is precisely the independence of the $  \sigma $-
 +
algebras generated by the trials.
  
For random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058055.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058056.png" />, independence is defined as independence of the sub <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058057.png" />-algebras <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058058.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058059.png" /> is the pre-image under <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058060.png" /> of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058061.png" />-algebra of Borel sets on the real line. Independence of random events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058062.png" /> is equivalent to independence of their indicators <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058063.png" />, i.e. independence of the random variables defined by
+
For random variables $  X _ {t} $,  
 +
$  t \in T $,  
 +
independence is defined as independence of the sub $  \sigma $-
 +
algebras $  {\mathcal B} ( X _ {t} ) $,  
 +
where $  {\mathcal B} ( X _ {t} ) $
 +
is the pre-image under $  X _ {t} $
 +
of the $  \sigma $-
 +
algebra of Borel sets on the real line. Independence of random events $  A _ {1} \dots A _ {n} $
 +
is equivalent to independence of their indicators $  I _ {A _ {k}  } $,  
 +
i.e. independence of the random variables defined by
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058064.png" /></td> </tr></table>
+
$$
 +
I _ {A _ {k}  } ( \omega )  = 1 \  \textrm{ for }  \omega \in A _ {k}  $$
  
 
and
 
and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058065.png" /></td> </tr></table>
+
$$
 +
I _ {A _ {k}  } ( \omega )  = 0 \  \textrm{ for }  \omega \notin A _ {k} .
 +
$$
  
There are various necessary and sufficient conditions for the independence of random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058066.png" />:
+
There are various necessary and sufficient conditions for the independence of random variables $  X _ {1} \dots X _ {n} $:
  
1) For arbitrary real numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058067.png" />, the value of the [[Distribution function|distribution function]]
+
1) For arbitrary real numbers $  a _ {1} \dots a _ {n} $,  
 +
the value of the [[Distribution function|distribution function]]
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058068.png" /></td> </tr></table>
+
$$
 +
F _ {X _ {1}  \dots X _ {n} } ( a _ {1} \dots a _ {n} )  = \
 +
{\mathsf P} \{  \omega  : {X _ {1} ( \omega ) <
 +
a _ {1} \dots X _ {n} ( \omega ) < a _ {n} } \}
 +
$$
  
 
is equal to the product of the values of the individual distribution functions
 
is equal to the product of the values of the individual distribution functions
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058069.png" /></td> </tr></table>
+
$$
 +
F _ {X _ {1}  \dots X _ {n} } ( a _ {1} \dots a _ {n} )  = \
 +
F _ {X _ {1}  } ( a _ {1} ) \dots F _ {X _ {n}  } ( a _ {n} ).
 +
$$
  
2) If there exist densities <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058070.png" /> (cf. [[Density of a probability distribution|Density of a probability distribution]]), then the density is equal to the product <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058071.png" /> of the individual densities for almost all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058072.png" /> with respect to Lebesgue measure on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058073.png" />.
+
2) If there exist densities $  p _ {X _ {1}  \dots X _ {n} } ( a _ {1} \dots a _ {n} ) $(
 +
cf. [[Density of a probability distribution|Density of a probability distribution]]), then the density is equal to the product $  p _ {X _ {1}  } ( a _ {1} ) \dots p _ {X _ {n}  } ( a _ {n} ) $
 +
of the individual densities for almost all $  ( a _ {1} \dots a _ {n} ) $
 +
with respect to Lebesgue measure on $  \mathbf R  ^ {n} $.
  
 
3) The [[Characteristic function|characteristic function]]
 
3) The [[Characteristic function|characteristic function]]
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058074.png" /></td> </tr></table>
+
$$
 +
f _ {X _ {1}  \dots X _ {n} } ( u _ {1} \dots u _ {n} )  = \
 +
{\mathsf E} e ^ {iu _ {1} X _ {1} + \dots + iu _ {n} X _ {n} }
 +
$$
  
is equal, for all real numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058075.png" />, to the product <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058076.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058077.png" />, of the individual characteristic functions.
+
is equal, for all real numbers $  u _ {1} \dots u _ {n} $,  
 +
to the product $  f _ {X _ {1}  } ( u _ {1} ) \dots f _ {X _ {n}  } ( u _ {n} ) $,  
 +
$  f _ {X _ {k}  } ( u _ {k} ) = {\mathsf E} e ^ {iu _ {k} X _ {k} } $,  
 +
of the individual characteristic functions.
  
 
The most important schemes of probability theory are based on the assumption that various events and random variables are independent: sequences of independent random variables (see, e.g., [[Bernoulli random walk|Bernoulli random walk]]; [[Law of large numbers|Law of large numbers]]; [[Limit theorems|Limit theorems]] of probability theory), stochastic processes with independent increments (see, e.g., [[Wiener process|Wiener process]]; [[Stochastic process|Stochastic process]]), etc. (see also [[Zero-one law|Zero-one law]]).
 
The most important schemes of probability theory are based on the assumption that various events and random variables are independent: sequences of independent random variables (see, e.g., [[Bernoulli random walk|Bernoulli random walk]]; [[Law of large numbers|Law of large numbers]]; [[Limit theorems|Limit theorems]] of probability theory), stochastic processes with independent increments (see, e.g., [[Wiener process|Wiener process]]; [[Stochastic process|Stochastic process]]), etc. (see also [[Zero-one law|Zero-one law]]).
Line 64: Line 168:
 
==General remarks about the concept of independence.==
 
==General remarks about the concept of independence.==
  
 +
1) Independence of functions of independent random variables. Given the independence of random variables  $  X _ {1} \dots X _ {n} $
 +
one may deduce various propositions that are fairly obvious and also in full intuitive agreement with the idea of independence. For example, a function of  $  X _ {1} \dots X _ {k} $
 +
and a function of  $  X _ {k + 1 }  \dots X _ {n} $,
 +
$  1 \leq  k < n $,
 +
are independent random variables. Functions of other types may be independent only if certain additional assumptions are made; such independence may serve in the definition of various classes of distributions. For example, if  $  X _ {1} \dots X _ {n} $
 +
are independent, identically distributed and have a normal distribution, then the functions
  
1) Independence of functions of independent random variables. Given the independence of random variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058078.png" /> one may deduce various propositions that are fairly obvious and also in full intuitive agreement with the idea of independence. For example, a function of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058079.png" /> and a function of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058080.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058081.png" />, are independent random variables. Functions of other types may be independent only if certain additional assumptions are made; such independence may serve in the definition of various classes of distributions. For example, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058082.png" /> are independent, identically distributed and have a normal distribution, then the functions
+
$$ \tag{4 }
 +
\overline{X}\; = \
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058083.png" /></td> <td valign="top" style="width:5%;text-align:right;">(4)</td></tr></table>
+
\frac{X _ {1} + \dots + X _ {n} }{n}
 +
 
 +
$$
  
 
and
 
and
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058084.png" /></td> <td valign="top" style="width:5%;text-align:right;">(5)</td></tr></table>
+
$$ \tag{5 }
 +
{
 +
\frac{1}{n}
 +
}
 +
\sum _ {j = 1 } ^ { n }
 +
( X _ {j} - \overline{X}\; ) ^ {2}
 +
$$
  
(these are statistical estimators for the expectation and the variance of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058085.png" />, respectively) are independent random variables. The converse is also true: If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058086.png" /> are independent and identically distributed and if the functions (4) and (5) are independent, then the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058087.png" /> are normally distributed. In exactly the same way, if it is known that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058088.png" /> are independent and identically distributed, that the two linear forms
+
(these are statistical estimators for the expectation and the variance of the $  X _ {k} $,  
 +
respectively) are independent random variables. The converse is also true: If $  X _ {1} \dots X _ {k} $
 +
are independent and identically distributed and if the functions (4) and (5) are independent, then the $  X _ {k} $
 +
are normally distributed. In exactly the same way, if it is known that $  X _ {1} \dots X _ {k} $
 +
are independent and identically distributed, that the two linear forms
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058089.png" /></td> </tr></table>
+
$$
 +
Y _ {1}  = \
 +
\sum _ {j = 1 } ^ { n }
 +
a _ {j} X _ {j} \ \
 +
\textrm{ and } \ \
 +
Y _ {2}  = \
 +
\sum _ {j = 1 } ^ { n }
 +
b _ {j} X _ {j}  $$
  
are independent random variables, that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058090.png" />, and that none of the coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058091.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058092.png" /> vanishes, then all the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058093.png" /> are normally distributed. (This kind of theorem can be used to deduce, under minimal assumptions, say, Maxwell's law for the distribution of molecular velocities.) The above propositions are examples of what are known as characterization theorems, and were most thoroughly studied by Yu.V. Linnik and his school.
+
are independent random variables, that $  ( a _ {1} \dots a _ {n} ) \neq ( b _ {1} \dots b _ {n} ) $,  
 +
and that none of the coefficients $  a _ {j} $,  
 +
$  b _ {j} $
 +
vanishes, then all the $  X _ {j} $
 +
are normally distributed. (This kind of theorem can be used to deduce, under minimal assumptions, say, Maxwell's law for the distribution of molecular velocities.) The above propositions are examples of what are known as characterization theorems, and were most thoroughly studied by Yu.V. Linnik and his school.
  
2) Existence of independent random variables on a given probability space. If the set of elementary events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058094.png" /> consists of three elements, each of which is assigned probability <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058095.png" />, then there do not exist non-constant independent random variables on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058096.png" />. Letting the probability space be the interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058097.png" /> with Lebesgue measure <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058098.png" />, then, given any sequence of distribution functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i05058099.png" /> one can define measurable functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580100.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580101.png" /> that are independent random variables with respect to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580102.png" /> and such that
+
2) Existence of independent random variables on a given probability space. If the set of elementary events $  \Omega $
 +
consists of three elements, each of which is assigned probability $  1 / 3 $,  
 +
then there do not exist non-constant independent random variables on $  \Omega $.  
 +
Letting the probability space be the interval $  [ 0, 1] $
 +
with Lebesgue measure $  m $,  
 +
then, given any sequence of distribution functions $  F _ {1} ( x), F _ {2} ( x) \dots $
 +
one can define measurable functions $  X _ {k} ( \omega ) $
 +
on $  [ 0, 1] $
 +
that are independent random variables with respect to $  m $
 +
and such that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580103.png" /></td> </tr></table>
+
$$
 +
m \{  \omega  : {0 \leq  \omega \leq  1, X _ {k} ( \omega ) < x } \}
 +
= \
 +
F _ {k} ( x).
 +
$$
  
The simplest example of this kind of statistically-independent functions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580104.png" /> is furnished by the signs of the binary decomposition of an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580105.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580106.png" />, or of the related Rademacher functions:
+
The simplest example of this kind of statistically-independent functions on $  [ 0, 1] $
 +
is furnished by the signs of the binary decomposition of an $  \omega $,  
 +
0 \leq  \omega \leq  1 $,  
 +
or of the related Rademacher functions:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580107.png" /></td> </tr></table>
+
$$
 +
r _ {k} ( \omega )  = \
 +
\mathop{\rm sign}  \sin \
 +
( 2 \pi \cdot 2 ^ {k - 1 } \omega ),\ \
 +
k = 1, 2 ,\dots .
 +
$$
  
 
It should be noted that the existence of some probability space on which one can define independent random variables with given distributions is a corollary of Kolmogorov's theorem on probabilities in infinite-dimensional spaces (see [[#References|[3]]], Chapt. III, Sect. 4).
 
It should be noted that the existence of some probability space on which one can define independent random variables with given distributions is a corollary of Kolmogorov's theorem on probabilities in infinite-dimensional spaces (see [[#References|[3]]], Chapt. III, Sect. 4).
  
3) Independent random variables as a source of other schemes. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580108.png" /> be a sequence of independent random variables and set
+
3) Independent random variables as a source of other schemes. Let $  Y _ {1} \dots Y _ {n} \dots $
 +
be a sequence of independent random variables and set
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580109.png" /></td> </tr></table>
+
$$
 +
X _ {0}  = 0 ,\  X _ {n}  = \
 +
\sum _ { k= } 1 ^ { n }  Y _ {k} \ \
 +
( n \geq  1 ) ;
 +
$$
  
 
then one obtains a sequence of random variables forming a [[Markov chain|Markov chain]]. A similar procedure will yield a [[Markov process|Markov process]], for example, beginning with a Wiener process and using a stochastic differential equation. Starting with Gaussian random measures with independent values and using the Fourier transform, one can construct Gaussian stationary stochastic processes, etc.
 
then one obtains a sequence of random variables forming a [[Markov chain|Markov chain]]. A similar procedure will yield a [[Markov process|Markov process]], for example, beginning with a Wiener process and using a stochastic differential equation. Starting with Gaussian random measures with independent values and using the Fourier transform, one can construct Gaussian stationary stochastic processes, etc.
  
4) Weak dependence. The asymptotic laws of probability theory that are established for sequences of independent random variables can usually be extended to sequences of so-called weakly-dependent variables, i.e. to sequences <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580110.png" /> in which there is a suitably measured dependence between  "distant"  segments of the sequence that is  "small"  (in the simplest cases, these may be sequences of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580112.png" />-dependent random variables, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580113.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580114.png" /> are independent if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580115.png" />; or sequences of random variables forming an ergodic Markov chain (cf. [[Markov chain, ergodic|Markov chain, ergodic]]); etc.). One of the main methods for proving theorems of this type is reduction to the situation of independence.
+
4) Weak dependence. The asymptotic laws of probability theory that are established for sequences of independent random variables can usually be extended to sequences of so-called weakly-dependent variables, i.e. to sequences $  X _ {1} \dots X _ {n} \dots $
 +
in which there is a suitably measured dependence between  "distant"  segments of the sequence that is  "small"  (in the simplest cases, these may be sequences of $  m $-
 +
dependent random variables, where $  X _ {k} $
 +
and $  X _ {l} $
 +
are independent if $  | k - l | > m $;  
 +
or sequences of random variables forming an ergodic Markov chain (cf. [[Markov chain, ergodic|Markov chain, ergodic]]); etc.). One of the main methods for proving theorems of this type is reduction to the situation of independence.
  
5) Independence in number theory. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580116.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580117.png" /> be two relatively-prime numbers. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580118.png" /> be a natural number, and suppose that a number between 1 and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580119.png" /> is chosen at random (the probability of each being chosen is assumed to be <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580120.png" />). Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580121.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580122.png" />) be the event that the chosen number is divisible by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580123.png" /> (by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580124.png" />). Then
+
5) Independence in number theory. Let $  p \geq  2 $
 +
and $  q \geq  2 $
 +
be two relatively-prime numbers. Let $  N $
 +
be a natural number, and suppose that a number between 1 and $  N $
 +
is chosen at random (the probability of each being chosen is assumed to be $  1/N $).  
 +
Let $  A _ {p} $(
 +
$  A _ {q} $)  
 +
be the event that the chosen number is divisible by $  p $(
 +
by $  q $).  
 +
Then
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580125.png" /></td> </tr></table>
+
$$
 +
{\mathsf P} ( A _ {p} )  = \
 +
{
 +
\frac{1}{N}
 +
} \left [
 +
{
 +
\frac{N}{p}
 +
} \right ] ,\ \
 +
{\mathsf P} ( A _ {q} )  = \
 +
{
 +
\frac{1}{N}
 +
} \left [
 +
{
 +
\frac{N}{q}
 +
} \right ] ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580126.png" /></td> </tr></table>
+
$$
 +
{\mathsf P} ( A _ {p} \cap A _ {q} )  = {
 +
\frac{1}{N}
 +
} \left [ {
 +
\frac{N}{pq}
 +
} \right ] ,
 +
$$
  
and if one lets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580127.png" />, then the events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580128.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580129.png" /> become  "almost independent" . A much more profound proposition is the following: Letting <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580130.png" />, one can choose <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580131.png" /> such that the events <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580132.png" /> (where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580133.png" /> denotes divisibility by the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/i/i050/i050580/i050580134.png" />-th prime) are jointly  "almost independent" ; this proposition provides the basis for studying the value distribution of arithmetic functions (see [[Number theory, probabilistic methods in|Number theory, probabilistic methods in]]). There are also other branches of number theory in which the idea of independence plays an explicit or implicit part.
+
and if one lets $  N \rightarrow \infty $,  
 +
then the events $  A _ {p} $
 +
and $  A _ {q} $
 +
become  "almost independent" . A much more profound proposition is the following: Letting $  N \rightarrow \infty $,  
 +
one can choose $  S = S _ {N} \rightarrow \infty $
 +
such that the events $  A _ {2} \dots A _ {p _ {S}  } $(
 +
where $  A _ {j} $
 +
denotes divisibility by the $  j $-
 +
th prime) are jointly  "almost independent" ; this proposition provides the basis for studying the value distribution of arithmetic functions (see [[Number theory, probabilistic methods in|Number theory, probabilistic methods in]]). There are also other branches of number theory in which the idea of independence plays an explicit or implicit part.
  
 
6) For testing of hypotheses of using independence results of observations, see [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]].
 
6) For testing of hypotheses of using independence results of observations, see [[Statistical hypotheses, verification of|Statistical hypotheses, verification of]].
  
 
====References====
 
====References====
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  A.A. Markov,  "Wahrscheinlichkeitsrechung" , Teubner  (1912)  (Translated from Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.N. Kolmogorov,  "Foundations of the theory of probability" , Chelsea, reprint  (1950)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  A.N. Kolmogorov,  "The theory of probability" , ''Mathematics, its content, methods and meaning'' , '''4''' , Amer. Math. Soc.  (1963)  pp. Chapt. 6  (Translated from Russian)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  M. Kac,  "Statistical independence in probability, analysis and number theory" , Math. Assoc. Amer.  (1963)</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  W. Feller,   "An introduction to probability theory and its applications" , '''1–2''' , Wiley  (1957–1971)</TD></TR></table>
+
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  A.A. Markov,  "Wahrscheinlichkeitsrechung" , Teubner  (1912)  (Translated from Russian)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.N. Kolmogorov,  "Foundations of the theory of probability" , Chelsea, reprint  (1950)  (Translated from Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  A.N. Kolmogorov,  "The theory of probability" , ''Mathematics, its content, methods and meaning'' , '''4''' , Amer. Math. Soc.  (1963)  pp. Chapt. 6  (Translated from Russian)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  M. Kac,  "Statistical independence in probability, analysis and number theory" , Math. Assoc. Amer.  (1963)</TD></TR>
 
+
<TR><TD valign="top">[5]</TD> <TD valign="top">  W. Feller, [[Feller, "An introduction to probability theory and its applications"|"An introduction to probability theory and its  applications"]], '''1–2''' , Wiley  (1957–1971)</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====
 
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  A. Rényi,  "Foundations of probability" , Holden-Day  (1970)</TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top">  A. Rényi,  "Foundations of probability" , Holden-Day  (1970)</TD></TR></table>

Latest revision as of 22:12, 5 June 2020


in probability theory

One of the most important notions in probability theory. Other terms occasionally used are statistical independence, stochastic independence. The assumption that the events, trials and random variables being considered are independent has long been a common premise, from the very beginnings of mathematical probability theory.

Independence of two random events is defined as follows. Let $ A $ and $ B $ be two random events, and let $ {\mathsf P} ( A) $ and $ {\mathsf P} ( B) $ be their probabilities. The conditional probability of $ B $ given that $ A $ has occurred is defined by

$$ {\mathsf P} ( B \mid A) = \ \frac{ {\mathsf P} ( A \cap B) }{ {\mathsf P} ( A) } , $$

where $ {\mathsf P} ( A \cap B) $ is the probability of the joint occurrence of $ A $ and $ B $. The events $ A $ and $ B $ are said to be independent if

$$ \tag{1 } {\mathsf P} ( A \cap B) = {\mathsf P} ( A) {\mathsf P} ( B). $$

If $ {\mathsf P} ( A) > 0 $ this is equivalent to

$$ \tag{2 } {\mathsf P} ( B \mid A) = {\mathsf P} ( B). $$

The meaning of this definition can be explained as follows. On the assumption that a large number $ N $ of trials is being carried out, and assuming for the moment that (2) refers to relative frequencies rather than probabilities, one may conclude that the relative frequency of the event $ B $ in all $ N $ trials must be equal to the relative frequency of its occurrences in the trials in which $ A $ also occurs. Thus, independence of two events indicates that there is no discernable connection between the occurrence of the one event and that of the other. Thus, the event "a randomly-selected person has a family name beginning, say, with the letter A" , and the event that "the same person will win the grand prize in the next play of the state lottery" are independent.

The definition of independence of $ n $ random events $ A _ {1} \dots A _ {n} $, $ n > 2 $, may be presented in several equivalent versions. According to one version, these events are said to be independent if, for any $ m $, $ 2 \leq m \leq n $, and for any $ m $ pairwise distinct natural numbers $ k _ {1} \dots k _ {m} \leq n $, the probability of the joint occurrence of the events $ A _ {k _ {1} } \dots A _ {k _ {m} } $ is equal to the product of their probabilities:

$$ \tag{3 } {\mathsf P} ( A _ {k _ {1} } \cap \dots \cap A _ {k _ {m} } ) = \ {\mathsf P} ( A _ {k _ {1} } ) \dots {\mathsf P} ( A _ {k _ {m} } ). $$

Hence, as before, one may conclude that the conditional probability of each event given the occurrence of any combination of the others is equal to its "unconditional" probability.

Sometimes, besides the independence (mutual independence) of the events $ A _ {1} \dots A _ {n} $, one also considers the notion known as pairwise independence: Any two of these events, say $ A _ {i} $ and $ A _ {j} $, $ i \neq j $, are independent. Independence of events implies pairwise independence, but the converse need not be true.

Prior to the axiomatic construction of probability theory, the independence concept was not interpreted in an adequately clear-cut fashion. In the words of A.A. Markov ([1], p. 24): "The concept of independent events may be considered quite clear in known theoretical problems; in other problems, however, the concept may of course become quite obscured, in keeping with the obscurity of the fundamental notion of probability" .

In the context of an axiomatic approach, the most natural definition of independence is the following. Let $ ( \Omega , {\mathcal A} , {\mathsf P} ) $ be some probability space, where $ \Omega $ is the set of elementary events, $ {\mathcal A} $ a $ \sigma $- algebra of events and $ {\mathsf P} $ a probability measure defined on $ {\mathcal A} $. One first defines independence of classes of events (the only classes $ {\mathcal B} $ considered here will be sub $ \sigma $- algebras of $ {\mathcal A} $). Classes $ {\mathcal B} _ {1} \dots {\mathcal B} _ {n} $ are said to be independent (relative to $ {\mathsf P} $) if any events $ A _ {1} \in {\mathcal B} _ {1} \dots A _ {n} \in {\mathcal B} _ {n} $ are independent in the sense of (3); the classes $ {\mathcal B} _ {t} $( $ t \in T $, where $ T $ is an arbitrary index set) are said to be independent if, for any integer $ n \geq 2 $ and any pairwise distinct $ t _ {1} \dots t _ {n} \in T $, the classes $ {\mathcal B} _ {t _ {1} } \dots {\mathcal B} _ {t _ {n} } $ are independent. Independence of events $ A _ {k} $, $ 1 \leq k \leq n $, is equivalent to independence of the classes

$$ {\mathcal B} _ {k} = \{ \emptyset , A _ {k} , \overline{ {A _ {k} }}\; , \Omega \} . $$

In the case of trials, independence is precisely the independence of the $ \sigma $- algebras generated by the trials.

For random variables $ X _ {t} $, $ t \in T $, independence is defined as independence of the sub $ \sigma $- algebras $ {\mathcal B} ( X _ {t} ) $, where $ {\mathcal B} ( X _ {t} ) $ is the pre-image under $ X _ {t} $ of the $ \sigma $- algebra of Borel sets on the real line. Independence of random events $ A _ {1} \dots A _ {n} $ is equivalent to independence of their indicators $ I _ {A _ {k} } $, i.e. independence of the random variables defined by

$$ I _ {A _ {k} } ( \omega ) = 1 \ \textrm{ for } \omega \in A _ {k} $$

and

$$ I _ {A _ {k} } ( \omega ) = 0 \ \textrm{ for } \omega \notin A _ {k} . $$

There are various necessary and sufficient conditions for the independence of random variables $ X _ {1} \dots X _ {n} $:

1) For arbitrary real numbers $ a _ {1} \dots a _ {n} $, the value of the distribution function

$$ F _ {X _ {1} \dots X _ {n} } ( a _ {1} \dots a _ {n} ) = \ {\mathsf P} \{ \omega : {X _ {1} ( \omega ) < a _ {1} \dots X _ {n} ( \omega ) < a _ {n} } \} $$

is equal to the product of the values of the individual distribution functions

$$ F _ {X _ {1} \dots X _ {n} } ( a _ {1} \dots a _ {n} ) = \ F _ {X _ {1} } ( a _ {1} ) \dots F _ {X _ {n} } ( a _ {n} ). $$

2) If there exist densities $ p _ {X _ {1} \dots X _ {n} } ( a _ {1} \dots a _ {n} ) $( cf. Density of a probability distribution), then the density is equal to the product $ p _ {X _ {1} } ( a _ {1} ) \dots p _ {X _ {n} } ( a _ {n} ) $ of the individual densities for almost all $ ( a _ {1} \dots a _ {n} ) $ with respect to Lebesgue measure on $ \mathbf R ^ {n} $.

3) The characteristic function

$$ f _ {X _ {1} \dots X _ {n} } ( u _ {1} \dots u _ {n} ) = \ {\mathsf E} e ^ {iu _ {1} X _ {1} + \dots + iu _ {n} X _ {n} } $$

is equal, for all real numbers $ u _ {1} \dots u _ {n} $, to the product $ f _ {X _ {1} } ( u _ {1} ) \dots f _ {X _ {n} } ( u _ {n} ) $, $ f _ {X _ {k} } ( u _ {k} ) = {\mathsf E} e ^ {iu _ {k} X _ {k} } $, of the individual characteristic functions.

The most important schemes of probability theory are based on the assumption that various events and random variables are independent: sequences of independent random variables (see, e.g., Bernoulli random walk; Law of large numbers; Limit theorems of probability theory), stochastic processes with independent increments (see, e.g., Wiener process; Stochastic process), etc. (see also Zero-one law).

General remarks about the concept of independence.

1) Independence of functions of independent random variables. Given the independence of random variables $ X _ {1} \dots X _ {n} $ one may deduce various propositions that are fairly obvious and also in full intuitive agreement with the idea of independence. For example, a function of $ X _ {1} \dots X _ {k} $ and a function of $ X _ {k + 1 } \dots X _ {n} $, $ 1 \leq k < n $, are independent random variables. Functions of other types may be independent only if certain additional assumptions are made; such independence may serve in the definition of various classes of distributions. For example, if $ X _ {1} \dots X _ {n} $ are independent, identically distributed and have a normal distribution, then the functions

$$ \tag{4 } \overline{X}\; = \ \frac{X _ {1} + \dots + X _ {n} }{n} $$

and

$$ \tag{5 } { \frac{1}{n} } \sum _ {j = 1 } ^ { n } ( X _ {j} - \overline{X}\; ) ^ {2} $$

(these are statistical estimators for the expectation and the variance of the $ X _ {k} $, respectively) are independent random variables. The converse is also true: If $ X _ {1} \dots X _ {k} $ are independent and identically distributed and if the functions (4) and (5) are independent, then the $ X _ {k} $ are normally distributed. In exactly the same way, if it is known that $ X _ {1} \dots X _ {k} $ are independent and identically distributed, that the two linear forms

$$ Y _ {1} = \ \sum _ {j = 1 } ^ { n } a _ {j} X _ {j} \ \ \textrm{ and } \ \ Y _ {2} = \ \sum _ {j = 1 } ^ { n } b _ {j} X _ {j} $$

are independent random variables, that $ ( a _ {1} \dots a _ {n} ) \neq ( b _ {1} \dots b _ {n} ) $, and that none of the coefficients $ a _ {j} $, $ b _ {j} $ vanishes, then all the $ X _ {j} $ are normally distributed. (This kind of theorem can be used to deduce, under minimal assumptions, say, Maxwell's law for the distribution of molecular velocities.) The above propositions are examples of what are known as characterization theorems, and were most thoroughly studied by Yu.V. Linnik and his school.

2) Existence of independent random variables on a given probability space. If the set of elementary events $ \Omega $ consists of three elements, each of which is assigned probability $ 1 / 3 $, then there do not exist non-constant independent random variables on $ \Omega $. Letting the probability space be the interval $ [ 0, 1] $ with Lebesgue measure $ m $, then, given any sequence of distribution functions $ F _ {1} ( x), F _ {2} ( x) \dots $ one can define measurable functions $ X _ {k} ( \omega ) $ on $ [ 0, 1] $ that are independent random variables with respect to $ m $ and such that

$$ m \{ \omega : {0 \leq \omega \leq 1, X _ {k} ( \omega ) < x } \} = \ F _ {k} ( x). $$

The simplest example of this kind of statistically-independent functions on $ [ 0, 1] $ is furnished by the signs of the binary decomposition of an $ \omega $, $ 0 \leq \omega \leq 1 $, or of the related Rademacher functions:

$$ r _ {k} ( \omega ) = \ \mathop{\rm sign} \sin \ ( 2 \pi \cdot 2 ^ {k - 1 } \omega ),\ \ k = 1, 2 ,\dots . $$

It should be noted that the existence of some probability space on which one can define independent random variables with given distributions is a corollary of Kolmogorov's theorem on probabilities in infinite-dimensional spaces (see [3], Chapt. III, Sect. 4).

3) Independent random variables as a source of other schemes. Let $ Y _ {1} \dots Y _ {n} \dots $ be a sequence of independent random variables and set

$$ X _ {0} = 0 ,\ X _ {n} = \ \sum _ { k= } 1 ^ { n } Y _ {k} \ \ ( n \geq 1 ) ; $$

then one obtains a sequence of random variables forming a Markov chain. A similar procedure will yield a Markov process, for example, beginning with a Wiener process and using a stochastic differential equation. Starting with Gaussian random measures with independent values and using the Fourier transform, one can construct Gaussian stationary stochastic processes, etc.

4) Weak dependence. The asymptotic laws of probability theory that are established for sequences of independent random variables can usually be extended to sequences of so-called weakly-dependent variables, i.e. to sequences $ X _ {1} \dots X _ {n} \dots $ in which there is a suitably measured dependence between "distant" segments of the sequence that is "small" (in the simplest cases, these may be sequences of $ m $- dependent random variables, where $ X _ {k} $ and $ X _ {l} $ are independent if $ | k - l | > m $; or sequences of random variables forming an ergodic Markov chain (cf. Markov chain, ergodic); etc.). One of the main methods for proving theorems of this type is reduction to the situation of independence.

5) Independence in number theory. Let $ p \geq 2 $ and $ q \geq 2 $ be two relatively-prime numbers. Let $ N $ be a natural number, and suppose that a number between 1 and $ N $ is chosen at random (the probability of each being chosen is assumed to be $ 1/N $). Let $ A _ {p} $( $ A _ {q} $) be the event that the chosen number is divisible by $ p $( by $ q $). Then

$$ {\mathsf P} ( A _ {p} ) = \ { \frac{1}{N} } \left [ { \frac{N}{p} } \right ] ,\ \ {\mathsf P} ( A _ {q} ) = \ { \frac{1}{N} } \left [ { \frac{N}{q} } \right ] , $$

$$ {\mathsf P} ( A _ {p} \cap A _ {q} ) = { \frac{1}{N} } \left [ { \frac{N}{pq} } \right ] , $$

and if one lets $ N \rightarrow \infty $, then the events $ A _ {p} $ and $ A _ {q} $ become "almost independent" . A much more profound proposition is the following: Letting $ N \rightarrow \infty $, one can choose $ S = S _ {N} \rightarrow \infty $ such that the events $ A _ {2} \dots A _ {p _ {S} } $( where $ A _ {j} $ denotes divisibility by the $ j $- th prime) are jointly "almost independent" ; this proposition provides the basis for studying the value distribution of arithmetic functions (see Number theory, probabilistic methods in). There are also other branches of number theory in which the idea of independence plays an explicit or implicit part.

6) For testing of hypotheses of using independence results of observations, see Statistical hypotheses, verification of.

References

[1] A.A. Markov, "Wahrscheinlichkeitsrechung" , Teubner (1912) (Translated from Russian)
[2] A.N. Kolmogorov, "Foundations of the theory of probability" , Chelsea, reprint (1950) (Translated from Russian)
[3] A.N. Kolmogorov, "The theory of probability" , Mathematics, its content, methods and meaning , 4 , Amer. Math. Soc. (1963) pp. Chapt. 6 (Translated from Russian)
[4] M. Kac, "Statistical independence in probability, analysis and number theory" , Math. Assoc. Amer. (1963)
[5] W. Feller, "An introduction to probability theory and its applications", 1–2 , Wiley (1957–1971)

Comments

References

[a1] A. Rényi, "Foundations of probability" , Holden-Day (1970)
How to Cite This Entry:
Independence. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Independence&oldid=21457
This article was adapted from an original article by Yu.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article