# Confidence estimation

A method in mathematical statistics for the construction of a set of approximate values of the unknown parameters of probability distributions.

Let $ X $ be a random vector assuming values in a set $ {\mathcal X} $ in a Euclidean space and let the probability distribution of this vector belong to the parametric family of distributions defined by the densities $ p ( x \mid \theta ) $, $ x \in {\mathcal X} $, $ \theta \in \Theta $, with respect to some measure $ \mu (x) $. It is assumed that the true value of the parametric point $ \theta $ corresponding to the result of observations of $ X $ is unknown. Confidence estimation consists in constructing a certain set $ C(X) $, depending on $ X $, containing the value of a given function $ u ( \theta ) $ corresponding to the unknown true value of $ \theta $.

Let $ U $ be the range of values of the function $ u ( \theta ) $, $ \theta \in \Theta $, and let $ C (x) $, $ x \in {\mathcal X} $, be some family of sets belonging to $ U $ for all $ x $ from $ {\mathcal X} $; moreover, it is assumed that for an arbitrary element $ u \in U $ and any value of $ \theta \in \Theta $ the probability of the event $ \{ C (X) \ni u \} $ is defined. This probability is given by the integral

$$ {\mathsf P} _ {C} ( u ,\ \theta ) \ = \ \int\limits _ {C (x) \ni u } p (x \mid \theta ) \ d \mu (x) ,\ \ u \in U ,\ \ \theta \in \Theta , $$

and is said to be the covering probability of the value of $ u $ by the set $ C (X) $ for a given value of $ \theta $.

If the true value of $ \theta $ is unknown, the set $ C (X) $( from the family of sets $ C (x) $, $ x \in {\mathcal X} $) which corresponds to the results of observations of $ X $ is said to be a confidence set or an interval estimator for the unknown true value of the function $ u ( \theta ) $. The confidence probability $ {\mathsf P} _ {C} ( \theta ) $, which can be expressed, in terms of the covering probability, by the equation

$$ {\mathsf P} _ {C} ( \theta ) \ = \ {\mathsf P} _ {C} [ u ( \theta ) ,\ \theta ] ,\ \ \theta \in \Theta , $$

is employed as the probability characteristic of the interval estimator $ C (X) $ constructed according to the above rule. In other words, $ {\mathsf P} _ {C} ( \theta ) $ is the probability of covering by the set $ C (X) $ of the value of a given function $ u ( \theta ) $ corresponding to the unknown true parametric point $ \theta $.

If the confidence probability $ {\mathsf P} _ {C} ( \theta ) $ is independent of $ \theta $, the interval estimator $ C (X) $ is said to be similar to the sampling space. This name is due to the analogy between the formulas

$$ {\mathsf P} _ {C} ( \theta ) \ = \ {\mathsf P} \{ C (X) \ni u ( \theta ) \mid \theta \} \ = \ \textrm{ const } $$

and

$$ {\mathsf P} \{ X \in {\mathcal X} \mid \theta \} \ = \ \textrm{ const } \ = \ 1 . $$

In a more general situation $ {\mathsf P} _ {C} ( \theta ) $ depends on the unknown $ \theta $, and for this reason the quality of the interval estimator is usually characterized in practical work by the confidence level

$$ P _ {C} \ = \ \inf \ {\mathsf P} _ {C} ( \theta ) , $$

where the lower bound is calculated over the set $ \Theta $. (The confidence level is sometimes called the confidence coefficient.)

Optimization of confidence estimation is defined by the requirements to be met by interval estimators. For instance, if the objective is to construct confidence sets similar to sampling spaces, with a given confidence level $ \omega $( $ 0.5 \leq \omega < 1 $), the first requirement is expressed by the identity

$$ {\mathsf P} _ {C} [ u ( \theta ) ,\ \theta ] \ \equiv \ \omega ,\ \ \theta \in \Theta . $$

It is natural to look for an interval estimator that covers the true value of $ u ( \theta ) $ with a probability at least that of covering an arbitrary value $ u \in U $. In other words, the second requirement, known as the requirement of unbiasedness, is expressed by the inequality

$$ {\mathsf P} _ {C} ( u ,\ \theta ) \ \leq \ \omega ,\ \ u \in U ,\ \ \theta \in \Theta . $$

Under these conditions, the "best" interval estimator $ C $ may reasonably be taken as one which covers any value $ u $ other than the true value $ u ( \theta ) $ with a smaller probability. Hence the third requirement of "highest accuracy" : For any set $ C ^ {\ \prime } $ other than $ C $ and meeting the condition

$$ {\mathsf P} _ {C ^ {\ \prime } } [ u ( \theta ) ,\ \theta ] \ \geq \ \omega ,\ \ \theta \in \Theta , $$

the inequality

$$ {\mathsf P} _ {C} ( u ,\ \theta ) \ \leq \ {\mathsf P} _ {C ^ {\ \prime } } ( u ,\ \theta ) ,\ \ u \in U ,\ \ \theta \in \Theta , $$

must be valid.

The task of finding interval estimators $ C $ satisfying all three requirements is equivalent to the task of constructing unbiased most-powerful statistical tests similar to the sampling space and having significance level $ 1 - \omega $. The problem of the existence of a solution to this problem and its constructive description form the base of the general theory of statistical hypothesis testing.

Confidence estimation is most often used when $ u ( \theta ) $ is a scalar function. Let $ X _ {1} \dots X _ {n} $, $ n \geq 2 $, be independent random variables subject to the same normal distribution, with the unknown parameters $ {\mathsf E} X _ {i} = \theta _ {1} $ and $ {\mathsf D} X _ {i} = \theta _ {2} $. The problem is to construct an interval estimator for $ u ( \theta ) = \theta _ {1} $. Let

$$ \overline{X}\; \ = \ \frac{1}{n} \sum _ {i = 1 } ^ { n } X _ {i} \ \ \textrm{ and } \ \ s ^ {2} \ = \ \frac{1}{n-1} \sum _ {i = 1 } ^ { n } ( X _ {i} - \overline{X}\; ) ^ {2} . $$

Since the random variable $ T = \sqrt n ( \overline{X}\; - \theta _ {1} ) / s $ is subject to the Student distribution with $ n - 1 $ degrees of freedom, and since this distribution does not depend on the unknown parameters $ \theta _ {1} $ and $ \theta _ {2} $( $ | \theta _ {1} | < \infty $, $ \theta _ {2} > 0 $), it follows that, for any positive $ t $, the probability of occurrence of the event

$$ \left \{ \overline{X}\; - \frac{ts}{\sqrt n } \ < \ \theta _ {1} \ < \ \overline{X}\; + \frac{ts}{\sqrt n } \right \} $$

depends only on $ t $. If this interval is taken as the interval estimator $ C $ for $ \theta _ {1} $, it will correspond to the confidence probability

$$ {\mathsf P} _ {C} ( \theta _ {1} ,\ \theta _ {2} ) \ = \ {\mathsf P} \{ | T | < t \} , $$

which is independent of $ \theta = ( \theta _ {1} ,\ \theta _ {2} ) $. Such an interval estimator is known as a confidence interval, while its end points are known as confidence bounds; in this case the confidence interval is a confidence estimator similar to the sampling space. In this example the interval estimator is most accurate unbiased.

#### References

[1] | S.S. Wilks, "Mathematical statistics" , Wiley (1962) |

[2] | L. Schmetterer, "Introduction to mathematical statistics" , Springer (1974) (Translated from German) |

[3] | E.L. Lehmann, "Testing statistical hypotheses" , Wiley (1986) |

[4] | L.N. Bol'shev, "On the construction of confidence limits" Theor. Probab. Appl. , 10 (1965) pp. 173–177 Teor. Veroyatnost. i Primenen. , 10 : 1 (1965) pp. 187–192 |

**How to Cite This Entry:**

Confidence estimation.

*Encyclopedia of Mathematics.*URL: http://www.encyclopediaofmath.org/index.php?title=Confidence_estimation&oldid=44400