# Sufficient statistic

for a family of probability distributions or for a parameter

A statistic (a vector random variable) such that for any event there exists a version of the conditional probability which is independent of . This is equivalent to the requirement that the conditional distribution, given , of any other statistic is independent of .

The knowledge of the sufficient statistic yields exhaustive material for statistical inferences about the parameter , since no complementary statistical data can add anything to the information about the parameter contained in the distribution of . This property is mathematically expressed as one of the results of the theory of statistical decision making which says that the set of decision rules based on a sufficient statistic forms an essentially complete class. The transition from the initial family of distributions to the family of distributions of the sufficient statistic is known as reduction of the statistical problem. The meaning of the reduction is a decrease (sometimes a very significant one) in the dimension of the observation space.

In practice, a sufficient statistic is found from the following factorization theorem. Let a family be dominated by a -finite measure and let be the density of with respect to the measure . A statistic is sufficient for the family if and only if

 (*)

where and are non-negative measurable functions ( is independent of ). For discrete distributions the "counting" measure may be taken as , and in relation (*) has the meaning of the probability of the elementary event .

E.g., let be a sequence of independent random variables which assume the value one with an unknown probability and the value zero with probability (a Bernoulli scheme). Then

Equation (*) is satisfied if

Thus, the empirical frequency

is a sufficient statistic for the unknown probability in the Bernoulli scheme.

Let be a sequence of independent, normally distributed variables with unknown mean and unknown variance . The joint density of the distributions of with respect to Lebesgue measure is given by the expression

which depends on only by means of the variables

For this reason the vector statistic

is a sufficient statistic for the two-dimensional parameter . Here, the pair: sample mean

and sample variance

will also be a sufficient statistic, since the variables

can be expressed in terms of and .

Many sufficient statistics may exist for a given family of distributions. In particular, the totality of all observations (in the example discussed above, ) is a trivial sufficient statistic. However, of main interest are statistics which permit a real reduction of the statistical problem. A sufficient statistic is known as minimal or necessary if it is a function of any other sufficient statistic. A necessary sufficient statistic realizes the utmost possible reduction of a statistical problem. In the examples discussed above the obtained sufficient statistics are also necessary.

An important application of the concept of sufficiency is the method of improvement of unbiased estimators, based on the Rao–Blackwell–Kolmogorov theorem: If is a sufficient statistic for the family , and if is an arbitrary statistic assuming values in the vector space , then the inequality

where is the conditional expectation of the statistic with respect to (which is in fact independent of by virtue of the sufficiency of ), holds for any real continuous convex function on . Often the loss function is taken to be a positive-definite quadratic form on .

A statistic is said to be a complete statistic if it follows from , , that almost surely with respect to , . A corollary of the Rao–Blackwell–Kolmogorov theorem states that if a complete sufficient statistic exists, then it is the best unbiased estimator, uniformly in , of its expectation . The examples above describe such a situation. Thus, the empirical frequency is the uniformly best unbiased estimator of the probability in the Bernoulli scheme, while the sample mean and the variance are the uniformly best unbiased estimators of the parameters and of the normal distribution.

On the theoretical level it may be more convenient to deal with sufficient -algebras rather than with sufficient statistics. If is a family of distributions on a probability space , then a sub--algebra is said to be sufficient for if for any event there exists a version of the conditional probability which is independent of . A statistic is sufficient if and only if the sub--algebra generated by it is sufficient.

#### References

 [1] P.R. Halmos, L.I. Savage, "Application of the Radon–Nikodym theorem to the theory of sufficient statistics" Ann. Math. Stat. , 20 (1949) pp. 225–241 [2] A.N. Kolmogorov, "Unbiased estimators" Izv. Akad. Nauk SSSR Ser. Mat. , 14 : 4 (1950) pp. 303–326 (In Russian) ((English translation in: Selected Works, Vol. 2 (Probability Theory and Mathematical Statistics), Kluwer, 1992, pp. 369–394.)) [3] C.R. Rao, "Linear statistical inference and its application" , Wiley (1973)