Kullback-Leibler-type distance measures
In mathematical statistics one usually considers, among others, estimation, testing of hypothesis, discrimination, etc. When considering the statistical problem of discrimination, S. Kullback and R.A. Leibler [a13] introduced a measure of the "distance" or "divergence" between statistical populations, known variously as information for discrimination, -divergence, the error, or the directed divergence. While the Shannon entropy is fundamental in information theory, several generalizations of Shannon's entropy have also been proposed. In statistical estimation problems, measures between probability distributions play a significant role. The Chernoff coefficient, Hellinger–Bhattacharyya coefficient, Jeffreys distance, the directed divergence and its symmetrization, -divergence, -divergence, etc. are examples of such measures. These measures have many applications in statistics, pattern recognition, numerical taxonomy, etc.
be the set of all complete discrete probability distributions of length (cf. Density of a probability distribution). Let and let be the set of real numbers. For in , Kullback and Leibler [a13] defined the directed divergence as
Usually, measures are characterized by using the many algebraic properties possessed by them, for example, see [a8] for (a1). A sequence of measures is said to have the sum property if there exists a function such that for . In this case is said to be a generating function of . A stronger version of the sum property is -divergence [a6]. The measure is an -divergence if and only if it has a representation
for some . The measures are said to be -additive if where .
Measures having the sum property with a Lebesgue-measurable generating function are -additive if and only if they are given by
where , , , , , , are constants, (Shannon entropy), (entropy of degree ) and (inaccuracy). However, (a1) is neither symmetric nor satisfies the triangle inequality and thus its use as a metric is limited. In [a7], the symmetric divergence or -divergence was introduced to restore symmetry.
A sequence of measures is said to be symmetrically additive if
for all , .
Sum-form measures with a measurable symmetric generating function are symmetrically additive for all pairs of integers and have the form [a5]
It is well known that , that is,
which is known as the Shannon inequality. This inequality gives rise to the error in (a1). A function is called a separability measure if and only if and attains a minimum if for all with . A separability measure is a distance measure of Kullback–Leibler type if there exists an such that . Any Kullback–Leibler-type distance measure with generating function satisfies the inequality (see [a10], [a2]).
|[a1]||J. Aczél, Z. Daróczy, "On measures of information and their characterizations" , Acad. Press (1975) Zbl 0345.94022|
|[a2]||J. Aczél, A.M. Ostrowski, "On the characterization of Shannon's entropy by Shannon's inequality" J. Austral. Math. Soc. , 16 (1973) pp. 368–374|
|[a3]||A. Bhattacharyya, "On a measure of divergence between two statistical populations defined by their probability distributions" Bull. Calcutta Math. Soc. , 35 (1943) pp. 99–109|
|[a4]||A. Bhattacharyya, "On a measure of divergence between two multinomial populations" Sankhya , 7 (1946) pp. 401–406|
|[a5]||J.K. Chung, P.L. Kannappan, C.T. Ng, P.K. Sahoo, "Measures of distance between probability distributions" J. Math. Anal. Appl. , 139 (1989) pp. 280–292 DOI 10.1016/0022-247X(89)90335-1 Zbl 0669.60025|
|[a6]||I. Csiszár, "Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten" Magyar Tud. Kutato Int. Közl. , 8 (1963) pp. 85–108|
|[a7]||H. Jeffreys, "An invariant form for the prior probability in estimation problems" Proc. Roy. Soc. London A , 186 (1946) pp. 453–461 DOI 10.1098/rspa.1946.0056 Zbl 0063.03050|
|[a8]||Pl. Kannappan, P.N. Rathie, "On various characterizations of directed divergence" , Proc. Sixth Prague Conf. on Information Theory, Statistical Decision Functions and Random Process (1971)|
|[a9]||Pl. Kannappan, C.T. Ng, "Representation of measures information" , Trans. Eighth Prague Conf. , C , Prague (1979) pp. 203–206|
|[a10]||Pl. Kannappan, P.K. Sahoo, "Kullback–Leibler type distance measures between probability distributions" J. Math. Phys. Sci. , 26 (1993) pp. 443–454|
|[a11]||Pl. Kannappan, P.K. Sahoo, J.K. Chung, "On a functional equation associated with the symmetric divergence measures" Utilita Math. , 44 (1993) pp. 75–83|
|[a12]||S. Kullback, "Information theory and statistics" , Peter Smith, reprint , Gloucester MA (1978)|
|[a13]||S. Kullback, R.A. Leibler, "On information and sufficiency" Ann. Math. Stat. , 22 (1951) pp. 79–86 DOI 10.1214/aoms/1177729694 Zbl 0042.38403|
|[a14]||C.E. Shannon, "A mathematical theory of communication" Bell System J. , 27 (1948) pp. 379–423; 623–656|
Kullback–Leibler-type distance measures. Encyclopedia of Mathematics. URL: http://www.encyclopediaofmath.org/index.php?title=Kullback%E2%80%93Leibler-type_distance_measures&oldid=22684