Spearman rho metric

Spearman rho

The non-parametric correlation coefficient (or measure of association) known as Spearman's rho was first discussed by the psychologist C. Spearman in 1904 [a4] as a coefficient of correlation on ranks (cf. also Correlation coefficient; Rank statistic). In modern use, the term "correlation" refers to a measure of a linear relationship between variates (such as the Pearson product-moment correlation coefficient), while "measure of association" refers to a measure of a monotone relationship between variates (such as the Kendall tau metric and Spearman's rho). For an historical review of Spearman's rho and related coefficients, see [a2].

Spearman's rho, denoted , is computed by applying the Pearson product-moment correlation coefficient procedure to the ranks associated with a sample . Let and ; then computing the sample (Pearson) correlation coefficient for yields  where . When ties exist in the data, the following adjusted formula for is used: where for the number of observations that are tied at a given rank, and for the number of observations that are tied at a given rank. For details on the use of in hypothesis testing, and for large-sample theory, see [a1].

If and are random variables (cf. Random variable) with respective distribution functions and , then the population parameter estimated by , usually denoted , is defined to be the Pearson product-moment correlation coefficient of the random variables and :  Spearman's is occasionally referred to as the grade correlation coefficient, since and are sometimes called the "grades" of and .

Like Kendall's tau, is a measure of association based on the notion of concordance. One says that two pairs and of real numbers are concordant if and or if and (i.e., if ); and discordant if and or if and (i.e., if ). Now, let , and be independent random vectors with the same distribution as . Then  that is, is proportional to the difference between the probabilities of concordance and discordance between the random vectors and (clearly, can be replaced by ).

When and are continuous,  where is the copula of and . Consequently, is invariant under strictly increasing transformations of and , a property shares with Kendall's tau but not with the Pearson product-moment correlation coefficient. Note that is proportional to the signed volume between the graphs of the copula and the "product" copula , the copula of independent random variables. For a survey of copulas and their relationship with measures of association, see [a3].

Spearman [a5] also proposed an version of , known as Spearman's footrule, based on absolute differences in ranks rather than squared differences: The population parameter estimated by is given by  