Namespaces
Variants
Actions

Regression matrix

From Encyclopedia of Mathematics
Jump to: navigation, search


The matrix $ B $ of regression coefficients (cf. Regression coefficient) $ \beta _ {ji} $, $ j = 1 \dots m $, $ i = 1 \dots r $, in a multi-dimensional linear regression model,

$$ \tag{* } X = B Z + \epsilon . $$

Here $ X $ is a matrix with elements $ X _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, where $ X _ {jk} $, $ k = 1 \dots n $, are observations of the $ j $- th component of the original $ m $- dimensional random variable, $ Z $ is a matrix of known regression variables $ z _ {ik} $, $ i = 1 \dots r $, $ k = 1 \dots n $, and $ \epsilon $ is the matrix of errors $ \epsilon _ {jk} $, $ j = 1 \dots m $, $ k = 1 \dots n $, with $ {\mathsf E} \epsilon _ {jk} = 0 $. The elements $ \beta _ {ji} $ of the regression matrix $ B $ are unknown and have to be estimated. The model (*) is a generalization to the $ m $- dimensional case of the general linear model of regression analysis.

References

[1] M.G. Kendall, A. Stuart, "The advanced theory of statistics" , 3. Design and analysis, and time series , Griffin (1983)

Comments

In econometrics, for example, a frequently used model is that one has $ m $ variables $ y _ {1} \dots y _ {m} $ to be explained (endogenous variables) in terms of $ n $ explanatory variables $ x _ {1} \dots x _ {n} $( exogenous variables) by means of a linear relationship $ y= Ax $. Given $ N $ sets of measurements (with errors), $ ( y _ {t} , x _ {t} ) $, the matrix of relation coefficients $ A $ is to be estimated. The model is therefore

$$ y _ {t} = A x _ {t} + \epsilon _ {t} . $$

With the assumption that the $ \epsilon _ {t} $ have zero mean and are independently and identically distributed with normal distribution, that is, the so-called standard linear multiple regression model or, briefly, linear model or standard linear model. The least squares method yields the optimal estimator:

$$ \widehat{A} = M _ {yx} M _ {xx} ^ {-1} , $$

where $ M _ {xx} = N ^ {-1} ( \sum _ {t=1} ^ {N} x _ {t} x _ {t} ^ {T} ) $, $ M _ {yx} = N ^ {-1} ( \sum _ {t=1} ^ {N} y _ {t} x _ {t} ^ {T} ) $. In the case of a single endogenous variable, $ y = a ^ {T} x $, this can be conveniently written as

$$ \widehat{a} = ( X ^ {T} X) ^ {-1} X ^ {T} Y , $$

where $ Y $ is the column vector of observations $ ( y _ {1} \dots y _ {N} ) ^ {T} $ and $ X $ is the $ ( N \times n ) $ observation matrix consisting of the rows $ x _ {t} ^ {T} $, $ t = 1 \dots N $. Numerous variants and generalizations are considered [a1], [a2]; cf. also Regression analysis.

References

[a1] E. Malinvaud, "Statistical methods of econometrics" , North-Holland (1970) (Translated from French)
[a2] H. Theil, "Principles of econometrics" , North-Holland (1971)
How to Cite This Entry:
Regression matrix. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Regression_matrix&oldid=51014
This article was adapted from an original article by A.V. Prokhorov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article