# Matrix

2010 Mathematics Subject Classification: Primary: 15Axx [MSN][ZBL]

A matrix is a rectangular array $$\begin{pmatrix} a_{11}&\dots&a_{1n}\\ \dots&\dots&\dots\\ a_{m1}&\dots&a_{mn}\\ \end{pmatrix}\tag{1}$$ consisting of $m$ rows and $n$ columns, the entries $a_{ij}$ of which belong to some set $K$. (1) is called also an $(m\times n)$-dimensional matrix over $K$, or a matrix of dimensions $m\times n$ over $K$. Let $\def\M{\mathrm{M}}\M_{m,n}(K)$ denote the set of all $(m\times n)$-dimensional matrices over $K$. If $m=n$, then (1) is called a square matrix of order $n$. The set of all square matrices of order $n$ over $K$ is denoted by $\M_n(K)$.

Alternative notations for matrices are: $$\begin{bmatrix} a_{11} & \dots & a_{1n}\\ \dots & \dots & \dots \\ a_{m1} & \dots & a_{mn} \end{bmatrix},\quad \begin{Vmatrix} a_{11} & \dots & a_{1n}\\ \dots & \dots & \dots \\ a_{m1} & \dots & a_{mn} \end{Vmatrix}, {\rm\ \ and\ }\quad (a_{ij}).$$ In the most important cases the role of $K$ is played by the field of real numbers, the field of complex numbers, an arbitrary field, a ring of polynomials, the ring of integers, a ring of functions, or an arbitrary associative ring. The operations of addition and multiplication defined on $K$ are carried over naturally to matrices over $K$, and in this way one is led to the matrix calculus — the subject matter of the theory of matrices.

The notion of a matrix arose first in the middle of the 19th century in the investigations of W. Hamilton, and A. Cayley. Fundamental results in the theory of matrices are due to K. Weierstrass, C. Jordan and G. Frobenius. I.A. Lappo-Danilevskii has developed the theory of analytic functions of several matrix variables and has applied it to the study of systems of linear differential equations.

## Operations with matrices.

Let $K$ be an associative ring and let $A=(a_{ij}),\; B=(b_{ij})\in \M_{m,n}(K)$. Then the sum of the matrices $A$ and $B$ is, by definition, $$A+B = (a_{ij}+b_{ij}).$$ Clearly, $A+B\in\M_{m,n}(K)$ and addition of matrices is associative and commutative. The null matrix in $\M_{m,n}(K)$ is the matrix 0, all entries of which are zero. For every $A\in\M_{m,n}(K)$, $$A+0=0+A=A$$ Let $A=(a_{ij})\in\M_{m,k}(K)$ and $B=(b_{ij})\in\M_{k,n}(K)$. The product of the two matrices $A$ and $B$ is defined by the rule $$AB=(c_{\mu,\nu})\in\M_{m,n}(K)$$ where $$c_{\mu,\nu} = \sum_{j=1}^k a_{\mu j}b_{j\nu}.$$ The product of two elements of $\M_n(K)$ is always defined and belongs to $\M_n(K)$. Multiplication of matrices is associative: If $A\in\M_{m,k}(K)$, $B\in\M_{k,n}(K)$ and $C\in\M_{n,p}(K)$, then $$(AB)C = A(BC)$$ and $ABC\in\M_{m,p}(K)$. The distributivity rule also holds: For $A\in\M_{m,n}(K)$ and $B,C\in\M_{n,m}(K)$, $$A(B+C)=AB+AC,\quad (B+C)A=BA+CA.\tag{2}$$ In particular, (2) holds also for $A,B,C\in\M_{n}(K)$. Consequently, $\M_{n}(K)$ is an associative ring. If $K$ is a ring with an identity, then the matrix $$\def\E{\mathrm{E}}\E_n = \begin{pmatrix}1&\dots&0\\ \vdots&\ddots&\vdots\\ 0&\dots&1\end{pmatrix}$$ is the identity of the ring $\M_{n}(K)$: $$\E_n A = A\E_n = A$$ for all $A\in\M_{n}(K)$. Multiplication of matrices is not commutative: If $n\ge 2$, for every associative ring $K$ with an identity there are matrices $A,B\in\M_{n}(K)$ such that $AB\ne BA$.

Let $\alpha\in K$, $A=(a_{ij})\in\M_{m,n}(K)$; the product of the matrix $A$ by the element (number, scalar) $\alpha$ is, by definition, the matrix $\alpha A= (\alpha a_{ij})$. Then $$(\alpha+\beta)A = \alpha A+\beta A,\quad \alpha(\beta A)=(\alpha\beta)A, \quad\alpha (A+B) = \alpha A+\beta B.$$ Let $K$ be a ring with an identity. The matrix $e_{ij}$ is defined as the element of $\M_{m,n}(K)$ the only non-zero entry of which is the entry $(i,j)$, which equals 1, $1\le i\le m$, $1\le j\le n$. For every $A=(a_{ij})\in \M_{m,n}(K)$, $$A=\sum_{i=1}^m\sum_{j=1}^n a_{ij}e_{ij}.$$ If $K$ is a field, then $\M_{m,n}(K)$ is an $nm$-dimensional vector space over $K$, and the matrices $e_{ij}$ form a basis in this space.

## Block matrices.

Let $m=m_1+\dots+m_k$, $n=n_1+\dots+n_l$, where $m_\mu$ and $n_\nu$ are positive integers. Then a matrix $A\in \M_{m,n}(K)$ can be written in the form $$A=\begin{pmatrix}A_{11}&\dots&A_{1l}\\ \dots &\dots &\dots \\ A_{k1}&\dots &A_{kl}\\ \end{pmatrix}\tag{3}$$ where $A_{\mu\nu}\in \M_{m_\mu,n_\nu}(K)$, $\mu=1,\dots,k$, $\nu=1,\dots,l$. The matrix (3) is called a block matrix. If $B\in \M_{n,p}(K)$, $p=p_1+\dots+p_t$, $p_i>0$, and $B$ is written in the form $$B=\begin{pmatrix}B_{11}&\dots&B_{1t}\\ \dots &\dots &\dots \\ B_{l1}&\dots &B_{lt}\\ \end{pmatrix},\quad B_{ij}\in\M_{n_\nu p_j}(K),$$ then $$AB=C=\begin{pmatrix}C_{11}&\dots&C_{1t}\\ \dots &\dots &\dots \\ C_{k1}&\dots &B_{kt}\\ \end{pmatrix},\quad C_{\mu j} = \sum_{i=1}^l A_{\mu i}B_{ij}.$$ For example, if $n=kl$, then $\M_n(K)$ may be regarded as $\M_k(\Sigma)$, where $\Sigma = M_l(K)$.

The matrix $A\in\M_n(K)$ of the form $$\begin{pmatrix} A_{1} & 0_{12} & \dots & 0_{1k}\\ 0_{21} & A_{2} & \dots & 0_{2k}\\ \dots &\dots &\dots & \dots \\ 0_{k1} & 0_{k2} & \dots & A_{k}\\ \end{pmatrix},$$ where $A_i\in\M_{n_i}(K)$ and $0_{ij}\in \M_{n_i n_j}(K)$ is the null matrix, is denoted by $\def\diag{\mathrm{diag}}\diag[A_1,\dots,A_k]$ and is called block diagonal. The following holds:

$$\diag[A_1,\dots,A_k] +\diag[B_1,\dots,B_k] = \diag[A_1+B_1,\dots,A_k+B_k],$$

$$\diag[A_1,\dots,A_k] \diag[B_1,\dots,B_k] = \diag[A_1 B_1,\dots,A_k B_k],$$

provided that the orders of $A_i$ and $B_i$ coincide for $i=1,\dots,k$.

## Square matrices over a field.

Let $K$ be a field, let $A\in\M_n(K)$ and let $\det A$ be the determinant of the matrix $A$. $A$ is said to be non-degenerate (or non-singular) if $\det A \ne 0$. A matrix $A^{-1}\in\M_n(K)$ is called the inverse of $A$ if $AA^{-1}=A^{-1} A = \E_n$. The invertibility of $A$ in $\M_n(K)$ is equivalent to its non-degeneracy, and $$A^{-1} = (c_{ij}),\quad c_{ij} = \frac{A_{ji}}{\det A},$$ where $A_{ji}$ is the cofactor of the entry $a_{ji}$, $\det(A^{-1})=(\det A)^{-1}$. For $A,B \in\M_n(K)$, $$AB=\E_n \Leftrightarrow BA=\E_n$$ The set of all invertible elements of $\M_n(K)$ is a group under multiplication, called the general linear group and denoted by $\def\GL{\mathrm{GL}}\GL(n,K)$. The powers of a matrix $A$ are defined as follows $$A^0 = \E_n,$$

$$A^k = A^{k-1} A {\rm\ for\ } k>0,$$ and if $A$ is invertible, then $A^{-k} = (A^{-1})^k$. For the polynomial $$f(x) = \alpha_0 + \alpha_1 x + \dots + \alpha_t x^t,\quad f(x)\in K[x],$$ the matrix polynomial $$f(A) = \alpha_0 \E_n + \alpha_1 A + \dots + \alpha_t A^t$$ is defined.

Every matrix from $\M_n(K)$ gives rise to a linear transformation of the $n$-dimensional vector space $V$ over $K$. Let $v_1,\dots v_n$ be a basis in $V$ and let $\sigma:V\to V$ be a linear transformation of $V$. Then $\sigma$ is uniquely determined by the set of vectors $$u_1=\sigma(v_1),\dots,u_n=\sigma(v_n).$$ Moreover, $$\displaylines{\sigma (v_1) = v_1 a_{11} +\cdots+v_n a_{n1},\cr \cdots\cdots\cr \sigma (v_n) = v_1 a_{1n} +\cdots+v_n a_{nn},}\tag{4}$$ where $a_{ij}\in K$. The matrix $A=(a_{ij})$ is called the matrix of the transformation $\sigma$ in the basis $v_1,\dots,v_n$. For a fixed basis, the matrix $A+B$ is the matrix of the linear transformation $\sigma+\tau$, while $AB$ is the matrix of $\sigma\tau$ if $B$ is the matrix of the linear transformation $\tau$. Equality (4) may be written in the form $$[\sigma(v_1),\dots,\sigma(v_n)] = [v_1,\dots,v_n]A.$$ Suppose that $w_1,\dots,w_n$ is a second basis in $V$. Then $[w_1,\dots,w_n]=[v_1,\dots,v_n] T$, $T\in\GL(n,K)$, and $T^{-1}AT$ is the matrix of the transformation $\sigma$ in the basis $[w_1,\dots,w_n]$. Two matrices $A,B\in\M_n(K)$ are similar if there is a matrix $T\in\GL(n,K)$ such that $B = T^{-1}AT$. Here, also, $\det A = \det\; T^{-1}AT$ and the ranks of the matrices $A$ and $B$ coincide. The linear transformation $\sigma$ is called non-degenerate, or non-singular, if $\sigma(V)=V$; $\sigma$ is non-degenerate if and only if its matrix is non-degenerate. If $V$ is regarded as the space of columns $\M_{n,1}(K)$, then every linear transformation in $V$ is given by left multiplication of the columns $\nu\in V$ by some $A\in\M_n(K)$: $\sigma(v)=Av$, and the matrix of $\sigma$ in the basis $$v_1 = \begin{pmatrix}1\\0\\ \vdots\\0\end{pmatrix},\dots, v_n = \begin{pmatrix}0\\ \vdots\\0\\1\end{pmatrix}$$ coincides with $A$. A matrix $A\in\M_n(K)$ is singular (or degenerate) if and only if there is a column $v\in\M_{n,1}(K)$, $v\ne 0$, such that $Av=0$.

## Transposition and matrices of special form.

Let $A=(a_{ij})\in \M_{m,n}(K)$. Then the matrix $A^T = (a_{ij})^T\in \M_{n,m}(K)$, where $a_{ij}^T = a_{ji}$, is called the transpose of $A$. Alternative notations are ${}^tA$ and $A'$. Let $A=(a_{ij})\in \M_{m,n}(\C)$. Then $\bar A = (\bar a_{ij})$, where $\bar a_{ij}$ is the complex conjugate of the number $a_{ij}$, is called the complex conjugate of $A$. The matrix $A^* = {\bar A}^T$, where $A\in\M_n(\C)$, is called the Hermitian conjugate of $A$. Many matrices used in applications are given special names:'

 Name of the matrix: Defining condition: symmetric $A^T = A$ skew-symmetric $A^T = -A$ orthogonal $A^T = A^{-1}$ Hermitian $A^* = A$ unitary $A^* = A^{-1}$ normal $A^*A = A A^*$ unipotent $(A-\mathrm{E}_n)^n = 0$ stochastic $A = (a_{ij})\in \mathrm{M}_n(\C^n),\; a_{ij}\ge0,\; \sum_{j=1}^n = 1, \ {\rm for }\ i=1,\dots,n$ doubly-stochastic $A$ and $A^T$ are stochastic $(0,1)$-matrix every entry of $A$ is either $0$ or $1$

## Polynomial matrices.

Let $K$ be a field and let $K[x]$ be the ring of all polynomials in the variable $x$ with coefficients from $K$. A matrix over $K[x]$ is called a polynomial matrix. For the elements of the ring $M_n(K[x])$ one introduces the following elementary operations: 1) multiplication of a row or column of a matrix by a non-zero element of the field $K$; and 2) addition to a row (column) of another row (respectively, column) of the given matrix, multiplied by a polynomial from $K[x]$. Two matrices $A,B\in\M_n(K[x])$ are called equivalent $(A\sim B)$ if $B$ can be obtained from $A$ through a finite number of elementary operations.

Let $$N=\diag[f_1(x),\dots,f_r(x),0,\dots,0]\in M_n(K[x]),$$ where a) $f_i(x)\ne 0$; b) $f_j(x)$ is divisible by $f_i(x)$ for $j>i$; and c) the coefficient of the leading term in $f_i(x)$ is equal to 1. Then $N$ is called a canonical polynomial matrix. Every equivalence class of elements of the ring $\M_n(K[x])$ contains a unique canonical matrix. If $A\sim N$, where $$N=\diag[f_1(x),\dots,f_r(x),0,\dots,0]$$ is a canonical matrix, then the polynomials $$f_1(x),\dots,f_r(x)$$ are called the invariant factors of $A$; the number $r$ is identical with the rank of $A$. A matrix $A\in \M_n(K[x])$ has an inverse in $\M_n(K[x])$ if and only if $A\sim E_n$. The last condition is in turn equivalent to $\det A \in K\setminus 0$. Two matrices $A,B\in \M_n(K[x])$ are equivalent if and only if $$B=PAQ,$$ where $P,Q\in \M_n(K[x])$, $P\sim Q\sim E_n$.

Let $A\in \M_n(K[x])$. The matrix $$xE_n - A\in \M_n(K[x])$$ is called the characteristic matrix of $A$ and $\det(xE_n-A)$ is called the characteristic polynomial of $A$. For every polynomial of the form $$f(x)=\alpha_0+\alpha_1x+\cdots+\alpha_{n-1}x^{n-1}+x^n\in K[x]$$ there is an $F\in \M_n(K[x])$ such that $$\det(xE_n-A) = f(x)$$ Such is, for example, the matrix $$F=\begin{pmatrix} 0 & 0 &\dots& 0 & -\alpha_0\\ 1 & 0 &\dots& 0 & -\alpha_1\\ 0 & 1 &\dots& 0 & -\alpha_2\\ \ddots&\ddots&\ddots&\vdots&\vdots\\ 0 & 0 &\dots& 1 & -\alpha_0\\ \end{pmatrix}$$ The characteristic polynomials of two similar matrices coincide. However, the fact that two matrices have identical characteristic polynomials does not necessarily entail the fact that the matrices are similar. A similarity criterion is: Two matrices $A,B\in \M_n(K[x])$ are similar if and only if the polynomial matrices $xE_n-A$ and $xE_n-B$ are equivalent. The set of all matrices from $\M_n(K[x])$ having a given characteristic polynomial $f(x)$ is partitioned into a finite number of classes of similar matrices; this set reduces to a single class if and only if $f(x)$ does not have multiple factors in $K[x]$.

Let $A\in \M_n(K)$, $v\in \M_{n,1}(K)$, $v\ne 0$, and suppose that $Av=\lambda v$, where $\lambda\in K$. Then $v$ is called an eigen vector of $A$ and $\lambda$ is called an eigen value of $A$. An element $\lambda\in K$ is an eigen value of a matrix $A$ if and only if it is a root of the characteristic polynomial of $A$. The set of all columns $u\in \M_{n,1}(K)$ such that $Au=\lambda u$ for a fixed eigen value $\lambda$ of $A$ is a subspace of $\M_{n,1}(K)$. The dimension of this subspace equals the defect (or deficiency) $d$ of the matrix $\lambda \E_n - A\$ ($d=n-r$, where $r$ is the rank of $\lambda \E_n - A$). The number $d$ does not exceed the multiplicity of the root $\lambda$, but need not coincide with it. A matrix $A\in \M_{n}(K)$ is similar to a diagonal matrix if and only if it has $n$ linearly independent eigen vectors. If for an $A\in \M_{n}(K)$,

$$\det(x\E_n - A) = (x-\lambda_1)^{n_1}\cdots(x-\lambda_t)^{n_t},\quad \lambda_j\in K$$ and the roots $\lambda_j$ are distinct, then the following holds: $A$ is similar to a diagonal matrix if and only if for each $\lambda_j$, $j=1,\dots,t$, the defect of $\lambda_j \E_n - A$ coincides with $n_j$. In particular, every matrix with $n$ distinct eigen values is similar to a diagonal matrix. Over an algebraically closed field every matrix from $\M_n(K)$ is similar to some triangular matrix from $\M_n(K)$. The Hamilton–Cayley theorem: If $f(x)$ is the characteristic polynomial of a matrix $A$, then $f(A)$ is the null matrix.

By definition, the minimum polynomial of a matrix $A\in \M_{n}(K)$ is the polynomial $m(x)\in K[x]$ with the properties:

$\alpha)$) $m(A)=0$;

$\beta$) the coefficient of the leading term equals 1; and

$\gamma$) if $0\ne \psi(x)\in K[x]$ and the degree of $\psi(x)$ is smaller than the degree of $m(x)$, then $\psi(A)\ne 0$.

Every matrix has a unique minimum polynomial. If $g(x)\in K[x]$ and $g(A)=0$, then the minimum polynomial $m(x)$ of $A$ divides $g(x)$. The minimum polynomial and the characteristic polynomial of $A$ coincide with the last invariant factor, and, respectively, the product of all invariant factors, of the matrix $x\E_n -A$. The minimum polynomial of $A$ equals $$\frac{\det(x\E_n -A)}{d_{n-1}(x)}$$ where $d_{n-1}(x)$ is the greatest common divisor of the minors (cf. Minor) of order $n-1$ of the matrix $x\E_n-A$. A matrix $A\in \M_n(K)$ is similar to a diagonal matrix over the field $K$ if and only if its minimum polynomial is a product of distinct linear factors in the ring $K[x]$.

A matrix $A\in \M_n(K)$ is called nilpotent if $A^k=0$ for some integer $k$. A matrix $A$ is nilpotent if and only if $\det(x\E_n-A) = x^n$. Every nilpotent matrix from $M_n(K)$ is similar to some triangular matrix with zeros on the diagonal.

#### References

 [Be] R. Bellman, "Introduction to matrix analysis", McGraw-Hill (1970) MR0258847 Zbl 0216.06101 [Bo] N. Bourbaki, "Elements of mathematics. Algebra: Algebraic structures. Linear algebra", 1, Addison-Wesley (1974) pp. Chapt.1;2 (Translated from French) MR0354207 [Ga] F.R. [F.R. Gantmakher] Gantmacher, "The theory of matrices", 1, Chelsea, reprint (1977) (Translated from Russian) [Ko] A.I. Kostrikin, "Introduction to algebra", Springer (1982) (Translated from Russian) MR0661256 Zbl 0482.00001 [Ku] A.G. Kurosh, "Higher algebra", MIR (1972) (Translated from Russian) Zbl 0237.13001 [La] P. Lancaster, "Theory of matrices", Acad. Press (1969) MR0245579 Zbl 0186.05301 [Ma] A.I. Mal'tsev, "Foundations of linear algebra", Freeman (1963) (Translated from Russian) [MaMi] M. Marcus, H. Minc, "A survey of matrix theory and matrix inequalities", Allyn & Bacon (1964) MR0162808 Zbl 0126.02404 [Pr] I.B. Proskuryakov, "Higher algebra. Linear algebra, polynomials, general algebra", Pergamon (1965) (Translated from Russian) MR0184948 Zbl 0132.25004 [Ty] A.R.I. Tyshkevich, "Linear algebra and analytic geometry", Minsk (1976) (In Russian) [Vo] V.V. Voevodin, "Algèbre linéare", MIR (1976) (Translated from Russian)

The result on canonical polynomial matrices quoted above has a natural generalization to matrices over principal ideal domains. An $(m\times n)$-matrix $A$ over a principal ideal domain $R$ of the form $$\diag [d_1,\dots,d_r,0,\dots,0]$$ with $d_i$ divisible by $d_{i+1}$, $i=1,\dots,r-1$, is said to be in Smith canonical form. Every matrix $A$ over a principal ideal domain $R$ is equivalent to one in Smith canonical form in the sense that there are an $(m\times m)$-matrix $P$ and an $(n\times n)$-matrix $Q$ such that $P$ and $Q$ are invertible in $\M_m(R)$ and $\M_n(R)$, respectively, and such that $PAQ$ is in Smith canonical form.