Namespaces
Variants
Actions

Difference between revisions of "Normal form (for matrices)"

From Encyclopedia of Mathematics
Jump to: navigation, search
m (MR/ZBL numbers added)
(latex details)
 
(10 intermediate revisions by 3 users not shown)
Line 1: Line 1:
The normal form of a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675201.png" /> is a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675202.png" /> of a pre-assigned special form obtained from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675203.png" /> by means of transformations of a prescribed type. One distinguishes various normal forms, depending on the type of transformations in question, on the domain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675204.png" /> to which the coefficients of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675205.png" /> belong, on the form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675206.png" />, and, finally, on the specific nature of the problem to be solved (for example, on the desirability of extending or not extending <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675207.png" /> on transition from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675208.png" /> to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n0675209.png" />, on the necessity of determining <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752010.png" /> from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752011.png" /> uniquely or with a certain amount of arbitrariness). Frequently, instead of "normal form" one uses the term "canonical form of a matrixcanonical form" . Among the classical normal forms are the following. (Henceforth <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752012.png" /> denotes the set of all matrices of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752013.png" /> rows and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752014.png" /> columns with coefficients in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752015.png" />.)
+
<!--
 +
n0675201.png
 +
$#A+1 = 498 n = 0
 +
$#C+1 = 498 : ~/encyclopedia/old_files/data/N067/N.0607520 Normal form
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
 +
The normal form of a matrix  $  A $
 +
is a matrix $  N $
 +
of a pre-assigned special form obtained from $  A $
 +
by means of transformations of a prescribed type. One distinguishes various normal forms, depending on the type of transformations in question, on the domain $  K $
 +
to which the coefficients of $  A $
 +
belong, on the form of $  A $,  
 +
and, finally, on the specific nature of the problem to be solved (for example, on the desirability of extending or not extending $  K $
 +
on transition from $  A $
 +
to $  N $,  
 +
on the necessity of determining $  N $
 +
from $  A $
 +
uniquely or with a certain amount of arbitrariness). Frequently, instead of "normal form" one uses the term "canonical form of a matrixcanonical form" . Among the classical normal forms are the following. (Henceforth $  M _ {m \times n }  ( K) $
 +
denotes the set of all matrices of $  m $
 +
rows and n $
 +
columns with coefficients in $  K $.)
  
 
==The Smith normal form.==
 
==The Smith normal form.==
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752016.png" /> be either the ring of integers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752017.png" /> or the ring <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752018.png" /> of polynomials in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752019.png" /> with coefficients in a field <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752020.png" />. A matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752021.png" /> is called equivalent to a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752022.png" /> if there are invertible matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752023.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752024.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752025.png" />. Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752026.png" /> is equivalent to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752027.png" /> if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752028.png" /> can be obtained from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752029.png" /> by a sequence of elementary row-and-column transformations, that is, transformations of the following three types: a) permutation of the rows (or columns); b) addition to one row (or column) of another row (or column) multiplied by an element of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752030.png" />; or c) multiplication of a row (or column) by an invertible element of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752031.png" />. For transformations of this kind the following propositions hold: Every matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752032.png" /> is equivalent to a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752033.png" /> of the form
+
Let $  K $
 +
be either the ring of integers $  \mathbf Z $
 +
or the ring $  F[ \lambda ] $
 +
of polynomials in $  \lambda $
 +
with coefficients in a field $  F $.  
 +
A matrix $  B \in M _ {m \times n }  ( K) $
 +
is called equivalent to a matrix $  A \in M _ {m \times n }  ( K) $
 +
if there are invertible matrices $  C \in M _ {m \times m }  ( K) $
 +
and  $  D \in M _ {n \times n }  ( K) $
 +
such that $  B = C A D $.  
 +
Here $  B $
 +
is equivalent to $  A $
 +
if and only if $  B $
 +
can be obtained from $  A $
 +
by a sequence of elementary row-and-column transformations, that is, transformations of the following three types: a) permutation of the rows (or columns); b) addition to one row (or column) of another row (or column) multiplied by an element of $  K $;  
 +
or c) multiplication of a row (or column) by an invertible element of $  K $.  
 +
For transformations of this kind the following propositions hold: Every matrix $  A \in M _ {m \times n }  ( K) $
 +
is equivalent to a matrix $  N \in M _ {m \times n }  ( K) $
 +
of the form
 +
 
 +
$$
 +
N  =  \left \|
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752034.png" /></td> </tr></table>
+
\begin{array}{cccccccc}
 +
d _ {1}  &{}  &{}  &{}  &{}  &{}  &{}  & 0 \\
 +
{}  &\cdot  &{}  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &\cdot  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &d _ {r}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  & 0  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &\cdot  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &{}  &\cdot  &{}  \\
 +
0  &{}  &{}  &{}  &{}  &{}  &{}  & 0  \\
 +
\end{array}
 +
\right \| ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752035.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752036.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752037.png" /> divides <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752038.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752039.png" />; and if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752040.png" />, then all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752041.png" /> are positive; if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752042.png" />, then the leading coefficients of all polynomials <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752043.png" /> are 1. This matrix is called the Smith normal form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752044.png" />. The <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752045.png" /> are called the invariant factors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752046.png" /> and the number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752047.png" /> is called its rank. The Smith normal form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752048.png" /> is uniquely determined and can be found as follows. The rank <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752049.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752050.png" /> is the order of the largest non-zero [[Minor|minor]] of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752051.png" />. Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752052.png" />; then among all minors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752053.png" /> of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752054.png" /> there is at least one non-zero. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752055.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752056.png" />, be the greatest common divisor of all non-zero minors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752057.png" /> of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752058.png" /> (normalized by the condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752059.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752060.png" /> and such that the leading coefficient of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752061.png" /> is 1 for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752062.png" />), and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752063.png" />. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752064.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752065.png" />. The invariant factors form a full set of invariants of the classes of equivalent matrices: Two matrices in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752066.png" /> are equivalent if and only if their ranks and their invariant factors with equal indices are equal.
+
where $  d _ {i} \neq 0 $
 +
for all $  i $;  
 +
$  d _ {i} $
 +
divides $  d _ {i+1} $
 +
for $  i = 1 \dots r - 1 $;  
 +
and if $  K = \mathbf Z $,  
 +
then all $  d _ {i} $
 +
are positive; if $  K = F [ \lambda ] $,  
 +
then the leading coefficients of all polynomials $  d _ {i} $
 +
are 1. This matrix is called the Smith normal form of $  A $.  
 +
The $  d _ {i} $
 +
are called the invariant factors of $  A $
 +
and the number $  r $
 +
is called its rank. The Smith normal form of $  A $
 +
is uniquely determined and can be found as follows. The rank $  r $
 +
of $  A $
 +
is the order of the largest non-zero [[Minor|minor]] of $  A $.  
 +
Suppose that $  1 \leq  j \leq  r $;  
 +
then among all minors of $  A $
 +
of order $  j $
 +
there is at least one non-zero. Let $  \Delta _ {j} $,  
 +
$  j = 1 \dots r $,  
 +
be the greatest common divisor of all non-zero minors of $  A $
 +
of order $  j $(
 +
normalized by the condition $  \Delta _ {j} > 0 $
 +
for $  K = \mathbf Z $
 +
and such that the leading coefficient of $  \Delta _ {j} $
 +
is 1 for $  K = F [ \lambda ] $),  
 +
and let $  \Delta _ {0} = 1 $.  
 +
Then $  d _ {j} = \Delta _ {j} / \Delta _ {j-1} $,  
 +
$  j = 1 \dots r $.  
 +
The invariant factors form a full set of invariants of the classes of equivalent matrices: Two matrices in $  M _ {m \times n }  ( K) $
 +
are equivalent if and only if their ranks and their invariant factors with equal indices are equal.
  
The invariant factors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752067.png" /> split (in a unique manner, up to the order of the factors) into the product of powers of irreducible elements <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752068.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752069.png" /> (which are positive integers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752070.png" /> when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752071.png" />, and polynomials of positive degree with leading coefficient 1 when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752072.png" />):
+
The invariant factors $  d _ {1} \dots d _ {r} $
 +
split (in a unique manner, up to the order of the factors) into the product of powers of irreducible elements $  e _ {1} \dots e _ {s} $
 +
of $  K $(
 +
which are positive integers > 1 $
 +
when $  K = \mathbf Z $,  
 +
and polynomials of positive degree with leading coefficient 1 when $  K = F [ \lambda ] $):
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752073.png" /></td> </tr></table>
+
$$
 +
d _ {i}  = \
 +
e _ {1} ^ {n _ {i1} } \dots e _ {s} ^ {n _ {is} } ,\ \
 +
i = 1 \dots r ,
 +
$$
  
where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752074.png" /> are non-negative integers. Every factor <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752075.png" /> for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752076.png" /> is called an elementary divisor of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752077.png" /> (over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752078.png" />). Every elementary divisor of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752079.png" /> occurs in the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752080.png" /> of all elementary divisors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752081.png" /> with multiplicity equal to the number of invariant factors having this divisor in their decompositions. In contrast to the invariant factors, the elementary divisors depend on the ring <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752082.png" /> over which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752083.png" /> is considered: If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752084.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752085.png" /> is an extension of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752086.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752087.png" />, then, in general, a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752088.png" /> has distinct elementary divisors (but the same invariant factors), depending on whether <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752089.png" /> is regarded as an element of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752090.png" /> or of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752091.png" />. The invariant factors can be recovered from the complete collection of elementary divisors, and vice versa.
+
where the n _ {ij} $
 +
are non-negative integers. Every factor $  e _ {j} ^ {n _ {ij} } $
 +
for which n _ {ij} > 0 $
 +
is called an elementary divisor of $  A $(
 +
over $  K $).  
 +
Every elementary divisor of $  A $
 +
occurs in the set $  {\mathcal E} _ {A , K }  $
 +
of all elementary divisors of $  A $
 +
with multiplicity equal to the number of invariant factors having this divisor in their decompositions. In contrast to the invariant factors, the elementary divisors depend on the ring $  K $
 +
over which $  A $
 +
is considered: If $  K = F [ \lambda ] $,  
 +
$  \widetilde{F}  $
 +
is an extension of $  F $
 +
and $  \widetilde{K}  = \widetilde{F}  [ \lambda ] $,  
 +
then, in general, a matrix $  A \in M _ {m \times n }  ( K) \subset  M _ {m \times n }  ( \widetilde{K}  ) $
 +
has distinct elementary divisors (but the same invariant factors), depending on whether $  A $
 +
is regarded as an element of $  M _ {m \times n }  ( K) $
 +
or of $  M _ {m \times n }  ( \widetilde{K}  ) $.  
 +
The invariant factors can be recovered from the complete collection of elementary divisors, and vice versa.
  
 
For a practical method of finding the Smith normal form see, for example, [[#References|[1]]].
 
For a practical method of finding the Smith normal form see, for example, [[#References|[1]]].
  
The main result on the Smith normal form was obtained for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752092.png" /> (see [[#References|[7]]]) and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752093.png" /> (see [[#References|[8]]]). With practically no changes, the theory of Smith normal forms goes over to the case when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752094.png" /> is any principal ideal ring (see [[#References|[3]]], [[#References|[6]]]). The Smith normal form has important applications; for example, the structure theory of finitely-generated modules over principal ideal rings is based on it (see [[#References|[3]]], [[#References|[6]]]); in particular, this holds for the theory of finitely-generated Abelian groups and theory of the Jordan normal form (see below).
+
The main result on the Smith normal form was obtained for $  K = \mathbf Z $(
 +
see [[#References|[7]]]) and $  K = F [ \lambda ] $(
 +
see [[#References|[8]]]). With practically no changes, the theory of Smith normal forms goes over to the case when $  K $
 +
is any principal ideal ring (see [[#References|[3]]], [[#References|[6]]]). The Smith normal form has important applications; for example, the structure theory of finitely-generated modules over principal ideal rings is based on it (see [[#References|[3]]], [[#References|[6]]]); in particular, this holds for the theory of finitely-generated Abelian groups and theory of the Jordan normal form (see below).
 +
 
 +
==The natural normal form==
 +
Let  $  K $
 +
be a field. Two square matrices  $  A , B \in M _ {n \times n }  ( K) $
 +
are called [[Similar matrices|similar]] over  $  K $
 +
if there is a non-singular matrix  $  C \in M _ {n \times n }  ( K) $
 +
such that  $  B = C  ^ {-1} A C $.
 +
There is a close link between similarity and equivalence: Two matrices  $  A , B \in M _ {n \times n }  ( K) $
 +
are similar if and only if the matrices  $  \lambda E - A $
 +
and  $  \lambda E - B $,
 +
where  $  E $
 +
is the identity matrix, are equivalent. Thus, for the similarity of  $  A $
 +
and  $  B $
 +
it is necessary and sufficient that all invariant factors, or, what is the same, the collection of [[elementary divisors]] over  $  K [ \lambda ] $
 +
of  $  \lambda E - A $
 +
and  $  \lambda E - B $,
 +
are the same. For a practical method of finding a  $  C $
 +
for similar matrices  $  A $
 +
and  $  B $,
 +
see [[#References|[1]]], [[#References|[4]]].
  
==The natural normal form.==
+
The matrix  $  \lambda E - A $
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752095.png" /> be a field. Two square matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752096.png" /> are called similar over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752097.png" /> if there is a non-singular matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752098.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n06752099.png" />. There is a close link between similarity and equivalence: Two matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520100.png" /> are similar if and only if the matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520101.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520102.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520103.png" /> is the identity matrix, are equivalent. Thus, for the similarity of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520104.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520105.png" /> it is necessary and sufficient that all invariant factors, or, what is the same, the collection of elementary divisors over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520106.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520107.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520108.png" />, are the same. For a practical method of finding a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520109.png" /> for similar matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520110.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520111.png" />, see [[#References|[1]]], [[#References|[4]]].
+
is called the characteristic matrix of  $  A \in M _ {n \times n }  ( K) $,
 +
and the invariant factors of  $  \lambda E - A $
 +
are called the similarity invariants of  $  A $;
 +
there are  $  n $
 +
of them, say  $  d _ {1} \dots d _ {n} $.  
 +
The polynomial  $  d _ {n} $
 +
is the determinant of $  \lambda E - A $
 +
and is called the characteristic polynomial of $  A $.  
 +
Suppose that  $  d _ {1} = \dots = d _ {q} = 1 $
 +
and that for  $  j \geq  q + 1 $
 +
the degree of $  d _ {j} $
 +
is greater than 1. Then  $  A $
 +
is similar over  $  K $
 +
to a block-diagonal matrix  $  N _ {1} \in M _ {n \times n }  ( K) $
 +
of the form
  
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520112.png" /> is called the characteristic matrix of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520113.png" />, and the invariant factors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520114.png" /> are called the similarity invariants of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520115.png" />; there are <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520116.png" /> of them, say <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520117.png" />. The polynomial <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520118.png" /> is the determinant of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520119.png" /> and is called the characteristic polynomial of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520120.png" />. Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520121.png" /> and that for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520122.png" /> the degree of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520123.png" /> is greater than 1. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520124.png" /> is similar over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520125.png" /> to a block-diagonal matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520126.png" /> of the form
+
$$
 +
N _ {1= \left \|
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520127.png" /></td> </tr></table>
+
\begin{array}{ccccc}
 +
L ( d _ {q+1} )  &{}  &{}  &{}  & 0 \\
 +
{}  &\cdot  &{}  &{}  &{}  \\
 +
{}  &{}  &\cdot  &{}  &{}  \\
 +
{}  &{}  &{}  &\cdot  &{}  \\
 +
0  &{}  &{}  &{}  &L ( d _ {n} )  \\
 +
\end{array}
 +
\right \| ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520128.png" /> for a polynomial
+
where $  L ( f  ) $
 +
for a polynomial
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520129.png" /></td> </tr></table>
+
$$
 +
= \lambda  ^ {p} + \alpha _ {1} \lambda  ^ {p-1} + \dots +
 +
\alpha _ {p}  $$
  
 
denotes the so-called companion matrix
 
denotes the so-called companion matrix
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520130.png" /></td> </tr></table>
+
$$
 +
L ( f  )  = \
 +
\left \|
 +
\begin{array}{cccccc}
 +
0  & 1  & 0  &\dots  & 0  & 0  \\
 +
0  & 0  & 1  &\dots  & 0  & 0  \\
 +
\cdot  &\cdot  &\cdot  &\dots  &\cdot  &\cdot  \\
 +
0  & 0  & 0  &\dots  & 0  & 1  \\
 +
{- \alpha _ {p} }  &{- \alpha _ {p-1} }  &{- \alpha _ {p-2} }  &\dots  &{- \alpha _ {2} }  &{- \alpha _ {1} }  \\
 +
\end{array}
 +
\right \| .
 +
$$
  
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520131.png" /> is uniquely determined from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520132.png" /> and is called the first natural normal form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520133.png" /> (see [[#References|[1]]], [[#References|[2]]]).
+
The matrix $  N _ {1} $
 +
is uniquely determined from $  A $
 +
and is called the first natural normal form of $  A $(
 +
see [[#References|[1]]], [[#References|[2]]]).
  
Now let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520134.png" /> be the collection of all elementary divisors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520135.png" />. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520136.png" /> is similar over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520137.png" /> to a block-diagonal matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520138.png" /> (cf. [[Block-diagonal operator|Block-diagonal operator]]) whose blocks are the companion matrices of all elementary divisors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520139.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520140.png" />:
+
Now let $  {\mathcal E} _ {A , K [ \lambda ] }  $
 +
be the collection of all elementary divisors of $  \lambda E - A $.  
 +
Then $  A $
 +
is similar over $  K $
 +
to a block-diagonal matrix $  N _ {2} $(
 +
cf. [[Block-diagonal operator|Block-diagonal operator]]) whose blocks are the companion matrices of all elementary divisors $  e _ {j} ^ {n _ {ij} } \in {\mathcal E} _ {A , K [ \lambda ] }  $
 +
of $  \lambda E - A $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520141.png" /></td> </tr></table>
+
$$
 +
N _ {2}  = \
 +
\left \|
 +
\begin{array}{ccccc}
 +
\cdot  &{}  &{}  &{}  & 0 \\
 +
{}  &\cdot  &{}  &{}  &{}  \\
 +
{}  &{}  &L ( e _ {j} ^ {n _ {ij} } )  &{}  &{}  \\
 +
{}  &{}  &{}  &\cdot  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  \\
 +
0  &{}  &{}  &{}  &\cdot  \\
 +
\end{array}
 +
\right \| .
 +
$$
  
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520142.png" /> is determined from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520143.png" /> only up to the order of the blocks along the main diagonal; it is called the second natural normal form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520144.png" /> (see [[#References|[1]]], [[#References|[2]]]), or its Frobenius, rational or quasi-natural normal form (see [[#References|[4]]]). In contrast to the first, the second natural form changes, generally speaking, on transition from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520145.png" /> to an extension.
+
The matrix $  N _ {2} $
 +
is determined from $  A $
 +
only up to the order of the blocks along the main diagonal; it is called the second natural normal form of $  A $(
 +
see [[#References|[1]]], [[#References|[2]]]), or its Frobenius, rational or quasi-natural normal form (see [[#References|[4]]]). In contrast to the first, the second natural form changes, generally speaking, on transition from $  K $
 +
to an extension.
  
 
==The Jordan normal form.==
 
==The Jordan normal form.==
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520146.png" /> be a field, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520147.png" />, and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520148.png" /> be the collection of all elementary divisors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520149.png" /> over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520150.png" />. Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520151.png" /> has the property that the characteristic polynomial <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520152.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520153.png" /> splits in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520154.png" /> into linear factors. (This is so, for example, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520155.png" /> is the field of complex numbers or, more generally, any algebraically closed field.) Then every one of the polynomials <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520156.png" /> has the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520157.png" /> for some <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520158.png" />, and, accordingly, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520159.png" /> has the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520160.png" />. The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520161.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520162.png" /> of the form
+
Let $  K $
 +
be a field, let $  A \in M _ {n \times n }  ( K) $,  
 +
and let $  {\mathcal E} _ {A , K [ \lambda ] }  = \{ e _ {i} ^ {n _ {ij} } \} $
 +
be the collection of all elementary divisors of $  \lambda E - A $
 +
over $  K [ \lambda ] $.  
 +
Suppose that $  K $
 +
has the property that the characteristic polynomial $  d _ {n} $
 +
of $  A $
 +
splits in $  K [ \lambda ] $
 +
into linear factors. (This is so, for example, if $  K $
 +
is the field of complex numbers or, more generally, any algebraically closed field.) Then every one of the polynomials $  e _ {i} $
 +
has the form $  \lambda - a _ {i} $
 +
for some $  a _ {i} \in K $,  
 +
and, accordingly, $  e _ {i} ^ {n _ {ij} } $
 +
has the form $  ( \lambda - a _ {i} ) ^ {n _ {ij} } $.  
 +
The matrix $  J ( f  ) $
 +
in $  M _ {s \times s }  ( K) $
 +
of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520163.png" /></td> </tr></table>
+
$$
 +
J ( f  )  = \
 +
\left \|
 +
\begin{array}{cccccc}
 +
a  & 1  &{}  &{}  &{}  & 0  \\
 +
{}  &\cdot  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &\cdot  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &\cdot  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &\cdot  & 1  \\
 +
0 &{}  &{}  &{}  &{}  & a  \\
 +
\end{array}
 +
\right \| ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520164.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520165.png" />, is called the hypercompanion matrix of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520166.png" /> (see [[#References|[1]]]) or the Jordan block of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520169.png" /> with eigenvalue <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520170.png" />. The following fundamental proposition holds: A matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520171.png" /> is similar over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520172.png" /> to a block-diagonal matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520173.png" /> whose blocks are the hypercompanion matrices of all elementary divisors of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520174.png" />:
+
where $  f = ( \lambda - a )  ^ {s} $,  
 +
$  a \in K $,  
 +
is called the hypercompanion matrix of $  f $(
 +
see [[#References|[1]]]) or the Jordan block of order $  s $
 +
with eigenvalue $  a $.  
 +
The following fundamental proposition holds: A matrix $  A $
 +
is similar over $  K $
 +
to a block-diagonal matrix $  J \in M _ {n \times n }  ( K) $
 +
whose blocks are the hypercompanion matrices of all elementary divisors of $  \lambda E - A $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520175.png" /></td> </tr></table>
+
$$
 +
= \left \|
 +
\begin{array}{ccccc}
 +
\cdot  &{}  &{}  &{}  & 0 \\
 +
{}  &\cdot  &{}  &{}  &{}  \\
 +
{}  &{}  &J ( e _ {i} ^ {n _ {ij} } )  &{}  &{}  \\
 +
{}  &{}  &{}  &\cdot  &{}  \\
 +
0  &{}  &{}  &{}  &\cdot  \\
 +
\end{array}
 +
\right \| .
 +
$$
  
The matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520176.png" /> is determined only up to the order of the blocks along the main diagonal; it is a [[Jordan matrix|Jordan matrix]] and is called the Jordan normal form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520177.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520178.png" /> does not have the property mentioned above, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520179.png" /> cannot be brought, over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520180.png" />, to the Jordan normal form (but it can over a finite extension of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520181.png" />). See [[#References|[4]]] for information about the so-called generalized Jordan normal form, reduction to which is possible over any field <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520182.png" />.
+
The matrix $  J $
 +
is determined only up to the order of the blocks along the main diagonal; it is a [[Jordan matrix|Jordan matrix]] and is called the Jordan normal form of $  A $.  
 +
If $  K $
 +
does not have the property mentioned above, then $  A $
 +
cannot be brought, over $  K $,  
 +
to the Jordan normal form (but it can over a finite extension of $  K $).  
 +
See [[#References|[4]]] for information about the so-called generalized Jordan normal form, reduction to which is possible over any field $  K $.
  
Apart from the various normal forms for arbitrary matrices, there are also special normal forms of special matrices. Classical examples are the normal forms of symmetric and skew-symmetric matrices. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520183.png" /> be a field. Two matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520184.png" /> are called congruent (see [[#References|[1]]]) if there is a non-singular matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520185.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520186.png" />. Normal forms under the congruence relation have been investigated most thoroughly for the classes of symmetric and skew-symmetric matrices. Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520187.png" /> and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520188.png" /> is skew-symmetric, that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520189.png" />. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520190.png" /> is congruent to a uniquely determined matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520191.png" /> of the form
+
Apart from the various normal forms for arbitrary matrices, there are also special normal forms of special matrices. Classical examples are the normal forms of symmetric and skew-symmetric matrices. Let $  K $
 +
be a field. Two matrices $  A , B \in M _ {n \times n }  ( K) $
 +
are called congruent (see [[#References|[1]]]) if there is a non-singular matrix $  C \in M _ {n \times n }  ( K) $
 +
such that $  B = C  ^ {T} A C $.  
 +
Normal forms under the congruence relation have been investigated most thoroughly for the classes of symmetric and skew-symmetric matrices. Suppose that $  \mathop{\rm char}  K \neq 2 $
 +
and that $  A $
 +
is skew-symmetric, that is, $  A  ^ {T} = - A $.  
 +
Then $  A $
 +
is congruent to a uniquely determined matrix $  H $
 +
of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520192.png" /></td> </tr></table>
+
$$
 +
= \left \|
 +
\begin{array}{rcrccrcccc}
 +
0  & 1  &{}  &{}  &{}  &{}  &{}  &{}  &{}  &{}  \\
 +
- 1  & 0  &{}  &{}  &{}  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  & 0  & 1  &{}  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &- 1  & 0  &{}  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &\cdot  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  & 0  & 1  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &- 1  & 0  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &{}  &{}  & 0  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &{}  &{}  &{}  &\cdot  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &{}  &{}  &{}  &{}  & 0 \\
 +
\end{array}
 +
\right \| ,
 +
$$
  
which can be regarded as the normal form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520193.png" /> under congruence. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520194.png" /> is symmetric, that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520195.png" />, then it is congruent to a matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520196.png" /> of the form
+
which can be regarded as the normal form of $  A $
 +
under congruence. If $  A $
 +
is symmetric, that is, $  A  ^ {T} = A $,  
 +
then it is congruent to a matrix $  D $
 +
of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520197.png" /></td> </tr></table>
+
$$
 +
= \left \|
 +
\begin{array}{cccccccc}
 +
\epsilon _ {1}  &{}  &{}  &{}  &{}  &{}  &{}  & 0  \\
 +
{}  &\cdot  &{}  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &\cdot  &{}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &\epsilon _ {r}  &{}  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  & 0  &{}  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &\cdot  &{}  &{}  \\
 +
{}  &{}  &{}  &{}  &{}  &{}  &\cdot  &{}  \\
 +
0  &{}  &{}  &{}  &{}  &{}  &{}  & 0 \\
 +
\end{array}
 +
\right \| ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520198.png" /> for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520199.png" />. The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520200.png" /> is the rank of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520201.png" /> and is uniquely determined. The subsequent finer choice of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520202.png" /> depends on the properties of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520203.png" />. Thus, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520204.png" /> is algebraically closed, one may assume that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520205.png" />; if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520206.png" /> is the field of real numbers, one may assume that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520207.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520208.png" /> for a certain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520209.png" />. <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520210.png" /> is uniquely determined by these properties and can be regarded as the normal form of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520211.png" /> under congruence. See [[#References|[6]]], [[#References|[10]]] and [[Quadratic form|Quadratic form]] for information about the normal forms of symmetric matrices for a number of other fields, and also about Hermitian analogues of this theory.
+
where $  \epsilon _ {1} \neq 0 $
 +
for all $  i $.  
 +
The number $  r $
 +
is the rank of $  A $
 +
and is uniquely determined. The subsequent finer choice of the $  \epsilon _ {i} $
 +
depends on the properties of $  K $.  
 +
Thus, if $  K $
 +
is algebraically closed, one may assume that $  \epsilon _ {1} = \dots = \epsilon _ {r} = 1 $;  
 +
if $  K $
 +
is the field of real numbers, one may assume that $  \epsilon _ {1} = \dots \epsilon _ {p} = 1 $
 +
and $  \epsilon _ {p+1} = \dots = \epsilon _ {r} = - 1 $
 +
for a certain $  p $.  
 +
$  D $
 +
is uniquely determined by these properties and can be regarded as the normal form of $  A $
 +
under congruence. See [[#References|[6]]], [[#References|[10]]] and [[Quadratic form|Quadratic form]] for information about the normal forms of symmetric matrices for a number of other fields, and also about Hermitian analogues of this theory.
  
A common feature in the theories of normal forms considered above (and also in others) is the fact that the admissible transformations over the relevant set of matrices are determined by the action of a certain group, so that the classes of matrices that can be carried into each other by means of these transformations are the orbits (cf. [[Orbit|Orbit]]) of this group, and the appropriate normal form is the result of selecting in each orbit a certain canonical representative. Thus, the classes of equivalent matrices are the orbits of the group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520212.png" /> (where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520213.png" /> is the group of invertible square matrices of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520214.png" /> with coefficients in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520215.png" />), acting on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520216.png" /> by the rule <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520217.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520218.png" />. The classes of similar matrices are the orbits of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520219.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520220.png" /> acting by the rule <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520221.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520222.png" />. The classes of congruent symmetric or skew-symmetric matrices are the orbits of the group <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520223.png" /> on the set of all symmetric or skew-symmetric matrices of order <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520224.png" />, acting by the rule <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520225.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520226.png" />. From this point of view every normal form is a specific example of the solution of part of the general problem of orbital decomposition for the action of a certain transformation group.
+
A common feature in the theories of normal forms considered above (and also in others) is the fact that the admissible transformations over the relevant set of matrices are determined by the action of a certain group, so that the classes of matrices that can be carried into each other by means of these transformations are the orbits (cf. [[Orbit|Orbit]]) of this group, and the appropriate normal form is the result of selecting in each orbit a certain canonical representative. Thus, the classes of equivalent matrices are the orbits of the group $  G = \mathop{\rm GL} _ {m} ( K) \times  \mathop{\rm GL} _ {n} ( K) $(
 +
where $  \mathop{\rm GL} _ {s} ( K) $
 +
is the group of invertible square matrices of order $  s $
 +
with coefficients in $  K $),  
 +
acting on $  M _ {m \times n }  ( K) $
 +
by the rule $  A \rightarrow C  ^ {-1} A D $,  
 +
where $  ( C , D ) \in G $.  
 +
The classes of similar matrices are the orbits of $  \mathop{\rm GL} _ {n} ( K) $
 +
on $  M _ {n \times n }  ( K) $
 +
acting by the rule $  A \rightarrow C  ^ {-1} A C $,  
 +
where $  C \in  \mathop{\rm GL} _ {n} ( K) $.  
 +
The classes of congruent symmetric or skew-symmetric matrices are the orbits of the group $  \mathop{\rm GL} _ {n} ( K) $
 +
on the set of all symmetric or skew-symmetric matrices of order n $,  
 +
acting by the rule $  A \rightarrow C  ^ {T} A C $,  
 +
where $  C \in  \mathop{\rm GL} _ {n} ( K) $.  
 +
From this point of view every normal form is a specific example of the solution of part of the general problem of orbital decomposition for the action of a certain transformation group.
  
 
====References====
 
====References====
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> M. Markus, "A survey of matrix theory and matrix inequalities" , Allyn &amp; Bacon (1964)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> P. Lancaster, "Theory of matrices" , Acad. Press (1969) {{MR|0245579}} {{ZBL|0186.05301}} </TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> S. Lang, "Algebra" , Addison-Wesley (1974) {{MR|0783636}} {{ZBL|0712.00001}} </TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> A.I. Mal'tsev, "Foundations of linear algebra" , Freeman (1963) (Translated from Russian) {{MR|}} {{ZBL|0396.15001}} </TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> N. Bourbaki, "Elements of mathematics. Algebra: Modules. Rings. Forms" , '''2''' , Addison-Wesley (1975) pp. Chapt.4;5;6 (Translated from French) {{MR|2333539}} {{MR|2327161}} {{MR|2325344}} {{MR|2284892}} {{MR|2272929}} {{MR|0928386}} {{MR|0896478}} {{MR|0782297}} {{MR|0782296}} {{MR|0722608}} {{MR|0682756}} {{MR|0643362}} {{MR|0647314}} {{MR|0610795}} {{MR|0583191}} {{MR|0354207}} {{MR|0360549}} {{MR|0237342}} {{MR|0205211}} {{MR|0205210}} {{ZBL|}} </TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> N. Bourbaki, "Elements of mathematics. Algebra: Algebraic structures. Linear algebra" , '''1''' , Addison-Wesley (1974) pp. Chapt.1;2 (Translated from French) {{MR|0354207}} {{ZBL|}} </TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top"> H.J.S. Smith, "On systems of linear indeterminate equations and congruences" , ''Collected Math. Papers'' , '''1''' , Chelsea, reprint (1979) pp. 367–409</TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top"> G. Frobenius, "Theorie der linearen Formen mit ganzen Coeffizienten" ''J. Reine Angew. Math.'' , '''86''' (1879) pp. 146–208</TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top"> F.R. [F.R. Gantmakher] Gantmacher, "The theory of matrices" , '''1''' , Chelsea, reprint (1977) (Translated from Russian) {{MR|1657129}} {{MR|0107649}} {{MR|0107648}} {{ZBL|0927.15002}} {{ZBL|0927.15001}} {{ZBL|0085.01001}} </TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top"> J.-P. Serre, "A course in arithmetic" , Springer (1973) (Translated from French) {{MR|0344216}} {{ZBL|0256.12001}} </TD></TR></table>
+
<table><TR><TD valign="top">[1]</TD> <TD valign="top"> M. Markus, "A survey of matrix theory and matrix inequalities" , Allyn &amp; Bacon (1964)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top"> P. Lancaster, "Theory of matrices" , Acad. Press (1969) {{MR|0245579}} {{ZBL|0186.05301}} </TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top"> S. Lang, "Algebra" , Addison-Wesley (1974) {{MR|0783636}} {{ZBL|0712.00001}} </TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top"> A.I. Mal'tsev, "Foundations of linear algebra" , Freeman (1963) (Translated from Russian) {{MR|}} {{ZBL|0396.15001}} </TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top"> N. Bourbaki, "Elements of mathematics. Algebra: Modules. Rings. Forms" , '''2''' , Addison-Wesley (1975) pp. Chapt.4;5;6 (Translated from French) {{MR|0643362}} {{ZBL|1139.12001}} </TD></TR><TR><TD valign="top">[6]</TD> <TD valign="top"> N. Bourbaki, "Elements of mathematics. Algebra: Algebraic structures. Linear algebra" , '''1''' , Addison-Wesley (1974) pp. Chapt.1;2 (Translated from French) {{MR|0354207}} {{ZBL|}} </TD></TR><TR><TD valign="top">[7]</TD> <TD valign="top"> H.J.S. Smith, "On systems of linear indeterminate equations and congruences" , ''Collected Math. Papers'' , '''1''' , Chelsea, reprint (1979) pp. 367–409</TD></TR><TR><TD valign="top">[8]</TD> <TD valign="top"> G. Frobenius, "Theorie der linearen Formen mit ganzen Coeffizienten" ''J. Reine Angew. Math.'' , '''86''' (1879) pp. 146–208</TD></TR><TR><TD valign="top">[9]</TD> <TD valign="top"> F.R. [F.R. Gantmakher] Gantmacher, "The theory of matrices" , '''1''' , Chelsea, reprint (1977) (Translated from Russian) {{MR|1657129}} {{MR|0107649}} {{MR|0107648}} {{ZBL|0927.15002}} {{ZBL|0927.15001}} {{ZBL|0085.01001}} </TD></TR><TR><TD valign="top">[10]</TD> <TD valign="top"> J.-P. Serre, "A course in arithmetic" , Springer (1973) (Translated from French) {{MR|0344216}} {{ZBL|0256.12001}} </TD></TR></table>
  
 +
====Comments====
 +
The Smith canonical form and a canonical form related to the first natural normal form are of substantial importance in linear control and system theory [[#References|[a1]]], [[#References|[a2]]]. Here one studies systems of equations  $  \dot{x} = A x + B u $,
 +
$  x \in \mathbf R  ^ {n} $,
 +
$  u \in \mathbf R  ^ {m} $,
 +
and the similarity relation is:  $  ( A , B ) \sim ( S A S  ^ {-1} , S B ) $.
 +
A pair of matrices  $  A \in \mathbf R ^ {n \times n } $,
 +
$  B \in \mathbf R ^ {n \times m } $
 +
is called completely controllable if the rank of the block matrix
  
 +
$$
 +
( B , A B \dots A  ^ {n} B )  =  R ( A , B )
 +
$$
  
====Comments====
+
is  $  n $.
The Smith canonical form and a canonical form related to the first natural normal form are of substantial importance in linear control and system theory [[#References|[a1]]], [[#References|[a2]]]. Here one studies systems of equations <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520227.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520228.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520229.png" />, and the similarity relation is: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520230.png" />. A pair of matrices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520231.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520232.png" /> is called completely controllable if the rank of the block matrix
+
Observe that  $  R ( S A S  ^ {-1} , S B ) = S R ( A , B ) $,
 +
so that a canonical form can be formed by selecting  $  n $
 +
independent column vectors from  $  R ( A , B ) $.
 +
This can be done in many ways. The most common one is to test the columns of $  R ( A , B ) $
 +
for independence in the order in which they appear in  $  R ( A , B ) $.  
 +
This yields the following so-called Brunovskii–Luenberger canonical form or block companion canonical form for a completely-controllable pair  $  ( A , B ) $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520233.png" /></td> </tr></table>
+
$$
 +
\overline{A}\;  = S  ^ {-1} A S  = \
 +
\left \|
 +
\begin{array}{ccc}
 +
\overline{A}\; _ {11}  &\dots  &\overline{A}\; _ {1m}  \\
 +
\cdot  &{}  &\cdot  \\
 +
\cdot  &{}  &\cdot  \\
 +
\cdot  &{}  &\cdot  \\
 +
\overline{A}\; _ {m1}  &\dots  &\overline{A}\; _ {mm}  \\
 +
\end{array}
 +
\right \| ,
 +
$$
  
is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520234.png" />. Observe that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520235.png" />, so that a canonical form can be formed by selecting <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520236.png" /> independent column vectors from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520237.png" />. This can be done in many ways. The most common one is to test the columns of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520238.png" /> for independence in the order in which they appear in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520239.png" />. This yields the following so-called Brunovskii–Luenberger canonical form or block companion canonical form for a completely-controllable pair <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520240.png" />:
+
$$
 +
\overline{B}\;  = S  ^ {-1} B  = ( \overline{b}\; _ {1} \dots \overline{b}\; _ {m} ) ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520241.png" /></td> </tr></table>
+
where  $  \overline{A}\; _ {ij} $
 +
is a matrix of size  $  d _ {i} \times d _ {j} $
 +
for certain  $  d _ {i} \in \mathbf N \cup \{ 0 \} $,
 +
$  \sum _ {i=1}  ^ {m} d _ {i} = n $,
 +
of the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520242.png" /></td> </tr></table>
+
$$
 +
\overline{A}\; _ {ii}  = \left \|
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520243.png" /> is a matrix of size <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520244.png" /> for certain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520245.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520246.png" />, of the form
+
\begin{array}{ccccccc}
 +
0 & 0 & 0 &\dots  & 0  & 0  &*  \\
 +
1  & 0  & 0  &\dots  & 0  & 0  &*  \\
 +
0  & 1  & 0  &\dots  & 0  & 0  &*  \\
 +
\cdot  &\cdot  &\cdot  &{}  &\cdot  &\cdot  &\cdot  \\
 +
0  & 0  & 0  &\dots  & 0 & 1  &*  \\
 +
\end{array}
 +
\right \| ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520247.png" /></td> </tr></table>
+
$$
 +
\overline{A}\; _ {ij}  = \left \|
 +
\begin{array}{cccc}
 +
0  &\dots  & 0  &*  \\
 +
0  &\dots  & 0  &*  \\
 +
\cdot  &{}  &\cdot  &\cdot  \\
 +
0  &\dots  & 0 &*  \\
 +
\end{array}
 +
\right \| \  \textrm{ for }  i \neq j ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520248.png" /></td> </tr></table>
+
and  $  \overline{b}\; _ {j} $
 
+
for $  d _ {j} \neq 0 $
and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520249.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520250.png" /> is the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520251.png" />-th standard basis vector of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520252.png" />; the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520253.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520254.png" /> have arbitrary coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520255.png" />. Here the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520256.png" />'s denote coefficients which can take any value. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520257.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520258.png" /> is zero, the block <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520259.png" /> is empty (does not occur). Instead of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520260.png" /> any field can be used. The <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520261.png" /> are called controllability indices or Kronecker indices. They are invariants.
+
is the $  ( d _ {1} + \dots + d _ {j-1} + 1 ) $-
 +
th standard basis vector of $  \mathbf R  ^ {n} $;  
 +
the $  \overline{b}\; _ {j} $
 +
with $  d _ {j} = 0 $
 +
have arbitrary coefficients $  * $.  
 +
Here the $  * $'
 +
s denote coefficients which can take any value. If $  d _ {j} $
 +
or $  d _ {i} $
 +
is zero, the block $  A _ {ij} $
 +
is empty (does not occur). Instead of $  \mathbf R $
 +
any field can be used. The $  d _ {j} $
 +
are called controllability indices or Kronecker indices. They are invariants.
  
 
Canonical forms are often used in (numerical) computations. This must be done with caution, because they may not depend continuously on the parameters [[#References|[a3]]]. For example, the Jordan canonical form is not continuous; an example of this is:
 
Canonical forms are often used in (numerical) computations. This must be done with caution, because they may not depend continuously on the parameters [[#References|[a3]]]. For example, the Jordan canonical form is not continuous; an example of this is:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520262.png" /></td> </tr></table>
+
$$
 +
\left \|
 +
\begin{array}{cc}
 +
1  & t  \\
 +
0 & 1  \\
 +
\end{array}
 +
\right \| \
 +
\mapsto  \left \|
 +
\begin{array}{cc}
 +
1  & 1  \\
 +
0  & 1  \\
 +
\end{array}
 +
\right \| \  \textrm{ for }  t \neq 0 ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520263.png" /></td> </tr></table>
+
$$
 +
\left \|
 +
\begin{array}{cc}
 +
1  & 0 \\
 +
0  & 1  \\
 +
\end{array}
 +
\right \| \
 +
\mapsto  \left \|
 +
\begin{array}{cc}
 +
1  & 0  \\
 +
0  & 1  \\
 +
\end{array}
 +
\right \| .
 +
$$
  
 
The matter of continuous canonical forms has much to do with moduli problems (cf. [[Moduli theory|Moduli theory]]). Related is the matter of canonical forms for families of objects, e.g. canonical forms for holomorphic families of matrices under similarity [[#References|[a4]]]. For a survey of moduli-type questions in linear control theory cf. [[#References|[a5]]].
 
The matter of continuous canonical forms has much to do with moduli problems (cf. [[Moduli theory|Moduli theory]]). Related is the matter of canonical forms for families of objects, e.g. canonical forms for holomorphic families of matrices under similarity [[#References|[a4]]]. For a survey of moduli-type questions in linear control theory cf. [[#References|[a5]]].
  
In the case of a controllable pair <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520264.png" /> with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520265.png" />, i.e. <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520266.png" /> is a vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520267.png" />, the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520268.png" /> is cyclic, see also the section below on normal forms for operators. In this special case there is just one block <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520269.png" /> (and one vector <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520270.png" />). This canonical form for a cyclic matrix with a cyclic vector is also called the Frobenius canonical form or the companion canonical form.
+
In the case of a controllable pair $  ( A , B ) $
 +
with $  m = 1 $,  
 +
i.e. $  B $
 +
is a vector $  b \in \mathbf R  ^ {n} $,  
 +
the matrix $  A $
 +
is cyclic, see also the section below on normal forms for operators. In this special case there is just one block $  \overline{A}\; _ {11} $(
 +
and one vector $  \overline{b}\; _ {1} $).  
 +
This canonical form for a cyclic matrix with a cyclic vector is also called the Frobenius canonical form or the companion canonical form.
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> W.A. Wolovich, "Linear multivariable systems" , Springer (1974) {{MR|0359881}} {{ZBL|0291.93002}} </TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> J. Klamka, "Controllability of dynamical systems" , Kluwer (1990) {{MR|2461640}} {{MR|1325771}} {{MR|1134783}} {{MR|0707724}} {{MR|0507539}} {{ZBL|0911.93015}} {{ZBL|0876.93016}} {{ZBL|0930.93008}} {{ZBL|1043.93509}} {{ZBL|0853.93020}} {{ZBL|0852.93007}} {{ZBL|0818.93002}} {{ZBL|0797.93004}} {{ZBL|0814.93012}} {{ZBL|0762.93006}} {{ZBL|0732.93008}} {{ZBL|0671.93040}} {{ZBL|0667.93007}} {{ZBL|0666.93009}} {{ZBL|0509.93012}} {{ZBL|0393.93041}} </TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top"> S.H. Golub, J.H. Wilkinson, "Ill conditioned eigensystems and the computation of the Jordan canonical form" ''SIAM Rev.'' , '''18''' (1976) pp. 578–619 {{MR|0413456}} {{ZBL|0341.65027}} </TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top"> V.I. Arnol'd, "On matrices depending on parameters" ''Russ. Math. Surv.'' , '''26''' : 2 (1971) pp. 29–43 ''Uspekhi Mat. Nauk'' , '''26''' : 2 (1971) pp. 101–114 {{MR|}} {{ZBL|0259.15011}} </TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top"> M. Hazewinkel, "(Fine) moduli spaces for linear systems: what are they and what are they good for" C.I. Byrnes (ed.) C.F. Martin (ed.) , ''Geometrical Methods for the Theory of Linear Systems'' , Reidel (1980) pp. 125–193 {{MR|0608993}} {{ZBL|0481.93023}} </TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top"> H.W. Turnball, A.C. Aitken, "An introduction to the theory of canonical matrices" , Blackie &amp; Son (1932)</TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> W.A. Wolovich, "Linear multivariable systems" , Springer (1974) {{MR|0359881}} {{ZBL|0291.93002}} </TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> J. Klamka, "Controllability of dynamical systems" , Kluwer (1990) {{MR|2461640}} {{MR|1325771}} {{MR|1134783}} {{MR|0707724}} {{MR|0507539}} {{ZBL|0911.93015}} {{ZBL|0876.93016}} {{ZBL|0930.93008}} {{ZBL|1043.93509}} {{ZBL|0853.93020}} {{ZBL|0852.93007}} {{ZBL|0818.93002}} {{ZBL|0797.93004}} {{ZBL|0814.93012}} {{ZBL|0762.93006}} {{ZBL|0732.93008}} {{ZBL|0671.93040}} {{ZBL|0667.93007}} {{ZBL|0666.93009}} {{ZBL|0509.93012}} {{ZBL|0393.93041}} </TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top"> S.H. Golub, J.H. Wilkinson, "Ill conditioned eigensystems and the computation of the Jordan canonical form" ''SIAM Rev.'' , '''18''' (1976) pp. 578–619 {{MR|0413456}} {{ZBL|0341.65027}} </TD></TR><TR><TD valign="top">[a4]</TD> <TD valign="top"> V.I. Arnol'd, "On matrices depending on parameters" ''Russ. Math. Surv.'' , '''26''' : 2 (1971) pp. 29–43 ''Uspekhi Mat. Nauk'' , '''26''' : 2 (1971) pp. 101–114 {{MR|}} {{ZBL|0259.15011}} </TD></TR><TR><TD valign="top">[a5]</TD> <TD valign="top"> M. Hazewinkel, "(Fine) moduli spaces for linear systems: what are they and what are they good for" C.I. Byrnes (ed.) C.F. Martin (ed.) , ''Geometrical Methods for the Theory of Linear Systems'' , Reidel (1980) pp. 125–193 {{MR|0608993}} {{ZBL|0481.93023}} </TD></TR><TR><TD valign="top">[a6]</TD> <TD valign="top"> H.W. Turnball, A.C. Aitken, "An introduction to the theory of canonical matrices" , Blackie &amp; Son (1932)</TD></TR></table>
  
A normal form of an operator is a representation, up to an isomorphism, of a [[Self-adjoint operator|self-adjoint operator]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520271.png" /> acting on a Hilbert space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520272.png" /> as an orthogonal sum of multiplication operators by the independent variable.
+
==Self-adjoint operator on a Hilbert space==
  
To begin with, suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520273.png" /> is a cyclic operator; this means that there is an element <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520274.png" /> such that every element <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520275.png" /> has a unique representation in the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520276.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520277.png" /> is a function for which
+
A normal form of an operator is a representation, up to an isomorphism, of a [[Self-adjoint operator|self-adjoint operator]]  $  A $
 +
acting on a Hilbert space  $  {\mathcal H} $
 +
as an orthogonal sum of multiplication operators by the independent variable.
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520278.png" /></td> </tr></table>
+
To begin with, suppose that  $  A $
 +
is a cyclic operator; this means that there is an element  $  h _ {0} \in {\mathcal H} $
 +
such that every element  $  h \in {\mathcal H} $
 +
has a unique representation in the form  $  F ( A) h _ {0} $,
 +
where  $  F ( \xi ) $
 +
is a function for which
  
here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520279.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520280.png" />, is the [[Spectral resolution|spectral resolution]] of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520281.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520282.png" /> be the space of square-integrable functions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520283.png" /> with weight <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520284.png" />, and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520285.png" /> be the multiplication operator by the independent variable, with domain of definition
+
$$
 +
\int\limits _ {- \infty } ^ { {+ }  \infty }
 +
| F ( \xi ) |  ^ {2}  d ( E _  \xi  h _ {0} , h _ {0} )
 +
< \infty ;
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520286.png" /></td> </tr></table>
+
here  $  E _  \xi  $,
 +
$  - \infty < \xi < \infty $,
 +
is the [[Spectral resolution|spectral resolution]] of  $  A $.
 +
Let  $  {\mathcal L} _  \rho  ^ {2} $
 +
be the space of square-integrable functions on  $  ( - \infty , + \infty ) $
 +
with weight  $  \rho ( \xi ) = ( E _  \xi  h _ {0} , h _ {0} ) $,
 +
and let  $  K _  \rho  F = \xi F ( \xi ) $
 +
be the multiplication operator by the independent variable, with domain of definition
  
Then the operators <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520287.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520288.png" /> are isomorphic, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520289.png" />; that is, there exists an isomorphic and isometric mapping <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520290.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520291.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520292.png" />.
+
$$
 +
D _ {K _  \rho  }  = \
 +
\left \{ {
 +
F ( \xi ) } : {
 +
\int\limits _ {- \infty } ^ { {+ }  \infty }
 +
\xi  ^ {2} | F ( \xi ) |  ^ {2}  d \rho ( \xi ) < \infty
 +
} \right \}
 +
.
 +
$$
  
Suppose, next, that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520293.png" /> is an arbitrary self-adjoint operator. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520294.png" /> can be split into an orthogonal sum of subspaces <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520295.png" /> on each of which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520296.png" /> induces a cyclic operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520297.png" />, so that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520298.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520299.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520300.png" />. If the operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520301.png" /> is given on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520302.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520303.png" />.
+
Then the operators  $  A $
 +
and  $  K _  \rho  $
 +
are isomorphic, $  A \simeq K _  \rho  $;
 +
that is, there exists an isomorphic and isometric mapping  $  U : {\mathcal H} \rightarrow {\mathcal L} _  \rho  ^ {2} $
 +
such that $  U D _ {A} = D _ {K _  \rho  } $
 +
and $  A = U  ^ {-1} K _  \rho  U $.
  
The operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520304.png" /> is called the normal form or canonical representation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520305.png" />. The theorem on the canonical representation extends to the case of arbitrary normal operators (cf. [[Normal operator|Normal operator]]).
+
Suppose, next, that  $  A $
 +
is an arbitrary self-adjoint operator. Then  $  {\mathcal H} $
 +
can be split into an orthogonal sum of subspaces  $  {\mathcal H} _  \alpha  $
 +
on each of which  $  A $
 +
induces a cyclic operator $  A _  \alpha  $,
 +
so that  $  H = \sum \oplus H _  \alpha  $,
 +
$  A = \sum \oplus A _  \alpha  $
 +
and  $  A _  \alpha  \simeq K _ {\rho _  \alpha  } $.
 +
If the operator  $  K = \sum \oplus K _ {\rho _  \alpha  } $
 +
is given on  $  {\mathcal L}  ^ {2} = \sum \oplus {\mathcal L} _ {\rho _  \alpha  }  ^ {2} $,
 +
then  $  A \simeq K $.
 +
 
 +
The operator  $  K $
 +
is called the normal form or canonical representation of $  A $.  
 +
The theorem on the canonical representation extends to the case of arbitrary normal operators (cf. [[Normal operator|Normal operator]]).
  
 
====References====
 
====References====
Line 122: Line 587:
 
''V.I. Sobolev''
 
''V.I. Sobolev''
  
The normal form of an operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520306.png" /> is a representation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520307.png" />, acting on a [[Fock space|Fock space]] constructed over a certain space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520308.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520309.png" /> is a [[Measure space|measure space]], in the form of a sum
+
The normal form of an operator $  A $
 +
is a representation of $  A $,  
 +
acting on a [[Fock space|Fock space]] constructed over a certain space $  L _ {2} ( M , \sigma ) $,
 +
where $  ( M , \sigma ) $
 +
is a [[Measure space|measure space]], in the form of a sum
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520310.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
$$ \tag{1 }
 +
= \sum _ {m , n \geq  0 }
 +
\int\limits K _ {n,m} ( x _ {1} \dots x _ {n} ; \
 +
y _ {1} \dots y _ {m} ) \times
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520311.png" /></td> </tr></table>
+
$$
 +
\times a  ^ {*} ( x _ {1} ) \dots a  ^ {*} ( x _ {n} )
 +
a ( y _ {1} ) \dots a ( y _ {m} ) \prod_{i=1}^ { n }  d
 +
\sigma ( x _ {i} ) \prod_{j=1}^ { m }  d \sigma ( y _ {j} ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520312.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520313.png" />) are operator-valued generalized functions generating families of [[Annihilation operators|annihilation operators]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520314.png" /> and [[Creation operators|creation operators]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520315.png" />:
+
where $  a ( x) , a  ^ {*} ( x) $(
 +
$  x \in M $)  
 +
are operator-valued generalized functions generating families of [[annihilation operators]] $  \{ {a ( f  ) } : {f \in L _ {2} ( M , \sigma ) } \} $
 +
and [[Creation operators|creation operators]] $  \{ {a  ^ {*} ( f  ) } : {f \in L _ {2} ( M , \sigma ) } \} $:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520316.png" /></td> </tr></table>
+
$$
 +
a ( f  )  = \int\limits _ { M } a ( x) f ( x)  d \sigma ( x) ,\ \
 +
a  ^ {*} ( f  )  = \int\limits _ { M } a  ^ {*} ( x) \overline{f}\; ( x)  d \sigma
 +
( x) .
 +
$$
  
In each term of expression (1) all factors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520317.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520318.png" />, stand to the right of all factors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520319.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520320.png" />, and the (possibly generalized) functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520321.png" /> in the two sets of variables <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520322.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520323.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520324.png" /> are, in the case of a symmetric (Boson) Fock space, symmetric in the variables of each set separately, and, in the case of an anti-symmetric (Fermion) Fock space, anti-symmetric in these variables.
+
In each term of expression (1) all factors $  a ( y _ {j} ) $,  
 +
$  j = 1 \dots m $,  
 +
stand to the right of all factors $  a  ^ {*} ( x _ {i} ) $,  
 +
$  i = 1 \dots n $,  
 +
and the (possibly generalized) functions $  K _ {n,m} ( x _ {1} \dots x _ {n} ;  y _ {1} \dots y _ {m} ) $
 +
in the two sets of variables $  ( x _ {1} \dots x _ {n} ) \in M  ^ {n} $,
 +
$  ( y _ {1} \dots y _ {m} ) \in M  ^ {m} $,  
 +
n , m = 0 , 1 \dots $
 +
are, in the case of a symmetric (Boson) Fock space, symmetric in the variables of each set separately, and, in the case of an anti-symmetric (Fermion) Fock space, anti-symmetric in these variables.
  
For any bounded operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520325.png" /> the normal form exists and is unique.
+
For any bounded operator $  A $
 +
the normal form exists and is unique.
  
 
The representation (1) can be rewritten in a form containing the annihilation and creation operators directly:
 
The representation (1) can be rewritten in a form containing the annihilation and creation operators directly:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520326.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$ \tag{2 }
 +
A =
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520327.png" /></td> </tr></table>
+
$$
 +
= \
 +
\sum _ {m , n }  \sum _ {
 +
\begin{array}{c}
 +
\{ i _ {1} \dots i _ {n} \}
 +
\\
 +
\{ j _ {1} \dots j _ {m} \}
 +
\end{array}
 +
} c _ {i _ {1}  \dots i _ {n} j _ {1} \dots j _ {m} } a  ^ {*} ( f _ {
 +
i _ {1}  } ) \dots a  ^ {*} ( f _ {i _ {n}  } ) a (
 +
f _ {j _ {1}  } ) \dots a ( f _ {j _ {m}  } ) ,
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520328.png" /> is an orthonormal basis in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520329.png" /> and the summation in (2) is over all pairs of finite collections <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520330.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520331.png" /> of elements of this basis.
+
where $  \{ {f _ {i} } : {i = 1 , 2 ,\dots } \} $
 +
is an orthonormal basis in $  L _ {2} ( M , \sigma ) $
 +
and the summation in (2) is over all pairs of finite collections $  \{ f _ {i _ {1}  } \dots f _ {i _ {n}  } \} $,  
 +
$  \{ f _ {j _ {1}  } \dots f _ {j _ {m}  } \} $
 +
of elements of this basis.
  
In the case of an arbitrary (separable) Hilbert space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520332.png" /> the normal form of an operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520333.png" /> acting on the Fock space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520334.png" /> constructed over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520335.png" /> is determined for a fixed basis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520336.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520337.png" /> by means of the expression (2), where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520338.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520339.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520340.png" />, are families of annihilation and creation operators acting on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520341.png" />.
+
In the case of an arbitrary (separable) Hilbert space $  H $
 +
the normal form of an operator $  A $
 +
acting on the Fock space $  \Gamma ( H) $
 +
constructed over $  H $
 +
is determined for a fixed basis $  \{ {f _ {i} } : {i = 1 , 2 ,\dots } \} $
 +
in $  H $
 +
by means of the expression (2), where $  a ( f  ) $,  
 +
$  a  ^ {*} ( f  ) $,  
 +
$  f \in H $,  
 +
are families of annihilation and creation operators acting on $  \Gamma ( H) $.
  
 
====References====
 
====References====
Line 152: Line 671:
  
 
====Comments====
 
====Comments====
 
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> N.N. [N.N. Bogolyubov] Bogolubov, A.A. Logunov, I.T. Todorov, "Introduction to axiomatic quantum field theory" , Benjamin (1975) (Translated from Russian) {{MR|0452276}} {{MR|0452277}} {{ZBL|}} </TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> G. Källen, "Quantum electrodynamics" , Springer (1972) {{MR|0153346}} {{MR|0056465}} {{MR|0051156}} {{MR|0039581}} {{ZBL|0116.45005}} {{ZBL|0074.44202}} {{ZBL|0050.43001}} {{ZBL|0046.21402}} {{ZBL|0041.57104}} </TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top"> J. Glimm, A. Jaffe, "Quantum physics, a functional integral point of view" , Springer (1981) {{MR|}} {{ZBL|0461.46051}} </TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> N.N. [N.N. Bogolyubov] Bogolubov, A.A. Logunov, I.T. Todorov, "Introduction to axiomatic quantum field theory" , Benjamin (1975) (Translated from Russian) {{MR|0452276}} {{MR|0452277}} {{ZBL|}} </TD></TR><TR><TD valign="top">[a2]</TD> <TD valign="top"> G. Källen, "Quantum electrodynamics" , Springer (1972) {{MR|0153346}} {{MR|0056465}} {{MR|0051156}} {{MR|0039581}} {{ZBL|0116.45005}} {{ZBL|0074.44202}} {{ZBL|0050.43001}} {{ZBL|0046.21402}} {{ZBL|0041.57104}} </TD></TR><TR><TD valign="top">[a3]</TD> <TD valign="top"> J. Glimm, A. Jaffe, "Quantum physics, a functional integral point of view" , Springer (1981) {{MR|}} {{ZBL|0461.46051}} </TD></TR></table>
  
The normal form of a recursive function is a method for specifying an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520342.png" />-place [[Recursive function|recursive function]] <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520343.png" /> in the form
+
==Recursive functions==
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520344.png" /></td> <td valign="top" style="width:5%;text-align:right;">(*)</td></tr></table>
+
The normal form of a recursive function is a method for specifying an  $  n $-
 +
place [[Recursive function|recursive function]]  $  \phi $
 +
in the form
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520345.png" /> is an <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520346.png" />-place [[Primitive recursive function|primitive recursive function]], <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520347.png" /> is a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520348.png" />-place primitive recursive function and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520349.png" /> is the result of applying the [[Least-number operator|least-number operator]] to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520350.png" />. Kleene's normal form theorem asserts that there is a primitive recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520351.png" /> such that every recursive function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520352.png" /> can be represented in the form (*) with a suitable function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520353.png" /> depending on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520354.png" />; that is,
+
$$ \tag{* }
 +
\phi ( x _ {1} \dots x _ {n} )  = \
 +
g ( \mu z ( f ( x _ {1} \dots x _ {n} , z ) = 0 ) ) ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520355.png" /></td> </tr></table>
+
where  $  f $
 +
is an  $  ( n + 1 ) $-
 +
place [[Primitive recursive function|primitive recursive function]],  $  g $
 +
is a  $  1 $-
 +
place primitive recursive function and  $  \mu z ( f ( x _ {1} \dots x _ {n} , z ) = 0 ) $
 +
is the result of applying the [[Least-number operator|least-number operator]] to  $  f $.  
 +
Kleene's normal form theorem asserts that there is a primitive recursive function  $  g $
 +
such that every recursive function  $  \phi $
 +
can be represented in the form (*) with a suitable function  $  f $
 +
depending on  $  \phi $;
 +
that is,
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520356.png" /></td> </tr></table>
+
$$
 +
( \exists g ) ( \forall \phi ) ( \exists f  ) ( \forall x _ {1} \dots x _ {n} ) :
 +
$$
 +
 
 +
$$
 +
[ \phi ( x _ {1} \dots x _ {n} )  = g ( \mu z ( f
 +
( x _ {1} \dots x _ {n} , z ) = 0 ) ) ] .
 +
$$
  
 
The normal form theorem is one of the most important results in the theory of recursive functions.
 
The normal form theorem is one of the most important results in the theory of recursive functions.
  
A.A. Markov [[#References|[2]]] obtained a characterization of those functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520357.png" /> that can be used in the normal form theorem for the representation (*). A function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520358.png" /> can be used as function whose existence is asserted in the normal form theorem if and only if the equation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520359.png" /> has infinitely many solutions for each <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520360.png" />. Such functions are called functions of great range.
+
A.A. Markov [[#References|[2]]] obtained a characterization of those functions $  g $
 +
that can be used in the normal form theorem for the representation (*). A function $  g $
 +
can be used as function whose existence is asserted in the normal form theorem if and only if the equation $  g ( x) = n $
 +
has infinitely many solutions for each n $.  
 +
Such functions are called functions of great range.
  
 
====References====
 
====References====
Line 177: Line 721:
  
 
====Comments====
 
====Comments====
 
  
 
====References====
 
====References====
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> S.C. Kleene, "Introduction to metamathematics" , North-Holland (1951) pp. 288 {{MR|1234051}} {{MR|1570642}} {{MR|0051790}} {{ZBL|0875.03002}} {{ZBL|0604.03002}} {{ZBL|0109.00509}} {{ZBL|0047.00703}} </TD></TR></table>
 
<table><TR><TD valign="top">[a1]</TD> <TD valign="top"> S.C. Kleene, "Introduction to metamathematics" , North-Holland (1951) pp. 288 {{MR|1234051}} {{MR|1570642}} {{MR|0051790}} {{ZBL|0875.03002}} {{ZBL|0604.03002}} {{ZBL|0109.00509}} {{ZBL|0047.00703}} </TD></TR></table>
 +
 +
==Normal form of a system of differential equations==
  
 
A normal form of a system of differential equations
 
A normal form of a system of differential equations
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520361.png" /></td> <td valign="top" style="width:5%;text-align:right;">(1)</td></tr></table>
+
$$ \tag{1 }
 +
\dot{x} _ {i}  = \phi _ {i} ( x _ {1} \dots x _ {n} ) ,\ \
 +
i = 1 \dots n ,
 +
$$
  
near an invariant manifold <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520362.png" /> is a formal system
+
near an invariant manifold $  M $
 +
is a formal system
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520363.png" /></td> <td valign="top" style="width:5%;text-align:right;">(2)</td></tr></table>
+
$$ \tag{2 }
 +
\dot{y} _ {i}  = \psi _ {i} ( x _ {1} \dots y _ {n} ) ,\ \
 +
i = 1 \dots n ,
 +
$$
  
 
that is obtained from (1) by an invertible formal change of coordinates
 
that is obtained from (1) by an invertible formal change of coordinates
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520364.png" /></td> <td valign="top" style="width:5%;text-align:right;">(3)</td></tr></table>
+
$$ \tag{3 }
 +
x _ {i}  = \xi _ {i} ( y _ {i} \dots y _ {n} ) ,\ \
 +
i = 1 \dots n ,
 +
$$
  
in which the Taylor–Fourier series <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520365.png" /> contain only [[Resonance terms|resonance terms]]. In a particular case, normal forms occurred first in the dissertation of H. Poincaré (see [[#References|[1]]]). By means of a normal form (2) some systems (1) can be integrated, and many can be investigated for stability and can be integrated approximately; for systems (1) a search has been made for periodic solutions and families of conditionally periodic solutions, and their [[Bifurcation|bifurcation]] has been studied.
+
in which the Taylor–Fourier series $  \psi _ {i} $
 +
contain only [[Resonance terms|resonance terms]]. In a particular case, normal forms occurred first in the dissertation of H. Poincaré (see [[#References|[1]]]). By means of a normal form (2) some systems (1) can be integrated, and many can be investigated for stability and can be integrated approximately; for systems (1) a search has been made for periodic solutions and families of conditionally periodic solutions, and their [[Bifurcation|bifurcation]] has been studied.
  
 
==Normal forms in a neighbourhood of a fixed point.==
 
==Normal forms in a neighbourhood of a fixed point.==
Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520366.png" /> contains a fixed point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520367.png" /> of the system (1) (that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520368.png" />), that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520369.png" /> are analytic at it and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520370.png" /> are the eigen values of the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520371.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520372.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520373.png" />. Then in a full neighbourhood of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520374.png" /> the system (1) has the following normal form (2): the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520375.png" /> has for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520376.png" /> a normal form (for example, the Jordan normal form) and the Taylor series
+
Suppose that $  M $
 +
contains a fixed point $  X \equiv ( x _ {1} \dots x _ {n} ) = 0 $
 +
of the system (1) (that is, $  \phi _ {i} ( 0) = 0 $),  
 +
that the $  \phi _ {i} $
 +
are analytic at it and that $  \lambda _ {1} \dots \lambda _ {n} $
 +
are the eigen values of the matrix $  \| \partial  \phi _ {i} / \partial  x _ {j} \| $
 +
for $  X = 0 $.  
 +
Let  $  \Lambda \equiv ( \lambda _ {1} \dots \lambda _ {n} ) \neq 0 $.  
 +
Then in a full neighbourhood of $  X = 0 $
 +
the system (1) has the following normal form (2): the matrix $  \| \partial  \psi _ {i} / \partial  y _ {j} \| $
 +
has for $  Y \equiv ( y _ {1} \dots y _ {n} ) = 0 $
 +
a normal form (for example, the Jordan normal form) and the Taylor series
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520377.png" /></td> <td valign="top" style="width:5%;text-align:right;">(4)</td></tr></table>
+
$$ \tag{4 }
 +
\psi _ {i}  = y _ {i} \sum _ {Q \in N _ {i} } g _ {i Q }  Y  ^ {Q} ,\ \
 +
i = 1 \dots n ,
 +
$$
  
 
contain only resonance terms for which
 
contain only resonance terms for which
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520378.png" /></td> <td valign="top" style="width:5%;text-align:right;">(5)</td></tr></table>
+
$$ \tag{5 }
 +
( Q , \Lambda )  \equiv \
 +
q _ {1} \lambda _ {1} + \dots + q _ {n} \lambda _ {n}  = 0 .
 +
$$
  
Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520379.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520380.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520381.png" />. If equation (5) has no solutions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520382.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520383.png" />, then the normal form (2) is linear:
+
Here $  Q \equiv ( q _ {1} \dots q _ {n} ) $,  
 +
$  Y  ^ {Q} \equiv y _ {1} ^ {q _ {1} } \dots y _ {n} ^ {q _ {n} } $,  
 +
$  N _ {i} = \{ {Q } : {\textrm{ integers }  q _ {j} \geq  0, q _ {i} \geq  - 1, q _ {1} + \dots + q _ {n} \geq  0 } \} $.  
 +
If equation (5) has no solutions $  Q \neq 0 $
 +
in $  N = N _ {1} \cup \dots \cup N _ {n} $,  
 +
then the normal form (2) is linear:
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520384.png" /></td> </tr></table>
+
$$
 +
\dot{y} _ {i}  = \lambda _ {i} y _ {i} ,\ \
 +
i = 1 \dots n .
 +
$$
  
Every system (1) with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520385.png" /> can be reduced in a neighbourhood of a fixed point to its normal form (2) by some formal transformation (3), where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520386.png" /> are (possibly divergent) power series, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520387.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520388.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520389.png" />.
+
Every system (1) with $  \Lambda \neq 0 $
 +
can be reduced in a neighbourhood of a fixed point to its normal form (2) by some formal transformation (3), where the $  \xi _ {i} $
 +
are (possibly divergent) power series, $  \xi _ {i} ( 0) = 0 $
 +
and  $  \mathop{\rm det}  \| \partial  \xi _ {i} / \partial  y _ {j} \| \neq 0 $
 +
for $  Y = 0 $.
  
Generally speaking, the normalizing transformation (3) and the normal form (2) (that is, the coefficients <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520390.png" /> in (4)) are not uniquely determined by the original system (1). A normal form (2) preserves many properties of the system (1), such as being real, symmetric, Hamiltonian, etc. (see , [[#References|[3]]]). If the original system contains small parameters, one can include them among the coordinates <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520391.png" />, and then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520392.png" />. Such coordinates do not change under a normalizing transformation (see [[#References|[3]]]).
+
Generally speaking, the normalizing transformation (3) and the normal form (2) (that is, the coefficients $  g _ {iQ} $
 +
in (4)) are not uniquely determined by the original system (1). A normal form (2) preserves many properties of the system (1), such as being real, symmetric, Hamiltonian, etc. (see , [[#References|[3]]]). If the original system contains small parameters, one can include them among the coordinates $  x _ {j} $,  
 +
and then $  \dot{x} _ {j} = 0 $.  
 +
Such coordinates do not change under a normalizing transformation (see [[#References|[3]]]).
  
If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520393.png" /> is the number of linearly independent solutions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520394.png" /> of equation (5), then by means of a transformation
+
If $  k $
 +
is the number of linearly independent solutions $  Q \in N $
 +
of equation (5), then by means of a transformation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520395.png" /></td> </tr></table>
+
$$
 +
y _ {i}  = \
 +
z _ {1} ^ {\alpha _ {i1} } \dots z _ {n} ^ {\alpha _ {in} } ,\ \
 +
i = 1 \dots n ,
 +
$$
  
where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520396.png" /> are integers and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520397.png" />, the normal form (2) is carried to a system
+
where the $  \alpha _ {ij} $
 +
are integers and $  \mathop{\rm det}  \| \alpha _ {ij} \| = \pm  1 $,  
 +
the normal form (2) is carried to a system
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520398.png" /></td> </tr></table>
+
$$
 +
\dot{z} _ {j}  = z _ {i} f ( z _ {1} \dots z _ {k} ) ,\ \
 +
i = 1 \dots n
 +
$$
  
(see , [[#References|[3]]]). The solution of this system reduces to a solution of the subsystem of the first <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520399.png" /> equations and to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520400.png" /> quadratures. The subsystem has to be investigated in the neighbourhood of the multiple singular point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520401.png" />, because the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520402.png" /> do not contain linear terms. This can be done by a local method (see [[#References|[3]]]).
+
(see , [[#References|[3]]]). The solution of this system reduces to a solution of the subsystem of the first $  k $
 +
equations and to n - k $
 +
quadratures. The subsystem has to be investigated in the neighbourhood of the multiple singular point $  z _ {1} = \dots = z _ {k} = 0 $,  
 +
because the $  f _ {1} \dots f _ {k} $
 +
do not contain linear terms. This can be done by a local method (see [[#References|[3]]]).
  
 
The following problem has been examined (see ): Under what conditions on the normal form (2) does the normalizing transformation of an analytic system (1) converge (be analytic)? Let
 
The following problem has been examined (see ): Under what conditions on the normal form (2) does the normalizing transformation of an analytic system (1) converge (be analytic)? Let
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520403.png" /></td> </tr></table>
+
$$
 +
\omega _ {k}  = \min  | ( Q , \Lambda ) |
 +
$$
  
for those <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520404.png" /> for which
+
for those $  Q \in N $
 +
for which
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520405.png" /></td> </tr></table>
+
$$
 +
( Q , \Lambda )  \neq  0 ,\ \
 +
q _ {1} + \dots + q _ {n}  <  2  ^ {k} .
 +
$$
  
Condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520407.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520408.png" />.
+
Condition $  \omega $:
 +
$  \sum _ {k=1}  ^  \infty  2  ^ {-k}  \mathop{\rm log}  \omega _ {k}  ^ {-1} < \infty $.
  
Condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520410.png" />: <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520411.png" /> as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520412.png" />.
+
Condition $  \overline \omega \; $:  
 +
$  {\lim\limits  \sup }  2  ^ {-k}  \mathop{\rm log}  \omega _ {k}  ^ {-1} < \infty $
 +
as $  k \rightarrow \infty $.
  
Condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520413.png" /> is weaker than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520414.png" />. Both are satisfied for almost-all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520415.png" /> (relative to Lebesgue measure) and are very weak arithmetic restrictions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520416.png" />.
+
Condition $  \overline \omega \; $
 +
is weaker than $  \omega $.  
 +
Both are satisfied for almost-all $  \Lambda $(
 +
relative to Lebesgue measure) and are very weak arithmetic restrictions on $  \Lambda $.
  
In case <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520417.png" /> there is also condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520419.png" /> (for the general case, see in ): There exists a power series <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520420.png" /> such that in (4), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520421.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520422.png" />.
+
In case $  \mathop{\rm Re}  \Lambda = 0 $
 +
there is also condition $  A $(
 +
for the general case, see in ): There exists a power series $  a ( Y) $
 +
such that in (4), $  \phi _ {i} = \lambda _ {i} y _ {i} a $,  
 +
$  i = 1 \dots n $.
  
If for an analytic system (1) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520423.png" /> satisfies condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520424.png" /> and the normal form (2) satisfies condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520425.png" />, then there exists an analytic transformation of (1) to a certain normal form. If (2) is obtained from an analytic system and fails to satisfy either condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520426.png" /> or condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520427.png" />, then there exists an analytic system (1) that has (2) as its normal form, and every transformation to a normal form diverges (is not analytic).
+
If for an analytic system (1) $  \Lambda $
 +
satisfies condition $  \omega $
 +
and the normal form (2) satisfies condition $  A $,  
 +
then there exists an analytic transformation of (1) to a certain normal form. If (2) is obtained from an analytic system and fails to satisfy either condition $  \overline \omega \; $
 +
or condition $  A $,  
 +
then there exists an analytic system (1) that has (2) as its normal form, and every transformation to a normal form diverges (is not analytic).
  
Thus, the problem raised above is solved for all normal forms except those for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520428.png" /> satisfies condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520429.png" />, but not <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520430.png" />, while the remaining coefficients of the normal form satisfy condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520431.png" />. The latter is a very rigid restriction on the coefficients of a normal form, and for large <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520432.png" /> it holds, generally speaking, only in degenerate cases. That is, the basic reason for divergence of a transformation to normal form is not [[Small denominators|small denominators]], but degeneracy of the normal form.
+
Thus, the problem raised above is solved for all normal forms except those for which $  \Lambda $
 +
satisfies condition $  \omega $,  
 +
but not $  \overline \omega \; $,  
 +
while the remaining coefficients of the normal form satisfy condition $  A $.  
 +
The latter is a very rigid restriction on the coefficients of a normal form, and for large n $
 +
it holds, generally speaking, only in degenerate cases. That is, the basic reason for divergence of a transformation to normal form is not [[Small denominators|small denominators]], but degeneracy of the normal form.
  
But even in cases of divergence of the normalizing transformation (3) with respect to (2), one can study properties of the solutions of the system (1). For example, a real system (1) has a smooth transformation to the normal form (2) even when it is not analytic. The majority of results on smooth normalization have been obtained under the condition that all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520433.png" />. Under this condition, with the help of a change <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520434.png" /> of finite smoothness class, a system (1) can be brought to a truncated normal form
+
But even in cases of divergence of the normalizing transformation (3) with respect to (2), one can study properties of the solutions of the system (1). For example, a real system (1) has a smooth transformation to the normal form (2) even when it is not analytic. The majority of results on smooth normalization have been obtained under the condition that all $  \mathop{\rm Re}  \lambda _ {j} \neq 0 $.  
 +
Under this condition, with the help of a change $  X \rightarrow V $
 +
of finite smoothness class, a system (1) can be brought to a truncated normal form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520435.png" /></td> <td valign="top" style="width:5%;text-align:right;">(6)</td></tr></table>
+
$$ \tag{6 }
 +
\dot{v} _ {i}  = \widetilde \psi  _ {i} ( V) ,\ \
 +
i = 1 \dots n ,
 +
$$
  
where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520436.png" /> are polynomials of degree <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520437.png" /> (see [[#References|[4]]]–). If in the normalizing transformation (3) all terms of degree higher than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520438.png" /> are discarded, the result is a transformation
+
where the $  \widetilde \psi  _ {i} $
 +
are polynomials of degree $  m $(
 +
see [[#References|[4]]]–). If in the normalizing transformation (3) all terms of degree higher than $  m $
 +
are discarded, the result is a transformation
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520439.png" /></td> <td valign="top" style="width:5%;text-align:right;">(7)</td></tr></table>
+
$$ \tag{7 }
 +
x _ {i}  = \widetilde \xi  _ {i} ( U) ,\ \
 +
i = 1 \dots n
 +
$$
  
(the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520440.png" /> are polynomials), that takes (1) to the form
+
(the $  \widetilde \xi  _ {i} $
 +
are polynomials), that takes (1) to the form
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520441.png" /></td> <td valign="top" style="width:5%;text-align:right;">(8)</td></tr></table>
+
$$ \tag{8 }
 +
\dot{u} _ {i}  = \widetilde \psi  _ {i} ( U) + \widetilde \phi  _ {i} ( U) ,\ \
 +
i = 1 \dots n ,
 +
$$
  
where the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520442.png" /> are polynomials containing only resonance terms and the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520443.png" /> are convergent power series containing only terms of degree higher than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520444.png" />. Solutions of the truncated normal form (6) are approximations for solutions of (8) and, after the transformation (7), give approximations of solutions of the original system (1). In many cases one succeeds in constructing for (6) a [[Lyapunov function|Lyapunov function]] (or [[Chetaev function|Chetaev function]]) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520445.png" /> such that
+
where the $  \widetilde \psi  _ {i} $
 +
are polynomials containing only resonance terms and the $  \widetilde \psi  _ {i} $
 +
are convergent power series containing only terms of degree higher than $  m $.  
 +
Solutions of the truncated normal form (6) are approximations for solutions of (8) and, after the transformation (7), give approximations of solutions of the original system (1). In many cases one succeeds in constructing for (6) a [[Lyapunov function|Lyapunov function]] (or [[Chetaev function|Chetaev function]]) $  f ( V) $
 +
such that
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520446.png" /></td> </tr></table>
+
$$
 +
| f ( V) |  \leq  c _ {1} | V |  ^  \gamma  \ \
 +
\textrm{ and } \ \
 +
\left |
 +
\sum_{j=1}^ { n }
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520447.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520448.png" /> are positive constants. Then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520449.png" /> is a Lyapunov (Chetaev) function for the system (8); that is, the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520450.png" /> is stable (unstable). For example, if all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520451.png" />, one can take <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520452.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520453.png" /> and obtain Lyapunov's theorem on stability under linear approximation (see [[#References|[7]]]; for other examples see the survey [[#References|[8]]]).
+
\frac{\partial  f }{\partial  v _ {j} }
  
From the normal form (2) one can find invariant analytic sets of the system (1). In what follows it is assumed for simplicity of exposition that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520454.png" />. From the normal form (2) one extracts the formal set
+
\widetilde \phi  _ {j} \right |
 +
> c _ {2} | V | ^ {\gamma + m } ,
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520455.png" /></td> </tr></table>
+
where  $  c _ {1} $
 +
and  $  c _ {2} $
 +
are positive constants. Then  $  f ( U) $
 +
is a Lyapunov (Chetaev) function for the system (8); that is, the point  $  X = 0 $
 +
is stable (unstable). For example, if all  $  \mathop{\rm Re}  \lambda _ {i} < 0 $,
 +
one can take  $  m = 1 $,
 +
$  f = \sum _ {i=1}  ^ {n} v _ {i}  ^ {2} $
 +
and obtain Lyapunov's theorem on stability under linear approximation (see [[#References|[7]]]; for other examples see the survey [[#References|[8]]]).
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520456.png" /> is a free parameter. Condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520457.png" /> is satisfied on the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520458.png" />. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520459.png" /> be the union of subspaces of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520460.png" /> such that the corresponding eigen values <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520461.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520462.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520463.png" />, are pairwise commensurable. The formal set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520464.png" /> is analytic in the system (1). From <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520465.png" /> one selects the subset <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520466.png" /> that is analytic in (1) if condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520467.png" /> holds (see [[#References|[3]]]). On the sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520468.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520469.png" /> lie periodic solutions and families of conditionally-periodic solutions of (1). By considering the sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520470.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520471.png" /> in systems with small parameters, one can study all analytic perturbations and bifurcations of such solutions (see [[#References|[9]]]).
+
From the normal form (2) one can find invariant analytic sets of the system (1). In what follows it is assumed for simplicity of exposition that  $  \mathop{\rm Re}  \Lambda = 0 $.
 +
From the normal form (2) one extracts the formal set
 +
 
 +
$$
 +
{\mathcal A}  = \{ {Y } : {\psi _ {i} = \lambda _ {i} y _ {i} a ,\
 +
i = 1 \dots n } \}
 +
,
 +
$$
 +
 
 +
where  $  a $
 +
is a free parameter. Condition $  A $
 +
is satisfied on the set $  {\mathcal A} $.  
 +
Let $  X $
 +
be the union of subspaces of the form $  \{ {Y } : {y _ {i} = 0,  i = i _ {1} \dots i _ {l} } \} $
 +
such that the corresponding eigen values $  \lambda _ {j} $,  
 +
$  j \neq i _ {1} \dots i _ {l} $,  
 +
$  1 \leq  j \leq  n $,  
 +
are pairwise commensurable. The formal set $  {\mathcal A}  tilde = {\mathcal A} \cap K $
 +
is analytic in the system (1). From $  {\mathcal A} $
 +
one selects the subset $  {\mathcal B} $
 +
that is analytic in (1) if condition $  \omega $
 +
holds (see [[#References|[3]]]). On the sets $  {\mathcal A}  tilde $
 +
and $  {\mathcal B} $
 +
lie periodic solutions and families of conditionally-periodic solutions of (1). By considering the sets $  {\mathcal A}  tilde $
 +
and $  {\mathcal B} $
 +
in systems with small parameters, one can study all analytic perturbations and bifurcations of such solutions (see [[#References|[9]]]).
  
 
==Generalizations.==
 
==Generalizations.==
If a system (1) does not lead to a normal form (2) but to a system whose right-hand sides contain certain non-resonance terms, then the resulting simplification is less substantial, but can improve the quality of the transformation. Thus, the reduction to a "semi-normal form" is analytic under a weakened condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520472.png" /> (see ). Another version is a transformation that normalizes a system (1) only on certain submanifolds (for example, on certain coordinate subspaces; see ). A combination of these approaches makes it possible to prove for (1) the existence of invariant submanifolds and of solutions of specific form (see [[#References|[9]]]).
+
If a system (1) does not lead to a normal form (2) but to a system whose right-hand sides contain certain non-resonance terms, then the resulting simplification is less substantial, but can improve the quality of the transformation. Thus, the reduction to a "semi-normal form" is analytic under a weakened condition $  A $(
 +
see ). Another version is a transformation that normalizes a system (1) only on certain submanifolds (for example, on certain coordinate subspaces; see ). A combination of these approaches makes it possible to prove for (1) the existence of invariant submanifolds and of solutions of specific form (see [[#References|[9]]]).
 +
 
 +
Suppose that a system (1) is defined and analytic in a neighbourhood of an invariant manifold  $  M $
 +
of dimension  $  k + l $
 +
that is fibred into  $  l $-
 +
dimensional invariant tori. Then close to  $  M $
 +
one can introduce local coordinates
 +
 
 +
$$
 +
S  =  ( s _ {1} \dots s _ {k} ) ,\ \
 +
Y  =  ( y _ {1} \dots y _ {l} ) ,\ \
 +
Z  =  ( z _ {1} \dots z _ {m} ) ,
 +
$$
 +
 
 +
$$
 +
k + l + m  =  n ,
 +
$$
 +
 
 +
such that  $  Z = 0 $
 +
on  $  M $,
 +
$  y _ {j} $
 +
is of period  $  2 \pi $,
 +
$  S $
 +
ranges over a certain domain  $  H $,
 +
and (1) takes the form
 +
 
 +
$$ \tag{9 }
 +
\left . \begin{array}{c}
 +
 
 +
\dot{S}  =  \Phi  ^ {( 1)} ( S , Y , Z ) ,
 +
\\
 +
 
 +
\dot{Y}  =  \Omega ( S , Y ) + \Phi  ^ {( 2)} ( S , Y , Z ) ,
 +
\\
  
Suppose that a system (1) is defined and analytic in a neighbourhood of an invariant manifold <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520473.png" /> of dimension <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520474.png" /> that is fibred into <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520475.png" />-dimensional invariant tori. Then close to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520476.png" /> one can introduce local coordinates
+
\dot{Z}  =  ( S , Y ) Z + \Phi  ^ {( 3)} ( S , Y , Z ) ,
 +
\end{array}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520477.png" /></td> </tr></table>
+
\right \}
 +
$$
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520478.png" /></td> </tr></table>
+
where  $  \Phi  ^ {( j)} = O ( | Z | ) $,
 +
$  j = 1 , 2 $,
 +
$  \Phi  ^ {( 3)} = O ( | Z |  ^ {2} ) $
 +
and  $  A $
 +
is a matrix. If  $  \Omega = \textrm{ const } $
 +
and  $  A $
 +
is triangular with constant main diagonal  $  \Lambda ( \lambda _ {1} \dots \lambda _ {n} ) $,
 +
then (under a weak restriction on the small denominators) there is a formal transformation of the local coordinates  $  S , Y , Z \rightarrow U , V , W $
 +
that takes the system (9) to the normal form
  
such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520479.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520480.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520481.png" /> is of period <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520482.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520483.png" /> ranges over a certain domain <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520484.png" />, and (1) takes the form
+
$$ \tag{10 }
 +
\left . \begin{array}{c}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520485.png" /></td> <td valign="top" style="width:5%;text-align:right;">(9)</td></tr></table>
+
\dot{U}  = \sum \Psi _ {PQ}  ^ {( 1)}
 +
( U) W  ^ {Q}  \mathop{\rm exp}  i ( P , V ) ,
 +
\\
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520486.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520487.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520488.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520489.png" /> is a matrix. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520490.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520491.png" /> is triangular with constant main diagonal <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520492.png" />, then (under a weak restriction on the small denominators) there is a formal transformation of the local coordinates <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520493.png" /> that takes the system (9) to the normal form
+
\dot{V}  = \sum \Psi _ {PQ}  ^ {( 2)}
 +
( U) W  ^ {Q}  \mathop{\rm exp}  i ( P , V ) ,
 +
\\
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520494.png" /></td> <td valign="top" style="width:5%;text-align:right;">(10)</td></tr></table>
+
\dot{w} _ {j}  = w _ {j} \sum
 +
g _ {jPQ} ( U) W  ^ {Q}  \mathop{\rm exp}  ( P , V ) ,\ \
 +
j = 1 \dots m ,
 +
 +
\end{array}
 +
\right \}
 +
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520495.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520496.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520497.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520498.png" />.
+
where $  P \in \mathbf Z  ^ {l} $,  
 +
$  Q \in \mathbf N  ^ {m} $,  
 +
$  U \in H $,  
 +
and $  i ( P , \Omega ) + ( Q , \Lambda ) = 0 $.
  
If among the coordinates <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520499.png" /> there is a small parameter, (9) can be averaged by the [[Krylov–Bogolyubov method of averaging|Krylov–Bogolyubov method of averaging]] (see [[#References|[10]]]), and the averaged system is a normal form. More generally, perturbation theory can be regarded as a special case of the theory of normal forms, when one of the coordinates is a small parameter (see [[#References|[11]]]).
+
If among the coordinates $  Z $
 +
there is a small parameter, (9) can be averaged by the [[Krylov–Bogolyubov method of averaging|Krylov–Bogolyubov method of averaging]] (see [[#References|[10]]]), and the averaged system is a normal form. More generally, perturbation theory can be regarded as a special case of the theory of normal forms, when one of the coordinates is a small parameter (see [[#References|[11]]]).
  
Theorems on the convergence of a normalizing change, on the existence of analytic invariant sets, etc., carry over to the systems (9) and (10). Here the best studied case is when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520500.png" /> is a periodic solution, that is, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520501.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520502.png" />. In this case the theory of normal forms is in many respects identical with the case when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/n/n067/n067520/n067520503.png" /> is a fixed point. Poincaré suggested that one should consider a pointwise mapping of a normal section across the periods. In this context arose a theory of normal forms of pointwise mappings, which is parallel to the corresponding theory for systems (1). For other generalizations of normal forms see [[#References|[3]]], , [[#References|[12]]]–[[#References|[14]]].
+
Theorems on the convergence of a normalizing change, on the existence of analytic invariant sets, etc., carry over to the systems (9) and (10). Here the best studied case is when $  M $
 +
is a periodic solution, that is, $  k = 0 $,  
 +
$  l = 1 $.  
 +
In this case the theory of normal forms is in many respects identical with the case when $  M $
 +
is a fixed point. Poincaré suggested that one should consider a pointwise mapping of a normal section across the periods. In this context arose a theory of normal forms of pointwise mappings, which is parallel to the corresponding theory for systems (1). For other generalizations of normal forms see [[#References|[3]]], , [[#References|[12]]]–[[#References|[14]]].
  
 
====References====
 
====References====

Latest revision as of 07:39, 14 January 2024


The normal form of a matrix $ A $ is a matrix $ N $ of a pre-assigned special form obtained from $ A $ by means of transformations of a prescribed type. One distinguishes various normal forms, depending on the type of transformations in question, on the domain $ K $ to which the coefficients of $ A $ belong, on the form of $ A $, and, finally, on the specific nature of the problem to be solved (for example, on the desirability of extending or not extending $ K $ on transition from $ A $ to $ N $, on the necessity of determining $ N $ from $ A $ uniquely or with a certain amount of arbitrariness). Frequently, instead of "normal form" one uses the term "canonical form of a matrixcanonical form" . Among the classical normal forms are the following. (Henceforth $ M _ {m \times n } ( K) $ denotes the set of all matrices of $ m $ rows and $ n $ columns with coefficients in $ K $.)

The Smith normal form.

Let $ K $ be either the ring of integers $ \mathbf Z $ or the ring $ F[ \lambda ] $ of polynomials in $ \lambda $ with coefficients in a field $ F $. A matrix $ B \in M _ {m \times n } ( K) $ is called equivalent to a matrix $ A \in M _ {m \times n } ( K) $ if there are invertible matrices $ C \in M _ {m \times m } ( K) $ and $ D \in M _ {n \times n } ( K) $ such that $ B = C A D $. Here $ B $ is equivalent to $ A $ if and only if $ B $ can be obtained from $ A $ by a sequence of elementary row-and-column transformations, that is, transformations of the following three types: a) permutation of the rows (or columns); b) addition to one row (or column) of another row (or column) multiplied by an element of $ K $; or c) multiplication of a row (or column) by an invertible element of $ K $. For transformations of this kind the following propositions hold: Every matrix $ A \in M _ {m \times n } ( K) $ is equivalent to a matrix $ N \in M _ {m \times n } ( K) $ of the form

$$ N = \left \| \begin{array}{cccccccc} d _ {1} &{} &{} &{} &{} &{} &{} & 0 \\ {} &\cdot &{} &{} &{} &{} &{} &{} \\ {} &{} &\cdot &{} &{} &{} &{} &{} \\ {} &{} &{} &d _ {r} &{} &{} &{} &{} \\ {} &{} &{} &{} & 0 &{} &{} &{} \\ {} &{} &{} &{} &{} &\cdot &{} &{} \\ {} &{} &{} &{} &{} &{} &\cdot &{} \\ 0 &{} &{} &{} &{} &{} &{} & 0 \\ \end{array} \right \| , $$

where $ d _ {i} \neq 0 $ for all $ i $; $ d _ {i} $ divides $ d _ {i+1} $ for $ i = 1 \dots r - 1 $; and if $ K = \mathbf Z $, then all $ d _ {i} $ are positive; if $ K = F [ \lambda ] $, then the leading coefficients of all polynomials $ d _ {i} $ are 1. This matrix is called the Smith normal form of $ A $. The $ d _ {i} $ are called the invariant factors of $ A $ and the number $ r $ is called its rank. The Smith normal form of $ A $ is uniquely determined and can be found as follows. The rank $ r $ of $ A $ is the order of the largest non-zero minor of $ A $. Suppose that $ 1 \leq j \leq r $; then among all minors of $ A $ of order $ j $ there is at least one non-zero. Let $ \Delta _ {j} $, $ j = 1 \dots r $, be the greatest common divisor of all non-zero minors of $ A $ of order $ j $( normalized by the condition $ \Delta _ {j} > 0 $ for $ K = \mathbf Z $ and such that the leading coefficient of $ \Delta _ {j} $ is 1 for $ K = F [ \lambda ] $), and let $ \Delta _ {0} = 1 $. Then $ d _ {j} = \Delta _ {j} / \Delta _ {j-1} $, $ j = 1 \dots r $. The invariant factors form a full set of invariants of the classes of equivalent matrices: Two matrices in $ M _ {m \times n } ( K) $ are equivalent if and only if their ranks and their invariant factors with equal indices are equal.

The invariant factors $ d _ {1} \dots d _ {r} $ split (in a unique manner, up to the order of the factors) into the product of powers of irreducible elements $ e _ {1} \dots e _ {s} $ of $ K $( which are positive integers $ > 1 $ when $ K = \mathbf Z $, and polynomials of positive degree with leading coefficient 1 when $ K = F [ \lambda ] $):

$$ d _ {i} = \ e _ {1} ^ {n _ {i1} } \dots e _ {s} ^ {n _ {is} } ,\ \ i = 1 \dots r , $$

where the $ n _ {ij} $ are non-negative integers. Every factor $ e _ {j} ^ {n _ {ij} } $ for which $ n _ {ij} > 0 $ is called an elementary divisor of $ A $( over $ K $). Every elementary divisor of $ A $ occurs in the set $ {\mathcal E} _ {A , K } $ of all elementary divisors of $ A $ with multiplicity equal to the number of invariant factors having this divisor in their decompositions. In contrast to the invariant factors, the elementary divisors depend on the ring $ K $ over which $ A $ is considered: If $ K = F [ \lambda ] $, $ \widetilde{F} $ is an extension of $ F $ and $ \widetilde{K} = \widetilde{F} [ \lambda ] $, then, in general, a matrix $ A \in M _ {m \times n } ( K) \subset M _ {m \times n } ( \widetilde{K} ) $ has distinct elementary divisors (but the same invariant factors), depending on whether $ A $ is regarded as an element of $ M _ {m \times n } ( K) $ or of $ M _ {m \times n } ( \widetilde{K} ) $. The invariant factors can be recovered from the complete collection of elementary divisors, and vice versa.

For a practical method of finding the Smith normal form see, for example, [1].

The main result on the Smith normal form was obtained for $ K = \mathbf Z $( see [7]) and $ K = F [ \lambda ] $( see [8]). With practically no changes, the theory of Smith normal forms goes over to the case when $ K $ is any principal ideal ring (see [3], [6]). The Smith normal form has important applications; for example, the structure theory of finitely-generated modules over principal ideal rings is based on it (see [3], [6]); in particular, this holds for the theory of finitely-generated Abelian groups and theory of the Jordan normal form (see below).

The natural normal form

Let $ K $ be a field. Two square matrices $ A , B \in M _ {n \times n } ( K) $ are called similar over $ K $ if there is a non-singular matrix $ C \in M _ {n \times n } ( K) $ such that $ B = C ^ {-1} A C $. There is a close link between similarity and equivalence: Two matrices $ A , B \in M _ {n \times n } ( K) $ are similar if and only if the matrices $ \lambda E - A $ and $ \lambda E - B $, where $ E $ is the identity matrix, are equivalent. Thus, for the similarity of $ A $ and $ B $ it is necessary and sufficient that all invariant factors, or, what is the same, the collection of elementary divisors over $ K [ \lambda ] $ of $ \lambda E - A $ and $ \lambda E - B $, are the same. For a practical method of finding a $ C $ for similar matrices $ A $ and $ B $, see [1], [4].

The matrix $ \lambda E - A $ is called the characteristic matrix of $ A \in M _ {n \times n } ( K) $, and the invariant factors of $ \lambda E - A $ are called the similarity invariants of $ A $; there are $ n $ of them, say $ d _ {1} \dots d _ {n} $. The polynomial $ d _ {n} $ is the determinant of $ \lambda E - A $ and is called the characteristic polynomial of $ A $. Suppose that $ d _ {1} = \dots = d _ {q} = 1 $ and that for $ j \geq q + 1 $ the degree of $ d _ {j} $ is greater than 1. Then $ A $ is similar over $ K $ to a block-diagonal matrix $ N _ {1} \in M _ {n \times n } ( K) $ of the form

$$ N _ {1} = \left \| \begin{array}{ccccc} L ( d _ {q+1} ) &{} &{} &{} & 0 \\ {} &\cdot &{} &{} &{} \\ {} &{} &\cdot &{} &{} \\ {} &{} &{} &\cdot &{} \\ 0 &{} &{} &{} &L ( d _ {n} ) \\ \end{array} \right \| , $$

where $ L ( f ) $ for a polynomial

$$ f = \lambda ^ {p} + \alpha _ {1} \lambda ^ {p-1} + \dots + \alpha _ {p} $$

denotes the so-called companion matrix

$$ L ( f ) = \ \left \| \begin{array}{cccccc} 0 & 1 & 0 &\dots & 0 & 0 \\ 0 & 0 & 1 &\dots & 0 & 0 \\ \cdot &\cdot &\cdot &\dots &\cdot &\cdot \\ 0 & 0 & 0 &\dots & 0 & 1 \\ {- \alpha _ {p} } &{- \alpha _ {p-1} } &{- \alpha _ {p-2} } &\dots &{- \alpha _ {2} } &{- \alpha _ {1} } \\ \end{array} \right \| . $$

The matrix $ N _ {1} $ is uniquely determined from $ A $ and is called the first natural normal form of $ A $( see [1], [2]).

Now let $ {\mathcal E} _ {A , K [ \lambda ] } $ be the collection of all elementary divisors of $ \lambda E - A $. Then $ A $ is similar over $ K $ to a block-diagonal matrix $ N _ {2} $( cf. Block-diagonal operator) whose blocks are the companion matrices of all elementary divisors $ e _ {j} ^ {n _ {ij} } \in {\mathcal E} _ {A , K [ \lambda ] } $ of $ \lambda E - A $:

$$ N _ {2} = \ \left \| \begin{array}{ccccc} \cdot &{} &{} &{} & 0 \\ {} &\cdot &{} &{} &{} \\ {} &{} &L ( e _ {j} ^ {n _ {ij} } ) &{} &{} \\ {} &{} &{} &\cdot &{} \\ {} &{} &{} &{} &{} \\ 0 &{} &{} &{} &\cdot \\ \end{array} \right \| . $$

The matrix $ N _ {2} $ is determined from $ A $ only up to the order of the blocks along the main diagonal; it is called the second natural normal form of $ A $( see [1], [2]), or its Frobenius, rational or quasi-natural normal form (see [4]). In contrast to the first, the second natural form changes, generally speaking, on transition from $ K $ to an extension.

The Jordan normal form.

Let $ K $ be a field, let $ A \in M _ {n \times n } ( K) $, and let $ {\mathcal E} _ {A , K [ \lambda ] } = \{ e _ {i} ^ {n _ {ij} } \} $ be the collection of all elementary divisors of $ \lambda E - A $ over $ K [ \lambda ] $. Suppose that $ K $ has the property that the characteristic polynomial $ d _ {n} $ of $ A $ splits in $ K [ \lambda ] $ into linear factors. (This is so, for example, if $ K $ is the field of complex numbers or, more generally, any algebraically closed field.) Then every one of the polynomials $ e _ {i} $ has the form $ \lambda - a _ {i} $ for some $ a _ {i} \in K $, and, accordingly, $ e _ {i} ^ {n _ {ij} } $ has the form $ ( \lambda - a _ {i} ) ^ {n _ {ij} } $. The matrix $ J ( f ) $ in $ M _ {s \times s } ( K) $ of the form

$$ J ( f ) = \ \left \| \begin{array}{cccccc} a & 1 &{} &{} &{} & 0 \\ {} &\cdot &{} &{} &{} &{} \\ {} &{} &\cdot &{} &{} &{} \\ {} &{} &{} &\cdot &{} &{} \\ {} &{} &{} &{} &\cdot & 1 \\ 0 &{} &{} &{} &{} & a \\ \end{array} \right \| , $$

where $ f = ( \lambda - a ) ^ {s} $, $ a \in K $, is called the hypercompanion matrix of $ f $( see [1]) or the Jordan block of order $ s $ with eigenvalue $ a $. The following fundamental proposition holds: A matrix $ A $ is similar over $ K $ to a block-diagonal matrix $ J \in M _ {n \times n } ( K) $ whose blocks are the hypercompanion matrices of all elementary divisors of $ \lambda E - A $:

$$ J = \left \| \begin{array}{ccccc} \cdot &{} &{} &{} & 0 \\ {} &\cdot &{} &{} &{} \\ {} &{} &J ( e _ {i} ^ {n _ {ij} } ) &{} &{} \\ {} &{} &{} &\cdot &{} \\ 0 &{} &{} &{} &\cdot \\ \end{array} \right \| . $$

The matrix $ J $ is determined only up to the order of the blocks along the main diagonal; it is a Jordan matrix and is called the Jordan normal form of $ A $. If $ K $ does not have the property mentioned above, then $ A $ cannot be brought, over $ K $, to the Jordan normal form (but it can over a finite extension of $ K $). See [4] for information about the so-called generalized Jordan normal form, reduction to which is possible over any field $ K $.

Apart from the various normal forms for arbitrary matrices, there are also special normal forms of special matrices. Classical examples are the normal forms of symmetric and skew-symmetric matrices. Let $ K $ be a field. Two matrices $ A , B \in M _ {n \times n } ( K) $ are called congruent (see [1]) if there is a non-singular matrix $ C \in M _ {n \times n } ( K) $ such that $ B = C ^ {T} A C $. Normal forms under the congruence relation have been investigated most thoroughly for the classes of symmetric and skew-symmetric matrices. Suppose that $ \mathop{\rm char} K \neq 2 $ and that $ A $ is skew-symmetric, that is, $ A ^ {T} = - A $. Then $ A $ is congruent to a uniquely determined matrix $ H $ of the form

$$ H = \left \| \begin{array}{rcrccrcccc} 0 & 1 &{} &{} &{} &{} &{} &{} &{} &{} \\ - 1 & 0 &{} &{} &{} &{} &{} &{} &{} &{} \\ {} &{} & 0 & 1 &{} &{} &{} &{} &{} &{} \\ {} &{} &- 1 & 0 &{} &{} &{} &{} &{} &{} \\ {} &{} &{} &{} &\cdot &{} &{} &{} &{} &{} \\ {} &{} &{} &{} &{} & 0 & 1 &{} &{} &{} \\ {} &{} &{} &{} &{} &- 1 & 0 &{} &{} &{} \\ {} &{} &{} &{} &{} &{} &{} & 0 &{} &{} \\ {} &{} &{} &{} &{} &{} &{} &{} &\cdot &{} \\ {} &{} &{} &{} &{} &{} &{} &{} &{} & 0 \\ \end{array} \right \| , $$

which can be regarded as the normal form of $ A $ under congruence. If $ A $ is symmetric, that is, $ A ^ {T} = A $, then it is congruent to a matrix $ D $ of the form

$$ D = \left \| \begin{array}{cccccccc} \epsilon _ {1} &{} &{} &{} &{} &{} &{} & 0 \\ {} &\cdot &{} &{} &{} &{} &{} &{} \\ {} &{} &\cdot &{} &{} &{} &{} &{} \\ {} &{} &{} &\epsilon _ {r} &{} &{} &{} &{} \\ {} &{} &{} &{} & 0 &{} &{} &{} \\ {} &{} &{} &{} &{} &\cdot &{} &{} \\ {} &{} &{} &{} &{} &{} &\cdot &{} \\ 0 &{} &{} &{} &{} &{} &{} & 0 \\ \end{array} \right \| , $$

where $ \epsilon _ {1} \neq 0 $ for all $ i $. The number $ r $ is the rank of $ A $ and is uniquely determined. The subsequent finer choice of the $ \epsilon _ {i} $ depends on the properties of $ K $. Thus, if $ K $ is algebraically closed, one may assume that $ \epsilon _ {1} = \dots = \epsilon _ {r} = 1 $; if $ K $ is the field of real numbers, one may assume that $ \epsilon _ {1} = \dots \epsilon _ {p} = 1 $ and $ \epsilon _ {p+1} = \dots = \epsilon _ {r} = - 1 $ for a certain $ p $. $ D $ is uniquely determined by these properties and can be regarded as the normal form of $ A $ under congruence. See [6], [10] and Quadratic form for information about the normal forms of symmetric matrices for a number of other fields, and also about Hermitian analogues of this theory.

A common feature in the theories of normal forms considered above (and also in others) is the fact that the admissible transformations over the relevant set of matrices are determined by the action of a certain group, so that the classes of matrices that can be carried into each other by means of these transformations are the orbits (cf. Orbit) of this group, and the appropriate normal form is the result of selecting in each orbit a certain canonical representative. Thus, the classes of equivalent matrices are the orbits of the group $ G = \mathop{\rm GL} _ {m} ( K) \times \mathop{\rm GL} _ {n} ( K) $( where $ \mathop{\rm GL} _ {s} ( K) $ is the group of invertible square matrices of order $ s $ with coefficients in $ K $), acting on $ M _ {m \times n } ( K) $ by the rule $ A \rightarrow C ^ {-1} A D $, where $ ( C , D ) \in G $. The classes of similar matrices are the orbits of $ \mathop{\rm GL} _ {n} ( K) $ on $ M _ {n \times n } ( K) $ acting by the rule $ A \rightarrow C ^ {-1} A C $, where $ C \in \mathop{\rm GL} _ {n} ( K) $. The classes of congruent symmetric or skew-symmetric matrices are the orbits of the group $ \mathop{\rm GL} _ {n} ( K) $ on the set of all symmetric or skew-symmetric matrices of order $ n $, acting by the rule $ A \rightarrow C ^ {T} A C $, where $ C \in \mathop{\rm GL} _ {n} ( K) $. From this point of view every normal form is a specific example of the solution of part of the general problem of orbital decomposition for the action of a certain transformation group.

References

[1] M. Markus, "A survey of matrix theory and matrix inequalities" , Allyn & Bacon (1964)
[2] P. Lancaster, "Theory of matrices" , Acad. Press (1969) MR0245579 Zbl 0186.05301
[3] S. Lang, "Algebra" , Addison-Wesley (1974) MR0783636 Zbl 0712.00001
[4] A.I. Mal'tsev, "Foundations of linear algebra" , Freeman (1963) (Translated from Russian) Zbl 0396.15001
[5] N. Bourbaki, "Elements of mathematics. Algebra: Modules. Rings. Forms" , 2 , Addison-Wesley (1975) pp. Chapt.4;5;6 (Translated from French) MR0643362 Zbl 1139.12001
[6] N. Bourbaki, "Elements of mathematics. Algebra: Algebraic structures. Linear algebra" , 1 , Addison-Wesley (1974) pp. Chapt.1;2 (Translated from French) MR0354207
[7] H.J.S. Smith, "On systems of linear indeterminate equations and congruences" , Collected Math. Papers , 1 , Chelsea, reprint (1979) pp. 367–409
[8] G. Frobenius, "Theorie der linearen Formen mit ganzen Coeffizienten" J. Reine Angew. Math. , 86 (1879) pp. 146–208
[9] F.R. [F.R. Gantmakher] Gantmacher, "The theory of matrices" , 1 , Chelsea, reprint (1977) (Translated from Russian) MR1657129 MR0107649 MR0107648 Zbl 0927.15002 Zbl 0927.15001 Zbl 0085.01001
[10] J.-P. Serre, "A course in arithmetic" , Springer (1973) (Translated from French) MR0344216 Zbl 0256.12001

Comments

The Smith canonical form and a canonical form related to the first natural normal form are of substantial importance in linear control and system theory [a1], [a2]. Here one studies systems of equations $ \dot{x} = A x + B u $, $ x \in \mathbf R ^ {n} $, $ u \in \mathbf R ^ {m} $, and the similarity relation is: $ ( A , B ) \sim ( S A S ^ {-1} , S B ) $. A pair of matrices $ A \in \mathbf R ^ {n \times n } $, $ B \in \mathbf R ^ {n \times m } $ is called completely controllable if the rank of the block matrix

$$ ( B , A B \dots A ^ {n} B ) = R ( A , B ) $$

is $ n $. Observe that $ R ( S A S ^ {-1} , S B ) = S R ( A , B ) $, so that a canonical form can be formed by selecting $ n $ independent column vectors from $ R ( A , B ) $. This can be done in many ways. The most common one is to test the columns of $ R ( A , B ) $ for independence in the order in which they appear in $ R ( A , B ) $. This yields the following so-called Brunovskii–Luenberger canonical form or block companion canonical form for a completely-controllable pair $ ( A , B ) $:

$$ \overline{A}\; = S ^ {-1} A S = \ \left \| \begin{array}{ccc} \overline{A}\; _ {11} &\dots &\overline{A}\; _ {1m} \\ \cdot &{} &\cdot \\ \cdot &{} &\cdot \\ \cdot &{} &\cdot \\ \overline{A}\; _ {m1} &\dots &\overline{A}\; _ {mm} \\ \end{array} \right \| , $$

$$ \overline{B}\; = S ^ {-1} B = ( \overline{b}\; _ {1} \dots \overline{b}\; _ {m} ) , $$

where $ \overline{A}\; _ {ij} $ is a matrix of size $ d _ {i} \times d _ {j} $ for certain $ d _ {i} \in \mathbf N \cup \{ 0 \} $, $ \sum _ {i=1} ^ {m} d _ {i} = n $, of the form

$$ \overline{A}\; _ {ii} = \left \| \begin{array}{ccccccc} 0 & 0 & 0 &\dots & 0 & 0 &* \\ 1 & 0 & 0 &\dots & 0 & 0 &* \\ 0 & 1 & 0 &\dots & 0 & 0 &* \\ \cdot &\cdot &\cdot &{} &\cdot &\cdot &\cdot \\ 0 & 0 & 0 &\dots & 0 & 1 &* \\ \end{array} \right \| , $$

$$ \overline{A}\; _ {ij} = \left \| \begin{array}{cccc} 0 &\dots & 0 &* \\ 0 &\dots & 0 &* \\ \cdot &{} &\cdot &\cdot \\ 0 &\dots & 0 &* \\ \end{array} \right \| \ \textrm{ for } i \neq j , $$

and $ \overline{b}\; _ {j} $ for $ d _ {j} \neq 0 $ is the $ ( d _ {1} + \dots + d _ {j-1} + 1 ) $- th standard basis vector of $ \mathbf R ^ {n} $; the $ \overline{b}\; _ {j} $ with $ d _ {j} = 0 $ have arbitrary coefficients $ * $. Here the $ * $' s denote coefficients which can take any value. If $ d _ {j} $ or $ d _ {i} $ is zero, the block $ A _ {ij} $ is empty (does not occur). Instead of $ \mathbf R $ any field can be used. The $ d _ {j} $ are called controllability indices or Kronecker indices. They are invariants.

Canonical forms are often used in (numerical) computations. This must be done with caution, because they may not depend continuously on the parameters [a3]. For example, the Jordan canonical form is not continuous; an example of this is:

$$ \left \| \begin{array}{cc} 1 & t \\ 0 & 1 \\ \end{array} \right \| \ \mapsto \left \| \begin{array}{cc} 1 & 1 \\ 0 & 1 \\ \end{array} \right \| \ \textrm{ for } t \neq 0 , $$

$$ \left \| \begin{array}{cc} 1 & 0 \\ 0 & 1 \\ \end{array} \right \| \ \mapsto \left \| \begin{array}{cc} 1 & 0 \\ 0 & 1 \\ \end{array} \right \| . $$

The matter of continuous canonical forms has much to do with moduli problems (cf. Moduli theory). Related is the matter of canonical forms for families of objects, e.g. canonical forms for holomorphic families of matrices under similarity [a4]. For a survey of moduli-type questions in linear control theory cf. [a5].

In the case of a controllable pair $ ( A , B ) $ with $ m = 1 $, i.e. $ B $ is a vector $ b \in \mathbf R ^ {n} $, the matrix $ A $ is cyclic, see also the section below on normal forms for operators. In this special case there is just one block $ \overline{A}\; _ {11} $( and one vector $ \overline{b}\; _ {1} $). This canonical form for a cyclic matrix with a cyclic vector is also called the Frobenius canonical form or the companion canonical form.

References

[a1] W.A. Wolovich, "Linear multivariable systems" , Springer (1974) MR0359881 Zbl 0291.93002
[a2] J. Klamka, "Controllability of dynamical systems" , Kluwer (1990) MR2461640 MR1325771 MR1134783 MR0707724 MR0507539 Zbl 0911.93015 Zbl 0876.93016 Zbl 0930.93008 Zbl 1043.93509 Zbl 0853.93020 Zbl 0852.93007 Zbl 0818.93002 Zbl 0797.93004 Zbl 0814.93012 Zbl 0762.93006 Zbl 0732.93008 Zbl 0671.93040 Zbl 0667.93007 Zbl 0666.93009 Zbl 0509.93012 Zbl 0393.93041
[a3] S.H. Golub, J.H. Wilkinson, "Ill conditioned eigensystems and the computation of the Jordan canonical form" SIAM Rev. , 18 (1976) pp. 578–619 MR0413456 Zbl 0341.65027
[a4] V.I. Arnol'd, "On matrices depending on parameters" Russ. Math. Surv. , 26 : 2 (1971) pp. 29–43 Uspekhi Mat. Nauk , 26 : 2 (1971) pp. 101–114 Zbl 0259.15011
[a5] M. Hazewinkel, "(Fine) moduli spaces for linear systems: what are they and what are they good for" C.I. Byrnes (ed.) C.F. Martin (ed.) , Geometrical Methods for the Theory of Linear Systems , Reidel (1980) pp. 125–193 MR0608993 Zbl 0481.93023
[a6] H.W. Turnball, A.C. Aitken, "An introduction to the theory of canonical matrices" , Blackie & Son (1932)

Self-adjoint operator on a Hilbert space

A normal form of an operator is a representation, up to an isomorphism, of a self-adjoint operator $ A $ acting on a Hilbert space $ {\mathcal H} $ as an orthogonal sum of multiplication operators by the independent variable.

To begin with, suppose that $ A $ is a cyclic operator; this means that there is an element $ h _ {0} \in {\mathcal H} $ such that every element $ h \in {\mathcal H} $ has a unique representation in the form $ F ( A) h _ {0} $, where $ F ( \xi ) $ is a function for which

$$ \int\limits _ {- \infty } ^ { {+ } \infty } | F ( \xi ) | ^ {2} d ( E _ \xi h _ {0} , h _ {0} ) < \infty ; $$

here $ E _ \xi $, $ - \infty < \xi < \infty $, is the spectral resolution of $ A $. Let $ {\mathcal L} _ \rho ^ {2} $ be the space of square-integrable functions on $ ( - \infty , + \infty ) $ with weight $ \rho ( \xi ) = ( E _ \xi h _ {0} , h _ {0} ) $, and let $ K _ \rho F = \xi F ( \xi ) $ be the multiplication operator by the independent variable, with domain of definition

$$ D _ {K _ \rho } = \ \left \{ { F ( \xi ) } : { \int\limits _ {- \infty } ^ { {+ } \infty } \xi ^ {2} | F ( \xi ) | ^ {2} d \rho ( \xi ) < \infty } \right \} . $$

Then the operators $ A $ and $ K _ \rho $ are isomorphic, $ A \simeq K _ \rho $; that is, there exists an isomorphic and isometric mapping $ U : {\mathcal H} \rightarrow {\mathcal L} _ \rho ^ {2} $ such that $ U D _ {A} = D _ {K _ \rho } $ and $ A = U ^ {-1} K _ \rho U $.

Suppose, next, that $ A $ is an arbitrary self-adjoint operator. Then $ {\mathcal H} $ can be split into an orthogonal sum of subspaces $ {\mathcal H} _ \alpha $ on each of which $ A $ induces a cyclic operator $ A _ \alpha $, so that $ H = \sum \oplus H _ \alpha $, $ A = \sum \oplus A _ \alpha $ and $ A _ \alpha \simeq K _ {\rho _ \alpha } $. If the operator $ K = \sum \oplus K _ {\rho _ \alpha } $ is given on $ {\mathcal L} ^ {2} = \sum \oplus {\mathcal L} _ {\rho _ \alpha } ^ {2} $, then $ A \simeq K $.

The operator $ K $ is called the normal form or canonical representation of $ A $. The theorem on the canonical representation extends to the case of arbitrary normal operators (cf. Normal operator).

References

[1] A.I. Plesner, "Spectral theory of linear operators" , F. Ungar (1965) (Translated from Russian) MR0194900 Zbl 0188.44402 Zbl 0185.21002
[2] N.I. Akhiezer, I.M. Glazman, "Theory of linear operators in Hilbert spaces" , 1–2 , Pitman (1981) (Translated from Russian) MR0615737 MR0615736

V.I. Sobolev

The normal form of an operator $ A $ is a representation of $ A $, acting on a Fock space constructed over a certain space $ L _ {2} ( M , \sigma ) $, where $ ( M , \sigma ) $ is a measure space, in the form of a sum

$$ \tag{1 } A = \sum _ {m , n \geq 0 } \int\limits K _ {n,m} ( x _ {1} \dots x _ {n} ; \ y _ {1} \dots y _ {m} ) \times $$

$$ \times a ^ {*} ( x _ {1} ) \dots a ^ {*} ( x _ {n} ) a ( y _ {1} ) \dots a ( y _ {m} ) \prod_{i=1}^ { n } d \sigma ( x _ {i} ) \prod_{j=1}^ { m } d \sigma ( y _ {j} ) , $$

where $ a ( x) , a ^ {*} ( x) $( $ x \in M $) are operator-valued generalized functions generating families of annihilation operators $ \{ {a ( f ) } : {f \in L _ {2} ( M , \sigma ) } \} $ and creation operators $ \{ {a ^ {*} ( f ) } : {f \in L _ {2} ( M , \sigma ) } \} $:

$$ a ( f ) = \int\limits _ { M } a ( x) f ( x) d \sigma ( x) ,\ \ a ^ {*} ( f ) = \int\limits _ { M } a ^ {*} ( x) \overline{f}\; ( x) d \sigma ( x) . $$

In each term of expression (1) all factors $ a ( y _ {j} ) $, $ j = 1 \dots m $, stand to the right of all factors $ a ^ {*} ( x _ {i} ) $, $ i = 1 \dots n $, and the (possibly generalized) functions $ K _ {n,m} ( x _ {1} \dots x _ {n} ; y _ {1} \dots y _ {m} ) $ in the two sets of variables $ ( x _ {1} \dots x _ {n} ) \in M ^ {n} $, $ ( y _ {1} \dots y _ {m} ) \in M ^ {m} $, $ n , m = 0 , 1 \dots $ are, in the case of a symmetric (Boson) Fock space, symmetric in the variables of each set separately, and, in the case of an anti-symmetric (Fermion) Fock space, anti-symmetric in these variables.

For any bounded operator $ A $ the normal form exists and is unique.

The representation (1) can be rewritten in a form containing the annihilation and creation operators directly:

$$ \tag{2 } A = $$

$$ = \ \sum _ {m , n } \sum _ { \begin{array}{c} \{ i _ {1} \dots i _ {n} \} \\ \{ j _ {1} \dots j _ {m} \} \end{array} } c _ {i _ {1} \dots i _ {n} j _ {1} \dots j _ {m} } a ^ {*} ( f _ { i _ {1} } ) \dots a ^ {*} ( f _ {i _ {n} } ) a ( f _ {j _ {1} } ) \dots a ( f _ {j _ {m} } ) , $$

where $ \{ {f _ {i} } : {i = 1 , 2 ,\dots } \} $ is an orthonormal basis in $ L _ {2} ( M , \sigma ) $ and the summation in (2) is over all pairs of finite collections $ \{ f _ {i _ {1} } \dots f _ {i _ {n} } \} $, $ \{ f _ {j _ {1} } \dots f _ {j _ {m} } \} $ of elements of this basis.

In the case of an arbitrary (separable) Hilbert space $ H $ the normal form of an operator $ A $ acting on the Fock space $ \Gamma ( H) $ constructed over $ H $ is determined for a fixed basis $ \{ {f _ {i} } : {i = 1 , 2 ,\dots } \} $ in $ H $ by means of the expression (2), where $ a ( f ) $, $ a ^ {*} ( f ) $, $ f \in H $, are families of annihilation and creation operators acting on $ \Gamma ( H) $.

References

[1] F.A. Berezin, "The method of second quantization" , Acad. Press (1966) (Translated from Russian) (Revised (augmented) second edition: Kluwer, 1989) MR0208930 Zbl 0151.44001

R.A. Minlos

Comments

References

[a1] N.N. [N.N. Bogolyubov] Bogolubov, A.A. Logunov, I.T. Todorov, "Introduction to axiomatic quantum field theory" , Benjamin (1975) (Translated from Russian) MR0452276 MR0452277
[a2] G. Källen, "Quantum electrodynamics" , Springer (1972) MR0153346 MR0056465 MR0051156 MR0039581 Zbl 0116.45005 Zbl 0074.44202 Zbl 0050.43001 Zbl 0046.21402 Zbl 0041.57104
[a3] J. Glimm, A. Jaffe, "Quantum physics, a functional integral point of view" , Springer (1981) Zbl 0461.46051

Recursive functions

The normal form of a recursive function is a method for specifying an $ n $- place recursive function $ \phi $ in the form

$$ \tag{* } \phi ( x _ {1} \dots x _ {n} ) = \ g ( \mu z ( f ( x _ {1} \dots x _ {n} , z ) = 0 ) ) , $$

where $ f $ is an $ ( n + 1 ) $- place primitive recursive function, $ g $ is a $ 1 $- place primitive recursive function and $ \mu z ( f ( x _ {1} \dots x _ {n} , z ) = 0 ) $ is the result of applying the least-number operator to $ f $. Kleene's normal form theorem asserts that there is a primitive recursive function $ g $ such that every recursive function $ \phi $ can be represented in the form (*) with a suitable function $ f $ depending on $ \phi $; that is,

$$ ( \exists g ) ( \forall \phi ) ( \exists f ) ( \forall x _ {1} \dots x _ {n} ) : $$

$$ [ \phi ( x _ {1} \dots x _ {n} ) = g ( \mu z ( f ( x _ {1} \dots x _ {n} , z ) = 0 ) ) ] . $$

The normal form theorem is one of the most important results in the theory of recursive functions.

A.A. Markov [2] obtained a characterization of those functions $ g $ that can be used in the normal form theorem for the representation (*). A function $ g $ can be used as function whose existence is asserted in the normal form theorem if and only if the equation $ g ( x) = n $ has infinitely many solutions for each $ n $. Such functions are called functions of great range.

References

[1] A.I. Mal'tsev, "Algorithms and recursive functions" , Wolters-Noordhoff (1970) (Translated from Russian) Zbl 0198.02501
[2] A.A. Markov, "On the representation of recursive functions" Izv. Akad. Nauk SSSR Ser. Mat. , 13 : 5 (1949) pp. 417–424 (In Russian) MR0031444

V.E. Plisko

Comments

References

[a1] S.C. Kleene, "Introduction to metamathematics" , North-Holland (1951) pp. 288 MR1234051 MR1570642 MR0051790 Zbl 0875.03002 Zbl 0604.03002 Zbl 0109.00509 Zbl 0047.00703

Normal form of a system of differential equations

A normal form of a system of differential equations

$$ \tag{1 } \dot{x} _ {i} = \phi _ {i} ( x _ {1} \dots x _ {n} ) ,\ \ i = 1 \dots n , $$

near an invariant manifold $ M $ is a formal system

$$ \tag{2 } \dot{y} _ {i} = \psi _ {i} ( x _ {1} \dots y _ {n} ) ,\ \ i = 1 \dots n , $$

that is obtained from (1) by an invertible formal change of coordinates

$$ \tag{3 } x _ {i} = \xi _ {i} ( y _ {i} \dots y _ {n} ) ,\ \ i = 1 \dots n , $$

in which the Taylor–Fourier series $ \psi _ {i} $ contain only resonance terms. In a particular case, normal forms occurred first in the dissertation of H. Poincaré (see [1]). By means of a normal form (2) some systems (1) can be integrated, and many can be investigated for stability and can be integrated approximately; for systems (1) a search has been made for periodic solutions and families of conditionally periodic solutions, and their bifurcation has been studied.

Normal forms in a neighbourhood of a fixed point.

Suppose that $ M $ contains a fixed point $ X \equiv ( x _ {1} \dots x _ {n} ) = 0 $ of the system (1) (that is, $ \phi _ {i} ( 0) = 0 $), that the $ \phi _ {i} $ are analytic at it and that $ \lambda _ {1} \dots \lambda _ {n} $ are the eigen values of the matrix $ \| \partial \phi _ {i} / \partial x _ {j} \| $ for $ X = 0 $. Let $ \Lambda \equiv ( \lambda _ {1} \dots \lambda _ {n} ) \neq 0 $. Then in a full neighbourhood of $ X = 0 $ the system (1) has the following normal form (2): the matrix $ \| \partial \psi _ {i} / \partial y _ {j} \| $ has for $ Y \equiv ( y _ {1} \dots y _ {n} ) = 0 $ a normal form (for example, the Jordan normal form) and the Taylor series

$$ \tag{4 } \psi _ {i} = y _ {i} \sum _ {Q \in N _ {i} } g _ {i Q } Y ^ {Q} ,\ \ i = 1 \dots n , $$

contain only resonance terms for which

$$ \tag{5 } ( Q , \Lambda ) \equiv \ q _ {1} \lambda _ {1} + \dots + q _ {n} \lambda _ {n} = 0 . $$

Here $ Q \equiv ( q _ {1} \dots q _ {n} ) $, $ Y ^ {Q} \equiv y _ {1} ^ {q _ {1} } \dots y _ {n} ^ {q _ {n} } $, $ N _ {i} = \{ {Q } : {\textrm{ integers } q _ {j} \geq 0, q _ {i} \geq - 1, q _ {1} + \dots + q _ {n} \geq 0 } \} $. If equation (5) has no solutions $ Q \neq 0 $ in $ N = N _ {1} \cup \dots \cup N _ {n} $, then the normal form (2) is linear:

$$ \dot{y} _ {i} = \lambda _ {i} y _ {i} ,\ \ i = 1 \dots n . $$

Every system (1) with $ \Lambda \neq 0 $ can be reduced in a neighbourhood of a fixed point to its normal form (2) by some formal transformation (3), where the $ \xi _ {i} $ are (possibly divergent) power series, $ \xi _ {i} ( 0) = 0 $ and $ \mathop{\rm det} \| \partial \xi _ {i} / \partial y _ {j} \| \neq 0 $ for $ Y = 0 $.

Generally speaking, the normalizing transformation (3) and the normal form (2) (that is, the coefficients $ g _ {iQ} $ in (4)) are not uniquely determined by the original system (1). A normal form (2) preserves many properties of the system (1), such as being real, symmetric, Hamiltonian, etc. (see , [3]). If the original system contains small parameters, one can include them among the coordinates $ x _ {j} $, and then $ \dot{x} _ {j} = 0 $. Such coordinates do not change under a normalizing transformation (see [3]).

If $ k $ is the number of linearly independent solutions $ Q \in N $ of equation (5), then by means of a transformation

$$ y _ {i} = \ z _ {1} ^ {\alpha _ {i1} } \dots z _ {n} ^ {\alpha _ {in} } ,\ \ i = 1 \dots n , $$

where the $ \alpha _ {ij} $ are integers and $ \mathop{\rm det} \| \alpha _ {ij} \| = \pm 1 $, the normal form (2) is carried to a system

$$ \dot{z} _ {j} = z _ {i} f ( z _ {1} \dots z _ {k} ) ,\ \ i = 1 \dots n $$

(see , [3]). The solution of this system reduces to a solution of the subsystem of the first $ k $ equations and to $ n - k $ quadratures. The subsystem has to be investigated in the neighbourhood of the multiple singular point $ z _ {1} = \dots = z _ {k} = 0 $, because the $ f _ {1} \dots f _ {k} $ do not contain linear terms. This can be done by a local method (see [3]).

The following problem has been examined (see ): Under what conditions on the normal form (2) does the normalizing transformation of an analytic system (1) converge (be analytic)? Let

$$ \omega _ {k} = \min | ( Q , \Lambda ) | $$

for those $ Q \in N $ for which

$$ ( Q , \Lambda ) \neq 0 ,\ \ q _ {1} + \dots + q _ {n} < 2 ^ {k} . $$

Condition $ \omega $: $ \sum _ {k=1} ^ \infty 2 ^ {-k} \mathop{\rm log} \omega _ {k} ^ {-1} < \infty $.

Condition $ \overline \omega \; $: $ {\lim\limits \sup } 2 ^ {-k} \mathop{\rm log} \omega _ {k} ^ {-1} < \infty $ as $ k \rightarrow \infty $.

Condition $ \overline \omega \; $ is weaker than $ \omega $. Both are satisfied for almost-all $ \Lambda $( relative to Lebesgue measure) and are very weak arithmetic restrictions on $ \Lambda $.

In case $ \mathop{\rm Re} \Lambda = 0 $ there is also condition $ A $( for the general case, see in ): There exists a power series $ a ( Y) $ such that in (4), $ \phi _ {i} = \lambda _ {i} y _ {i} a $, $ i = 1 \dots n $.

If for an analytic system (1) $ \Lambda $ satisfies condition $ \omega $ and the normal form (2) satisfies condition $ A $, then there exists an analytic transformation of (1) to a certain normal form. If (2) is obtained from an analytic system and fails to satisfy either condition $ \overline \omega \; $ or condition $ A $, then there exists an analytic system (1) that has (2) as its normal form, and every transformation to a normal form diverges (is not analytic).

Thus, the problem raised above is solved for all normal forms except those for which $ \Lambda $ satisfies condition $ \omega $, but not $ \overline \omega \; $, while the remaining coefficients of the normal form satisfy condition $ A $. The latter is a very rigid restriction on the coefficients of a normal form, and for large $ n $ it holds, generally speaking, only in degenerate cases. That is, the basic reason for divergence of a transformation to normal form is not small denominators, but degeneracy of the normal form.

But even in cases of divergence of the normalizing transformation (3) with respect to (2), one can study properties of the solutions of the system (1). For example, a real system (1) has a smooth transformation to the normal form (2) even when it is not analytic. The majority of results on smooth normalization have been obtained under the condition that all $ \mathop{\rm Re} \lambda _ {j} \neq 0 $. Under this condition, with the help of a change $ X \rightarrow V $ of finite smoothness class, a system (1) can be brought to a truncated normal form

$$ \tag{6 } \dot{v} _ {i} = \widetilde \psi _ {i} ( V) ,\ \ i = 1 \dots n , $$

where the $ \widetilde \psi _ {i} $ are polynomials of degree $ m $( see [4]–). If in the normalizing transformation (3) all terms of degree higher than $ m $ are discarded, the result is a transformation

$$ \tag{7 } x _ {i} = \widetilde \xi _ {i} ( U) ,\ \ i = 1 \dots n $$

(the $ \widetilde \xi _ {i} $ are polynomials), that takes (1) to the form

$$ \tag{8 } \dot{u} _ {i} = \widetilde \psi _ {i} ( U) + \widetilde \phi _ {i} ( U) ,\ \ i = 1 \dots n , $$

where the $ \widetilde \psi _ {i} $ are polynomials containing only resonance terms and the $ \widetilde \psi _ {i} $ are convergent power series containing only terms of degree higher than $ m $. Solutions of the truncated normal form (6) are approximations for solutions of (8) and, after the transformation (7), give approximations of solutions of the original system (1). In many cases one succeeds in constructing for (6) a Lyapunov function (or Chetaev function) $ f ( V) $ such that

$$ | f ( V) | \leq c _ {1} | V | ^ \gamma \ \ \textrm{ and } \ \ \left | \sum_{j=1}^ { n } \frac{\partial f }{\partial v _ {j} } \widetilde \phi _ {j} \right | > c _ {2} | V | ^ {\gamma + m } , $$

where $ c _ {1} $ and $ c _ {2} $ are positive constants. Then $ f ( U) $ is a Lyapunov (Chetaev) function for the system (8); that is, the point $ X = 0 $ is stable (unstable). For example, if all $ \mathop{\rm Re} \lambda _ {i} < 0 $, one can take $ m = 1 $, $ f = \sum _ {i=1} ^ {n} v _ {i} ^ {2} $ and obtain Lyapunov's theorem on stability under linear approximation (see [7]; for other examples see the survey [8]).

From the normal form (2) one can find invariant analytic sets of the system (1). In what follows it is assumed for simplicity of exposition that $ \mathop{\rm Re} \Lambda = 0 $. From the normal form (2) one extracts the formal set

$$ {\mathcal A} = \{ {Y } : {\psi _ {i} = \lambda _ {i} y _ {i} a ,\ i = 1 \dots n } \} , $$

where $ a $ is a free parameter. Condition $ A $ is satisfied on the set $ {\mathcal A} $. Let $ X $ be the union of subspaces of the form $ \{ {Y } : {y _ {i} = 0, i = i _ {1} \dots i _ {l} } \} $ such that the corresponding eigen values $ \lambda _ {j} $, $ j \neq i _ {1} \dots i _ {l} $, $ 1 \leq j \leq n $, are pairwise commensurable. The formal set $ {\mathcal A} tilde = {\mathcal A} \cap K $ is analytic in the system (1). From $ {\mathcal A} $ one selects the subset $ {\mathcal B} $ that is analytic in (1) if condition $ \omega $ holds (see [3]). On the sets $ {\mathcal A} tilde $ and $ {\mathcal B} $ lie periodic solutions and families of conditionally-periodic solutions of (1). By considering the sets $ {\mathcal A} tilde $ and $ {\mathcal B} $ in systems with small parameters, one can study all analytic perturbations and bifurcations of such solutions (see [9]).

Generalizations.

If a system (1) does not lead to a normal form (2) but to a system whose right-hand sides contain certain non-resonance terms, then the resulting simplification is less substantial, but can improve the quality of the transformation. Thus, the reduction to a "semi-normal form" is analytic under a weakened condition $ A $( see ). Another version is a transformation that normalizes a system (1) only on certain submanifolds (for example, on certain coordinate subspaces; see ). A combination of these approaches makes it possible to prove for (1) the existence of invariant submanifolds and of solutions of specific form (see [9]).

Suppose that a system (1) is defined and analytic in a neighbourhood of an invariant manifold $ M $ of dimension $ k + l $ that is fibred into $ l $- dimensional invariant tori. Then close to $ M $ one can introduce local coordinates

$$ S = ( s _ {1} \dots s _ {k} ) ,\ \ Y = ( y _ {1} \dots y _ {l} ) ,\ \ Z = ( z _ {1} \dots z _ {m} ) , $$

$$ k + l + m = n , $$

such that $ Z = 0 $ on $ M $, $ y _ {j} $ is of period $ 2 \pi $, $ S $ ranges over a certain domain $ H $, and (1) takes the form

$$ \tag{9 } \left . \begin{array}{c} \dot{S} = \Phi ^ {( 1)} ( S , Y , Z ) , \\ \dot{Y} = \Omega ( S , Y ) + \Phi ^ {( 2)} ( S , Y , Z ) , \\ \dot{Z} = ( S , Y ) Z + \Phi ^ {( 3)} ( S , Y , Z ) , \end{array} \right \} $$

where $ \Phi ^ {( j)} = O ( | Z | ) $, $ j = 1 , 2 $, $ \Phi ^ {( 3)} = O ( | Z | ^ {2} ) $ and $ A $ is a matrix. If $ \Omega = \textrm{ const } $ and $ A $ is triangular with constant main diagonal $ \Lambda ( \lambda _ {1} \dots \lambda _ {n} ) $, then (under a weak restriction on the small denominators) there is a formal transformation of the local coordinates $ S , Y , Z \rightarrow U , V , W $ that takes the system (9) to the normal form

$$ \tag{10 } \left . \begin{array}{c} \dot{U} = \sum \Psi _ {PQ} ^ {( 1)} ( U) W ^ {Q} \mathop{\rm exp} i ( P , V ) , \\ \dot{V} = \sum \Psi _ {PQ} ^ {( 2)} ( U) W ^ {Q} \mathop{\rm exp} i ( P , V ) , \\ \dot{w} _ {j} = w _ {j} \sum g _ {jPQ} ( U) W ^ {Q} \mathop{\rm exp} ( P , V ) ,\ \ j = 1 \dots m , \end{array} \right \} $$

where $ P \in \mathbf Z ^ {l} $, $ Q \in \mathbf N ^ {m} $, $ U \in H $, and $ i ( P , \Omega ) + ( Q , \Lambda ) = 0 $.

If among the coordinates $ Z $ there is a small parameter, (9) can be averaged by the Krylov–Bogolyubov method of averaging (see [10]), and the averaged system is a normal form. More generally, perturbation theory can be regarded as a special case of the theory of normal forms, when one of the coordinates is a small parameter (see [11]).

Theorems on the convergence of a normalizing change, on the existence of analytic invariant sets, etc., carry over to the systems (9) and (10). Here the best studied case is when $ M $ is a periodic solution, that is, $ k = 0 $, $ l = 1 $. In this case the theory of normal forms is in many respects identical with the case when $ M $ is a fixed point. Poincaré suggested that one should consider a pointwise mapping of a normal section across the periods. In this context arose a theory of normal forms of pointwise mappings, which is parallel to the corresponding theory for systems (1). For other generalizations of normal forms see [3], , [12][14].

References

[1] H. Poincaré, "Thèse, 1928" , Oeuvres , 1 , Gauthier-Villars (1951) pp. IL-CXXXII
[2a] A.D. [A.D. Bryuno] Bruno, "Analytical form of differential equations" Trans. Moscow Math. Soc. , 25 (1971) pp. 131–288 Trudy Moskov. Mat. Obshch. , 25 (1971) pp. 119–262
[2b] A.D. [A.D. Bryuno] Bruno, "Analytical form of differential equations" Trans. Moscow Math. Soc. (1972) pp. 199–239 Trudy Moskov. Mat. Obshch. , 26 (1972) pp. 199–239
[3] A.D. Bryuno, "Local methods in nonlinear differential equations" , 1 , Springer (1989) (Translated from Russian) MR0993771
[4] P. Hartman, "Ordinary differential equations" , Birkhäuser (1982) MR0658490 Zbl 0476.34002
[5a] V.S. Samovol, "Linearization of a system of differential equations in the neighbourhood of a singular point" Soviet Math. Dokl. , 13 (1972) pp. 1255–1259 Dokl. Akad. Nauk SSSR , 206 (1972) pp. 545–548 Zbl 0667.34041
[5b] V.S. Samovol, "Equivalence of systems of differential equations in the neighbourhood of a singular point" Trans. Moscow Math. Soc. (2) , 44 (1982) pp. 217–237 Trudy Moskov. Mat. Obshch. , 44 (1982) pp. 213–234
[6a] G.R. Belitskii, "Equivalence and normal forms of germs of smooth mappings" Russian Math. Surveys , 33 : 1 (1978) pp. 95–155 Uspekhi Mat. Nauk. , 33 : 1 (1978) MR0490708
[6b] G.R. Belitskii, "Normal forms relative to a filtering action of a group" Trans. Moscow Math. Soc. , 40 (1979) pp. 3–46 Trudy Moskov. Mat. Obshch. , 40 (1979) pp. 3–46
[6c] G.R. Belitskii, "Smooth equivalence of germs of vector fields with a single zero eigenvalue or a pair of purely imaginary eigenvalues" Funct. Anal. Appl. , 20 : 4 (1986) pp. 253–259 Funkts. Anal. i Prilozen. , 20 : 4 (1986) pp. 1–8
[7] A.M. [A.M. Lyapunov] Liapunoff, "Problème général de la stabilité du mouvement" , Princeton Univ. Press (1947) (Translated from Russian)
[8] A.L. Kunitsyn, A.P. Markev, "Stability in resonant cases" Itogi Nauk. i Tekhn. Ser. Obsh. Mekh. , 4 (1979) pp. 58–139 (In Russian)
[9] J.N. Bibikov, "Local theory of nonlinear analytic ordinary differential equations" , Springer (1979) MR0547669 Zbl 0404.34005
[10] N.N. Bogolyubov, Yu.A. Mitropol'skii, "Asymptotic methods in the theory of non-linear oscillations" , Hindushtan Publ. Comp. , Delhi (1961) (Translated from Russian) MR0100379 Zbl 0151.12201
[11] A.D. [A.D. Bryuno] Bruno, "Normal form in perturbation theory" , Proc. VIII Internat. Conf. Nonlinear Oscillations, Prague, 1978 , 1 , Academia (1979) pp. 177–182 (In Russian)
[12] V.V. Kostin, Le Dinh Thuy, "Some tests of the convergence of a normalizing transformation" Dapovidi Akad. Nauk URSR Ser. A : 11 (1975) pp. 982–985 (In Russian) MR407356
[13] E.J. Zehnder, "C.L. Siegel's linearization theorem in infinite dimensions" Manuscr. Math. , 23 (1978) pp. 363–371 MR0501144 Zbl 0374.47037
[14] N.V. Nikolenko, "The method of Poincaré normal forms in problems of integrability of equations of evolution type" Russian Math. Surveys , 41 : 5 (1986) pp. 63–114 Uspekhi Mat. Nauk , 41 : 5 (1986) pp. 109–152 MR0878327 Zbl 0632.35026

A.D. Bryuno

Comments

For more on various linearization theorems for ordinary differential equations and canonical form theorems for ordinary differential equations, as well as generalizations to the case of non-linear representations of nilpotent Lie algebras, cf. also Poincaré–Dulac theorem and Analytic theory of differential equations, and [a1].

References

[a1] V.I. Arnol'd, "Geometrical methods in the theory of ordinary differential equations" , Springer (1983) (Translated from Russian)
How to Cite This Entry:
Normal form (for matrices). Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Normal_form_(for_matrices)&oldid=23915
This article was adapted from an original article by V.L. Popov (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article