Namespaces
Variants
Actions

Difference between revisions of "Code with correction of deletions and insertions"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (tex done)
 
Line 1: Line 1:
 +
{{TEX|done}}
 +
 
''code for correction of deletions and insertions''
 
''code for correction of deletions and insertions''
  
A code intended for the correction of errors of two types encountered in the transmission of information. By a deletion of a letter in a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228601.png" /> of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228602.png" /> over some alphabet <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228603.png" /> one means a transformation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228604.png" /> into a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228605.png" /> of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228606.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228607.png" />. The numbers <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228608.png" /> of words obtainable from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c0228609.png" /> by deletion of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286010.png" /> letters satisfies the following estimates
+
A code intended for the correction of errors of two types encountered in the transmission of information. By a deletion of a letter in a word $  \beta = b _ {1} \dots b _ {n} $
 +
of length $  n $
 +
over some alphabet $  B $
 +
one means a transformation of $  \beta $
 +
into a word $  \beta  ^  \prime  = b _ {1} \dots b _ {i-1} b _ {i+1} \dots b _ {n} $
 +
of length  $  n - 1 $,
 +
$  1 \leq  i \leq  n $.  
 +
The numbers  $  N _ {s} ( \beta ) $
 +
of words obtainable from  $  \beta $
 +
by deletion of  $  s $
 +
letters satisfies the following estimates
 +
 
 +
$$
 +
\left ( \begin{array}{c}
 +
\tau ( \beta ) - s + 1 \\
 +
s
 +
\end{array}
 +
\right ) \  \leq  \  N _ {s} ( \beta ) \  \leq  \  \left ( \begin{array}{c}
 +
\tau ( \beta ) + s - 1 \\
 +
s
 +
\end{array}
 +
\right ) .
 +
$$
 +
 
 +
Here  $  \tau ( \beta ) $
 +
is the number of series of $  \beta $(
 +
by a series of a word  $  \beta = b _ {1} \dots b _ {n} $
 +
one means a word  $  b _ {i+1} \dots b _ {j} $,
 +
$  j \geq  i + 1 $,
 +
such that 1)  $  b _ {i+1} = \dots = b _ {j} $;
 +
2) if  $  i \geq  1 $,
 +
then  $  b _ {i} \neq b _ {i+1} $;
 +
and 3) if  $  j < n $,
 +
then  $  b _ {j+1} \neq b _ {j} $).  
 +
In particular, $  N _ {1} ( \beta ) = \tau ( \beta ) $.  
 +
By an insertion of a letter in a word  $  \beta = b _ {1} \dots b _ {n} $
 +
one means a transformation of  $  \beta $
 +
into a word  $  \beta  ^  \prime  = b _ {1} \dots b _ {i} b b _ {i+1} \dots b _ {n} $
 +
of length  $  n + 1 $,
 +
where  $  b \in B $
 +
and  $  0 \leq  i \leq  n $.  
 +
The number of words obtainable from an arbitrary word  $  \beta $
 +
of length  $  n $
 +
by the insertion of $  s $
 +
letters of an alphabet  $  B $
 +
is equal to
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286011.png" /></td> </tr></table>
+
$$
 +
\sum _ { j=0 } ^ { s }
 +
\left ( \begin{array}{c}
 +
n + s \\
 +
j
 +
\end{array}
 +
\right )
 +
( r - 1 )  ^ {j} ,
 +
$$
  
Here <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286012.png" /> is the number of series of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286013.png" /> (by a series of a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286014.png" /> one means a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286015.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286016.png" />, such that 1) <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286017.png" />; 2) if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286018.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286019.png" />; and 3) if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286020.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286021.png" />). In particular, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286022.png" />. By an insertion of a letter in a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286023.png" /> one means a transformation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286024.png" /> into a word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286025.png" /> of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286026.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286027.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286028.png" />. The number of words obtainable from an arbitrary word <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286029.png" /> of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286030.png" /> by the insertion of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286031.png" /> letters of an alphabet <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286032.png" /> is equal to
+
where  $  r $
 +
is the number of letters in  $  B $.
 +
A set  $  K $
 +
of words over the alphabet  $  B $
 +
is called a code for correction of $  s $
 +
deletions (insertions, deletions or insertions) if no word in  $  B $
 +
can be obtained from two distinct words of  $  K $
 +
as the result of  $  s $
 +
or fewer deletions (insertions, deletions or insertions) of letters in each of them. The function defined on the pairs  $  ( \beta _ {1} ,\  \beta _ {2} ) $
 +
of words in  $  B $
 +
and equal to the minimum number of deletions and insertions of letters converting  $  \beta _ {1} $
 +
into  $  \beta _ {2} $
 +
is a metric. A set  $  K $
 +
of words over the alphabet  $  B $
 +
is a code for correction of  $  s $
 +
deletions (insertions, deletions or insertions) if and only if the distance between any two distinct words of  $  K $
 +
is greater than  $  2 s $,  
 +
so that the above three definitions of a code are equivalent. An example of a code for correction of one deletion or one insertion is the set of words  $  \{ \beta = b _ {1} \dots b _ {n} \} $
 +
of length $  n $
 +
over the alphabet  $  B = \{ 0 ,\  1 \} $
 +
for which  $  \sum _ {i=1}  ^ {n} i b _ {i} \equiv 0 $(
 +
$  \mathop{\rm mod} \  n + 1 $).  
 +
The number of words in this code is equal to
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286033.png" /></td> </tr></table>
+
$$
  
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286034.png" /> is the number of letters in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286035.png" />. A set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286036.png" /> of words over the alphabet <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286037.png" /> is called a code for correction of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286038.png" /> deletions (insertions, deletions or insertions) if no word in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286039.png" /> can be obtained from two distinct words of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286040.png" /> as the result of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286041.png" /> or fewer deletions (insertions, deletions or insertions) of letters in each of them. The function defined on the pairs <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286042.png" /> of words in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286043.png" /> and equal to the minimum number of deletions and insertions of letters converting <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286044.png" /> into <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286045.png" /> is a metric. A set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286046.png" /> of words over the alphabet <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286047.png" /> is a code for correction of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286048.png" /> deletions (insertions, deletions or insertions) if and only if the distance between any two distinct words of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286049.png" /> is greater than <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286050.png" />, so that the above three definitions of a code are equivalent. An example of a code for correction of one deletion or one insertion is the set of words <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286051.png" /> of length <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286052.png" /> over the alphabet <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286053.png" /> for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286054.png" /> (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286055.png" />). The number of words in this code is equal to
+
\frac{1}{2 ( n + 1 )}
  
<table class="eq" style="width:100%;"> <tr><td valign="top" style="width:94%;text-align:center;"><img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286056.png" /></td> </tr></table>
+
\sum \phi (d) 2 ^ {( n + 1 ) / d} ,
 +
$$
  
where the sum is taken over all odd divisors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286057.png" /> of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286058.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286059.png" /> is the Euler function; it is asymptotically maximal as <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c022/c022860/c02286060.png" />.
+
where the sum is taken over all odd divisors $  d $
 +
of $  n + 1 $
 +
and $  \phi (d) $
 +
is the Euler function; it is asymptotically maximal as $  n \rightarrow \infty $.

Latest revision as of 13:23, 8 February 2020


code for correction of deletions and insertions

A code intended for the correction of errors of two types encountered in the transmission of information. By a deletion of a letter in a word $ \beta = b _ {1} \dots b _ {n} $ of length $ n $ over some alphabet $ B $ one means a transformation of $ \beta $ into a word $ \beta ^ \prime = b _ {1} \dots b _ {i-1} b _ {i+1} \dots b _ {n} $ of length $ n - 1 $, $ 1 \leq i \leq n $. The numbers $ N _ {s} ( \beta ) $ of words obtainable from $ \beta $ by deletion of $ s $ letters satisfies the following estimates

$$ \left ( \begin{array}{c} \tau ( \beta ) - s + 1 \\ s \end{array} \right ) \ \leq \ N _ {s} ( \beta ) \ \leq \ \left ( \begin{array}{c} \tau ( \beta ) + s - 1 \\ s \end{array} \right ) . $$

Here $ \tau ( \beta ) $ is the number of series of $ \beta $( by a series of a word $ \beta = b _ {1} \dots b _ {n} $ one means a word $ b _ {i+1} \dots b _ {j} $, $ j \geq i + 1 $, such that 1) $ b _ {i+1} = \dots = b _ {j} $; 2) if $ i \geq 1 $, then $ b _ {i} \neq b _ {i+1} $; and 3) if $ j < n $, then $ b _ {j+1} \neq b _ {j} $). In particular, $ N _ {1} ( \beta ) = \tau ( \beta ) $. By an insertion of a letter in a word $ \beta = b _ {1} \dots b _ {n} $ one means a transformation of $ \beta $ into a word $ \beta ^ \prime = b _ {1} \dots b _ {i} b b _ {i+1} \dots b _ {n} $ of length $ n + 1 $, where $ b \in B $ and $ 0 \leq i \leq n $. The number of words obtainable from an arbitrary word $ \beta $ of length $ n $ by the insertion of $ s $ letters of an alphabet $ B $ is equal to

$$ \sum _ { j=0 } ^ { s } \left ( \begin{array}{c} n + s \\ j \end{array} \right ) ( r - 1 ) ^ {j} , $$

where $ r $ is the number of letters in $ B $. A set $ K $ of words over the alphabet $ B $ is called a code for correction of $ s $ deletions (insertions, deletions or insertions) if no word in $ B $ can be obtained from two distinct words of $ K $ as the result of $ s $ or fewer deletions (insertions, deletions or insertions) of letters in each of them. The function defined on the pairs $ ( \beta _ {1} ,\ \beta _ {2} ) $ of words in $ B $ and equal to the minimum number of deletions and insertions of letters converting $ \beta _ {1} $ into $ \beta _ {2} $ is a metric. A set $ K $ of words over the alphabet $ B $ is a code for correction of $ s $ deletions (insertions, deletions or insertions) if and only if the distance between any two distinct words of $ K $ is greater than $ 2 s $, so that the above three definitions of a code are equivalent. An example of a code for correction of one deletion or one insertion is the set of words $ \{ \beta = b _ {1} \dots b _ {n} \} $ of length $ n $ over the alphabet $ B = \{ 0 ,\ 1 \} $ for which $ \sum _ {i=1} ^ {n} i b _ {i} \equiv 0 $( $ \mathop{\rm mod} \ n + 1 $). The number of words in this code is equal to

$$ \frac{1}{2 ( n + 1 )} \sum \phi (d) 2 ^ {( n + 1 ) / d} , $$

where the sum is taken over all odd divisors $ d $ of $ n + 1 $ and $ \phi (d) $ is the Euler function; it is asymptotically maximal as $ n \rightarrow \infty $.

How to Cite This Entry:
Code with correction of deletions and insertions. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Code_with_correction_of_deletions_and_insertions&oldid=44396
This article was adapted from an original article by V.I. Levenshtein (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article