Namespaces
Variants
Actions

Difference between revisions of "User:Joachim Draeger/sandbox"

From Encyclopedia of Mathematics
Jump to: navigation, search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
  
 +
{{MSC|68Q05}}
  
 +
{{TEX|done}}
  
{{MSC|68P05}}
+
A probabilistic Turing machine (PTM) is a [[Turing machine]] (TM) modified for executing a randomized [[Computable function|computation]]. From the computability point of view, a PTM is equivalent to a TM. In other respects, however, the behavior of a PTM is profoundly different from the deterministic computation of a TM; false results, for example, can only be excluded statistically in this model. The physical realization of a true random number generation is possible by performing a measurement process in quantum theory.
  
{{TEX|done}}
+
Some applications of computer science can be better modeled by a PTM than by a classical TM. An example are environments with strong radiation like space missions crossing the radiation belt of Jupiter or robots for handling accidents in a nuclear plant.  But even a usual calculation involving a very large number of single operations (e.g. calculations of $\pi$ with record precision) may be potentially influenced by cosmic rays making the calculation probabilistic.
  
 +
===Definition of a Probabilistic Turing Machine===
  
 +
A PTM $(Q,\Sigma,\Gamma,\sqcup,q_0,q_f,\delta)$ has the same components as a TM. The set $Q$ is a finite set of states, $\Sigma$ is a finite input/output alphabet, $\Gamma$ is a finite tape alphabet with $\Sigma\subseteq\Gamma$, $\sqcup\in \Gamma$ is a blank symbol with $\sqcup \notin \Sigma$, the state $q_0 \in Q$ is a start state, and $q_f \in Q$ is a stop state. The transition function $\delta$, however, does not define deterministic transitions as in the case of a Turing machine, but gives a probability distribution of possible transitions according to $ \delta: Q \times \Sigma \times Q \times \Sigma \times \{L,R\} \longrightarrow [0,1]$.
  
 +
For probabilistic Turing machines, the set $C$ of <i>configurations</i> is defined in the same way as for Turing machines. It is also called the set of <i>basic states</i>. The set $\Omega$ of <i>states</i> is the set of possible probability distributions on the basic states, i.e. 
 +
$$\Omega=\left\{(p_c)_{c\in C}\in [0,1]^C \,\,\,\left| \,\,\,\sum\limits_{c\in C} p_c=1\right.\right\}.$$
 +
The set of states serves as memory for the computation history. Since the run of the computation is probabilistic, the definition of a state must be probabilistic as well. Thus the distinction between basic states and states.
  
The name attached to abstract computers (cf. [[Computer,  abstract|Computer, abstract]]) of a specific type. The concept of a machine of such a kind originated in the middle of the 1930's from A.M.  Turing
+
The transition function $\delta$ can be considered as [[Stochastic matrix|stochastic matrix]] $M_{ji}$ defined on the space $C$ of configurations with $ M_{ji} = \mathrm{Prob}[\delta\colon c_i \mapsto c_j] \in [0,1]$As a stochastic matrix, the $L_1$-norm of each column of $M_{ji}$ is equal to 1, i.e. $\sum_i M_{ji} = 1$. $L_1$-norms are preserved by $M$ according to $L_1(M\cdot c) = L_1(c) = \sum_{i} c_i$ for a configuration $c\in C$Not every stochastic matrix provides the transition function $\delta$ of a PTM, however, because such a $\delta$ must fulfill additionally a locality constraint. A Turing machine changes only a single symbol in each step and moves its head to a new position in its immediate neighborhood.
  
as the result of an analysis carried out by him  of the actions of a human being carrying out some or other calculations  in accordance with a plan worked out in advance, that is, carrying out  successive transformations of complexes of symbols. This analysis, in  turn, was carried out by him with the aim of solving the then urgent  problem of finding a precise mathematical equivalent for the general  intuitive idea of an [[Algorithm|algorithm]]. In the course of  development of the theory of algorithms (cf. [[Algorithms, theory  of|Algorithms, theory of]]), there emerged a number of modifications of  the original definition of Turing. The version given here goes back to  E. Post [[#References|[2]]]; in this form the definition of a Turing machine has achieved widespread popularity (the Turing machine has been  described in detail, for example, in [[#References|[3]]] and  [[#References|[4]]]).
+
Some alternative definitions of probabilistic Turing machines can be shown to be equivalent to the definition given here.
 +
* A probabilistic Turing machine can also be understood as a Turing machine $(Q,\Sigma,\Gamma,\sqcup,q_0,q_f,\delta_0,\delta_1)$ having two transition functions $\delta_0$ and $\delta_1$. Which one of these two functions has to be applied in the next transition step is chosen randomly with probability $1/2$ each. This can be understood as a random number generator executing a coin toss for the binary decision between two possible continuations.
 +
* In a slight variation of the above approach, a probabilsitic Turing machine is a deterministic Turing machine with an additional tape (usually considered as read-only and its head moving only to the right) containing binary random numbers. Though $\delta$ is a deterministic transition function, the additional tape introduces a random decision for each step.
  
A Turing machine is conveniently  represented as an automatically-functioning system capable of being in a  finite number of internal states and endowed with an infinite external  memory, called a tape. Two of the states are distinguished, the initial  state and the final state. The tape is divided into cells and is  unbounded to the left and to the right. Any letter of some finite  [[Alphabet|alphabet]] <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944601.png" /> can be printed on  each cell of the tape (for the sake of uniformity, it is convenient to  regard an empty cell as being printed with a  "blank" ). At each moment  of discrete time the Turing machine is in one of its states, and by  scanning one of the cells of its tape it perceives the symbol written  there (a letter of the alphabet <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944602.png" /> or the blank).
+
===Complexity Theory of Probabilistic Turing Machines===
  
If  the Turing machine is in a non-final state at some moment of time, it  completes a step, which is completely determined by its current state  and the symbol that is perceived on the tape at this moment. A step  consists of the following: 1) print a new symbol in the scanned cell, which may be the same as the old symbol or a blank; 2) go to a new  state, which may be the same as the old one or the final state; and 3)  move the tape to the left or to the right by one cell, or keep it in the  same place. The list of all possible steps of the Turing machine in  dependence on the current combination of  "non-final state + symbol  perceived"  can be represented, for example, by a special table with two  inputs, called the program, or scheme, of the given Turing machine. The  codes of the corresponding steps of the machine, called its commands,  are placed in the cells of this table. The program of the Turing machine  is an object with a given structure, and one can stipulate that the  Turing machine be identified with its program. If one wants to emphasize  the connection of such a Turing machine with the alphabet <img  align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944603.png" />, then one usually  says that this machine is a Turing machine in the alphabet <img  align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944604.png" />.
+
For a TM, the sequence of computation steps is uniquely determined. Such a machine accepts an input $x\in\Sigma^\ast$, if the terminating state of the computation is an accepting state. For a nondeterministic Turing machine, the input $x$ is accepted if it exists a computation sequence starting with $x$ and terminating in an accepting state. For probabilistic Turing machines, such a computation sequence exists in each case, even though its probability may be zero. Thus for defining acceptance, the probability of computation sequences is taken into consideration. This leads to the following definition.
  
The  complete description of the current state of a Turing machine is given  by its configuration, consisting of the following information at the  given moment: 1) the actual symbols filling the cells of the tape; 2) the cell currently being scanned by the machine; and 3) the internal  state of the machine. A configuration corresponding to the final state of the Turing machine is also called final.
+
For $T\colon \mathbb{N} \longrightarrow \mathbb{N}$, a PTM $M$ [[Decidable predicate|decides]] a language $L\subseteq \Sigma^\ast$ in time $T(n)$ if
 +
* For each $x\in \Sigma^\ast$ and each possible computation sequence resulting from input $x$, $M$ terminates after at most $T(|x|)$ computation steps.
 +
* $\forall x \in L \colon \mathrm{Prob}[M(x)=1 ] \ge 2/3$
 +
* $\forall x \notin L \colon \mathrm{Prob}[M(x)=0 ] \ge 2/3$
 +
In this definition, $M(x)$ designates the result of the processing of input $x$ by $M$. The expression $M(x)=1 $ indicates a termination in an accepting state, whereas $M(x)=0$ indicates a termination in a nonaccepting state. $\mathrm{Prob}[M(x)=1 ]$ denotes the fraction of computations leading to $M(x)=1$The class of languages decided by PTMs in $O(T(n))$ computation steps is designated as $\mathrm{BPTIME}(T(n))$.
  
If some non-final configuration of the Turing machine is fixed as the initial  configuration, then the functioning of this machine will consist of a (step by step) sequential transformation starting with the initial  configuration in accordance with the machine's program until the time of attaining a final configuration. After this, the functioning of the  Turing machine is considered ended and the final configuration attained  is regarded as the result of the functioning of the machine. Of course, the functioning of the Turing machine does not, in general, terminate  for every initial configuration.
+
Based on $\mathrm{BPTIME}(T(n))$, the [[Complexity theory|complexity class]] $\mathrm{BPP}$ (an abbreviation of bounded-error, probabilistic, polynomial-time) is formally defined as  
 +
$$\mathrm{BPP}:=\bigcup\limits_{c\in\mathbb{R},c>0} \mathrm{BPTIME}(|x|^c).$$
 +
This means it holds $L\in \mathrm{BPP}$ if a polynomial-time PTM $M$ exists with
 +
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] \ge 2/3. \end{align*}
 +
Since the transition function $\delta$ can be chosen in such a way that a specific continuation is preferred with a probability of $1$, a deterministic TM is a special case of a PTM. Thus it holds $\mathrm{P}\subseteq \mathrm{BPP}$. Up to know (2013) it is unknown, whether it holds $\mathrm{BPP} = \mathrm{P}$ or not.
  
The notion of a Turing  machine can be used for making precise the general idea of an algorithm  in a given alphabet, as follows. By a Turing algorithm in an alphabet  <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944605.png" /> is meant any  algorithm <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944606.png" /> of the following  kind. One takes a fixed Turing machine <img align="absmiddle"  border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944607.png" /> in the  alphabet <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944608.png" />. Let <img  align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t0944609.png" /> be the word in  <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446010.png" /> taken as the initial data for the algorithm <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446011.png" />. The following  initial configuration of the machine <img align="absmiddle"  border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446012.png" /> is  constructed: 1) the word <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446013.png" /> is written on the  tape without gaps, the remaining cells being left empty (i.e. blank);  2) the machine <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446014.png" /> is set up to scan  the cell with the first letter of the word <img align="absmiddle"  border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446015.png" />; and  3) <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446016.png" /> is put into the initial state (if <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446017.png" /> is empty, then  the tape is chosen to be empty, and the scanned cell is any cell).  Suppose that <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446018.png" />, starting from  this initial configuration, completes its functi
+
The complexity class $\mathrm{BPP}$ defines the polynomial-time complexity for a PTM $M$ based on a two-sided error, i.e. $M$ may indicate $0$ despite of $x\in L$ and $1$ despite of $x\notin L$. It is also possible to define complexity classes with one-sided error. In this case, $M(x)$ may still indicate, say, a false reject, but not a false accept. This leads to the definition of the complexity class $\mathrm{RP}$ (abbreviation for random polynomial-time). It holds $L\in \mathrm{RP}$ if a polynomial-time PTM $M$ exists with
oning. Consider the cell  of the tape being scanned by <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446019.png" /> in the final  configuration. If the symbol printed on it is blank, then <img  align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446020.png" /> is taken to be  the empty word. Otherwise, <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446021.png" /> is taken to be  the word printed on the maximum segment of the tape including the  scanned cell and not containing any blanks.
+
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 0. \end{align*}
 +
This is equivalent to  
 +
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] =1. \end{align*}
 +
An immediate consequence of the definition is the inclusion $\mathrm{RP} \subseteq \mathrm{NP}$, whereby $\mathrm{NP}$ is the complexity class of nondeterministically polynomial-time languages. Analogously, it holds $L\in \mathrm{coRP}$ if a polynomial-time PTM $M$ exists with
 +
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 1 \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] \ge 2/3 \end{align*}
 +
or, equivalently,
 +
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 0] = 0 \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] \ge 2/3. \end{align*}
 +
One can show both $\mathrm{RP}\subseteq \mathrm{BPP}$ and $\mathrm{coRP}\subseteq \mathrm{BPP}$. The members of $\mathrm{RP}$ gives no false accepts, while the members of $\mathrm{coRP}$ gives no false rejects. For avoiding both false accepts and rejects, i.e. false answers at all, one has to use algorithms belonging to the complexity class $\mathrm{ZPP}$.  
  
There are  strong grounds for supposing that the precise description of the general  idea of an [[Algorithm|algorithm]] in an alphabet carried out by means  of the notion of a Turing machine is adequate. Namely, it is held that for every algorithm <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446022.png" /> in some alphabet  it is possible to construct a Turing algorithm giving the same results  under the same initial data as the algorithm <img align="absmiddle"  border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446023.png" />. This  convention is known in the theory of algorithms as the Turing thesis.  The acceptance of the Turing thesis is equivalent to the acceptance of  the [[Church thesis|Church thesis]] (for partial recursive functions) or  the normalization principle (for normal algorithms, cf. [[Normal  algorithm|Normal algorithm]]). However, in contrast to the latter two, the Turing thesis is immediately highly convincing. In fact, by carrying  out computations according to a selected plan, the mathematician acts  in a way similar to a Turing machine: in considering some position in his writings and being in a certain  "state of mind" , he makes the  necessary alterations in his writing, is inspired by a new  "state of  mind" , and goes on to contemplate further writing. The fact that he  completes more complicated steps than a Turing machine seems not  principally significant.
+
The complexity class $\mathrm{ZPP}$ of zero-sided error, expected polynomial-time languages consists of all laguages $L$ for which it exists a $c\in\mathbb{R},c>0$ such that for all $x\in L$ the average running time is $|x|^c$ while the probability of providing the correct answer is equal to $1$, i.e.
 +
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 1 \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] = 1. \end{align*}
 +
For $L\in \mathrm{ZPP}$, the probability that $M(x)$ does not terminate for $x\in L$ is equal to $0$. It holds $\mathrm{ZPP} = \mathrm{RP}\cap \mathrm{coRP}$.
  
In terms of the structure of  their description and the type of functioning, Turing machines are  automata of a very general kind, so that Turing's conception has to a  considerable extent stimulated the origin of the abstract theory of  automata and largely predetermined their particular properties (cf.  [[Automaton|Automaton]]; [[Automata, theory of|Automata, theory of]]).
+
===Improvement of Probabilistic Computations===
  
There  are many modifications of Turing machines. The most widespread are multi-tape Turing machines, with one or several heads for each of its  tapes. The motion of the heads and the printing of the letters on the  tape are carried out simultaneously according to the program of the control system. Multi-tape Turing machines are conveniently used in the formalization of the notion of a relative algorithm. Thus, a function  <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446024.png" /> is  (algorithmically) computable relative to a function <img  align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446025.png" /> if there exists a  multi-tape Turing machine that computes <img align="absmiddle"  border="0" src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446026.png" /> under  the condition that in any initial configuration all the values of <img align="absmiddle" border="0"  src="https://www.encyclopediaofmath.org/legacyimages/t/t094/t094460/t09446027.png" /> are printed in fixed order on one of the tapes. In this form one can, in terms of  relative computations, introduce the important notion of Turing  reducibility in the theory of algorithms, as well as other forms of  [[Algorithmic reducibility|algorithmic reducibility]]. It is natural to  formalize the concept of a probabilistic algorithm by means of  multi-tape Turing machines. A common approach consists of the following:  A random sequence is printed on one of the tapes of the multi-tape  Turing machine; the Turing machine then deals with exactly one symbol of  this sequence at each instant. In a second approach, the program of the  control system of the Turing machine will allow the existence of  several commands with the same left-hand sides, the choice of one or  other of the commands then being carried out with prescribed  probabilities. The notion of a non-deterministic Turing machine is based  on a similar idea. Here again, the program of the control system can have several commands with the same left-hand
+
The definitions of probabilistic complexity classes given above use the specific value $2/3$ as required minimal probabilityThis somewhat arbitrarily chosen value can be replaced by any other value $1/2+\epsilon$, $\epsilon > 0$, without changing the essential meaning of the definitions. In the case of $\mathrm{RP}$ for example, an [[Algorithm|algorithm]] fulfilling
sides. In both cases,  instead of a single computation for a given input, one considers the  class of all possible computations compatible with the program. For  probabilistic Turing machines the probability of such computations is  considered; for non-deterministic Turing machines one considers the  possibility of the computation itself.
+
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 0 \end{align*}
 +
iterated $m$ times in the case of $M(x) = 1$ leads to an algorithm fulfilling
 +
\begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge (2/3)^m \\
 +
\forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 0. \end{align*}
 +
In the same way, algorithms belonging to the complexity class $\mathrm{coRP}$ can be modified.
  
See also [[Algorithm, complexity of description of an|Algorithm, complexity of  description of an]]; [[Algorithm, computational complexity of an|Algorithm, computational complexity of an]]; [[Computable  function|Computable function]]; [[Machine|Machine]].
+
Algorithms belonging to the complexity class $\mathrm{BPP}$ require some more effort for modifying the probability of correctness. Here, an $m$-fold repetition is used, whereby the results $b_1,\ldots,b_m$ are evaluated using a voting mechanism. Assuming that $M(x)$ decides the predicate $x\in L$ by producing the result $0$ or $1$ and that $m$ is an odd number, the modified algorithm gives $1$ if $\sum_i b_i > m/2$ and $0$ otherwise. The probability of correctness is modified according to Chernoff bounds as follows.
  
====References====
+
Let $x_1,\ldots,x_m$ be independent random variables having the same probability distribution with image set $\{0,1\}$. For $p:= \mathrm{Prob}[x_i=1]$, $X:=\sum_{i=1}^mx_i$, and $\Theta \in [0,1]$ it holds
<table><TR><TD  valign="top">[1a]</TD> <TD valign="top">  A.M. Turing,    "On computable numbers, with an application to the  Entscheidungsproblem"  ''Proc. London Math. Soc. (2)'' , '''42'''  (1937) pp. 230–265</TD></TR><TR><TD  valign="top">[1b]</TD> <TD valign="top">  A.M. Turing,    "On computable numbers with an application to the Entscheidungsproblem, a  correction"  ''Proc. London Math. Soc. (2)'' , '''43'''  (1937)  pp544–546</TD></TR><TR><TD  valign="top">[2]</TD> <TD valign="top">  E.L. Post,    "Finite combinatory processes - formulation 1" ''J. Symbolic Logic'' , '''1''' :  3 (1936)  pp. 103–105</TD></TR><TR><TD  valign="top">[3]</TD> <TD valign="top">  S.C. Kleene,    "Introduction to metamathematics" , North-Holland  (1951)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  A.I. Mal'tsev,    "Algorithms and recursive functions" , Wolters-Noordhoff  (1970)   (Translated from Russian)</TD></TR><TR><TD  valign="top">[5]</TD> <TD valign="top">  E. Mendelson,    "Introduction to mathematical logic" , v. Nostrand  (1964)</TD></TR><TR><TD  valign="top">[6]</TD> <TD valign="top">  M. Minsky,    "Computation: finite and infinite machines" , Prentice-Hall  (1967)</TD></TR><TR><TD  valign="top">[7]</TD> <TD valign="top"> , ''Turing  machines and recursive functions'' , Moscow  (1972) (In Russian;  translated from German)</TD></TR></table>
+
$$\begin{array}{rcl}
 +
\mathrm{Prob}[X\ge (1+\Theta)pm] &\le & \exp\left(-{\Theta^2\over 3}pm\right) \\
 +
\mathrm{Prob}[X\le (1-\Theta)pm] &\le & \exp\left(-{\Theta^2\over 2}pm\right)  
 +
\end{array}$$
 +
The random variables $x_i$ are now interpreted as error variables, i.e$x_i=1$ if the $i$-th repetition of the decision algorithm gives a wrong answer and $x_i=0$ otherwiseAccording to the definition of the class $\mathrm{BPP}$, it holds $p=1-2/3=1/3$Taking $\Theta=1/2$ in the first Chernoff bound gives
 +
$$\mathrm{Prob}[X\ge m/2] \le \exp\left(-{\Theta^2\over 3}pm\right) = \exp\left(-{1\over 36}m\right) $$
 +
i.e. the error of the voting algorithm is smaller or equal to $\exp(-{m/36})$.
  
 +
===Applications of Probabilistic Computations===
  
 +
Some examples for probabilistic algorithms may be given. Only their basic ideas are presented.
 +
* The probabilistic primality testing for a natural number $n\in\mathbb{N}$ can be realized as follows: Generate random natural numbers $k_1, \ldots, k_r$ with $1 < k_i < n$. For each $k_i$, calculate the greatest common divisor $g_i:=\gcd(n,k_i)$. If it exists a $i$ with $g_i>1$, output $0$ for 'not prime'. Otherwise, output $1$.
 +
* The question, whether two polynomials $f(x)$, $g(x)$ are equal on a region $D$ can be reduced to the question, whether $f(x)-g(x)=0$ for $x\in D$. Thus, the algorithm generates random numbers $x_1, \ldots, x_r$ with $x_i \in D$. For each $x_i$, the algorithm calculates the difference $d_i:= f(x_i)-g(x_i)$. If it exists a $i$ with $d_i\neq 0$, output $0$ representing 'unequal'. Otherwise, output $1$ representing 'equal'.
  
====Comments====
+
===References===
See  also [[Complexity theory|Complexity theory]]; [[Formal languages and  automata|Formal languages and automata]];  [[Undecidability|Undecidability]]. Consult [[#References|[a1]]] and  [[#References|[a2]]] for the importance of a Turing machine as a  formalization of the intuitive notion of an algorithm and for the Church  thesis, as well as for the relation of Turing machines to complexity  theory in general.
 
  
====References====
+
{|
<table><TR><TD  valign="top">[a1]</TD> <TD valign="top">  A. Salomaa,   "Formal languages" , Acad. Press  (1973)</TD></TR><TR><TD  valign="top">[a2]</TD> <TD valign="top">  A. Salomaa,   "Computation and automata" , Cambridge Univ. Press   (1985)</TD></TR><TR><TD  valign="top">[a3]</TD> <TD valign="top">  M. Davis,    "Computability and unsolvability" , McGraw-Hill  (1958)</TD></TR><TR><TD  valign="top">[a4]</TD> <TD valign="top">  J.E. Hopcroft,    J.D. Ulman,  "Introduction to automata theory, languages and  computation" , Addison-Wesley  (1979)</TD></TR><TR><TD  valign="top">[a5]</TD> <TD valign="top">  C.H.  Papdimitriou,   "Elements of the theory of computation" , Prentice-Hall  (1981)</TD></TR></table>
+
|-
 +
|valign="top"|{{Ref|DK2000}}||valign="top"| Ding-Zhu Du, Ker-I Ko, "Theory of Computational Complexity", Wiley 2000
 +
|-
 +
|valign="top"|{{Ref|AB2009}}||valign="top"| Sanjeev Arora, Boaz Barak, "Computational Complexity: A Modern Approach", Cambridge University Press 2009
 +
|-
 +
|valign="top"|{{Ref|BC2006}}||valign="top"| [http://parlevink.cs.utwente.nl/~vdhoeven/CCC/bCC.pdf Daniel Pierre Bovet, Pierluigi Crescenzim, "Introduction to the Theory of Complexity", 2006]
 +
|-
 +
|}

Latest revision as of 17:42, 26 December 2013

2020 Mathematics Subject Classification: Primary: 68Q05 [MSN][ZBL]


A probabilistic Turing machine (PTM) is a Turing machine (TM) modified for executing a randomized computation. From the computability point of view, a PTM is equivalent to a TM. In other respects, however, the behavior of a PTM is profoundly different from the deterministic computation of a TM; false results, for example, can only be excluded statistically in this model. The physical realization of a true random number generation is possible by performing a measurement process in quantum theory.

Some applications of computer science can be better modeled by a PTM than by a classical TM. An example are environments with strong radiation like space missions crossing the radiation belt of Jupiter or robots for handling accidents in a nuclear plant. But even a usual calculation involving a very large number of single operations (e.g. calculations of $\pi$ with record precision) may be potentially influenced by cosmic rays making the calculation probabilistic.

Definition of a Probabilistic Turing Machine

A PTM $(Q,\Sigma,\Gamma,\sqcup,q_0,q_f,\delta)$ has the same components as a TM. The set $Q$ is a finite set of states, $\Sigma$ is a finite input/output alphabet, $\Gamma$ is a finite tape alphabet with $\Sigma\subseteq\Gamma$, $\sqcup\in \Gamma$ is a blank symbol with $\sqcup \notin \Sigma$, the state $q_0 \in Q$ is a start state, and $q_f \in Q$ is a stop state. The transition function $\delta$, however, does not define deterministic transitions as in the case of a Turing machine, but gives a probability distribution of possible transitions according to $ \delta: Q \times \Sigma \times Q \times \Sigma \times \{L,R\} \longrightarrow [0,1]$.

For probabilistic Turing machines, the set $C$ of configurations is defined in the same way as for Turing machines. It is also called the set of basic states. The set $\Omega$ of states is the set of possible probability distributions on the basic states, i.e. $$\Omega=\left\{(p_c)_{c\in C}\in [0,1]^C \,\,\,\left| \,\,\,\sum\limits_{c\in C} p_c=1\right.\right\}.$$ The set of states serves as memory for the computation history. Since the run of the computation is probabilistic, the definition of a state must be probabilistic as well. Thus the distinction between basic states and states.

The transition function $\delta$ can be considered as stochastic matrix $M_{ji}$ defined on the space $C$ of configurations with $ M_{ji} = \mathrm{Prob}[\delta\colon c_i \mapsto c_j] \in [0,1]$. As a stochastic matrix, the $L_1$-norm of each column of $M_{ji}$ is equal to 1, i.e. $\sum_i M_{ji} = 1$. $L_1$-norms are preserved by $M$ according to $L_1(M\cdot c) = L_1(c) = \sum_{i} c_i$ for a configuration $c\in C$. Not every stochastic matrix provides the transition function $\delta$ of a PTM, however, because such a $\delta$ must fulfill additionally a locality constraint. A Turing machine changes only a single symbol in each step and moves its head to a new position in its immediate neighborhood.

Some alternative definitions of probabilistic Turing machines can be shown to be equivalent to the definition given here.

  • A probabilistic Turing machine can also be understood as a Turing machine $(Q,\Sigma,\Gamma,\sqcup,q_0,q_f,\delta_0,\delta_1)$ having two transition functions $\delta_0$ and $\delta_1$. Which one of these two functions has to be applied in the next transition step is chosen randomly with probability $1/2$ each. This can be understood as a random number generator executing a coin toss for the binary decision between two possible continuations.
  • In a slight variation of the above approach, a probabilsitic Turing machine is a deterministic Turing machine with an additional tape (usually considered as read-only and its head moving only to the right) containing binary random numbers. Though $\delta$ is a deterministic transition function, the additional tape introduces a random decision for each step.

Complexity Theory of Probabilistic Turing Machines

For a TM, the sequence of computation steps is uniquely determined. Such a machine accepts an input $x\in\Sigma^\ast$, if the terminating state of the computation is an accepting state. For a nondeterministic Turing machine, the input $x$ is accepted if it exists a computation sequence starting with $x$ and terminating in an accepting state. For probabilistic Turing machines, such a computation sequence exists in each case, even though its probability may be zero. Thus for defining acceptance, the probability of computation sequences is taken into consideration. This leads to the following definition.

For $T\colon \mathbb{N} \longrightarrow \mathbb{N}$, a PTM $M$ decides a language $L\subseteq \Sigma^\ast$ in time $T(n)$ if

  • For each $x\in \Sigma^\ast$ and each possible computation sequence resulting from input $x$, $M$ terminates after at most $T(|x|)$ computation steps.
  • $\forall x \in L \colon \mathrm{Prob}[M(x)=1 ] \ge 2/3$
  • $\forall x \notin L \colon \mathrm{Prob}[M(x)=0 ] \ge 2/3$

In this definition, $M(x)$ designates the result of the processing of input $x$ by $M$. The expression $M(x)=1 $ indicates a termination in an accepting state, whereas $M(x)=0$ indicates a termination in a nonaccepting state. $\mathrm{Prob}[M(x)=1 ]$ denotes the fraction of computations leading to $M(x)=1$. The class of languages decided by PTMs in $O(T(n))$ computation steps is designated as $\mathrm{BPTIME}(T(n))$.

Based on $\mathrm{BPTIME}(T(n))$, the complexity class $\mathrm{BPP}$ (an abbreviation of bounded-error, probabilistic, polynomial-time) is formally defined as $$\mathrm{BPP}:=\bigcup\limits_{c\in\mathbb{R},c>0} \mathrm{BPTIME}(|x|^c).$$ This means it holds $L\in \mathrm{BPP}$ if a polynomial-time PTM $M$ exists with \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] \ge 2/3. \end{align*} Since the transition function $\delta$ can be chosen in such a way that a specific continuation is preferred with a probability of $1$, a deterministic TM is a special case of a PTM. Thus it holds $\mathrm{P}\subseteq \mathrm{BPP}$. Up to know (2013) it is unknown, whether it holds $\mathrm{BPP} = \mathrm{P}$ or not.

The complexity class $\mathrm{BPP}$ defines the polynomial-time complexity for a PTM $M$ based on a two-sided error, i.e. $M$ may indicate $0$ despite of $x\in L$ and $1$ despite of $x\notin L$. It is also possible to define complexity classes with one-sided error. In this case, $M(x)$ may still indicate, say, a false reject, but not a false accept. This leads to the definition of the complexity class $\mathrm{RP}$ (abbreviation for random polynomial-time). It holds $L\in \mathrm{RP}$ if a polynomial-time PTM $M$ exists with \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 0. \end{align*} This is equivalent to \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] =1. \end{align*} An immediate consequence of the definition is the inclusion $\mathrm{RP} \subseteq \mathrm{NP}$, whereby $\mathrm{NP}$ is the complexity class of nondeterministically polynomial-time languages. Analogously, it holds $L\in \mathrm{coRP}$ if a polynomial-time PTM $M$ exists with \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 1 \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] \ge 2/3 \end{align*} or, equivalently, \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 0] = 0 \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] \ge 2/3. \end{align*} One can show both $\mathrm{RP}\subseteq \mathrm{BPP}$ and $\mathrm{coRP}\subseteq \mathrm{BPP}$. The members of $\mathrm{RP}$ gives no false accepts, while the members of $\mathrm{coRP}$ gives no false rejects. For avoiding both false accepts and rejects, i.e. false answers at all, one has to use algorithms belonging to the complexity class $\mathrm{ZPP}$.

The complexity class $\mathrm{ZPP}$ of zero-sided error, expected polynomial-time languages consists of all laguages $L$ for which it exists a $c\in\mathbb{R},c>0$ such that for all $x\in L$ the average running time is $|x|^c$ while the probability of providing the correct answer is equal to $1$, i.e. \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 1 \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 0] = 1. \end{align*} For $L\in \mathrm{ZPP}$, the probability that $M(x)$ does not terminate for $x\in L$ is equal to $0$. It holds $\mathrm{ZPP} = \mathrm{RP}\cap \mathrm{coRP}$.

Improvement of Probabilistic Computations

The definitions of probabilistic complexity classes given above use the specific value $2/3$ as required minimal probability. This somewhat arbitrarily chosen value can be replaced by any other value $1/2+\epsilon$, $\epsilon > 0$, without changing the essential meaning of the definitions. In the case of $\mathrm{RP}$ for example, an algorithm fulfilling \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge 2/3 \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 0 \end{align*} iterated $m$ times in the case of $M(x) = 1$ leads to an algorithm fulfilling \begin{align*}\forall x \in L \colon & \,\, \mathrm{Prob}[M(x) = 1] \ge (2/3)^m \\ \forall x \notin L \colon & \,\, \mathrm{Prob}[M(x) = 1] = 0. \end{align*} In the same way, algorithms belonging to the complexity class $\mathrm{coRP}$ can be modified.

Algorithms belonging to the complexity class $\mathrm{BPP}$ require some more effort for modifying the probability of correctness. Here, an $m$-fold repetition is used, whereby the results $b_1,\ldots,b_m$ are evaluated using a voting mechanism. Assuming that $M(x)$ decides the predicate $x\in L$ by producing the result $0$ or $1$ and that $m$ is an odd number, the modified algorithm gives $1$ if $\sum_i b_i > m/2$ and $0$ otherwise. The probability of correctness is modified according to Chernoff bounds as follows.

Let $x_1,\ldots,x_m$ be independent random variables having the same probability distribution with image set $\{0,1\}$. For $p:= \mathrm{Prob}[x_i=1]$, $X:=\sum_{i=1}^mx_i$, and $\Theta \in [0,1]$ it holds $$\begin{array}{rcl} \mathrm{Prob}[X\ge (1+\Theta)pm] &\le & \exp\left(-{\Theta^2\over 3}pm\right) \\ \mathrm{Prob}[X\le (1-\Theta)pm] &\le & \exp\left(-{\Theta^2\over 2}pm\right) \end{array}$$ The random variables $x_i$ are now interpreted as error variables, i.e. $x_i=1$ if the $i$-th repetition of the decision algorithm gives a wrong answer and $x_i=0$ otherwise. According to the definition of the class $\mathrm{BPP}$, it holds $p=1-2/3=1/3$. Taking $\Theta=1/2$ in the first Chernoff bound gives $$\mathrm{Prob}[X\ge m/2] \le \exp\left(-{\Theta^2\over 3}pm\right) = \exp\left(-{1\over 36}m\right) $$ i.e. the error of the voting algorithm is smaller or equal to $\exp(-{m/36})$.

Applications of Probabilistic Computations

Some examples for probabilistic algorithms may be given. Only their basic ideas are presented.

  • The probabilistic primality testing for a natural number $n\in\mathbb{N}$ can be realized as follows: Generate random natural numbers $k_1, \ldots, k_r$ with $1 < k_i < n$. For each $k_i$, calculate the greatest common divisor $g_i:=\gcd(n,k_i)$. If it exists a $i$ with $g_i>1$, output $0$ for 'not prime'. Otherwise, output $1$.
  • The question, whether two polynomials $f(x)$, $g(x)$ are equal on a region $D$ can be reduced to the question, whether $f(x)-g(x)=0$ for $x\in D$. Thus, the algorithm generates random numbers $x_1, \ldots, x_r$ with $x_i \in D$. For each $x_i$, the algorithm calculates the difference $d_i:= f(x_i)-g(x_i)$. If it exists a $i$ with $d_i\neq 0$, output $0$ representing 'unequal'. Otherwise, output $1$ representing 'equal'.

References

[DK2000] Ding-Zhu Du, Ker-I Ko, "Theory of Computational Complexity", Wiley 2000
[AB2009] Sanjeev Arora, Boaz Barak, "Computational Complexity: A Modern Approach", Cambridge University Press 2009
[BC2006] Daniel Pierre Bovet, Pierluigi Crescenzim, "Introduction to the Theory of Complexity", 2006
How to Cite This Entry:
Joachim Draeger/sandbox. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Joachim_Draeger/sandbox&oldid=29989