formal programming language
A formal language intended for describing computational processes or, equivalently, for writing down algorithms to be executed by computers. One distinguishes between problem-oriented algorithmic languages (high-level languages), which are not related to any specific machine, and machine-oriented algorithmic languages (low-level languages), which take the specific features of a given machine into account (instruction set, addressing modes, etc.). The term "algorithmic language" usually refers to a problem-oriented language, as opposed to machine code, which is a notation that is directly interpreted by a machine. For the well-formed texts of an algorithmic language (programs, cf. Program) a general algorithm defines their execution in a unique way, which is the distinction between algorithmic languages and non-algorithmic programming languages, for which the execution process for a text is fully undetermined or the text merely serves as material for the synthesis of an algorithm.
As in natural languages, an algorithmic language is constructed over an alphabet of basic symbols (in which the program is written down) in the form of a hierarchical system of grammatical elements, between which relations are given (similarly to the words, phrases and sentences in a natural language, whose connections are given by syntactic rules). The lowest level elements, formed by chains of basic symbols, are called lexemes, or lexical units. For the lexemes occurring in a program the class to which they belong is defined, and for certain classes of lexemes (e.g. identifiers) also their scope — some uniquely identifiable part of the program to which all occurrences of a given lexeme belong (a block). Exactly one occurrence of such a lexeme is said to be defining; the other occurrences of the lexeme in its scope are called applied.
The further levels of elements of an algorithmic language are formed by notions, or non-terminals. The relation that may hold between the notions of an algorithmic language is that of being a (direct) constituent of (i.e. an immediate constituting part), while the individual constituents of a given notion are related to each other by concatenation (textual sequence). The transitive closure of the constituent relation uniquely assigns to each notion some subword of the text of the program, which is said to be the (terminal) production of this notion. There is one initial notion, the production of which is the entire program text. A tree whose root is the initial notion, whose terminal vertices (leafs) are lexemes and basic symbols, whose internal vertices are concepts and whose branches are constituent relations, is called a production or syntax tree of a program. The construction of such a tree is known as the syntactic analysis or parsing of a program.
Notions and lexemes have attributes, i.e. certain sets fixed by the description of the algorithmic language. Determining the attributes of the elements occurring in a program is called its semantic analysis. Finding the attributes of lexemes begins with the analysis of their defining occurrences, which usually contain explicit information about the attributes. The attribute information is transferred to all the applied occurrences of the lexeme within the corresponding scope (identification). The attributes of a certain notion are found, by induction over the production tree, as a function of the attributes of its constituents. A substantial part of the semantic analysis, which is very valuable for checking the correctness of a program, is a check on the compatibility of the attributes.
The rules of syntactic analysis are laid down by the generative grammar of the algorithmic language (cf. Grammar, generative) or by an analyzing automaton (more precisely, by its different generalizations) for converting the program text into a production tree. The rules of semantic analysis are usually described informally, but attempts have been made to formalize the definition of attribute information and to take the context into consideration with the aid of the mechanism of two-level grammars (cf. Algol-68).
The execution algorithm of the programs of an algorithmic language is usually given in the form of a correspondence between the vertices of the production tree and various execution procedures (also called transducers or semantic procedures). Each procedure performs certain operations as a function of the attributes and of some context of the given notion (or lexeme), and then determines the next procedure. The operations of the procedure may specify the computation process directly in terms of some abstract machine (semantics of the interpretation type), or may transform the production of a given notion into a fragment of the program in some high-level language (translational type).
In specific algorithmic languages, the alphabet of basic symbols usually consists of the letters of the Roman alphabet (which may be replaced or augmented by characters from a national set), digits, pairs of delimiters (parentheses), separators (punctuation marks) and symbols for certain operations. The principal classes of lexemes are numerals to represent numbers, literals to represent texts, and identifiers to denote the various objects of the program that are defined in the program itself. The principal such objects are variables, labels (which name different parts of the program) and procedures (which denote functions). The meanings and the names of certain identifiers are defined by the description of an algorithmic language (reserved words).
Among the concepts of an algorithmic language are basic structures — definitions, expressions and operators. Descriptions are sources of attribute information which is assigned to the defining occurrence of the lexeme. Essentially, attributes characterize the type of values calculated by executing the program, their representation and the way in which they are stored in the memory of the computer. For composite values like arrays (vectors, matrices) and structured values (records) the mode of access to their elementary components is also specified. Expressions are the source of values; operators are units of finite operations in the program; basic operators are the operators of assignment of the value of an expression to a variable, the operator of control transfer (unconditional or conditional jump, or goto), the operator of procedure call, and the operator of a loop.
Loops and procedures are the most characteristic tools of an algorithmic language, providing as they do a concise notation for very long computations. The loop operator prescribes a repetitive execution of some part of the program (the body of the loop), the number of repetitions being determined by some given condition. The definition of a procedure introduces an abbreviated, parametrized notation for the function of some part of the program (the procedure body). The execution of the procedure body may be subsequently treated as an elementary operation triggered by a call of the procedure, at which time the actual values are assigned to the parameters included in the procedure.
An important property of an algorithmic language is its capacity to structure a program text, which makes it easier to write, to learn and to check. The principal means of structuring are limitations on the structure of scopes (block structure), on procedure and loop, bodies and rules imposed on the use of goto operators, which limits the number of branches on the execution paths.
Programming a computer in an algorithmic language necessitates the availability of special software in the form of programming processors (compilers, interpreters) — which are intermediaries between the program in the algorithmic language and the machine. A complete processor executes the following functions: input of the program, lexical analysis (isolation and classification of lexemes), syntactic and semantic analysis, with indications of formal errors, synthesis of an intermediate form (representation of the program in the internal language of some abstract computer, which is convenient for the subsequent processing or execution of the program), optimization (systematic transformations of the intermediate forms which improves certain characteristics of the program such as its size, speed and demands on memory), code generation (construction of the machine program) and execution of the program. In processors of the translational type (translators, compilers, program generators) the execution of the program takes place after the completion of the construction of the machine program. In processors of the interpretative type (incremental translators, interactive translators) the program is executed with the aid of some mechanism of interpretation of either its intermediate form, or of the production tree or even of the initial text.
One distinguishes between one-phase, two-phase and multi-phase translation schemes of programs. In the one-phase scheme all functions of the translator are executed during a single pass of the text of the program. There is no optimization and no intermediate form, while the isolated vertices of the production tree immediately generate machine commands of the program. In two-phase translators, the production tree is usually built during the first pass of the program, while the semantic analysis, combined with the generation, is effected during the second pass. Multi-phase schemes, involving the utilization of an intermediate form, are employed in optimizing translators or in systems with mixed generation for various types of machines; in certain very complex algorithmic languages syntactic and semantic analysis is also required.
The number of algorithmic languages which may be employed with computers is very large (more than one thousand), but only a few of them are extensively used. These include Algol; Algol-68; Cobol; Lisp; PL/I; Simula; Fortran; and in the USSR also Algams; Al'fa and Refal.
|||P. Ingerman, "A syntax-oriented translator" , Acad. Press (1966)|
|||, Programming languages , Moscow (1972) (In Russian; translated from English)|
|||F.R.A. Hopgood, "Compilation techniques" , American Elsevier , New York (1969)|
|||B. Higman, "A comparitive study of programming languages" , American Elsevier (1968)|
In English-speaking computer science community, usually no specific distinction is made between the concepts of "programming language" and "algorithmic language" , although the latter term historically refers mainly to languages of the Algol family. The terminology and concepts of the theory of programming languages are not standardized; the article above leans heavily on the somewhat idiomatic approach employed in the definition of Algol-68.
A language that has gained widespread use since the 1970s is Pascal, which is a close relative of Algol-60 offering some additional facilities. Next to the class of procedural (or operational) languages, which basically prescribe the sequence of operations to be performed, there are functional (or applicative) languages, in which a program is basically a collection of recursive function definitions, and deductive (or logic-programming) languages, for which a program consists of a collection of recursive definitions of predicates. Most existing algorithmic languages are procedural. An example of a functional language is Lisp, whereas Prolog is a deductive language.
|[a1]||T.W. Pratt, "Programming languages: design and implementation" , Prentice-Hall (1975)|
Algorithmic language. A.P. Ershov (originator), Encyclopedia of Mathematics. URL: http://www.encyclopediaofmath.org/index.php?title=Algorithmic_language&oldid=11836