Systems that follow the second law of thermodynamics evolve with time to maximal entropy and complete disorder, destroying any order initially present. Cellular automata are examples of mathematical systems which may instead exhibit ''self-organizing'' behaviour (1) Even starting from complete disorder, their irreversible evolution can spontaneously generate ordered structure. One coarse indication of such self-organization is a decrease of entropy with time. This paper discusses an approach to a more complete mathematical characterization of self-organizing processes in cellular automata, and possible quantitative measures of the ''complexity'' generated by them. The evolution of cellular automata is viewed as a computation which processes information specified as the initial state. The structure of the output from such information processing is then described using the mathematical theory of formal languages (e.g. [6--8]). Detailed results and examples for simpler cases are presented, and some general conjectures are outlined. Computation and formal language theory may in general be expected to play a role in the theory of non-equilibrium and self-organizing systems analogous to the role of information theory in conventional statistical mechanics.
A one dimensional cellular automaton consists of a line of sites, with each site taking on a finite set of possible values, updated in discrete time steps according to a deterministic rule involving a local neighbourhood of sites around it. The value of site at time step is denoted and is a symbol chosen from the alphabet
The possible sequences of these symbols form the set of cellular automaton configurations . Most of this paper concerns the evolution of infinite sequences ; finite sequences flanked by quiescent sites (with say value 0) may also be considered. At each time step each site value is updated according to the values of a neighbourhood of sites around it by a local rule
of the form (2)
This local rule leads to a global mapping
on complete cellular automaton configurations. Then in general
is the set (ensemble) of configurations generated after iterated applications of ( time steps).
Formal languages consist of sets of words formed from strings of symbols in a finite alphabet according to definite grammatical rules. Sets of cellular automaton configurations may thus be considered as formal languages, with each word in the language representing a cellular automaton configuration. Such infinite sets of configurations are then completely specified by finite sets of grammatical rules. (This descriptive use of formal grammars may be contrasted with the use of their transformation rules to define the dynamical evolution of developmental or systems (e.g. ).)
Figure 1.1 gives typical examples of the evolution of cellular automata from disordered initial states according to various rules . Structure of varying complexity is seen to be formed. Four basic classes of behaviour are found in these and other cellular automata . In order of increasing apparent complexity, qualitative characterizations of these classes are as follows:
Approaches based on dynamical systems theory (e.g. [10, 11]) suggest some quantitative characterizations of these classes: the first three are analogous to the limit points, limits cycles and chaotic (''strange'') attractors found in continuous dynamical systems. The fourth class exhibits more complex behaviour, and, as discussed below, is conjectured  to be capable of universal computation (e.g. [6, 7, 8]). The formal language theory approach discussed in this paper provides more precise and complete characterizations of the classes and their complexity.
The four classes of cellular automata generate distinctive patterns by evolution from finite initial configurations, as illustrated in Fig. 1.2:
The classes are also distinguished by the effects of small changes in initial configurations:
''Information'' associated with the initial state thus propagates only a finite distance in classes 1 and 2, but may propagate an infinite distance in classes 3 and 4. In class 3, it typically propagates at a fixed positive speed.
The grammar of a formal language gives rules for generating or recognizing the words in the language. An idealized computer (such as a Turing machine) may be constructed to implement these rules. Such a computer may be taken to consist of a ''central processing unit'' with a fixed finite number of internal states, together with a ''memory'' or ''tape.'' Four types of formal language are conventionally identified, roughly characterized by the size of the memory in computers that implement them (e.g. ):
These four types of languages (essentially) form a hierarchy, with type 0 the most general. Only type 0 languages require full universal computers; the other three types of language are associated with progressively simpler types of computer (linear-bounded automata, pushdown automata, and finite automata, respectively).
The grammatical rules for a formal language may be specified as ''productions'' which define transformations or rewriting rules for strings of symbols. In addition to the set of ''terminal'' symbols which appear directly in the words of the language, one introduces a set of intermediate ''non-terminal'' symbols . To generate words in the language, one begins with a particular non-terminal ''start'' symbol, then uses applicable productions in turn eventually to obtain strings containing only terminal symbols. The different types of languages involve productions of different kinds:
Words in languages are recognized (or ''parsed'') by finding sequences of inverse productions that transform the words back to the start symbol.
The grammars for regular (type 3) languages may be specified by the finite state transition graphs for finite automata that recognize them. Each arc in such a graph carries a symbol from the alphabet . The nodes in the graph are labelled by non-terminal symbols, and connected according to the production rules of the grammar. Words in the language correspond to paths through the state transition graph. The (set) entropy of the language, defined as the exponential rate of increase in the number of words with length (see Sect. 3), is then given by the logarithm of the largest eigenvalue of the adjacency matrix for the state transition graph. This eigenvalue is always an algebraic integer.
The set of all possible sequences of zeroes and ones forms a trivial regular language, corresponding to a finite automaton with the state transition graph of Fig. 1.3a. Exclusion of all sequences with pairs of adjacent ones (so that any 1 must be followed by a 0) yields the regular language of Fig. 1.3b. The set of sequences in which, say, an even number of isolated ones appear between every 0110 block, again forms a regular language, now specified by the graph of Fig. 1.3c.
Regular expressions provide a convenient notation for regular languages. For example, represents all possible sequences of zeroes and ones, corresponding to Fig. 1.3a. Here denotes an arbitrary number of repetitions of the string . With this notation, represents Fig. 1.3b, and represents Fig. 1.3c.
Many regular grammars may in general yield the same regular language. However, it is always possible to find the simplest grammar for a given regular language (Myhill-Nerode theorem (e.g. )), whose corresponding finite automaton has the minimal number of states (nodes). This minimal number of states provides a measure of the ''complexity'' of the regular language. The regular languages of Fig. 1.3a--c are thus deemed to have progressively greater regular language complexities.
Section 2 shows that the sets of configurations generated by any finite number of steps in the evolution of a cellular automaton form a regular language. For some cellular automata, the complexities of the regular languages obtained tend to a fixed limit after a few time steps, yielding a large time limiting set of configurations corresponding to a regular language. In general, it appears that the limit sets for all cellular automata that exhibit only class 1 or 2 behaviour are given by regular languages. For most class 3 and 4 cellular automata, however, the regular language complexities of the sets increase rapidly with time, presumably leading to non-regular language limit sets.
Sets of symbol sequences analogous to sets of cellular automaton configurations are obtained from the ''symbolic dynamics'' of continuous dynamical systems, in which the values of real number parameters are divided into discrete bins, labelled by symbols (e.g. [10, 11]). The simplest symbol sequences obtained in this way are ''full shifts,'' corresponding to trivial regular languages containing all possible sequences of symbols. More complicated systems yield finite complement languages, or ''subshifts of finite type,'' in which a finite set of fixed blocks of symbols is excluded. ''Sofic'' systems, equivalent to general regular languages, have also been studied . There is nevertheless evidence that, just as in cellular automata, regular languages are inadequate to describe the complete symbolic dynamics of even quite simple continuous dynamical systems.
Context-free (type 2) languages are generalizations of regular languages. Words in context-free languages may be viewed as sequences of terminal nodes (leaves) in trees constructed according to context-free grammatical rules. Each non-terminal symbol in the context-free grammar is taken to correspond to a type of tree node. The production rules for the non-terminal symbol then specify its possible descendents in the tree. For each word in the language, there corresponds such a ''derivation'' tree, rooted at the start symbol. (In most context-free languages, there are ''ambiguous'' words, obtained from multiple distinct derivation trees.) The syntax for most practical computer languages is supposed to be context-free. Each grammatical production rule corresponds to a subexpression with a particular structure (such as ); the subexpressions may be arbitrarily nested [as in ], corresponding to arbitrary derivation trees.
Regular languages correspond to context-free languages whose derivation trees consist only of a ''trunk'' sprouting a sequence of leaves, one at a time. An example of a context-free language not represented by any regular grammar is the sequence of strings of the form for any . (Here, as elsewhere, represents -fold repetition of the string .) A derivation tree for a word in this language is shown in Fig. 1.4. In general, the productions of any context-free language may in fact be arranged so that all derivation trees are binary (Chomsky normal form) (4)
At each point in the generation of a word in a regular language, the next symbol depends only on the current finite automaton state, and not on any previous history. (Regular language words may thus be considered as Markov chains.) To generate words in a context-free language, however, one must maintain a ''stack'' (last-in first-out memory), which at each point represents the part of the derivation tree above the symbol (tree leaf) just generated. In this way, words in context-free languages may exhibit certain long-range correlations, as illustrated in Fig. 1.4. (In practical computer languages, these long-range correlations are typically manifest in the pairing of parentheses separated by many subexpressions.)
The production rules of a context-free grammar specify transformations for individual non-terminal symbols, independent of the ''context'' in which they appear. Context-sensitive grammars represent a generalization in which transformations for a particular symbol may depend on the strings of symbols that precede or follow it (its ''context''). However, the transformation or production rule for any string is required to yield a longer (or equal length) string . The set of all strings of the form for any forms a context-sensitive language, not represented by any context-free or simpler language. The words in a context-sensitive language may be viewed as formed from sequences of terminal nodes in a directed graph. The graph is a derivation tree rooted at the start symbol, but with connections representing context sensitivities added. The requirement implies that there are progressively more nodes at each stage: the length of a word in context-sensitive language thus gives an upper bound on the number of nodes that occur at any stage in its derivation. A machine that recognizes words in a context-sensitive language by enumerating all applicable derivation graphs need therefore only have a memory as large as the words to be recognized.
Unrestricted (type 0) languages are associated with universal computers. A system is considered capable of ''universal computation'' if, with some particular input, it can simulate the behaviour of any other computational system (5) A universal computer may thus be ''programmed'' to implement any finite algorithm. A universal Turing machine has an infinite memory, and a central processing unit with a particular ''instruction set.'' (The ''simplest'' known universal Turing machine has seven internal states, and a memory arranged as a line of sites, each having four possible values, and with one site accessible to the central processing unit at each time step (e.g. ).) Several quite different systems capable of universal computation have also been found. Among these are string manipulation systems which directly apply the production rules of type 0 languages; machines with one, infinite precision, arithmetic register; logic circuits analogous to those of practical digital electronic computers; and mathematical systems such as -calculus (general recursive functions). Some cellular automata have also been proved capable of universal computation. For example, a one-dimensional cellular automaton with and is equivalent to the simplest known universal Turing machine (e.g. ). (A two-dimensional cellular automata, the ''Game of Life'', with and a nine site neighbourhood, has also been proved computationally universal (e.g. ).) It is conjectured that all cellular automata in the fourth class indicated above are in fact capable of universal computation .
There are many problems which can be stated in finite terms, but which are ''undecidable'' in a finite time, even for a universal computer (6) An example is the ''halting problem'': to determine whether a particular computer will ''halt'' in a finite time, given particular input. The only way to predict the behaviour of some system is to execute some procedure in a universal computer; but if, for example, is itself a universal computer, then the procedure must reduce to a direct simulation, and can run no more than a finite amount faster than the evolution of itself. The infinite time behaviour of cannot therefore be determined in general in a finite time. For a cellular automaton, an analogue of the halting problem is to determine whether a particular finite initial configuration will ultimately evolve to the null configuration.
Any problem which depends on the results of infinite information processing may potentially be undecidable. However, when the information processing is sufficiently simple, there may be a finite ''short-cut'' procedure to determine the solution. For example, the information processing corresponding to the evolution of cellular automata with only class 1 or 2 behaviour appears to be sufficiently simple that their infinite time behaviour may be found by finite computation. Many problems concerning the infinite time behaviour of class 3 and 4 cellular automata may, however, be undecidable. For example, the entropies of the invariant sets for class 3 and 4 cellular automata may in general be non-computable numbers. This would be the case if the languages corresponding to these limit sets were of type 0 or 1.
It seems likely, in fact, that the consequences of infinite evolution in many dynamical systems may not be described in finite mathematical terms, so that many questions concerning their limiting behaviour are formally undecidable. Many features of the behaviour of such systems may be determined effectively only by explicit simulation: no general predictions are possible.
Even for results that can in principle be obtained by finite computation there is a wide variation in the magnitude of time (or memory resources) required. Several classes of finite computations may be distinguished (e.g. ).
The first class (denoted P) consists of problems that can be solved by a deterministic procedure in a time given by some polynomial function of the size of their input. For example, finding the successor of a length sequence in a cellular automaton takes (at most) a time linear in , and is therefore a problem in the class . Since most universal computers can simulate any other computer in a polynomial time, the times required on different computers usually differ at most by a polynomial transformation, and the set of problems in class is defined almost independent of computer.
Nondeterministic polynomial time problems () form a second class. Solutions to such problems may not necessarily be obtained in a polynomial time by a systematic procedure, but the correctness of a candidate solution, once guessed, can be tested in a polynomial time. Clearly , and there is considerable circumstantial evidence that . The problem of finding a pre-image for a length sequence under cellular automaton evolution is in the class .
The problem classes P and NP are characterized by the times required for computations. One may also consider the class of problems PSPACE that require memory space given by a polynomial function of the size of the input, but may take an arbitrary time. There is again circumstantial evidence that .
Just as there exist universal computers which, when given particular input, can simulate any other computer, so, analogously, there exist ''NP-complete'' (or ''PSPACE-complete'') problems which, with particular input, correspond to any NP (or PSPACE) problem of a particular size (e.g. [6, 7]). Many NP and PSPACE complete problems are known. An example of an NP-complete problem is ''satisfiability'': finding truth values for variables which make a particular Boolean expression true. If then there is essentially no faster method to solve this problem than to try the possible sets of values. (It appears that any method must at least require a time larger than any polynomial in .) As discussed in Sect. 6, it is likely that the problem of finding pre-images for sequences in certain cellular automata, or of determining whether particular sequences are ever generated, is NP-complete. This would imply that no simple description exists even for some finite time properties of cellular automata: results may be found essentially only by explicit simulation of all possibilities.