Stephen Wolframstephenwolfram.com
Publications by Stephen Wolfram * Articles * Cellular Automata * Computation Theory of Cellular Automata (1984)
Computation Theory of Cellular Automata (1984)


3. Properties of Finite Time Sets

This section discusses some properties of the regular language sets generated by a finite number of steps of cellular automaton evolution, and constructed by the procedure of Sect. 2.

We consider as a sample set the 32 ''legal'' cellular automaton rules with and . A rule is considered legal if it is symmetric, and maps the null configuration (with all site values 0) to itself. Each of the 256 possible , cellular automaton rules is conveniently labelled by a ''rule number,'' defined as the decimal equivalent of the sequence of binary digits (analogous to Eq. (2.1)) [1].

Tables 3.1, 3.2 and 3.3 give some properties of the sets generated by a few time steps in the evolution of the 32 legal , cellular automata (7).These properties are deduced from the minimal DFA which describe the , obtained according to the construction of Sect. 2.

The minimal DFA corresponding to the trivial language illustrated in Fig. 1.1a has just one state. The minimal DFA corresponding to the minimal regular grammars for more complicated languages have progressively more states. The total number of states in the minimal DFA that generates a set provides a measure of the ''complexity'' of the set , considered as a regular language. gives the size of the shortest specification of the set in terms of regular languages: this shortest specification becomes longer as the complexity of the set increases.



[ Figure 3.1 ] Non-deterministic finite automaton graphs (de Bruijn graphs) analogous to Fig. 2.1 for the cases , , and , .

Table 3.1 gives the ''regular language complexities'' for the sets generated at the first few time steps in the evolution of the legal , cellular automata. In all the cases given, is seen to be non-decreasing with time. Cellular automata with only class 1 or 2 appear to give which tend to constants after one or two time steps, or increase linearly or quadratically with time. Class 3 and 4 cellular automata usually give which increase rapidly with time. In general,



[ Table 3.1 ] Numbers of nodes (and arcs) in minimal deterministic finite automata (DFA) representing regular languages corresponding to sets of configurations generated after time steps in the evolution of legal , cellular automata. Each configuration corresponds to a path through the DFA state transition graph. The construction of Sect. 2 yields the DFA with the minimal number of nodes (states) that generates a given regular language . This DFA may be considered to give the shortest specification of viewed as a regular language. Its size measures the ''complexity'' of . The initial () set of configurations include all possible sequences of zeroes and ones, and correspond to a trivial regular language. Cellular automata with only class 1 or 2 behaviour yield regular languages whose complexities become constant, or increase as polynomials in . Cellular automata capable of class 3 or 4 behaviour usually lead to rapidly-increasing complexities. Bounds on these complexities are given when their exact calculation exceeded available computational resources. Some of the results in this table were obtained using the methods of [52] and [53].




[ Table 3.2 ] Characteristic polynomials for the adjacency matrices of state transition graphs for minimal DFA representing regular languages generated after one time step in the evolution of legal , cellular automata. The nonzero roots of these polynomials determine the number of distinct symbol sequences that can appear in configurations generated by the cellular automaton evolution. The maximal root determines the limiting entropy of the sequences.




[ Table 3.3 ] The length of the shortest distinct blocks of site values newly-excluded after exactly time steps in the evolution of legal , cellular automata. The notation indicates that the set of cellular automaton configurations forms a finite complement language (finite number of distinct excluded blocks). The notation signifies no new excluded blocks.

The upper bound is found to be attained in several cases for ; for larger , appears to grow at most exponentially with .

All possible sequences of symbols occur in the trivial language . In more complicated regular languages, only some number of the possible sequences of symbols may occur. Each sequence which occurs corresponds to a distinct path in the minimal DFA graph for the language. (Note that all distinct paths in a DFA correspond to different symbol sequences; this need not be the case in a NDFA graph.) The number of such paths is conveniently computed using a matrix representation for the DFA.

Consider as an example the set obtained by one time step in the evolution of the cellular automaton (2.1). The minimal DFA graph for this set is given in Fig. 2.4, and may be represented by the adjacency matrix

The elements of give the numbers of possible length paths in . For lengths from 1 to 10 these numbers are 2, 4, 7, 13, 24, 44, 81, 149, 274, and 504. In general, at least for large ,

where the are the eigenvalues of , and is the largest of them. These eigenvalues are determined from the characteristic polynomial for the minimal DFA adjacency matrix, given in the case of Eq. (3.2) by (8)

The largest (real) root of this characteristic polynomial (known as the ''index'' of the graph [19]) is given by the cubic algebraic integer

The set of infinite configurations generated by cellular automaton evolution may be considered to form a Cantor set. The dimension of this Cantor set is given by

and is equal to the topological entropy of the shift mapping restricted to this set (e.g. [20]). For any regular language, this entropy is given according to Eqs. (3.2) by [21]

For the case of Eq. (3.2), the entropy is thus

Table 3.2 gives the characteristic polynomials for the regular languages obtained after one time step in the evolution of the 32 legal , cellular automata, together with their largest real roots . All the nonzero roots of the appear in the expression (3.3) for , and are therefore the same for all possible DFA corresponding to a particular regular language. (They may thus be considered ''topological invariants.'') Additional powers of may appear in the characteristic polynomials obtained from non-minimal DFA.

The characteristic polynomials such as those in Table 3.2 obtained from regular languages are always monic (the term with the highest power of that appears in them always has unit coefficient). The largest roots of the for regular languages are thus always algebraic integers (e.g. [22]) (9) ,so that the entropies for regular languages are always the logarithms of algebraic integers. The minimal polynomial with as a root has a degree not greater than the size of the minimal DFA for a regular language . This bound is usually not reached, since the characteristic polynomial is usually reducible, as seen in Table 3.2. Notice that in many cases, has several factors with equal degrees. (The factorizations of the are related to the colouring properties of the corresponding graphs [19]. Note that graphs corresponding to minimal DFA always have trivial automorphism groups.) Factors (other than ) with smaller degrees appear to be associated with transient subgraphs in the minimal DFA graph.

The entropy (3.6) characterizes the number of distinct symbol sequences generated by cellular automaton evolution, without regard to the probabilities with which they occur. One may also define a measure entropy (e.g. [20])

in terms of the probabilities for length sequences. Starting from an initial ensemble in which all symbol sequences of a given length occur with equal probabilities, the probability for a sequence after time steps is given by

where is the number of (length ) -step preimages of the sequence under the cellular automaton mapping . This number is equal to the number of distinct paths through the NDFA graph analogous to in Fig. 2.1 that yield the sequence . It may also be computed from reduced NDFA graphs analogous to of Fig. 2.2 by including a weight for each path, equal to the product of weights giving the number of unreduced nodes combined into each node on the path.

The set of configurations generated by cellular automaton evolution always contracts or remains unchanged with time, as implied by Eq. (1.5). The entropies associated with the sets are therefore non-increasing with time. Class 1 cellular automata are characterized by (spatial) entropies that tend to zero with time [2]. Class 2, 3 and 4 cellular automata generate sets of configurations with nonzero limiting spatial entropy. (Class 2 cellular automata nevertheless yield patterns essentially periodic in time, with zero temporal entropy.)

Some cellular automata have the special property that

so that all possible configurations can occur at any time in their evolution, and the entropies of the are always equal to one. Such surjective cellular automaton rules may be recognized by the presence of all possible outgoing arcs at each node in a DFA representing the grammar of the set obtained after one time step in its evolution. The finite maximum size for such a DFA, constructed as in Sect. 2, ensures that this procedure (cf. [24--27, 2]) for determining the surjectiveness of any cellular automaton rule is a finite one (10).

Since there are outgoing arcs at each node in the original NDFA analogous to Fig. 2.1 for any cellular automaton rule, the rule is surjective if in all cases these arcs carry distinct symbols (so that the NDFA is in fact a DFA). This occurs whenever the local cellular automaton mapping is injective with respect to its first or last argument (as for additive rules [29, 16] such as 90, 150 or 204 in Tables 3.1--3.3). However, at least when or , there exist surjective cellular automata for which this does not occur [25, 30]. Since all surjective cellular automata must yield the same trivial minimal DFA, it is possible that a reversal of the minimization and subset algorithms discussed in Sect. 2 could be used to generate all NDFA analogous to Fig. 2.1 that correspond to surjective rules.

Surjective cellular automata yield trivial regular languages, in which all possible blocks of symbols may appear. Some cellular automata generate the slightly more complicated ''finite complement'' regular languages, in which a finite set of distinct blocks are excluded. (Such languages are equivalent to ''subshifts of finite type'' (e.g. [10, 11]).) An example of a finite complement language, illustrated in Fig. 1.1b, consists of all sequences from which the block of sites 11 is absent. To construct the grammar for a finite complement language in which blocks of length are excluded, first form a graph analogous to Fig. 2.1, but with sequences of length at each node. Each arc then corresponds to a length sequence, and may be labelled by the last symbol in the sequence. With this labelling, one arc carrying each of the possible symbols emanates from each of the nodes, so that the graph represents a DFA. Removing arcs corresponding to the excluded length blocks then yields the graph for a DFA that recognizes the finite complement language with these blocks absent. Examples of the resulting graphs for two simple cases are shown in Fig. 3.1.



[ Figure 3.2 ] Non-deterministic finite automata (NDFA) corresponding to finite complement regular languages consisting of sequences of zeroes and ones in which a the block 11 is excluded, and b the block 111 is excluded. The graphs are constructed from analogues of Fig. 2.1 by dropping arcs corresponding to excluded blocks.

The minimal DFA for a finite complement language with a maximal distinct excluded block of length has at most states, and at least states. An excluded block is considered ''distinct'' if it contains no excluded sub-blocks. (Hence, for example, in the language of Figs. 1.1a and 3.1a, the excluded block 11 is considered distinct, but 110, 111 and so on, are not.)

Any path through the minimal DFA graph for a regular language of length greater than must contain a cycle, which retraverses some arcs. If no symbol sequence of length less than is excluded, then no sequence of any length can therefore be excluded, and the corresponding language must be trivial. If some symbol sequences of length less than are excluded, but no distinct sequences with lengths between and are excluded, then no longer distinct sequences can be excluded, and the corresponding language must be a finite complement one. If further distinct excluded blocks with lengths between and are found, then an infinite series of longer distinct excluded blocks must exist, and the language cannot be a finite complement one.

The language of Fig. 2.4, generated by the evolution of the , cellular automaton with rule number 76, is a finite complement one, in which 111 is the only distinct excluded block. The language of Fig. 2.5, obtained after one time step in the evolution of rule number 18, is not a finite complement one. The block 111 is the shortest excluded in this case. But the distinct length 7 block 1101011 is also excluded, as are the two distinct length 8 blocks 11001011 and 11010011, three distinct length 9 blocks (11010100011, 110001011, 110010011), four distinct length 10 blocks, and so on.

The length of the shortest excluded block in a language generated by cellular automaton evolution (denoted in [2]) is in general given by the shortest distance from the start node in the corresponding DFA graph to an ''incomplete'' node, with less than outgoing arcs. If the cellular automaton rule is not surjective, then

Whenever cellular automaton evolution is irreversible, the set of configurations generated contracts with time, and progressively more distinct blocks are excluded. One may define to be the length of the shortest newly-excluded block at time step in the evolution of a cellular automaton. The values of obtained in the first few time steps of evolution according to the 32 legal , cellular automaton rules are given in Table 3.3. In most cases, is seen to increase with time, indicating that progressively finer subsets of are excluded, and qualitatively reflecting the increase of . In general, however, need not increase monotonically with time. A length block is excluded after time steps if there is no initial length block that evolves into it. A length block is newly excluded at time step if and only if no length blocks allowed at time step evolve to it, but at least one length block newly excluded at time step would evolve to it. The length of the shortest newly excluded block at time is thus bounded by

Table 3.3 includes several cases for which the lower bound is realized.

The sets of infinite symbol sequences generated by cellular automaton evolution are characterized in part by the numbers and lengths of allowed and excluded finite blocks which appear in them. A further characterization may be given in terms of the number of infinite sequences with (spatial) period that appear. This number is related to the number of distinct cycles in the minimal DFA graph for . Cycles are considered distinct if the sequences of symbols that appear in them are distinct. The enumeration of cycles thus requires knowledge of the arc labelling as well as connectivity of the DFA graph.

Just as the number of finite blocks for all may be summarized in the characteristic polynomial , so also the number of periodic configurations may be summarized in the zeta function (e.g. [10, 11])

For all regular languages is a rational function of [31]. For the special case of finite complement languages,

A finite procedure may be given [32] to compute for any regular language.

previous  l  next