![]() ![]() ![]() |
This section describes quantitative statistical measures of order and chaos in patterns generated by cellular automaton evolution. These measures may be used to distinguish the four classes of behaviour identified qualitatively above.
Consider first the statistical properties of configurations generated at a particular time step in cellular automaton evolution. A disordered initial state, in which each site takes on its
possible values with equal independent probabilities, is statistically random. Irreversible cellular automaton evolution generates deviations from statistical randomness. In a random sequence, all
possible subsequences (``blocks'') of length
must occur with equal probabilities. Deviations from randomness imply unequal probabilities for different subsequences. With probabilities
for the
possible sequences of site values in a length
block, one may define a specific ``spatial set entropy''

where
for
and
, and a specific ``spatial measure entropy''

In both cases, the superscript
indicates that ``spatial'' sequences (obtained at a particular time step) are considered. The ``set entropy'' (4.1) is determined directly by the total number
of length
blocks generated (with any nonzero probability) in cellular automaton evolution, according to

In the ``measure entropy'' (4.2) each block is weighted with its probability, so that the result depends explicitly on the probability measure for different cellular automaton configurations, as indicated by the subscript
. Set entropy is often called ``topological entropy''; measure entropy is sometimes referred to as ``metric entropy'' (2) (e.g. [6]). For blocks of length 1, the measure entropy
is related to the densities
, of sites with each of the
possible values
.
is related to the densities of ``digrams'' (blocks of length 2), and so on. In general, the measure entropy gives the average ``information content'' per site computed by allowing for correlations in blocks of sites up to length
. Note that the entropies (4.1) and (4.2) may be considered to have units of (
-ary) bits per unit distance.
In the equations below,
stands for either set entropy
or for measure entropy
.
The definitions (4.1) and (4.2) yield immediately

The first inequality is saturated (equality holds) only for ``equidistributed'' systems, in which all nonzero block probabilities
are equal. The second inequality is saturated if all possible length
blocks of site values occur, but perhaps with unequal probabilities.
only for ``
-random'' sequences [7], in which all
possible sequences of
site values occur with equal probabilities. In addition to (4.4), the definitions (4.1) and (4.2) imply

if and only if just one length
block occurs with nonzero probability, so that
also. As discussed below, the inequality (4.5) is saturated for class 1 cellular automata.
Both set and measure entropies satisfy the subadditivity condition

The inequality is saturated if successive blocks of sites are statistically uncorrelated. In general, it implies some decrease in
with
(for example,
). For cellular automata with translation invariant initial probability measures, stronger constraints may be obtained (analogous to those for ``stationary'' processes in communication theory [8]). First, note that bounds on
valid for any set of probabilities
also apply to
, since
may formally be reproduced from the definition (4.2) for
by a suitable (extreme) choice of the
. The probability
for the sequence of site values
is given in general by

where
denotes the conditional probability for a site value
, preceded by site values
. Defining a total entropy

and corresponding conditional total entropy

one obtains

Hence

so that the set and measure entropies for a translationally invariant system decrease monotonically with the block size
. One finds in addition in this case that

so that
is a convex function of
.
With the definition
, this implies that there exists a critical block size
, such that

The significance and values of the critical block size
will be discussed in section 7.
The entropies
and
may be evaluated either for many blocks in a single cellular automaton configuration, or for blocks in an ensemble of different configurations. For smooth probability measures on the ensemble of possible initial configurations, the results obtained in these two ways are almost always the same. (A probability measure will be considered ``smooth'' if changes in the values of a few sites in an infinite configuration lead only to infinitesimal changes in the probability for the configuration.) The set entropy
is typically independent of the probability measure on the ensemble, for any smooth measure. The measure entropy
in general depends on the probability measure for initial configurations, although for class 3 cellular automata, it is typically the same for at least a large class of smooth measures. Notice that with smooth measures, the values of
and
are the same whether the length
blocks used in their computation are taken disjoint or overlapping.
The entropies (4.1) and (4.2) are defined for infinite cellular automata. A corresponding definition may be given for finite cellular automata, with a maximum block length given by the total number of sites
in the cellular automaton. The entropies
and
are related to global properties of the state transition diagram for the finite cellular automaton. The value of
at a particular time is determined by the fraction of possible configurations which may be reached at that time by evolution from any initial configuration. The limiting value of
at large times is determined by the fraction of configuration on cycles in the state transition graph. Starting from an initial ensemble in which all
configurations occur with equal probabilities, the limiting value of
is equal to the limiting value of
if all transient trees in the state transition graph for the finite cellular automaton are identical, so that all configurations with nonzero probabilities are generated with the same probability (cf. [2]).
As mentioned in section 2, the configurations of an infinite cellular automaton may be considered as elements of a Cantor set. For an ensemble of disordered configurations (in which each site takes on its
possible values with equal independent probabilities), this Cantor set has fractal dimension 1. Irreversible cellular automaton evolution may lead to an ensemble of configurations corresponding to elements of a Cantor set with dimension less than one. The limiting value of
as
gives the fractal or ``set'' dimension of this set.
Relations between entropy and dimension may be derived in many ways (e.g. [6], [7], [8] and [9]). Consider a set of numbers in the interval
of the real line. Divide this interval into
bins of width
, and let the fraction of bins containing numbers in the set be
. For large
(small bin width), this number grows as
. The exponent
is the Kolmogorov dimension (or ``capacity'' (cf. [8])) of the set. If the set contains all real numbers in the interval
, then
, and
, as expected. If the set contains only a finite number of points, then
must tend to a constant for large
, yielding
. The classic Cantor set consists of real numbers in the interval
, whose ternary decomposition contains only the digits
and
. Dividing the interval into
equal bins, it is clear that
of these bins contain points in the set. The dimension of the set is thus
. This dimension may also be found by an explicit recursive geometrical construction, using the fact that the set is ``self-similar'', in the sense that with appropriate magnification, its parts are identical to the whole.
The example above suggests that one may define a ``set dimension''
according to

where
is the number of bins which contain elements of the set. The bins are of equal size, and their total number is taken as
. Except in particularly pathological examples, (3) the dimension obtained with this definition is equal to the more usual Hausdorff (or ``fractal'') dimension (e.g. [11]) obtained by considering the number of patches at arbitrary positions required to cover the set (rather than the number of fixed bins containing elements of the set).
The definition (4.14) may be applied directly to cellular automaton configurations. The
``bins'' may be taken to consist of cellular automaton configurations in which a block of
sites has a particular sequence of values. The definition (4.3) of set entropy then shows that the set dimension is given by

A disordered cellular automaton configuration, in which all possible sequences of site values occur with nonzero probability (or an ensemble of such configurations), gives
, as expected. Similarly, a homogeneous configuration, such as the null configuration, gives
.
The set of configurations which appear at large times in the evolution of a cellular automaton constitute the attractors for the cellular automaton. The set dimension of these attractors is given in terms of the entropies for configurations appearing at large times by eq. (4.15).
Accurate direct evaluation of the set entropy
from cellular automaton configurations typically requires sampling of many more than
length
blocks. Inadequate samples yield systematic underestimates of
. Direct estimates are most accurate when all nonzero probabilities for length
blocks are equal. In this case, a sample of
blocks yields an entropy underestimated on average by approximately

Unequal probabilities increase the magnitude of this error, and typically prevent the generation of satisfactory estimates of
from direct simulations of cellular automaton evolution. (If the probabilities follow a log normal distribution, as in many continuous chaotic dynamical systems [12], then the exponential in eq. (4.16) is apparently replaced by a power [13].)
The dimension (4.15) is given as the limiting exponent with which
increases for large
. In the formula (4.15), this exponent is obtained as the limit of
for large
. If
indeed increases roughly exponentially with
, then the alternative formula

is typically more accurate if entropy values are available only for small
.
The set dimension (4.15) may be used to characterize the set of configurations occurring on the attractor for a cellular automaton, without regard to their probabilities. One may also define a ``measure dimension''
which characterizes the probability measure for the configurations (cf. [12]):

It is clear that

The measure dimension
is equal to the ``average information per symbol'' contained in the sequence of site values in a cellular automaton configuration. If the sequence is completely random (or ``
-random'' [7]), then the probabilities
for all
sequences of length
must be equal for all
, so that
. In this case, there is no redundancy or pattern in the sequence of site values, so that determination of each site value represents accquisition of one (
-ary) bit of information. A cellular automaton configuration with any structure or pattern must give
.
In direct simulations of cellular automaton evolution, the probabilities
for each possible length
block are estimated from the frequencies with which the blocks occur. These estimated probabilities are thus subject to Gaussian errors. Although the individual estimated probabilities are unbiased, the measure entropy deduced from them according to eq. (4.2), is systematically biased. Its mean typically yields a systematic underestimate of the true measure entropy, and with fixed sample size, the underestimate deteriorates rapidly with increasing
, making an accurate estimate of
impossible. However, since an unbiased estimate may be given for any polynomial function of the
, unbiased estimated upper and lower bounds for the measure entropy may be obtained from estimates for polynomials in
just larger and just smaller than
for
[14]. In this way, it may be possible to obtain more accurate estimates of
for large
, and thus of
.
The ``spatial'' entropies (4.1) and (4.2) were defined in terms of the sequence of site values in a cellular automaton configuration at a particular time step. One may also define ``temporal'' entropies which characterize the sequence of values taken on by a particular site through many time steps of cellular automaton evolution, as illustrated in fig. 7. With probabilities
for the
possible sequences of values for a site at
successive time steps, one may define a specific temporal set entropy in analogy with eq. (4.1) by

and a specific temporal measure entropy in analogy with eq. (4.2) by


These entropies satisfy relations directly analogous to those given in eqs. (4.3) through (4.6) for spatial entropies. They obey relations analogous to (4.11) and (4.12) only for cellular automata in ``equilibrium'', statistically independent of time. The temporal entropies (4.20) and (4.21) may be considered to have units of (
-ary) bits per unit time.
Sequences of values in particular cellular automaton configurations typically have little similarity with the ``time series'' of values attained by a particular site under cellular automaton evolution. The spatial and temporal entropies for a cellular automaton are therefore in general quite different. Notice that the spatial entropy of a cellular automaton configuration may be considered as the temporal entropy of a pure shift mapping applied to the cellular automaton configuration.
Just as dimensions may be assigned to the set of spatial configurations generated in cellular automaton evolution, so also one may assign dimensions to the set of temporal sequences generated by the evolution. The temporal set dimension may be defined in analogy with eq. (4.15) by

and the temporal measure dimension may be defined by

If the evolution of a cellular automaton is periodic, so that each site takes on a fixed cycle of values, then

As discussed in section 6 below, class 2 cellular automata yield periodic structures at large times, so that the correspondingly temporal entropies vanish.
As a generalization of the spatial and temporal entropies introduced above, one may consider entropies associated with space-time ``patches'' in the patterns generated by cellular automaton evolution, as illustrated in fig. 7. With probabilities
for the
possible patches of spatial width
and temporal extent
, one may define a set entropy

and a measure entropy

Clearly


If no relation existed between configurations at successive time steps then the entropies (4.25) and (4.26) would be bounded simply by

The cellular automaton rules introduce definite relations between successive configurations and tighten this bound. In fact, the values of all sites in a
space-time patch are determined according to the cellular automaton rules by the values in the ``rind'' of the patch, as indicated in fig. 7. The rind contains only
sites (where
is the ``range'' of the cellular automaton rule, defined in section 2), so that

For large
(and fixed
), therefore

If both
and
tend to infinity with
fixed, eq. (4.30) implies that the ``information per site''
in a
patch must tend to zero. The evolution of cellular automata can therefore never generate random space-time patterns.
With
,
fixed, the length
horizontal section of the rind makes a negligible contribution to the entropies. The entropy is maximal if the
vertical columns in the rind are statistically independent, so that

In addition,

where the bounds are saturated for large
if the time series associated with different sets of sites are statistically uncorrelated.
The limiting set entropy

for temporally-extended patches is a fundamental quantity equivalent to the set (or topological) entropy of the cellular automaton mapping in symbolic dynamics.
may be considered as a dimension for the mapping. It specifies the asymptotic rate at which the number of possible histories for the cellular automaton increases with time. The limiting measure entropy

gives the average amount of ``new information'' contained in each cellular automaton configuration, and not already determined from previous configurations. Equations (4.31) and (4.32) show that

In addition,

The basic cellular automaton time evolution rule (2.1) implies that the value
of a site
at a particular time step depends on sites a maximum distance
away on the previous time step according to the function
. After
time steps, the values of the site could depend on sites at distances up to
, so that features in patterns generated by cellular automaton evolution could propagate at ``speeds'' up to
sites per time step. For many rules, however, the value of a site after many time steps depends on fewer initial site values, and features may propagate only at lower speeds. In general, let
denote the minimum
for which the value of site
depends only on the initial values of sites
. Then the maximum propagation speed associated with the cellular automaton rule
may be defined as

(The rule is assumed symmetric; for nonsymmetric rules, distinct left and right propagation speeds may be defined.) Clearly,

When
, finite regions of the cellular automaton must ultimately become isolated, so that

The construction of fig. 8 shows that for any
,

In the limit
, the construction implies

The ratio of temporal to spatial entropy is thus bounded by the maximum propagation speed in the cellular automaton. The relation is consistent with the assignment of units to the spatial and temporal entropies mentioned above.
The corresponding inequalities for mapping entropies are:


The quantity
defined by eq. (4.37) gives the maximum speed with which any feature in a cellular automaton may propagate. With many cellular automaton rules, however, almost all ``features'' propagate much more slowly. To define an appropriate maximum average propagation speed, consider the effect after many time steps of changes in the initial state. Let
denote the probability that the value of a site at position
is changed when the value of a site at position
is changed
time steps before. The form of
for various cellular automaton rules is suggested by fig. 3.
may be considered as a Green function for the cellular automaton evolution. For large
,
typically vanishes outside a ``cone'' defined by
.
may then be considered as a maximum average propagation speed. In analogy with eqs. (4.41) and (4.42), one expects

Mapping and temporal entropies thus vanish for cellular automata with zero maximum average propagation speed. Cellular automata in class 2 have this property.

The maximum average propagation speed
specifies a cone outside which
almost always vanishes. One may also define a minimum average propagation speed
, such that
for almost any
.
The Green function
gives the probability that a particular site is affected by changes in a previous configuration. The total effect of changes may be measured by the ``Hamming distance''
between configurations before and after the changes, defined as the total number of site values which differ between the configurations after
time steps. (
is analogous to Lyapunov exponents for continuous dynamical systems.) Changing the values of initial sites in a small region,
may be given as a space integral of the Green function, and for large
obeys the inequality

to be compared with the result (4.43) obtained above.
The definitions and properties of dimension given above suggest that the behaviour of these quantities determines the degree of ``chaotic'' behaviour associated with cellular automaton evolution. ``Spatial chaos'' occurs when
, and ``temporal chaos'' when
. Temporal chaos requires a nonzero maximum average propagation speed for features in cellular automaton patterns, and implies that small changes in initial conditions lead to effects ever-increasing with time.