**Christian Apologetics Related to Science **

Intelligent Design as a Theory of Information

William A. Dembski*

Center for the Renewal of Science and Culture

Discovery Institute

1201 Third Ave., Suite 3950

Seattle, WA 98101

*PSCF* 49 (September 1997): 180.

For the scientific community, intelligent design represents creationism's latest grasp at scientific legitimacy. Accordingly, intelligent design is viewed as yet another ill-conceived attempt by creationists to straightjacket science within a religious ideology. But, in fact, intelligent design can be formulated as a scientific theory having empirical consequences and devoid of religious commitments. Intelligent design can be unpacked as a theory of information. Within such a theory, information becomes a reliable indicator of design as well as a proper object for scientific investigation. In my paper, I shall (1) show how information can be reliably detected and measured, and (2) formulate a conservation law that governs the origin and flow of information. My broad conclusion is that information is not reducible to natural causes, and that the origin of information is best sought in intelligent causes. Intelligent design, thereby, becomes a theory for detecting and measuring information, explaining its origin, and tracing its flow.

Information

In *Steps Towards Life,* Manfred Eigen identifies
what he regards as the central problem facing origins-of-life research: "Our task is
to find an algorithm, a natural law that leads to the origin of information."^{1} Eigen is only half right. To determine how life began, it is
indeed necessary to understand the origin of information. Even so, neither algorithms nor
natural laws can produce information. The great myth of modern evolutionary biology is
that information can be gotten on the cheap without recourse to intelligence. It is this
myth I seek to dispel, but to do so I shall need to give an account of information. No one
disputes that there is such a thing as information. As Keith Devlin remarks:

Our very lives depend upon it, upon its gathering, storage, manipulation, transmission, security, and so on. Huge amounts of money change hands in exchange for information. People talk about it all the time. Lives are lost in its pursuit. Vast commercial empires are created in order to manufacture equipment to handle it.

^{2}

But what exactly is information? The burden of this paper is to answer this question, presenting an account of information that is relevant to biology.

The fundamental intuition underlying information is not, as is sometimes thought, the transmission of signals across a communication channel, but rather the actualization of one possibility to the exclusion of others. As Fred Dretske puts it:

Information theory identifies the amount of information associated with, or generated by, the occurrence of an event (or the realization of a state of affairs) with the reduction in uncertainty, the elimination of possibilities, represented by that event or state of affairs.

^{3}

To be sure, whenever signals are transmitted across a
communication channel, one possibility is actualized to the exclusion of others, namely,
the signal that was transmitted to the exclusion of those that weren't. But this is only a
special case. Information in the first instance presupposes not some medium of
communication, but contingency. Robert Stalnaker makes this point clearly: "Content
requires contingency. To learn something, to acquire information, is to rule out
possibilities. To understand the information conveyed in a communication is to know what
possibilities would be excluded by its truth."^{4} For there
to be information, there must be a multiplicity of distinct possibilities, any one of
which might happen. When one of these possibilities does happen and the others are ruled
out, information becomes actualized. Indeed, information in its most general sense can be
defined as the actualization of one possibility to the exclusion of others (observe that
this definition encompasses both syntactic and semantic information).

This way of defining information may seem counterintuitive since we often speak of the information inherent in possibilities that are never actualized. Thus we may speak of the information inherent in flipping one-hundred heads in a row with a fair coin even if this event never happens. There is no difficulty here. In counterfactual situations, the definition of information needs to be applied counterfactually. Thus to consider the information inherent in flipping one-hundred heads in a row with a fair coin, we treat this event/possibility as though it were actualized. Information needs to be referenced not just to the actual world, but also cross-referenced with all possible worlds.

Complex Information

How does our definition of information apply to biology,
and to science more generally? To render information a useful concept for science we need
to do two things: first, show how to measure information; second, introduce a crucial
distinction-the distinction between *specified* and *unspecified* information.
First, let us show how to measure information. In measuring information, it is not enough
to count the number of possibilities excluded, and offer this number as the relevant
measure of information. The problem is that a simple enumeration of excluded possibilities
tells us nothing about how those possibilities were individuated in the first place.
Consider, for instance, the following individuation of poker hands:

#1 A royal flush. #2 Everything else.

To learn that something other than a royal flush was dealt (i.e., possibility #2) is clearly to acquire less information than to learn that a royal flush was dealt (i.e., possibility #1). Yet if our measure of information is simply an enumeration of excluded possibilities, the same numerical value must be assigned in both instances since in both instances a single possibility is excluded.

It follows, therefore, that how we measure information needs to be independent of whatever procedure we use to individuate the possibilities under consideration. The way to do this is not simply to count possibilities, but to assign probabilities to these possibilities. For a thoroughly shuffled deck of cards, the probability of being dealt a royal flush (i.e., possibility #1) is approximately .000002 whereas the probability of being dealt anything other than a royal flush (i.e., possibility #2) is approximately .999998. Probabilities by themselves, however, are not information measures. Although probabilities properly distinguish possibilities according to the information they contain, nonetheless probabilities remain an inconvenient way of measuring information. There are two reasons for this. First, the scaling and directionality of the numbers assigned by probabilities need to be recalibrated. We are clearly acquiring more information when we learn someone was dealt a royal flush than when we learn someone wasn't dealt a royal flush. And yet the probability of being dealt a royal flush (i.e., .000002) is minuscule compared to the probability of being dealt something other than a royal flush (i.e., .999998). Smaller probabilities signify more information, not less.

The second reason probabilities are inconvenient for measuring information is that they are multiplicative rather than additive. If I learn that Alice was dealt a royal flush playing poker at Caesar's Palace and that Bob was dealt a royal flush playing poker at the Mirage, the probability that both Alice and Bob were dealt royal flushes is the product of the individual probabilities. Nonetheless, it is convenient for information to be measured additively so that the measure of information assigned to Alice and Bob jointly being dealt royal flushes equals the measure of information assigned to Alice being dealt a royal flush plus the measure of information assigned to Bob being dealt a royal flush.

An obvious way to transform probabilities that
circumvents both these difficulties is to apply a negative logarithm to the probabilities.
Applying a negative logarithm assigns the more information to the less probability and
transforms multiplicative probability measures into additive information measures, because
the logarithm of a product is the sum of the logarithms. What's more, in deference to
communication theorists, it is customary to use the logarithm to the base 2. The rationale
for this choice of logarithmic base is as follows. The most convenient way for
communication theorists to measure information is in bits. Any message sent across a
communication channel can be viewed as a string of 0's and 1's. For instance, the ASCII
code uses strings of eight 0's and 1's to represent the characters on a typewriter, with
whole words and sentences in turn represented as strings of such character strings. In
like manner, all communication may be reduced to the transmission of sequences of 0's and
1's. Given this reduction, the obvious way for communication theorists to measure
information is in the number of bits transmitted across a communication channel. Since the
negative logarithm to the base 2 of a probability corresponds to the average number of
bits needed to identify an event of that probability, the logarithm to the base 2 is the
canonical logarithm for communication theorists. Thus, we define the measure of
information in an event of probability *p* as log_{2}*p*.^{5}

What about the additivity of this information measure?
Recall the example of Alice being dealt a royal flush playing poker at Caesar's Palace and
Bob being dealt a royal flush playing poker at the Mirage. Let's call the first event A
and the second B. Since randomly dealt poker hands are probabilistically independent, the
probability of A and B taken jointly equals the product of the probabilities of A and B
taken individually. Symbolically, P(A&B) = P(A) x P(B). Given our logarithmic
definition of information, we thus define the amount of information in an event E as I(E)
=_{def} log_{2}P(E). It then follows that P(A&B) = P(A) x P(B)
if and only if I(A&B) = I(A) + I(B). Since in the example of Alice and Bob P(A) = P(B)
= .000002, I(A) = I(B) = 19, and I(A&B) = I(A)<|>+<|>I(B) = 19 + 19 = 38.
Thus the amount of information inherent in Alice and Bob jointly obtaining royal flushes
is 38 bits.

Since many events are probabilistically independent, information measures exhibit much additivity. But since many events are also correlated, information measures exhibit much nonadditivity as well. In the example of Alice and Bob, Alice being dealt a royal flush is probabilistically independent of Bob being dealt a royal flush, and so the amount of information in Alice and Bob both being dealt royal flushes equals the sum of the individual amounts of information.

Since many events are probabilistically independent, information measures exhibit much additivity. But since many events are also correlated, information measures exhibit much nonadditivity as well.

Now let's consider a different example. Alice and Bob together toss a coin five times. Alice observes the first four tosses but is distracted, and so misses the fifth toss. On the other hand, Bob misses the first toss, but observes the last four tosses. Let's say the actual sequence of tosses is 11001 (1 = heads, 0 = tails). Thus Alice observes 1100* and Bob observes *1001. Let A denote the first observation, B the second. It follows that the amount of information in A&B is the amount of information in the completed sequence 11001, namely, 5 bits. On the other hand, the amount of information in A alone is the amount of information in the incomplete sequence 1100*, namely 4 bits. Similarly, the amount of information in B alone is the amount of information in the incomplete sequence *1001, also 4 bits. This time the information doesn't add up: 5 = I(A&B) <F128M>Ö<F255D> I(A)<|>+<|>I(B) = 4<|>+<|>4 = 8.

Here A and B are correlated. Alice knows all but the last bit of information in the completed sequence 11001. Thus when Bob gives her the incomplete sequence *1001, all Alice really learns is the last bit in this sequence. Similarly, Bob knows all but the first bit of information in the completed sequence 11001. Thus when Alice gives him the incomplete sequence 1100*, all Bob really learns is the first bit in this sequence. What appears to be four bits of information actually ends up being only one bit of information once Alice and Bob factor in the prior information they possess about the completed sequence 11001. If we introduce the idea of conditional information, this is just to say that 5 = I(A&B) = I(A)<|>+<|>I(B|A) = 4<|>+<|>1. I(B|A), the conditional information of B given A, is the amount of information in Bob's observation once Alice's observation is taken into account. This, as we just saw, is 1 bit.

I(B|A), like I(A&B), I(A), and I(B), can be
represented as the negative logarithm to the base two of a probability, only this time the
probability under the logarithm is a conditional, as opposed to an unconditional,
probability. By definition, I(B|A)<|>=<|>_{def}<|>?log_{2}P(B|A),
where P(B|A) is the conditional probability of B given A. But since
P(B|A)<|>=<|>_{def<|>}P(A&B)/P(A) and since the logarithm of
a quotient is the difference of the logarithms, log_{2}P(B|A)<|>=<|>log_{2}P(A&B)<|>?<|>log_{2}P(A),
and so ?log_{2}P(B|A)<|>=<|>?log_{2}P(A&B)<|>+<|>log_{2}P(A),
which is just I(B|A)<|>=<|>I(A&B)<|>?<|>I(A). This last
equation is equivalent to

I(A&B)<|>=<|>I(A)<|>+<|>I(B|A)

(*)

Formula (*) holds with full generality, reducing to I(A&B)<|>=<|>I(A)<|>+<|>I(B) when A and B are probabilistically independent, in which case P(B|A)<|>=<|>P(B) and thus I(B|A)<|>=<|>I(B).

...the complexity of information increases as [the information measure] increases (or, correspondingly, as [the probability measure] decreases).

Formula (*) asserts that the information in both A and B jointly is the information in A plus the information in B that is not in A. Its point, therefore, is to spell out how much additional information B contributes to A. As such, this formula places tight constraints on the generation of new information. Does, for instance, a computer program, call the program A, by outputting some data, call the data B, generate new information? Computer programs are fully deterministic, and so B is fully determined by A. It follows that P(B|A)<|>=<|>1, and thus I(B|A)<|>=<|>0 (the logarithm of 1 is always 0). From Formula (*) it therefore follows that I(A&B)<|>=<|>I(A), and that the amount of information in A and B jointly is no more than the amount of information in A by itself.

For an example in the same spirit, consider that there
is no more information in two copies of Shakespeare's *Hamlet* than in a single copy.
This is patently obvious, and any formal account of information had better agree. To see
that our formal account does indeed agree, let A denote the printing of the first copy of *Hamlet*,
and B the printing of the second copy. Once A is given, B is entirely determined. Indeed,
the correlation between A and B is perfect. Probabilistically this is expressed by saying
the conditional probability of B given A is 1, namely, P(B|A)<|>=<|>1. In
information-theoretic terms this is to say that I(B|A)<|>=<|>0. As a result
I(B|A) drops out of Formula (*), and so I(A&B)<|>=<|>I(A). Our
information-theoretic formalism, therefore, agrees with our intuition that two copies of *Hamlet*
contain no more information than a single copy.

Information is a complexity-theoretic notion. As a
purely formal object, the information measure described here is a complexity measure.^{6} Complexity measures arise whenever we assign numbers to
degrees of complication. A set of possibilities will often admit varying degrees of
complication, ranging from extremely simple to extremely complicated. Complexity measures
assign non-negative numbers to these possibilities so that 0 corresponds to the most
simple and <F128M>à<F255D> to the most complicated. For instance,
computational complexity is always measured in terms of either time (i.e., number of
computational steps) or space (i.e., size of memory, usually measured in bits or bytes) or
some combination of the two. The more difficult a computational problem, the more time and
space are required to run the algorithm that solves the problem. For information measures,
the degree of complication is measured in bits. Given an event A of probability P(A),
I(A)<|>=<|>?log_{2}P(A) measures the number of bits associated with
the probability P(A). We therefore speak of the "complexity of information" and
say that the complexity of information increases as I(A) increases (or, correspondingly,
as P(A) decreases). We also speak of "simple" and "complex"
information according to whether I(A) signifies few or many bits of information. This
notion of complexity is important to biology since not just the origin of information
stands in question, but also the origin of complex information.

**Complex Specified Information**

Given a means of measuring information and determining
its complexity, we turn now to the distinction between *specified* and *unspecified*
information. This is a vast topic whose full elucidation is beyond the scope of this paper.^{7} Nonetheless, in what follows I shall try to make this
distinction intelligible, and offer some hints on how to make it rigorous. For an
intuitive grasp of the difference between specified and unspecified information, consider
the following example. Suppose an archer stands 50 meters from a large blank wall with bow
and arrow in hand. The wall, let us say, is sufficiently large that the archer cannot help
but hit it. Consider now two alternative scenarios. In the first scenario, the archer
simply shoots at the wall. In the second scenario, the archer first paints a target on the
wall, and then shoots at the wall, squarely hitting the target's bull's-eye. Let us
suppose that in both scenarios the arrow lands in the same spot. In both scenarios, the
arrow might have landed anywhere on the wall. What's more, any place where it might land
is highly improbable. It follows that in both scenarios highly complex information is
actualized. Yet the conclusions we draw from these scenarios are very different. In the
first scenario, we can conclude absolutely nothing about the archer's ability as an
archer, whereas in the second scenario, we have evidence of the archer's skill.

The actualization of a possibility (i.e., information) is specified if the possibility's actualization is independently identifiable by means of a pattern.

The obvious difference between the two scenarios is that
in the first, the information follows no pattern, whereas in the second, it does. Now the
information that tends to interest us as rational inquirers generally, and scientists in
particular, is not the actualization of arbitrary possibilities which correspond to no
patterns, but the actualization of circumscribed possibilities which *do* correspond
to patterns. There's more. Patterned information, though a step in the right direction,
still doesn't quite get us specified information. The problem is that patterns can be
concocted after the fact so that instead of helping explain information, the patterns are
merely read off already actualized information.

To see this, consider a third scenario in which an
archer shoots at a wall. As before, we suppose the archer stands 50 meters from a large
blank wall with bow and arrow in hand, the wall being so large that the archer cannot help
but hit it. As in the first scenario, the archer shoots at the wall while it is still
blank. This time suppose that after having shot the arrow, and finding the arrow stuck in
the wall, the archer paints a target around the arrow so that the arrow sticks squarely in
the bull's-eye. Let us suppose further that the precise place where the arrow lands in
this scenario is identical with where it landed in the first two scenarios. Since any
place where the arrow might land is highly improbable, highly complex information has been
actualized as in the other scenarios. What's more, since the information corresponds to a
pattern, we can even say that in this third scenario highly complex patterned information
has been actualized. Nevertheless, it would be wrong to say that highly complex specified
information has been actualized. Of the three scenarios, only the information in the
second scenario is specified. In that scenario, by *first* painting the target and *then*
shooting the arrow, the pattern is given independently of the information. On the other
hand, in the third scenario, by first shooting the arrow and then painting the target
around it, the pattern is merely read off the information.

Specified information is always patterned information,
but patterned information is not always specified information. For specified information,
not just any pattern will do. Therefore we must distinguish between the "good"
patterns and the "bad" patterns. We will call the "good" patterns *specifications*.
Specifications are the independently given patterns that are not simply read off
information. By contrast, we will call the "bad" patterns *fabrications*.
Fabrications are the *post hoc* patterns that are simply read off already existing
information.

Unlike specifications, fabrications are wholly unenlightening. We are no better off with a fabrication than without one. This is clear from comparing the first and third scenarios. Whether an arrow lands on a blank wall and the wall stays blank (as in the first scenario), or an arrow lands on a blank wall and a target is then painted around the arrow (as in the third scenario), any conclusions we draw about the arrow's flight remain the same. In either case, chance is as good an explanation as any for the arrow's flight. The fact that the target in the third scenario constitutes a pattern makes no difference since the pattern is constructed entirely in response to where the arrow lands. Only when the pattern is given independently of the arrow's flight does a hypothesis other than chance come into play. Thus only in the second scenario does it make sense to ask whether we are dealing with a skilled archer. Only in the second scenario does the pattern constitute a specification. In the third scenario, the pattern constitutes a mere fabrication.

The distinction between specified and unspecified information may now be defined as follows: the actualization of a possibility (i.e., information) is specified if the possibility's actualization is independently identifiable by means of a pattern. If not, then the information is unspecified. Note that this definition implies an asymmetry between specified and unspecified information: specified information cannot become unspecified information, though unspecified information can become specified information. Unspecified information can become specified as our background knowledge increases. For example, a cryptographic transmission, whose cryptosystem we have yet to break, will constitute unspecified information. However, as soon as we break the cryptosystem, the cryptographic transmission becomes specified information.

Information can be specified, complex, or both complex and specified. Information that is both complex and specified I call "complex specified information," or CSI for short.

What is it for a possibility to be identifiable by means
of an independently given pattern" A full exposition of specification requires a
detailed answer to this question. Unfortunately, such an exposition is beyond the scope of
this paper. The key conceptual difficulty here is to characterize the independence
condition between patterns and information. This independence condition breaks into two
subsidiary conditions: (1) a condition to stochastic conditional independence between the
information in question and particular relevant background knowledge; and (2) a
tractability condition by which the pattern in question can be constructed from the
aforementioned background knowledge. Though these conditions make good intuitive sense,
they are not easily formalized.^{8}

If formalizing what it means for a pattern to be given
independently of a possibility is difficult, determining in practice whether a pattern is
given independently of a possibility is much easier. If the pattern is given prior to the
possibility being actualized as in the second scenario above where the target was painted
before the arrow was shot?then the pattern is automatically independent of the
possibility, and we are dealing with specified information. Patterns given prior to the
actualization of a possibility are just the rejection regions of statistics. There is a
well-established statistical theory that describes such patterns and their use in
probabilistic reasoning. These are clearly specifications since having been given prior to
the actualization of some possibility, they have already been identified, and thus are
identifiable independently of the possibility being actualized.^{9}

Many interesting cases of specified information,
however, are those in which the pattern is given *after* a possibility has been
actualized. This is the case with the origin of life: life originates first and only
afterwards do pattern-forming, rational agents (like ourselves) enter the scene. It
remains the case, however, that a pattern corresponding to a possibility, though
formulated after the possibility has been actualized, can constitute a specification.
Certainly this was not so in the third scenario above, where the target was painted around
the arrow only after it hit the wall. But consider the following example. Alice and Bob
are celebrating their fiftieth wedding anniversary. Their six children all show up bearing
gifts. Each gift is part of a matching set of china. There is no duplication of gifts, and
together the gifts constitute a complete set of china. Suppose Alice and Bob were
satisfied with their old set of china, and had no inkling prior to opening their gifts
that they might expect a new set of china. Alice and Bob are therefore without a relevant
pattern whither to refer their gifts prior to actually receiving the gifts from their
children. Nevertheless, the pattern they explicitly formulate only after receiving the
gifts could be formed independently of receiving the gifts. We all know about matching
sets of china and how to distinguish them from unmatched sets. This pattern therefore
constitutes a specification. What's more, there is an obvious inference connected with
this specification: Alice and Bob's children were in collusion, and did not present their
gifts as random acts of kindness.

But what about the origin of life? Is life specified? If
so, to what patterns does life correspond, and how are these patterns given independently
of life's origin? Obviously, pattern-forming rational agents like ourselves don't enter
the scene till after life originates. Nonetheless, there are functional patterns to which
life corresponds, and which are given independently of the actual living systems. An
organism is a functional system comprising many functional subsystems. The functionality
of organisms can be cashed out in any number of ways. Arno Wouters cashes it out globally
in terms of viability of whole organisms.^{10 }Michael Behe
cashes it out in terms of the irreducible complexity and minimal function of biochemical
systems.^{11} Even the staunch Darwinist Richard Dawkins admits
that life is specified functionally, cashing out the functionality of organisms in terms
of reproduction of genes. Thus he writes: "Complicated things have some quality,
specifiable in advance, that is highly unlikely to have been acquired by random chance
alone. In the case of living things, the quality that is specified in advance
is<|>?<|>the ability to propagate genes in reproduction.?^{12}

Information can be specified, complex, or both complex
and specified. Information that is both complex and specified I call "complex
specified information," or CSI for short. CSI is what all the fuss over information
has been about in recent years, not just in biology, but in science generally. It is CSI
that for Manfred Eigen constitutes the great mystery of biology, and one he hopes
eventually to unravel in terms of algorithms and natural laws. It is CSI that for
cosmologists underlies the fine-tuning of the universe, and which the various anthropic
principles attempt to understand.^{13} It is CSI that David
Bohm's quantum potentials are extracting when they scour the microworld for what Bohm
calls "active information."^{14 }It is CSI that
enables Maxwell's demon to outsmart a thermodynamic system tending toward thermal
equilibrium.^{15} It is CSI on which David Chalmers hopes to
base a comprehensive theory of human consciousness.^{16} It is
CSI that within the Kolmogorov-Chaitin theory of algorithmic information takes the form of
highly compressible, nonrandom strings of digits.^{17}

CSI is not just confined to science. It is indispensable
in our everyday lives. The 16-digit number on your VISA card is an example of CSI. The
complexity of this number ensures that a would-be thief cannot randomly pick a number and
have it turn out to be a valid VISA card number. What's more, the specification of this
number ensures that it is your number, and not anyone else's. Even your telephone number
constitutes CSI. As with the VISA card number, the complexity ensures that this number
won't be dialed randomly (at least not too often), and the specification ensures that this
number is yours and yours only. All the numbers on our bills, credit slips, and purchase
orders represent CSI. CSI makes the world go round. It follows that CSI is a rife field
for criminality. CSI is what motivated the greedy Michael Douglas character in the movie *Wall
Street* to lie, cheat, and steal. CSI's total and absolute control was the objective of
the monomaniacal Ben Kingsley character in the movie *Sneakers*. CSI is the artifact
of interest in most techno-thrillers. Ours is an information age, and the information that
captivates us is CSI.

**Intelligent Design**

From where does the origin of complex specified
information come? In this section, I shall argue that intelligent causation, or
equivalently design, accounts for the origin of complex specified information. My argument
focuses on the nature of intelligent causation, and specifically, on what it is about
intelligent causes that makes them detectable. To see why CSI is a reliable indicator of
design, we need to examine the nature of intelligent causation. The principal
characteristic of intelligent causation is *directed contingency*, or what we call *choice*.
Whenever an intelligent cause acts, it chooses from a range of competing possibilities.
This is true not just of humans, but also of animals and extraterrestrial intelligences. A
rat navigating a maze must choose whether to go right or left at various points in the
maze. When SETI (Search for Extra-Terrestrial Intelligence) researchers try to discover
intelligence in the extraterrestrial radio transmissions they are monitoring, they first
assume that an extraterrestrial intelligence could have chosen any number of possible
radio transmissions. Then they try to match the transmissions they observe with certain
patterns as opposed to others (patterns that presumably are markers of intelligence).
Whenever a human being utters meaningful speech, a choice is made from a range of possible
sound-combinations that might have been uttered. Intelligent causation always entails
discrimination, choosing certain things, ruling out others.

The principal characteristic of
intelligent causation is **directed contingency**,or what we call **choice**.

Given this characterization of intelligent causes, the
crucial question is how to recognize their operation. Intelligent causes act by making a
choice. How then do we recognize that an intelligent cause has made a choice? A bottle of
ink spills accidentally onto a sheet of paper; someone takes a fountain pen and writes a
message on a sheet of paper. In both instances, ink is applied to paper. In both
instances, one among an almost infinite set of possibilities is realized. In both
instances, a contingency is actualized and others are ruled out. Yet in one instance we
infer design, in the other chance. What is the relevant difference? Not only do we need to
observe that a contingency was actualized, but we ourselves need also to be able to
specify that contingency. The contingency must conform to an independently given pattern,
and we must be able to independently formulate that pattern. A random ink blot is
unspecifiable; a message written with ink on paper is specifiable. Wittgenstein made the
same point: "We tend to take the speech of a Chinese for inarticulate gurgling.
Someone who understands Chinese will recognize *language *in what he hears. Similarly
I often cannot discern the *humanity* in man."^{18}

In hearing a Chinese utterance, someone who understands Chinese not only recognizes that one from a range of all possible utterances was actualized, but is also able to specify the utterance as coherent Chinese speech. Contrast this with someone who does not understand Chinese. In hearing a Chinese utterance, someone who does not understand Chinese also recognizes that one from a range of possible utterances was actualized, but this time, because lacking the ability to understand Chinese, is unable to specify the utterance as coherent speech. To someone who does not understand Chinese, the utterance will appear gibberish. Gibberish-the utterance of nonsense syllables uninterpretable within any natural language-always actualizes one utterance from the range of possible utterances. Nevertheless, gibberish, by corresponding to nothing we can understand in any language, cannot be specified. As a result, gibberish is never taken for intelligent communication, but always for what Wittgenstein calls "inarticulate gurgling."

The actualization of one among several competing possibilities, the exclusion of the rest, and the specification of the possibility actualized encapsulate how we recognize intelligent causes, or equivalently, how we detect design. The Actualization- Exclusion-Specification triad constitutes a general criterion for detecting intelligence-be it animal, human, or extraterrestrial. Actualization establishes that the possibility in question is the one that actually occurred. Exclusion establishes that there was genuine contingency (i.e., that there were other live possibilities, and that these were ruled out). Specification establishes that the actualized possibility conforms to a pattern given independently of its actualization.

Now where does choice, which we've cited as the principal characteristic of intelligent causation, figure into this criterion? The problem is that we never witness choice directly. Instead, we witness actualizations of contingency which might be the result of choice (i.e., directed contingency) or the result of chance (i.e., blind contingency). Specification is the only means available to us for distinguishing choice from chance, directed contingency from blind contingency. Actualization and exclusion together guarantee that we are dealing with contingency. Specification guarantees that we are dealing with a directed contingency. The Actualization- Exclusion-Specification triad is therefore precisely what we need to identify choice and with it intelligent causation.

The contingency must conform to an independently given pattern, and we must be able to independently formulate that pattern.

Psychologists who study animal learning and behavior
have known of the Actualization-Exclusion-Specification triad all along, even if
implicitly. For these psychologists known as learning theorists, learning is
discrimination.^{19 }To learn a task an animal must acquire the
ability to actualize behaviors suitable for the task as well as the ability to exclude
behaviors unsuitable for the task. Moreover, for a psychologist to recognize that an
animal has learned a task, it is necessary not only to observe the animal making the
appropriate behavior, but also to specify this behavior. Thus to recognize whether a rat
has successfully learned how to traverse a maze, a psychologist must first specify the
sequence of right and left turns that conducts the rat out of the maze. No doubt, a rat
randomly wandering a maze also discriminates a sequence of right and left turns. But by
randomly wandering the maze, the rat gives no indication that it can discriminate the
appropriate sequence of right and left turns for exiting the maze. Consequently, the
psychologist studying the rat will have no reason to think the rat has learned how to
traverse the maze. Only if the rat executes the sequence of right and left turns specified
by the psychologist will the psychologist recognize that the rat has learned how to
traverse the maze. We regard these learned behaviors as intelligent causes in animals.
Thus, it is no surprise that the same scheme for recognizing animal learning recurs for
recognizing intelligent causes generally, to wit, actualization, exclusion, and
specification.

This general scheme for recognizing intelligent causes coincides precisely with how we recognize complex specified information. First, the basic precondition for information to exist must hold, namely, contingency. Thus one must establish that any one of a multiplicity of distinct possibilities might obtain. Next, one must establish that the possibility which was actualized after the others were excluded was also specified. So far the match between this general scheme for recognizing intelligent causation and how we recognize complex specified information is exact. Only one loose end remains-complexity. Although complexity is essential to CSI (corresponding to the first letter of the acronym), its role in this general scheme for recognizing intelligent causation is not immediately evident. In this scheme, one among several competing possibilities is actualized, the rest are excluded, and the possibility which was actualized is specified. Where in this scheme does complexity figure in?

To recognize intelligent causation, we must establish that one possibility from a range of competing possibilities was actualized, determine which possibilities were excluded, and then specify the actualized possibility.

The answer is that it is there implicitly. To see this, consider again a rat traversing a maze, but now take a very simple maze in which two right turns conduct the rat out of the maze. How will a psychologist studying the rat determine whether it has learned to exit the maze? Just putting the rat in the maze will not be enough. Because the maze is so simple, the rat could by chance just happen to take two right turns, and thereby exit the maze. The psychologist will therefore be uncertain whether the rat actually learned to exit this maze, or whether the rat just got lucky. But contrast this now with a complicated maze in which a rat must take just the right sequence of left and right turns to exit the maze. Suppose the rat must take one hundred appropriate right and left turns, and that any mistake will prevent the rat from exiting the maze. A psychologist who sees the rat take no erroneous turns and in short order exit the maze will be convinced that the rat has indeed learned how to exit the maze, and that this was not dumb luck. With the simple maze, there is a substantial probability that the rat will exit the maze by chance; with the complicated maze, this is exceedingly improbable. The role of complexity in detecting design is now clear, since improbability is precisely what we mean by complexity (cf. section "Complex Information").

We can summarize this argument for showing that CSI is a reliable indicator of design as follows: CSI is a reliable indicator of design because its recognition coincides with how we recognize intelligent causation generally. To recognize intelligent causation, we must establish that one possibility from a range of competing possibilities was actualized, determine which possibilities were excluded, and then specify the actualized possibility. What's more, the competing possibilities that were excluded must be live possibilities, sufficiently numerous so that specifying the actualized possibility cannot be attributed to chance. In terms of probability, this means that the specified possibility is highly improbable. In terms of complexity, this means that the specified possibility is highly complex. All the elements in the general scheme for recognizing intelligent causation (i.e., Actualization-Exclusion-Specification) find their counterpart in complex specified information"CSI. CSI pinpoints what we need to be looking for when we detect design.

As a postscript, I call the reader's attention to the
etymology of the word "intelligent." It derives from two Latin words, the
preposition *inter*, meaning between, and the verb *lego*, meaning to choose or
select. Thus, according to its etymology, intelligence consists in *choosing between*.
It follows that the etymology of the word "intelligent" parallels the formal
analysis of intelligent causation just given. Thus, "Intelligent design" is a
thoroughly apt phrase, signifying that design is inferred precisely because an intelligent
cause has done what only an intelligent cause can do?make a choice.

**The Law of
Conservation of Information**

Evolutionary biology has steadfastly resisted
attributing CSI to intelligent causation. Though Eigen recognizes that the central problem
of evolutionary biology is the origin of CSI, he has no thought of attributing CSI to
intelligent causation. According to Eigen, natural causes are adequate to explain the
origin of CSI. The only question for him is which natural causes explain the origin of
CSI. Eigen ignores the logically prior question of whether natural causes can even, in
principle, explain the origin of CSI. Yet this is a question that undermines his entire
project.^{20} Natural causes are, in principle, incapable of
explaining the origin of CSI. They can explain the flow of CSI, being ideally suited for
transmitting already existing CSI. What they cannot do, however, is originate CSI. This
strong proscriptive claim, that natural causes can only transmit CSI but never originate
it, I call the Law of Conservation of Information. It is this law that gives definite
scientific content to the claim that CSI is intelligently caused. The aim of this last
section is briefly to sketch the Law of Conservation of Information.^{21}

To see that natural causes cannot account for CSI is
straightforward. Natural causes comprise chance and necessity.^{22 }Because
information presupposes contingency, necessity is by definition incapable of producing
information, much less complex specified information. For there to be information, there
must be a multiplicity of live possibilities, one of which is actualized, and the rest of
which are excluded. This is contingency. But if some outcome B is necessary given
antecedent conditions A, then the probability of B given A is one, and the information in
B given A is zero. If B is necessary given A, Formula (*) reduces to
I(A&B)<|>=<|>I(A), which is to say that B contributes no new information
to A. It follows that necessity is incapable of generating new information. Observe that
what Eigen calls "algorithms" and "natural laws" fall under necessity.

** Natural causes are therefore
incapable of generating CSI.**

Since information presupposes contingency, let us take a closer look at contingency. Contingency can assume only one of two forms. Either the contingency is a blind, purposeless contingency-which is chance; or it is a guided, purposeful contingency-which is intelligent causation. Since we already know that intelligent causation is capable of generating CSI (cf. section, "Intelligent Design"), let us next consider whether chance might also be capable of generating CSI. First notice that pure chance, entirely unsupplemented and left to its own devices, is incapable of generating CSI. Chance can generate complex unspecified information, and chance can generate noncomplex specified information. What chance cannot generate is information that is jointly complex and specified.

Biologists by and large do not dispute this claim. Most
agree that pure chance-what Hume called the Epicurean hypothesis-does not adequately
explain CSI. Jacques Monod is one of the few exceptions, arguing that the origin of life,
though vastly improbable, can nonetheless be attributed to chance because of a selection
effect.^{23} Just as the winner of a lottery is shocked at
winning, so we are shocked to have evolved. But the lottery was bound to have a winner,
and so too something was bound to have evolved. Something vastly improbable was bound to
happen, and so, the fact that it happened to us (i.e., that we were selected-thus the name
selection effect) does not preclude chance. This is Monod's argument and it is fallacious.
It utterly fails to come to grips with specification. Moreover, it confuses a necessary
condition for life's existence with its explanation. Monod's argument has been refuted by
the philosophers John Leslie,^{24} John Earman,^{25}
and Richard Swinburne.^{26} It has also been refuted by the
biologists Francis Crick,^{27} Bernd-Olaf Küppers,^{28}
and Hubert Yockey.

^{ }Selection effects do nothing to render
chance an adequate explanation of CSI.

Most biologists, therefore, reject pure chance as an
adequate explanation of CSI. The problem here is not simply one of faulty statistical
reasoning. Pure chance as an explanation of CSI is also scientifically unsatisfying. To
explain CSI in terms of pure chance is no more instructive than pleading ignorance or
proclaiming CSI a mystery. It is one thing to explain the occurrence of heads on a single
coin toss by appealing to chance. It is quite another, as Küppers points out, to follow
Monod and take the view that "the specific sequence of the nucleotides in the DNA
molecule of the first organism came about by a purely random process in the early history
of the earth."^{30} CSI cries out for an explanation, and
pure chance won't do. As Richard Dawkins correctly notes: "We can accept a certain
amount of luck in our [scientific] explanations, but not too much."^{31}

If chance and necessity left to themselves cannot
generate CSI, is it possible that chance and necessity working together might generate
CSI? The answer is "No." Whenever chance and necessity work together, the
respective contributions of chance and necessity can be arranged sequentially. But by
arranging them sequentially, it becomes clear that at no point in the sequence is CSI
generated. Consider the case of trial-and-error (trial corresponds to necessity and error
to chance). Once considered a crude method of problem solving, trial-and-error has so
risen in the estimation of scientists that it is now regarded as the ultimate source of
wisdom and creativity in nature. The probabilistic algorithms of computer science all
depend on trial-and-error.^{32} So too, the Darwinian mechanism
of mutation and natural selection is a trial-and-error combination in which mutation
supplies the error and selection the trial. An error is committed after which a trial is
made. But at no point is CSI generated.

Natural causes are therefore incapable of generating CSI. This broad conclusion I call the Law of Conservation of Information, or LCI for short. LCI has profound implications for science. Among its corollaries are the following: (1) The CSI in a closed system of natural causes remains constant or decreases; (2) CSI cannot be generated spontaneously, originate endogenously, or organize itself (as these terms are used in origins-of-life research); (3) The CSI in a closed system of natural causes either has been in the system eternally or was at some point added exogenously (implying that the system though now closed was not always closed); (4) In particular, any closed system of natural causes that is also of finite duration received whatever CSI it contains before it became a closed system.

This last corollary is especially pertinent to the
nature of science for it shows that scientific explanation is not coextensive with
reductive explanation. Richard Dawkins, Daniel Dennett, and many scientists are convinced
that proper scientific explanations must be reductive, moving from the complex to the
simple. Dawkins writes: "The one thing that makes evolution such a neat theory is
that it explains how organized complexity can arise out of primeval simplicity."^{33} Dennett views any scientific explanation that moves from
simple to complex as "question-begging."^{34} Thus
Dawkins explicitly equates proper scientific explanation with what he calls
"hierarchical reductionism," according to which "a complex entity at any
particular level in the hierarchy of organization" must properly be explained
"in terms of entities only one level down the hierarchy.^{35 }While
no one will deny that reductive explanation is extremely effective within science, it is
hardly the only type of explanation available to science. The divide-and-conquer mode of
analysis behind reductive explanation has strictly limited applicability within science.
In particular, this mode of analysis is utterly incapable of making headway with CSI. CSI
demands an intelligent cause. Natural causes will not do.

Notes

^{1}Manfred Eigen, *Steps
Towards Life: A Perspective on Evolution*, translated by Paul Woolley (Oxford: Oxford
University Press, 1992), 12.

^{2}Keith J. Devlin, *Logic
and Information* (New York: Cambridge University Press, 1991), 1.

^{3}Fred I. Dretske, *Knowledge
and the Flow of Information* (Cambridge, MA: MIT Press, 1981), 4.

^{4}Robert Stalnaker, *Inquiry*
(Cambridge, MA: MIT Press, 1984), 85.

^{5}See Claude E. Shannon and W.
Weaver, *The Mathematical Theory of Communication* (Urbana, IL: University of
Illinois Press, 1949), 32; R. W. Hamming, *Coding and Information Theory*, 2d edition
(Englewood Cliffs, NJ: Prentice-Hall, 1986); or any mathematical introduction to
information theory.

^{6}Cf. William A. Dembski, *The
Design Inference: Eliminating Chance through Small Probabilities* (Forthcoming,
Cambridge University Press, 1998), Chap. 4.

^{7}The details can be found in
my monograph, *The Design Inference*.

^{8}For the details refer to my
monograph, *The Design Inference*.

^{9}Cf. Ian Hacking, *Logic of
Statistical Inference* (Cambridge: Cambridge University Press, 1965).

^{10}Arno Wouters,
"Viability Explanation," *Biology and Philosophy,* 10 (1995): 435-457.

^{11}Michael Behe, *Darwin's
Black Box: The Biochemical Challenge to Evolution*. (New York: The Free Press, 1996).

^{12}Richard Dawkins, *The
Blind Watchmaker* (New York: Norton, 1987), 9.

^{13}Cf. John D. Barrow and
Frank J. Tipler, *The Anthropic Cosmological Principle* (Oxford: Oxford University
Press, 1986).

^{14}Cf. David Bohm, *The
Undivided Universe: An Ontological Interpretation of Quantum Theory* (London:
Routledge, 1993), 35-38.

^{15}Cf. Rolf Landauer,
"Information is Physical," *Physics Today* (May: 23-29, 1991): 26.

^{16}Cf. David J. Chalmers, *The
Conscious Mind: In Search of a Fundamental Theory* (New York: Oxford University Press,
1996), Chap. 8.

^{17}Cf. Andrei N. Kolmogorov,
"Three Approaches to the Quantitative Definition of Information," *Problemy
Peredachi Informatsii* (in translation), 1(1) (1965): 3-11; Gregory J. Chaitin,
"On the Length of Programs for Computing Finite Binary Sequences," *Journal of
the ACM*, 13 (1966): 547-569.

^{18}Ludwig Wittgenstein, *Culture
and Value*, edited by G. H. von Wright, translated by P. Winch (Chicago: University of
Chicago Press, 1980), 1e.

^{19}Cf. James. E. Mazur, *Learning
and Behavior*, 2d ed. (Englewood Cliffs, NJ: Prentice Hall, 1990); Barry Schwartz, *Psychology
of Learning and Behavior*, 2d edition (New York: Norton, 1984).

^{20}Manfred Eigen, *Steps
Towards Life.*

^{21}*A full treatment will
be given in Uncommon Descent*, a book I am jointly authoring with Stephen Meyer and
Paul Nelson.

^{22}Cf. Jacques Monod, *Chance
and Necessity* (New York: Vintage, 1972).

^{23}Ibid.

^{24}John Leslie, *Universes*
(London: Routledge, 1989).

^{25} John Earman, "The
Sap Also Rises: A Critical Examination of the Anthropic Principle," *American
Philosophical Quarterly*, 24(4) (1987): 307-317.

^{26}Richard Swinburne, *The
Existence of God* (Oxford: Oxford University Press, 1979).^{2}

^{27}Francis Crick, *Life
Itself: Its Origin and Nature* (New York: Simon and Schuster, 1981), Chap. 7.

^{28}Bernd-Olaf Küppers, *Information
and the Origin of Life* (Cambridge, MA: MIT Press, 1990), Chap. 6.

^{29} Hubert P. Yockey, *Information
Theory and Molecular Biology* (Cambridge: Cambridge University Press, 1992), Chap. 9.

^{30}Bernd-Olaf Küppers, *Information
and the Origin of Life*, 59.

^{31}Richard Dawkins, *The
Blind Watchmaker, *139.

^{32}*E.g., genetic
algorithms, see Stephanie Forrest, "*Genetic Algorithms: Principles of Natural
Selection Applied to Computation*," Science*, 261 (1993): 872-878.

^{33}Richard Dawkins, *The
Blind Watchmaker*, 316.

^{34} Daniel C. Dennett, *Darwin's
Dangerous Idea: Evolution and the Meanings of Life* (New York: Simon & Schuster,
1995), 153.

^{35}Richard Dawkins, *The
Blind Watchmaker*, 13.