Evolutionary Information

Glenn R. Morton (grmorton@waymark.net)
Sun, 05 Jul 1998 15:54:43 -0500

Evolutionary information comes from the environment.

Copyright 1998 G.R. Morton
This can be freely distributed so long as no charge is made and the text is
left unaltered.

With all the talk about information theory in the air, I have been spurred
to attempt to show how information arises in an evolutionary context. It
comes from the environment. Most of this post can be understood with no
mathematical background. Only the last third contains mathematics. One
caution, the word information is used in its mathematical definition. It
is not synonymous with 'meaning', 'specificity', 'knowledge' etc. This is
the proper meaning of information in relation to the mathematical
definition in information theory.

One of the most fascinating areas of the creation/evolution debate concerns
the specificity of the DNA, RNA and proteins upon which life is
constructed (specificity is the ability of a molecule to perform a given
function). Anti-evolutionists have traditionally believed that the
information for the manufacture of life forms and their further evolution
could only come directly, rather than indirectly, from the hand of the
creator. The anti-evolutionary position has traditionally excluded God
dealing in an indirect fashion. Why this is has more to do with history
rather than with science or scripture. In the Scripture God acted
indirectly by commanding both the water and land to bring forth life. In
Genesis 1, God didn't create the fish or the land animals directly. And as
far as science is concerned, I am going to attempt to show that information
for the development of new structures in living forms comes from the
environment, not directly from the hand of God. In fact, some creationists
have recognized this is the case with evolution for over 30 years. Other
anti-evolutionists seem not to be aware of it. I would suggest that God
created the environment so the information can be transmitted from the
environment into the genome. The information is ultimately from God as is
everything else in this world. Bradley and Thaxton define two types of
information (a distinction which is not contained in information theory).
The first kind of information is order, the information required to form a
crystal. The second kind of information is complex specificity, that is,
information which contains 'meaning' or 'biological specificity.' They write:

"Order with low information content (the first kind) does arise by natural
processes. However, there is no convincing experimental evidence that
order with high information content (the second kind or specified
complexity) can arise by natural processes. Indeed, the only evidence we
have in the present is that it takes intelligence to produce the second
kind of order."
"Scientists can synthesize proteins suitable for life, Research chemists
produce things link insulin for medical problems in greater quantities.
The question is, How do they do it? Certainly not by means of chance or
natural causes. Only by highly constraining their experiments can chemists
produce proteins like those found in living things. Placing constraints on
the experiment limits the 'choices' at each step of the way. That is, it
adds information.
"If we want to speculate on how the first informational molecules came
into being, the most reasonable speculation is there was some form of
intelligence around at the time. We cannot identify that source any further
from a scientific analysis alone. Science cannot supply a name for that
intelligent cause. Walter Bradley and Charles Thaxton, "Information and the
Origin of Life," in J. P. Moreland, editor, Creation Hypothesis (Downer's
Grove: Illinois: Intervarsity Press, 1994), p. 209

Lane Lester and Ray Bohlin write:

"Intelligence is a necessity in the origin of any informational code,
including the genetic code, no matter how much time is given." Lane Lester
and Ray Bohlin, The Natural Limits to Biological Change, (Grand Rapids:
Zondervan, 1984), p. 157

Let's examine two things in the above quotations. The scientists are
adding information to the molecules they are creating. This is absolutely
correct. But the scientist is doing it INDIRECTLY by creating an
environment in which the molecules behave in particular ways. The
scientist does not directly take an individual nucleotide and holding it
carefully in his fingers attach it to another nucleotide. The creation of
DNA or RNA or proteins in a test tube is a case of indirect creativity by
an intelligence. This is analogous to God creating the universe in a
fashion in which biomolecules would be constrained by their environment to
behave in certain manners.

In the Lester and Bohlin quote, they say that intelligence is a necessity
for any informational code. This would seem to require an intelligence to
produce the genetic code for a wing, or a feather, or a leg. If the direct
application of intelligence is necessary for the each part of the genetic
code then it is also a necessity for the evolution of any new information
which was not in the genome earlier. Lester and Bohlin's requirement, if
proven, would require at the very least, progressive creation, at the most,
special creation of each species.

However, there is a weakness in the requirement that all information must
come come directly, rather than indirectly, from the Creator. In my view
information is from the creator regardless of the directness the
transmission. Information can come from the environment and natural
selection is the informational pump. When a new predator begins preying on
a plant, the plant population has a better chance of surviving if a
mutation occurs in some members to produce a toxin which makes the predator
sick. As those plants with toxin become more prevalent, the predator now
may find itself better able to survive if a few mutations take place in
some of their members which enables the species to produce an antidote to
the toxin. In this way the environment of the plant has transmitted
information into the genome of the plant and the environment of the
predator then receives information that an antidote is beneficial. Note
that the information received by the predator is not information on how to
manufacture the toxin, but the opposite, how to nullify it. The creationist
Lee Spetner wrote:

"If we consider the genetic information of a species we find evolutionary
theory implies that as the environment changes, so the genetic information
changes. This is a requirement of any evolutionary theory that attempts to
account for the widespread adaptivity found in the biological world. In
particular in the modern synthetic theory of evolution, random mutation
produces an assortment of genotypes of which one or more is favored by the
environment. The result is that there is a kind of information-transmission
taking place from the environment to the genetic information storage of the
species. The mechanism by means of which this information is transmitted is
natural selection." ." ~ L. M. Spetner, "Natural Selection: An
Information-Transmission Mechanism for Evolution," Journal of Theoretical
Biology, 7(1964):412-429, p. 412

and

"The process by which such evolution is realized can be considered a
transmission of information from the environment into the genetic storage
of the organism. The information so transmitted will be referred to as
adaptive information and any theory of evolution can be characterized by
the mechanism it proposes for the transmission of such information." ~ Lee
M. Spetner, "Information Transmission in Evolution," IEEE Transactions on
Information Theory Vol IT-14 January 1968, p. 3-6, p. 3

Spetner is correct that natural selection acts as an information
transmission mechanism taking information from the environment and placing
it into the genome. This, from a creationist, is exactly where the
information comes from which drives evolution. But Spetner's contention
requires that there be information in the environment. And there is.
Spetner is not dealing with the origin of life, but with the creation and
transmission of new information after life has already been created.
Spetner's methodology requires that there be information in the environment
of an animal which is able to be transmitted. Since this will probably be
a novel concept to many we will examine this aspect of the environment.

How much information exists in the environment of an animal? Well the
environment consists both of the organic and inorganic world with which the
organism interacts. Predominantly a living organism relates to other living
organisms more than to the inorganic world. So, in the spirit of Spetner's
observation, we will estimate a lower limit to the information in the
biosphere.

Humans have approximately 100,000 genes (150,000 by some accounts [Isaac
Asimov, The Genetic Code, (New York: The New American Library, 1962), p.
179]) Assume that the average protein is 200 amino acids long and let us
use Yockey's value of 2.119 bits/ amino acid (calculated from cytochrome c)
as the amount of information of the average site in a protein(H. P.
Yockey, Information Theory and Molecular Biology, (New York: Cambridge
University Press, 1992), p. 172).

What we find is that the amount of protein information in human proteins is:

100,000 x 200 x 2.119= 4.2 x 10^6 = 42 million bits of information.

If we assume that half this value represents the proteinaceous
information of the average
organism, then given 10,000,000 species on earth, we have

21,000,000 x 10,000,000= 2.1 x 10^14 bits of information in the protein
instructions for each species.

But this is not all, given that our genomes are not identical and are not
merely clones-- each individual of each species has a unique genetic
inheritance--then assuming that there are 100 billion individual organisms
on earth(this figure would exclude microscopic life forms), this leads us
to the conclusion that there are at least 2.1 x 10^25 bits of information.
In the proteins of the biosphere.

Information is not conserved as is energy. If information were conserved,
we humans would be unable to double the database of human knowledge every
few years. Information is created by the expenditure of energy according to
the relation

Del (I) < or = Del (E) / k T ln2

Where Del (I) is the change in information
Del (E) is the change in energy
k. is the Boltzman constant
T is the temperature in Kelvin.

(Barrow and Tipler, The Anthropic Principle, (New York: Oxford University
Press, 1986) p. 660-62)

What this means is that the information content in our genes is not the
entire complement of information that we will have throughout our lives.
The expenditure of energy creates our bodies, our memories, and human
knowledge. In this fashion the total informational content of the adult
human vastly exceeds that of the genome itself. The human brain has 10^15
dendritic connections (Frank J Tipler, The Physics of Immortality, (New
York: Doubleday, 1994), p. 22) And if each connection requires 1 bit (and I
think it requires more than 1 bit), these connections alone represent a
billion times more information in a living human brain than exists in the
proteins. How is this information developed? It is developed by the
expenditure of energy during human development. So starting with and
initial quantity of information each individual creates more during growth
and development. As I have noted elsewhere there is not enough information
in the entire DNA to code for the locations of the synaptic connections in
the brain.
(http://www.calvin.edu/archive/asa/199710/0403.html) the original post
(http://www.calvin.edu/archive/asa/199710/0405.html) A correction.

Given 6 billion people on earth, the information stored in the synaptic
connections of the brains yields 6 x 10^24 bits of information in human
brains alone! This doesn't account for the brains of other animals

Given the above, the environment of any animal contains a minimum of 10^25
bits of information in the biosphere which is capable of being transmitted
to any population in its evolution.

So, we can conclude that there is much information which can flow from the
environment into the genome.

But can the transmission of information occur rapidly enough to drive
evolution. Spetner, in his Journal of Theoretical Biology article says
'No'. But there is a flaw in his argument. His argument was constructed
prior to one important discovery and failed to properly account for a fact
about biological systems from an information point of view. Spetner
attempts to show that given the observed reproduction rates, the
informational transmission rate is too slow. His assumption is where he
fails to be convincing. He writes: "Let the genetic information storage F,
consist of a sequence of n+l symbols, where l of them represent essential
information that has already been transmitted from the environment while
the remaining n are random. These two component sequences need not, of
course, be physically separated, but they each may be distributed in any
way over the entire sequence. We shall compute the average number of
trials necessary to achieve by chance alone a specific sequence of n
symbols." ~ L. M. Spetner, "Natural Selection: An Information-Transmission
Mechanism for Evolution," Journal of Theoretical Biology, 7(1964):412-429,
p. 415

The last sentence is his assumption and it is where he errs in his quest.
By this assumption he is stating that there is one and ONLY ONE sequence of
n symbols which will perform the FUNCTION he desires to evolve. Indeed
Spetner makes this claim in his abstract. He states:

"The information-transmission rate possible for natural selection is
computed as the average number of trials (i.e. births) necessary to specify
a unique nucleotide sequence of length n."~L. M. Spetner, "Natural
Selection: An Information-Transmission Mechanism for Evolution," Journal of
Theoretical Biology, 7(1964):412-429, p. 412

But any anti-evolutionary probability argument is erroneous if it claims
that only a single unique sequence can perform a given function. This
simply isn't observationally true. In almost all functional biological
molecules there are hundreds of billions of FUNCTIONAL versions. For
instance, cow, sheep, pig etc insulins are all different from human insulin
in their sequence, but they can all be injected into a human and they will
perform well enough to keep the person alive for years. Do they work as
efficiently as human insulin? No, but they do perform the function of
human insulin in spite of having a different sequence. In 1977 Yockey
calculated that there were 10^61 different sequences which could perform
the function of cytochrome c. (Yockey, "A Calculation of the Probability of
Spontaneous Biogenesis by Information Theory," Journal of Theoretical
Biology, 67(1977):377-398). By 1992 new discoveries had increased that
number to 10^93 different protein sequences which perform the specific
function of cytochrome c. (Yockey Information Theory and Molecular
Biology, (New York: Cambridge University Press, 1992), p. 59) This is an
increase of 32 orders of magnitude in the likelihood of finding a
cytochrome c sequence in only 15 years!

To Spetner's credit, he does acknowledge that his assumption is a weakness
in his theory. He wrote:

"If one assumes that all existing enzymes were evolved sequentially in
accordance with the present model, then one must conclude that if the
probability of evolution taking place is to be at all reasonable, then at
each step, there must have been a vast number of good choices available.
This is because P [the probability of finding a useful protein-GRM] can be
large only if n [the additional sequence length-GRM] is very small, or m
[the number of useful sequences-GRM] is very large. Small n means that
only a small amount of information is needed to specify the given enzyme
function, i.e., many different proteins could do the same job. Large m
means that there are many different proteins could do the same job. Large
m means that there are many different ways in which the given species could
be improved in a given environment. In either of these cases we can
conclude that if life were to start over, the biological world would very
probably be far different from what it is now." ~L. M. Spetner, "Natural
Selection: An Information-Transmission Mechanism for Evolution," Journal of
Theoretical Biology, 7(1964):412-429, p. 421

The implications of the above weakness is what we will show below.

What this means is that in Spetner's article when he calculated the
probability of mutating a molecule towards the more beneficial version, his
probability is way too low. He says,

"If we let p be the probability of mutation of a single nucleotide, then
the probability of achieving in one trial the desired sequence of length n
while not disturbing the existing sequence of length l is

P[n,l;k] = (1-p)^l (1-p)^k (p/3)^(n-k)

where we start with a sequence that just happens to have k of the n symbols
correct to begin with." ~ L. M. Spetner, "Natural Selection: An
Information-Transmission Mechanism for Evolution," Journal of Theoretical
Biology, 7(1964):412-429, p. 415

Now, this is the probability of moving toward a particular solution but
since there are others one must sum over the entire panoply of other
solutions. Assuming that there are j different solutions this requires
that the total probability is greater than what Spetner gives.

The relation of TP to P[n,l;k] can be expressed as TP= J P[n,l;k] where J
is a suitable multiplier determined from the sum. J will be positive and
greater than 1.

Then Spetner attempts to calculate how many 'trials' or generations is
necessary to add a length n to the working molecule. A trial is defined in
this way as the birth of the next generation as that is the only time that
mutations can either occur or be passed on to the next generation. Spetner
goes on to say,

"Then the average number of trials necessary to achieve the desired
sequence of length n, starting with k of the symbols correct, and not
disturbing the existing l is the reciprocal of P[n,l;k]." ~ L. M. Spetner,
"Natural Selection: An Information-Transmission Mechanism for Evolution,"
Journal of Theoretical Biology, 7(1964):412-429, p. 415-416

The smaller P[n,l;k] is the greater is the number of trials, but with J
greater than 1 (accounting for functionality rather than desiring a given
sequence), the number of trials becomes less than for P[n,l;k] alone.
Mathematically it is:

N= 1/J P[n,l;k] << 1/P[n,l;k].

Spetner, using his much underestimated probability for FUNCTIONALITY
concludes:

"If we use this value for p in equation (4); then we obtain

n~ 0.12 log[base10]Nbar. (5)

Here Nbar represents the average number of trials [or GENERATIONS-GRM] in
the evolutionary step and n represents the number of symbols in the
sequence that was added to the genetic message. Let us for example
consider an evolutionary step in a population of animals with an annual
birth rate of 10^7 over a time of 10^7 years, then equation (5) tells us
that the information that could be transmitted to the species is less than
that which corresponds to two nucleotides." ~ L. M. Spetner, "Natural
Selection: An Information-Transmission Mechanism for Evolution," Journal of
Theoretical Biology, 7(1964):412-429, p. 416

Nbar in the above citation is the average or expected value of N, the
number of generations in the animal's lineage needed to construct a
sequence of length n from the information provided by the environment..
Where he misses the boat is the increased efficiency of finding A
functional sequence rather than THE functional sequence the factor in
equation 5 of his quote above must be significantly greater than 0.12
which is calculated as if there is one and only one functional sequence.

How can we calculate an estimate of how far off Spetner is? There is a
way. We will illustrate this by looking at the cytochrome c molecule.
Cytochrome c is a 101 amino acid long protein. In Spetner's terminology
above, this is the case where l=0 and n=101. There are 10^180 different
sequences with 10^168 being in the high probability set (some amino acids
are rare and are unlikely to be drawn, but I will use the entire set of
permutations in the following calculations.) .

Spetner's methodology would lead us to believe that only 1 sequence out of
10^180 perform the function of cytochrome c. But as noted above, this is
not true. Yockey notes that there are 10^93 different molecules that will
perform the cytochrome c function. (Information Theory and Molecular
Biology, p. 255). Using this, we can estimate how far off Spetner is. He
is off by the ratio of 10^93/10^180 and 1/10^180 or he is off by a factor
of 10^93 in acceptable sequences and thus off by a factor of 10^93 in his
estimate of the speed of information transmission from the environment to
the species! Indeed the weakness that Spetner worried about is the weakness
that brings down his argument-the fact that billions of sequences perform
the same function.

To conclude, Spetner is correct that the environment is what provides the
information for the evolution of new structures and new molecules, but he
vastly underestimates the speed of that transmission by 93 orders of
magnitude. Applying this multiplier to Spetner's transmission rate, we see
that given 10^7 births (trials) , we can see that N, the number of
generations required to create a functional 101 residue protein is

n~ 10^93 * 0.12 log[base10](10^7)

n~10^92*7= 7 x 10^92

What this shows is that natural selection's information transmission
capacity is sufficient for the construction of any viable sequence length
less than 10^92 nucleotides long in 10^7 generations. , If a generation
consists of a year, then this represents 10 million years of evolution.
Since most genomes are only of the order of 10^12 nucleotides , this is
plenty of channel capacity.

The information which the environment is transmitting to the genome is
information concerning alterations of the environment. Once an organism has
received the 'message' and is adapted to the environment, there is little
morphological change because the already received 'message' is repeated..
But when the environment alters the channel capacity is not a limiting
factor in the organism changing.

I would like to thank the three gentlemen who helped me on this. Their
help was invaluable in shaping this report. I will not name them since in
this fashion they can dissociate themselves from the views (one is not an
evolutionist) and from any errors that might be found in this in the
future. I am the responsible party for any errors.

glenn

Adam, Apes and Anthropology
Foundation, Fall and Flood
& lots of creation/evolution information
http://www.isource.net/~grmorton/dmd.htm