Re: Information: a very technical definition (was Dawkins' video)

Stephen Jones (sejones@ibm.net)
Tue, 23 Jun 1998 22:18:48 +0800

Brian

On Tue, 16 Jun 1998 17:05:07 -0400, Brian D Harper wrote:

[...]

BH>Information theory tends to get very confusing. One
>point of possible confusion in recent discussions is the
>simultaneous usage of two related but different theories
>and definitions of information, classical information
>theory a la Shannon and the more recent algorithmic
>information theory of Kolmogorov and Chaitin.

Thanks for this. I also think that an even greater confusion
is the layman's (and even biologists') idea of "information"
involving *meaning*:

"Some have attempted to develop a theory of meaning from
these ideas. Shannon (1949) warned against this at the outset
of his paper. The assignment of meaning or in biology,
specificity, to certain particular member of the ensemble lies
outside information theory." (Yockey H.P., "An Application
of Information Theory to the Central Dogma and the
Sequence Hypothesis," Journal of Theoretical Biology, Vol.
46, 1974, pp371-372)

"Information theory. Created by Claude E. Shannon in 1948,
the theory provided a way to quantify the information content
in a message. The hypothesis still serves as the theoretical
foundation for information coding, compression, encryption
and other aspects of information processing. Efforts to apply
information theory to other fields, ranging from physics and
biology to psychology and even the arts, have generally failed-
in large part because the theory cannot address the issue of
meaning." (Horgan J., "From Complexity to Perplexity",
Scientific American, Vol. 272 No. 6, June 1995, p78-79)

BH>If one is discussing info content in terms of
>compressibility then algorithmic information theory is
>appropriate. So, what I plan to do here is reach the same
>conclusion Glenn does by a different route. This example
>will hopefully illustrate an important point that is often
>missed. The algorithmic information content is not given
>directly by the compressibility. A sequence that is 90%
>compressible may contain more information than one that is
>only 10% compressible.
>
>OK, what I did was follow up on Steve's example. First
>I started with two files that contained 640 repetitions of the
>the two sequences, hereinafter referred to as -1- and
>-2-,
>
>-1- => repetitions of AAAAAAAAAAA
>-2- => repetitions of AAAAATAAAAA
>
>I then began to make larger and larger files. My initial
>plan was to double the size each time, but I quickly
>exceeded the size that I could copy to the clipboard, thus
>the size doubling stops after awhile.
>
>Anyway, the results are in the following table. Compression
>was accomplished using gzip with default settings.

[...]

BH>First note that as the file size increases the %
>compressibility also increases and approaches
>100%, though it will never get to 100% of course.

My son Brad stands by his claim that it is "compressibility"
that is the true mark of information increase. He says that file
size is a function of the algorithm which builds a dictionary of
increasing size.

BH>The important point though is that the information
>content is given not by % compressibility but
>rather by the size of the compressed file. Even
>though the compressibility is approaching 100 %,
>the information content continues to increase.
>(see second two columns)
>
>It is also interesting to note that the difference
>between the two compressed files (last column) remains
>virtually constant, varying between 7 and 9 bytes.
>
>While I was at it, I decided to do another little
>experiment. I took a single file with 168 repetitions
>of AAAAATAAAAA and began to modify it according to
>the following rule: I rolled a twenty faced die and
>then moved to the right by how many ever spaces was
>indicated by the roll. I then changed whatever character
>I happened to land on, A to T or T to A. After every
>10 changes I saved the file and then looked at the
>change in information content versus number of random
>changes. Here are the results:

[...]

>This should show clearly enough that random changes
>increase the information content.

Thanks again. I am now convinced that the "information" you
are talking about in Information Theory is not necessarily the
same as what the layman and biologists are talking about,
namely meaning.

Indeed, I it is possible that Glenn's whole argument (and
maybe Spetner's) is a misconception of what Neo-Darwinian
evolution really is. NDE claims that mutations are random
with respect to the needs of the organism. It is only when
those mutations are filtered and preerved by natural selection
that Darwinism claims they become "information" in the
biological sense: Mayr emphasises that NDE is a "two-step
process":

"Evolution through natural selection is (I repeat!) a two-step
process. The first step is the production (through
recombination, mutation and chance events) of genetic
variability; the second is the ordering of that variability by
selection. Most of the variation produced by the first step is
random in that it is not caused by, and is unrelated to, the
current needs of the organism or the nature of its
environment." (Mayr E., "Evolution," Scientific American,
Vol. 239, No. 3, September 1978, p44)

If mutation alone could generate information, then there
would be only a minor role for natural selection, and this
would actually defeat Neo-Darwinism:

"The claim for creativity has important consequences and
prerequisites that also become part of the Darwinian
corpus....If new species characteristically arise all at once,
then the fit are formed by the process of variation itself, and
natural selection only plays the negative role of executioner
for the unfit. True saltationist theories have always been
considered anti-Darwinian on this basis..." (Gould S.J.,
"Darwinism and the Expansion of Evolutionary Theory,"
Science, Vol. 216, 23 April 1982, p381)

Steve

"Evolution is the greatest engine of atheism ever invented."
--- Dr. William Provine, Professor of History and Biology, Cornell University.
http://fp.bio.utk.edu/darwin/1998/slides_view/Slide_7.html

--------------------------------------------------------------------
Stephen E (Steve) Jones ,--_|\ sejones@ibm.net
3 Hawker Avenue / Oz \ Steve.Jones@health.wa.gov.au
Warwick 6024 ->*_,--\_/ Phone +61 8 9448 7439
Perth, West Australia v "Test everything." (1Thess 5:21)
--------------------------------------------------------------------