Re: pure chance

Brian D. Harper (harper.10@osu.edu)
Sat, 28 Dec 1996 23:50:39 -0500

At 12:47 PM 12/27/96 -0600, Steve Clark wrote:
>At 01:04 PM 12/27/96 -0500, Brian wrote:
>>
>> Do critters try to maximize the information in
>> their genome? Doesn't seem obvious--genomes are ordered, and a
>> good thing, too!
>>
>>Of course critters are not trying to maximize the information in their
>>genome, the interesting question is whether it happens
>>anyway. My own view is that the information content tends
>>to increase during evolution
>
>Do you mean that the absolute information content increases or that the
>ration of information vs noninformation in the genomes increase? For an
>example of the latter, I remind you of a message I recently posted regarding
>intriguing data that support the idea that bacteria have eliminated
>intervening sequences of DNA from their genes, and hence appear to have a
>greater relative information content compared to the mammalian genome.
>

This is an interesting question and I'm not too sure the best way
to approach an answer. Perhaps before I start speculating I should
give my standard disclaimer, e.g. that information theory is just a
hobby for me (e.g. my priestly robes are short and badly tattered).

First of all, wrt the study I was citing it would be the absolute
information content that would be increasing. Here we need to
remind ourselves that the study considered only point mutations,
primarily because the mathematics is relatively simple for this
case. What you are describing for bacteria seems to me an
exceedingly more complex operation than a point mutation.
Nevertheless, my intuition suggests to me an answer that many
will find surprising and I think goes to the heart of what some
consider to be the "problem" with info theory that I mentioned
earlier.

My intuition is that this elimination of intervening sequences
would decrease the information content just as tearing out pages
would decrease the information content of a book. Perhaps
this is why you are introducing the idea of a ratio of info to non-info,
e.g. you may protest that a better analogy might be a printing error
wherein 10 pages of gobbledegook were inadvertently inserted after
every 20 pages of the real text. Obviously, removing these pages
of gobbledegook is going to result in a better book. The problem
though is that this "betterness" cannot be measured by information
theory. The "problem" that I keep referring to is that information
theory cannot address meaning, value, function etc. Your description
of some sequences as containing no information applies a criteria
external to information theory. Actually, these sequences still
contain information in the Shannon sense. Whether they have any
value cannot be determined.

Yockey gives a number of illustrations of this. For example, the
phone company charges you by the quantity rather than the quality
of your phone calls. Assuming you are running windows, you can go
to your file manager and click on the c drive. What you will see is the
amount of information stored on your hard drive measured in bytes
or more likely kilobytes. There is no indication of the value of this
information. It could be that a virus infected your hard drive filling
it with nonsensical goo. Or, even worse, you may have archived 20
Megs of posts by Gelnn Moron :). One of my favorite Yockey
illustrations occurs in the context of his scathing criticism
of Manfred Eigen's hypercycles. Yockey accuses Eigen of violating
this fundamental aspect of information theory by introducing
ad hoc a "value parameter" into his theory. Yockey's illustration
then is that he could not send a package labeled "gift" to Eigen
since gift means poison in German, the idea being that meaning
is not contained "in the words themselves".

I keep putting "problem" in quotes because I don't really think this
is a problem at all, the real problem is everyone seems to think its
a problem :). For example, in Horgan's critique of complexology he
writes:

=========
Efforts to apply information theory to other fields,
ranging from physics and biology to psychology and
even the arts, have generally failed--in large part
because the theory cannot address the issue of meaning.
-- John Horgan, "From Complexity to Perplexity,"
Scientific American June 1995, 272(6):104-109.
==========

This is really a strange criticism. Since when has the failure of a
theory to address meaning been a crucial factor in physics or
biology? Newtonian mechanics is, I guess, a failure since it can't
address meaning.

Actually, I think this business makes a rather good analogy for
the recent discussions on methodological naturalism. A limitation
has been imposed upon information theory. It can measure only
the intrinsic, or "structural" properties of messages without any
regard to meaning or value. Is this a weakness or a strength?
I say its a strength. You have something concrete, objective.
"information" is just the name given to a mathematical expression.
The problem comes when people forget this and start playing word
games. We might say that this limitation is just an arbitrary
convention which could be removed (as some say MN imposes
arbitrary limitations). But to remove this limitation is to destroy
information theory, even though you may still call the result by the
same name. Likewise, I believe the limitations imposed by
MN are not arbitrary and in fact are imposed for very similar
reasons as the exclusion of value and meaning from information
theory. Further, I believe that removing these limitations would
be to destroy science, though we may continue to call the result
by the same name.

To carry this analogy even further lets suppose that an ultra reductionist,
the Dawkins of information theory, writes a book called the Blind
Playwright arguing that since information theory cannot determine
the meaning of a Shakespearian play, that play thus has no meaning
except for apparent meaning. We all recognize the silliness of such
a proposal, but to say that since science cannot discover any purpose
or meaning behind physical processes or mechanisms then there
is no purpose or meaning amounts to the same mistake.

Well, I've digressed quite a bit from your original question. I don't
mean to imply that your observation about bacteria is not interesting
or without merit. I believe what you are applying is some type of
efficiency criteria that might be measurable, as you suggest, by
the ratio of the length of coding DNA to total length. Apparently,
these bacteria would have a ratio close to 1 and humans something
much less (just out of curiosity, does anyone know roughly what
this ratio would be?). But then another question might be whether
it is really "good" for a genome to have a high efficiency of this
type. I think this is probably a complicated and difficult question
to answer.

Returning to the paper I cited, it seems also likely that while
a point mutation may be expected generally to increase the information
content I don't think we can make any observations about whether
its likely or unlikely to increase the "efficiency ratio" defined above.
It seems quite possible to increase the information content and
decrease the efficiency simultaneously. For example, its entirely
possible that a random mutation would increase the information
content and yet be lethal to the organism.

>
>P.S. Brian, I seriously doubt that after talking about their subject,
>information theorists were "almost stoned" by biologists. Any
>self-respecting biologist I know would have been asleep after the talk.

Hmm..., I wonder how many read my long winded ramblings above
without falling asleep :).

Brian Harper | "If you don't understand
Associate Professor | something and want to
Applied Mechanics | sound profound, use the
The Ohio State University | word 'entropy'"
| -- Morrowitz
Bastion for the naturalistic |
rulers of science |