More on information, meaning, and biological application

Greg Billock (billgr@cco.caltech.edu)
Mon, 29 Jun 1998 14:22:33 -0700 (PDT)

The difference between information as defined for communications systems
purposes in the Shannon theory and colloquial uses cause some problems,
but these have very real effects on attempted applications of info
theory to genetics.

In the Shannon theory, information is divorced from meaning, and is related
exclusively to the resolution of uncertainty regarding the message
(i.e. integrating -p log p over the distribution of possible incoming
messages). Shannon theorems are limit theorems regarding channels, how
much information you can stuff through them, and assumes you will
have a great deal of freedom in codeword assignment. That's why the
theory works so well even though it ignores meaning--the meaning can
be taken out of the system more-or-less arbitrarily by the users (the
source and sink), and doesn't depend upon the transmission system.

In applying this to genetics, there is an initial problem of figuring
out (in order to use the Shannon framework) what exactly the messages
are which are to be expected, and thus how much information is stored
in the genome, and where. The problem is that this distribution is
intimately tied up with a very meaning-full process: survival on
Planet Earth. We can be absolutely sure that no species of bird will
turn out to have genomes which code for methane metabolism, so a vast
number of possible genomes is ruled out of consideration. But the
reason we rule it out is because of environmental constraints on the
planet which we didn't choose, the birds didn't choose, and neither of
us can control. We don't have the freedom to live with whatever
genome we like--constraints relating to the 'meaning' of the 'message'
of our genomes (decoded into proteins) are very strict. As we all
appreciate, it doesn't take much genomically awry to cause some
much-less-viable organisms. This means we're on a relatively smaller
territory in genome-space, and the information required to resolve
*exactly* where we are is much less.

This is a point Gould often highlights--much less evolution may be
due to natural selection than we thought, because it could be that
the path species are travelling in this much-reduced genomic territory
is more like a 'one-way street' than a network of roads in a
large subdivision.

The same sorts of problems crop up in applying info theory to behavioral
biological processes. It is unclear what role the theory will have,
because the information processing in brains uses information in a
very colloquial sense--a meaning-laden construct--rather than in an
information theoretic sense.

These sorts of puzzles have tended to limit the fruitfulness of
applying information theory to biology.

-Greg