Re: order, complexity, entropy and evolution

Don N Page (don@phys.ualberta.ca)
Tue, 23 Dec 97 11:15:59 -0700

Glenn asked whether I may be counting the same information twice. This
indeed gets to the heart of my proposal, which is essentially to compare the
naive sum J_m of the i_m's, the sum of the information in the separate
molecules without discounting the overlap, and I_m, which is the minimum
information needed to specify the m most complicated molecules, not counting
the same information twice. If there is a large difference, then there is a
lot of duplicate information in the molecules, and this duplicate information
requires both organization (for the information that is duplicated) and order
(for the duplication).

Yesterday I thought of a somewhat simpler scheme for quantifying what I
am after: Now just label the different distinct types of molecules by j, and
suppose that there are n_j molecules of type j in the entity. Let i_j be the
information in each type separately (without considering the duplicated
information from one type of molecule to another). This will be essentially
the information c_j needed to specify the type of molecule (the molecular
structure), plus the information, of order log n_j, needed to specify how many
such molecules there are in the entity. Let J be the sum of the i_m's, the sum
of all the information in the separate species of molecules, without regard for
the correlations between the different types and numbers of molecules. And let
I be the true minimum information needed to specify all the types j of
molecules and their numbers n_j within the entity, not counting duplicated
information more than once. Then let L = J - I, the duplicate information.

It would seem that for L to be large, one needs both large organization
and large order. If all the molecules are not very complex, then i_j would be
small (perhaps dominated by the log n_j term, which would be at most of order
300 even for all the particles in the observable universe), and so would be
their sum J and hence L = J - I. On ther other hand, if one just had a random
collection of complex molecules with essentially no correlations (very little
duplicate information) between the different types, then even if J were large,
I would be essentially equal to J, and so L would again be small.

Now an interesting question is whether the biosphere on earth is the
entity in our region of the universe that, for that same number of molecules,
has the largest L. (The assumption of evolution would go into getting the
simplest description of all the molecules in the different species and so would
make I lower, and hence L larger, than if the different species were not
related by having common ancestors.) Another interesting question would be
whether for its number of molecules, a human has the largest L. If these and
various other questions were answered affirmatively, L might be a good measure
of the complexity (organization plus order) of life.

Unfortunately, there is still the subjective element of how to quantify
the information in a finite collections of molecules. The answer depends on
the background knowledge. Usually in definitions of complexity one assumes a
universal computer with a fixed amount of background knowledge and then takes
the limit of an infinite string, so the background knowledge of the finite
computer is only an infinitesimal fraction of the information in the infinite
string (if it is at least partially random). But if the string is finite, as
would be a specification of the type (numbers of various atoms and their
configuration) of any finite collection of finite molecules, the background
information is not an infinitesimal fraction of the total and so makes a
difference. This makes it impossible to give a unambiguously unique definition
of how much information there is in any finite system. But maybe one could
come up with a pretty good subjective definition. Can anyone tell me if there
is one for the complexities of collections of molecules?

Don Page