Re: Conservation of information (long)

David Bowman (dbowman@tiger.georgetowncollege.edu)
Fri, 20 Aug 1999 20:52:19 -0400 (EDT)

Glenn's review of the gedanken experiment involving Maxewll's demon was
quite nice. He is also correct that it is the forgetting/memory clearing
step that has the thermodynamical cost which raises the entropy of the
universe by at least as many bits as are cleared in memory. Any computer
that operates in an environment at any finite absolute temperature T will
therefore be required to consume at least kT*ln(2) amount of energy per
bit erased (where ln(...) represents a natural logarithm and k is
Boltzmann's constant k = 1.380658 x 10^(-23) J/K). Thus the way to beat
this minimal energy requirement is to make the computer operation
*reversible*. It has been shown (although I don't have the references
handy) that a Turing-type general purpose finite state computer can, *in
principle*, be modified to perform its calculations in a reversible
manner. Such a computer would never forget/erase anything and keeps all
stored information involving all data and partial results in tact
throughout the calculation. The behavior of such a reversible computer
could be controlled by a Hamiltonian dynamical system (also reversible,
in principle) which by its nature conserves energy when it is isolated
from its environment. Such an energy-conserving computer neither has any
energy inputted into it from the outside, nor dissipates any energy
either. Needless to say we are a *very* long way from making such a
functioning computer that doesn't require any external power supply to
operate. In fact, we are still *very* far from the usual theoretical
bound of kT*ln(2) of energy consumed/dissipated per bit erased in an
ordinary computer.

BTW, *I think* that Landauer is a physicist, not a mathematician as
mentioned by the quote from the article by Lowenstein.

>Information is gained at the cost of an increase in entropy for the
>universe. So information is not constant. It can be increased. But there is
>a price to pay. I would also be interested in David Bowman's comments on
>this aspect of information theory.

On the surface it would seem that Spetner has a point about the non-
increasing nature of information. But a closer look shows, IMO, that the
idea is misleading and doesn't have any relevance related to various
natural mechanisms for biological evolution. The reason that it looks
like Spetner may have a point is, essentially, that the thermodynamic
entropy represents (as all forms of entropy must) an amount of *missing*
information. Since thermodynamic entropy is subject to the 2nd law
(which indicates a non-decreasing property) this means that the missing
information represented by that entropy must increase (or at least not
decrease for a thermodynamically isolated system). If the amount of
missing information increases, it would seem to stand to reason that the
amount of available or possessed information would, correspondingly,
decrease.

I believe there are at least a couple of things wrong with this idea.
Before explaining the problems let me, as a point of background, review
the nature of thermodynamic entropy a little. Entropy (of any kind) is a
measure of an amount of missing information. What kind of information is
it that is missing? It is the average minimal amount of further
information required to specify, with certainty, which specific outcome
obtains for some probabilistic process governed by a known probability
distribution. Each probability distribution has associated with it a
property which is a measure of its 'randomness' called its entropy.
Suppose that a particular random process has a number of distinct
disjoint outcomes {i | i=1,...N} associated with it. Let
{p_i | i=1,...N} be the set of probabilities such that p_i is the
probability for outcome i of the process. These probabilities are
positive (or at least non-negative) and they add up to exactly 1. This
is because the outcomes are disjoint and the sum of any subset of these
probabilities is just the composite probability of *any* one of the
outcomes in the subset happening. So if we add them all up we get the
probability for any one of all the outcomes happening. Since one of
these possible outcomes *must* happen it is certain (i.e. the probability
is 1) that one of them does, indeed, happen. Now the entropy of this
distribution is the sum (as i ranges from 1 to N) of the expression
p_i*log(1/p_i) where log(...) is a logarithmic function. The base of the
logarithms gives the units that the entropy is to be measured it.
Changing bases will multiply the entropy by a fixed unit conversion
factor. The *meaning* of the entropy is that it is the *average* minimal
amount of further information required to determine, with certainty,
which of the outcomes of the random process obtains given that all that
is known about the process is the probability distribution (i.e. the
values of {p_1} are assumed known) for it. If the base of the logarithms
is 2 then the entropy is measured in bits. If the base is 256 then the
entropy is measured in bytes. If the information is coded for using a
set of b distinct symbols, then the base of the logarithms is b and the
entropy is the average minimal number of these symbols needed to uniquely
distinguish which outcome obtains when the process operates.

In thermodynamics the relevant probability distribution is the
probability that the thermodynamic system is in a particular
*microscopically* specified state given that all that is known about the
system is its *macroscopic* description. For a quantum system specifying
the microscopic state means specifying which wave function actually
describes the system's exact quantum state. For a classical system
specifying the microscopic state means specifying the exact position and
momentum vectors (3 components each) for every single microscopic
particle making up the system. Thus, a classical system of n-particles
requires that 6*n numbers be specified to determine the microstate of the
system. The macroscopic state of the system is specified by giving a few
globally defined parameters for it such as the total energy, total number
of particles, total volume, etc., etc. For each macroscopic description
there is a *huge* number of microscopic states consistent with it that
the system could actually be in. The system's entropy is the average
amount of further information required to exactly determine what the
system's microscopic state really is given just its coarse
macrocsopic-level description. Each macroscopic description determines a
particular probability distribution for the possible microscopic states
consistent with it. The entropy of that distribution is the
thermodynamic entropy of the system. Now usually thermodynamic entropy
is measured in units of energy per (absolute) temperature rather than in
units of information such as the bit. The conversion factor between
these different kinds of units is related to Boltzmann's constant. If we
use the SI unit of entropy i.e. the joule/kelvin we find that
1 J/K = 1.044933 x 10^(23) bits = 1/(k*ln(2)) bits.

Since the time of Claussius it has been known that if a thermodynamic
system absorbs a small amount of heat q while it is quasistatically at an
absolute temperature T then its entropy increases by q/T. Thus suppose
we have a bottle of some gas sitting on a table at room tempreature
(298 K) and the temperature of the room increases by a tiny fraction of a
degree and the gas absorbs a small amount of heat from the room before
re-equilibating. Let's suppose that the total amount of heat absorbed
by the gas is only 1 microjoule. Then the entropy of the gas will have
increased by q/T = 3.356 x 10^(-9) J/K = 3.506 x 10^(14) bits =
40821 gigabytes. Thus after the gas absorbs this tiny amount of energy
it now requires 40821 gigabytes more information than it required before
to specify just which exact microscopic state the system is in. The
reason for this increase is that when the gas particles get some extra
energy from the outside there are many more ways for those particles to
share the total energy that is available to them than there was before,
when they had less energy to partition among themselves. (More energy
means more ways to divide it up among the constituent particles.) We see
that even a little thermodynamic dissipation results in a huge increase
in entropy when it is measured in bits rather than in joules/kelvin.

Now that we understand what entropy is we can better understand how the
2nd law relates to the infomation-theoretic understanding of a system's
entropy. Suppose we have a themodynamically isolated system that is not
quite yet in equilibrium. As time goes on the system's entropy
increases until equilibrium is reached and the entropy stops rising. As
the system gains entropy it means that the distribution of possible
microscopic states that the system can be in is becoming ever more
random as more and more possible microscopic states are becoming
available to the system for it to be in as the system approachs
equilibrium, and so it requires more and more extra information to
find out which microscopic state it is in, since there are ever-more
possibilities to choose from. Once the system reaches equilibrium the
probability distribution of microscopic states is maximally random (or
equivalently, the microscopic state is maximally uncertain, or
equivalently, the system's entropy is maximal).

We thus see that the operation of the 2nd law in nature creates ever more
missing information (about the precise microscopic state of the system or
of the universe). But (and this is one place where I believe Spetner is
wrong) this creation of missing information does not necessarily mean
that the amount of possessed or known information must correspondingly go
down. The amount of possessed information is contained in the
macroscopic description of the system. This possessed information is a
*very* tiny fraction of the missing information contained in the system's
entropy. (IOW, our net ignorance greatly exceeds our net knowledge.) As
the system equilibrates, its macroscopic state changes by a little bit
and the amount of possessed information needed to describe the new
situation may also change a little accordingly. But while this is
happenening the missing info in the entropy is skyrocketing towards its
equilibrium value. There is *not* a conservation of the total amount of
potential information present whereby all possible info is either
possessed or missing (with a fixed sum of the two), and an increase in one
means a decrease in the other. Rather, what happens is the amount of
possessed info changes a little one way or the other while the amount of
missing info grows tremendously, and at all times greatly exceeds the
amount of possessed information. The only exception possible to this is
when the system is at absolute zero (not really possible anyway) and the
system's entropy vanishs. In this wierd case we essentially don't have
any missing info at all because we already know that a system at exactly
absolute zero must be in its lowest energy allowed microscopic ground
state.

The second problem with the idea/argument (based on the 2nd law) that
supposedly nature cannot create biological information is simply that
biological information is not the kind of micro-level info that is
relevant to the 2nd law. The kind of info needed to make new proteins or
nucleic acids is a quasi-macroscopic info corresponding to particular
sequences of particular types of monomer units. This info is not the
microscopic-level info that is the currency of the entropy. Knowing the
monomer sequence of a given protein or nucleic acid might seem like a
very detailed microscopic thing to know, and it is much more detailed
than just specifying the total energy content, composition, volume and
mass of a given bottle of gas. *But* compared to the kind of detail
necessary to relate to the level of specification appropriate to the
thermodynamic entropy, it is quite a coarse description. In order to get
at the entropy-level of required info one would need to not only specify
the sequence of the types of momomers, one would also have to specify the
detailed microscopic state of each of the atoms and electrons inside each
monomer unit, as well as the particular state of the bond angles for each
joint between those monomers, as well as the exact state of each of the
particles that make up each of the surrounding solvent molecules that the
macromolecule finds itself enveloped in, as well as the exact state of
all the impurity molecular species dispersed throughout that solvent
mixture, as well as the exact microscopic atomic/electron-level
description of the rest of the relevant cellular environment of those
macromolecules. Such a microscopically detailed level of description
involves so much extra information over that contained in merely the
monomer sequence that completely adding or subtracting the entire
genome's sequence data to/from the very microscopic entropy-level info
represents a drop-in-the-bucket which is as irrelevant as the shape of
the bottle of gas on the table is to the information theoretic processes
going on at the level of the entropy. The amount of sequence info
represented by the genome of some organism is maybe some
10^9 ~ 10^10 bits or so. The microscopics of the organism looses track
one way or the other (increasing or decreasing) of a comparable amount of
information as it gains or looses heat to/from its environment every time
about 10^(-11) joules is exchanged with that environment. (Try comparing
this to a daily diet for a human of about 8 megajoules total throughput
carrying a thermodynamic entropy of some 3 x 10^27 bits representing some
10^18 times more info than is contained in the whole genome).

David Bowman
dbowman@georgetowncollege.edu