Re: Entropy

From: David_Bowman@georgetowncollege.edu
Date: Wed Oct 25 2000 - 02:42:44 EDT

  • Next message: Huxter4441@aol.com: "Re: Jonathan Wells' new book Icons of Evolution: The Cambrian Explosion"

    Since responding to DNAunion's long 2-part post of 22 OCT would require
    more time than I have, and since any response I would make to it would
    most likely be met with a further interminable barrage of low signal-to-
    noise ratio responses that would be demanding of further cycles of
    correspondence on my part, I have decided to spare the reflector
    readership from suffering through it all by not responding to those
    posts in the first place.

    However, DNAunion's last post of 24 OCT on this thread has been much more
    focused to the point and can possibly be responded to without miring the
    reflector into another worthless argument, I have taken the risk of
    responding to it.

    Regarding:

    >>> DNAunion: All life requires that it actively maintain itself far above
    >thermodynamic equilibrium. For an acorn to grow into an oak, it must fight
    >against, and "overcome", entropic tendencies at every moment along the way.
    >This example does not contradict my statements.
    >
    >>>FMAJ: Exactly. This far for equilibrium thermodynamics is exactly
    >whatdrives evolution and creation of completity. So what does this show?
    >
    >>>DNAunion: It shows that there *is* something that opposes matter's being
    >organized in complex ways, which must be continually fought: when it is
    >battled, it *can* be "overcome".
    >
    >How many times do I have to explain this. I am *not* stating that increases
    >in order or complexity *cannot* occur, just that in order for them to
    >occur,that entropy must be *overcome*. Entropy *is* something that opposes
    >matter's being arranged in organized and complex ways.

    I think DNAunion's point here has been clear all along. My point is that
    entropy does *not* have to be 'overcome' for matter to organize itself
    in complex ways if it so organizes. Rather, the organization itself is a
    result of the system in interaction with its environment generating
    entropy *according to* the 2nd law (not in opposition to it). In the
    case of the interestingly organized dissipative structures that occur in
    far-from-equilibrium systems the entropy generated by the action of the
    system in conjunction with the relevant part of its environment is
    greater and is generated at a greater rate than if the system did not
    organize itself in such a dissipative structure. The system is not
    'fighting' entropy. Rather, the 2nd law is constraining and guiding the
    system's behavior in conjunction with the system's own particular
    internal dynamics.

    >>>Chris: Actually, I don't think that the complexity of the Universe as a
    >whole changes at all. *Organization* changes, of course. But, randomness is
    >as complex as you can get; it's just not organized in ways that we would
    >recognize as such.

    I could quibble with the first part of this if I was in the mood to
    debate the point. But I'm not so inclined. The last part is a good
    point.

    >The complexity of randomness is what makes the claims that random processes
    >cannot produce complexity ironic; that's what random processes are *best* at
    >producing. What they are not so good at is producing simplicity and
    >systematic organization.
    >
    >DNAunion: Good point about complexity related to randomness.
    >
    >Seeing that David Bowman is so much more informed on the subject than I (not
    >being smart, he obviously does know much more than I do on entropy,
    >thermodynamics, and its relation to complexity), I would like to ask him a
    >question.
    >
    >I have heard both complexity and randomness defined in terms of a measure of
    >the degree of algorithmic compressibility.

    Like many of the other key terms of involved in information theoretic
    considerations the notions of 'complexity' and 'randomness' have been
    defined in terms of multiple incompatible definitions by different
    authors. The lack of a single nomenclature of the field helps make the
    field more arcane than it needs to be.

    But I think usually the notions of 'complexity' and 'randomness',
    although related, are not synonomous. Usually the notion of 'complexity'
    is taken to mean (a la Kolmogorov & Chaitin) some measure of the length
    of the shortest possible algorithm required to reproduce or reconstruct
    the thing whose 'complexity' is being discussed. The ideas of
    'compressibility' and 'complexity' (using the above definition) *tend* to
    be typically *antithetical*. Suppose that the things being described are
    sequences of symbols. A highly compressible sequence is one whose
    shortest algorithm for reproducing it (i.e. its maximally zipped version)
    is much shorter than the original sequence. Such a sequence is not very
    complex (relatively speaking) because the shortest reproduction
    algorithm is relatively short. Conversely, a highly incompressible
    sequence is one whose complexity (length of shortest reproduction
    algorithm) is nearly as long as the sequence itself. It is possible,
    though, for a sequence to be *both* highly compressible and yet still be
    complex. An example of this is a sequence that is so very long that even
    with a high compression ratio that makes its shortest reproduction
    algorithm much shorter than its original length has the property that
    that shortest algorithm is *still* very long on some relevant scale of
    measurement length. Also it is possible for a sequence to be both
    incompressible and not complex. Any sequence that is relatively
    unpatterned and originally very short will be both incompressible and
    non-complex.

    The term 'randomness' has a wider range of meanings. Often the property
    of 'randomness' is taken to be characteristic of a parent ensemble or
    distribution that is devoid of mutual dependences, correlations and
    nonuniformities in probabilities among the possible outcomes.
    Distributions that have a high randomness are very hard to predict (with
    any accuracy above that from wild guesses) the outcomes prior to their
    being realized. Such distributions with this property tend to have a
    high entropy, in that the background prior information about the
    distribution is woefully inadequate to determine the outcomes--
    neccessitating much further information (on the average) to be supplied
    to reliably identify any given outcome.

    Another usage of the term 'randomness' is used in terms of sequences
    or time series generated by some stochastic process. Time series
    drawn from stochastic processes having a high randomness tend to pass
    various statistical tests designed to detect it. But it is the nature
    of the beast that any battery of tests for randomness can easily be
    fooled by nonrandom sequences that are sufficiently enciphered and
    false positives are the bane of any test for randomness. Sequences
    with the property of randomness are 'white noise' in the sense that
    the mean power spectrum of such sequences is flat in frequency
    space, and there are no correlations among the individual
    time-dependent members of the sequences in the 'time domain'. Each
    member of a sequence is statistically independent of all the other
    members. This latter usage of the term 'randomness' has a connection
    with the notion of complexity in that a stochastic process that
    generates random sequences, i.e. a fully random process, has the
    property that the sequence so generated is maximally complex, i.e. the
    shortest algorithm for reproducing the sequence is just a listed copy of
    the sequence itself. Random sequences (i.e. sequences generated by fully
    random processes) are both incompressible and maximally complex.

    >HHTTHHTTHHTTHHTTHHTTHHTTHHTT
    >
    >HTTHHHTHHTTHTHTHHHHTHHTTTHTT
    >
    >The first coin flip example can be described by "repeat HHTT 7 times". Even
    >if the sequence were extended out to a billion symbols, the description would
    >become hardly longer, something like "repeat HHTT 250,000 times". The
    >sequence has a very short mimimal algorithm that fully describes it, so is
    >not random (nor complex?).

    The first sequence is neither complex nor random (with near certainty).
    It's not complex because its minimal length reproduction algorithm is
    short; its not random (with overwhelmingly probability) because it
    easily fails many tests for randomness. But there *is* a *very* tiny
    probability (1/2^28) that the first sequence could have been generated
    by a random process if that process that generated it just happened to
    turn up the HHTT pattern 7 times in a row just when the sequence
    was being sampled.

    >But the second coint flip example has no such shortcut. The shortest
    >algorithm that fully describes the sequence is (for all we can tell), the
    >sequence itself. Therefore, it is random (and/or complex?).

    It *seems to be* possibly maximally complex. But even this degree of
    complexity is only at most 28 bits. In the overall scheme of things
    28 bits is not very much complexity. Again, I haven't applied any
    test for randomness to it, but I suspect that the probability that it
    could have been generated by a fully random process is much greater
    than the example of the first sequence. But it, presumably, could
    also easily have been generated by a decidely non-random process that
    only looks random (for at least these 28 bits worth) w.r.t. whatever
    tests for randomness that it passes.

    >PS: I also understand that a more detailed look for randomness would involve
    >taking the symbols 1 at a time, then 2 at a time, then 3 at a time, ... and
    >seeing how well the observed distribution of possible symbol grouping matches
    >the expected distribution.

    Yes, there are tests for randomness that check these things.

    > Does this also apply to complexity also?

    If the sequence is at all compressible, it is possible that any
    algorithm that generated it could have made use of any patterns,
    correlations and nonuniformities in the frequencies in various multi-
    symbol subsequences to save space in its maximally compressed form.

    >Basically, in terms of symbol sequences, what is the difference between
    >randomness and complexity?

    The complexity is effectively the length of it maximally compressed form
    (length of minimal length reproduction algorithm). Randomness is
    typically not a property of a given sequence. Rather it may be a
    property of the parent process that generated it. If a very long
    sequence is incompressible in that its complexity is nearly its original
    length then that sequence is a viable candidate for having been generated
    by a fully random process. Recall that a sequence generated by a good
    (pseudo)random number generator will tend to not have any obvious
    patterns that could be used to fail a test for randomness. Such a
    pseudo-random sequence *looks* random and may pass a battery of tests for
    randomness, but nevertheless such a sequence is highly compressible and
    is not very complex because the algorithm that generates it is much
    shorter than the sequence generated. Also, using the other meaning
    mentioned above for the term 'randomness' it is possible to consider a
    numerical measure for the randomness of a probability distribution that
    governs the realizations of an ensemble of possible sequences of a given
    length to be just the entropy of that generating distribution.

    >I have heard even physicists (in works directed
    >at a general audiences) use the terms interchangeably.

    Yes, and lamentably, such permutations of key terminology is not always
    restricted to general audiences.

    David Bowman
    David_Bowman@georgetowncollege.edu



    This archive was generated by hypermail 2b29 : Wed Oct 25 2000 - 03:03:34 EDT