Re: [Fwd: Re: Kirk Durston on information theory]

From: NSS (kirk@newscholars.com)
Date: Mon Nov 10 2003 - 14:36:54 EST

  • Next message: Glenn Morton: "RE: UK oil production lowest since 1992"

    Denyse, if a question is posted publically, then make my answer public.
    Otherwise, private correspondence will do. Since I do not participate in
    whatever list this is going on, I'll leave this up to you to do. I've posted
    this to the ASA email address, but since I am not a member of that list (and
    do not have time to, alas) it may be bounced, so please ensure that my
    response is posted. Thanks.

    Kirk

    Intro:

    >> This hypothesis yields two falsifiable predictions: a) natural processes
    >> will never be observed to produce more than 70 bits of information,
    >> and b) any configuration or sequence that is actually observed to be
    >> produced that contains more than 70 bits, will always be produced by
    >> an intelligent agent.

    Question:
    > Please explain exactly how this could be falsified. If someone says,
    > "look at all these natural proteins which to all appearances
    > developed gradually during evolution" you will say, "they contain
    > over 70 bits of CSI, therefore they must be designed." If we could
    > directly observe that they came about by a string of individual
    > mutations, you could still say, "it contains 70 bits of CSI,
    > therefore God was directing each mutation and its fixation along the
    > way."
    >
    > So what could possibly count as falsification?
    >
    > Preston G.

    Response:

    First, just a couple points of clarification.

    I'm not familiar enough with Dembski's work to use the term 'CSI'. I would
    prefer to simply use the term 'functional information' as discussed in Jack
    Szostak's very brief article 'Molecular information', Nature 423, (2003),
    689. As for why I prefer to use Shannon's approach to information rather
    than the Kolmogorov-Chaitin approach, see Adami & Cerf, 'Physical complexity
    of symbolic sequences' Physica D, 137 (2000), 62-69. Also, I think Shannon
    information is reasonably well understood by biologists and relatively
    uncontroversial. I do look forward to reading Dembski's works, but have not
    yet had the time to do so.

    There are some problems with the question. I will respond to each of them
    and then suggest two empirical approaches to verification or falsification.
    First, I want to point out that both predictions use the term 'observed'. Of
    course, proteins already exist; we have not observed their initial
    construction. All we can observe today is the copying or replication of
    existing functional information which, itself, requires no new information.
    There is danger of circular reasoning when proteins are described as
    'natural' (in the question). Referring to proteins as 'natural' assumes that
    they were produced by natural processes. We do not know that. We did not
    observe how they came into existence. Thus the problem we are faced with is
    that the average 300 residue protein requires roughly 500 bits to encode. We
    do not observe any natural processes at all that can do such a thing.
    Therefore we are left wondering how such a thing happened. Calling them
    'natural' and then concluding that, since they are natural, then natural
    processes can do such things is an obvious logical fallacy; it is circular
    reasoning. We cannot assume that they are 'natural'. Of course, it is an
    empirical fact that intelligent agents, such as humans, can produce vast
    amounts of functional information. Couple that with our empirical
    observations that nature cannot seem to produce more than a few dozen bits
    of information at best, and the resulting hypothesis I suggested is the most
    rational position to hold, supported by empirical observations.

    With respect to the possibility that God might be tinkering with experiments
    behind the scenes, when we do science, we make a tacit assumption that God
    is not doing such things. There is a way to test if we had such suspicions,
    but I see no reason to believe that He is. Furthermore, the same objection
    could be raised against any possible falsification for any hypothesis. One
    could always claim that hypothesis X was not really falsified, since God was
    tinkering away behind the scenes. So we need to assume that God is not
    tinkering behind the scenes if we are going to do science.

    There are probably a large number of ways to test the hypothesis that I
    presented, but I will only outline two. The first method would be to focus
    on a bacterial protein that has at least 6 highly conserved residues, any
    one of which, if substituted, would render the protein non-functional.
    Population A would have only one of these residues substituted for.
    Population B would have two, population C would have 3, and so on. Knowing
    the size of each population, and having an idea of replication times per
    population, and having in place some way of detecting when the gene
    (protein) became functional within a population, and so forth, one could
    then observe how many trials were required to mutate the gene back to
    functionality. One could then compare the experimental results with the
    number of trials predicted from probability calculations and see what kind
    of correlation there was between the two. For amino acids that must be
    conserved, the functional information they carry is maximized at 4.3
    bits/highly conserved residue. Armed with the experimental data, one could
    then plot the results along with what probability theory would predict, and
    see if there is an upper limit to the amount of functional information that
    natural processes can produce. Similar work has already been underway for a
    number of years at the U of Wisconsin and the data thus far seems to
    correlate well with what one would predict from probability theory. Also,
    there seems to be an upper limit for the amount of functional information
    that can be regenerated that is surprisingly modest, significantly less than
    40 bits. Also, we note that the more functional information that has to be
    regenerated, the more likely other areas of the temporarily non-functional
    gene will be degraded during its non-functional state, so it is not simply a
    matter of adding more time to the experiment. Rather, the experiment works
    against time. This suggested experiment will do two things. First, since the
    experimental data correlates well with what theory would predict, we can be
    more confident in our assumption that God is not tinkering away behind the
    scenes. Second, since the experimental data seems to indicate an upper limit
    for the regeneration of functional information that is well below 70 bits,
    the hypothesis is not falsified.

    There is a second, computational approach. One can run a simulation, such as
    Lenski's et al. 'The evolutionary origin of complex features' Nature 423
    (2003), 139-144, to see what sort of informational jumps can be accomplished
    through a random walk. Of course, I am assuming that computers work within
    the laws of nature, God's not tinkering with the processor, and the
    simulation is not designed to cheat. We can then compare the results with
    what we would predict in theory to see if there is a correlation. Again, the
    results appear to correlate well, and there seems to be an upper limit.
    Lenski's simulation could achieve smaller, intermediate functional
    information jumps, but it could not achieve a 32 bit jump. I have worked the
    numbers, and in theory, Lenski's simulation can achieve a 32 bit jump, he
    just needs to run his program a bit longer. He will never, however, achieve
    a 70 bit jump, and I can say that with confidence. Incidentally, something
    Lenski either does not realize, or chose not to discuss, is that one can
    achieve 32 bits by building in intermediate, selected for states, exactly as
    he did. However, he neglected to note that he will always have to input at
    least 32 bits of information into the virtual fitness landscape to do so. In
    general, one can reduce the amount of information acquired through a random
    walk by inputting the information into some sort of fitness landscape that
    will guide the random walk. The problem is then shifted from explaining
    where the information in the protein came from, to where the information in
    the fitness landscape came from. The point is moot, however, because the
    fitness factor in the regions of sequence space between real-world regions
    of folding sequence space, is zero and perfectly flat. There appear to be no
    closely spaced islands of stable, folding sequence space between the major
    3-D structural families in protein topologies.

    If, in either of the two above proposed experiments, we saw one of the
    predictions falsified, then the hypothesis would have to be either
    abandoned, or reworked. For the present, however, all the data seems to
    verify my hypothesis.

    Cheers,

    Kirk



    This archive was generated by hypermail 2.1.4 : Tue Nov 11 2003 - 20:16:24 EST