RE: Design detection and minimum description length

From: Adrian Teo (ateo@whitworth.edu)
Date: Tue Nov 26 2002 - 18:35:19 EST

  • Next message: bivalve: "Re: Fwd: "ICONS OF EVOLUTION?" posted at NCSE"

    This has been a fascinating discussion thus far. If you will indulge
    me for a moment, and let me think out loud as I try to make sense of
    what Glenn and Iain are saying:

    Iain offers the example of a "perfect deal" in a game of bridge.
    Anyone who has sees this situation would become highly suspicious,
    BUT, only if that person has some knowledge of card games and
    numbers. And that is precisely Glenn's point. For a person who has
    never seen a pack of cards, and don't know anything about written
    numbers, the "perfect deal" is unintelligible and random.

    BUT, even this naive person would become suspicious of the fact that
    each person received all the cards with the same shapes (suits) and
    colors. No additional side knowledge is necessary for one to detect
    similarities in shape or color. And in my conversations with Steve
    Meyers, a colleague of Dembski, this is the sense of what they are
    trying to get at (although it may not be precisely what Dembski was
    trying to get at in his book).

    Of course, I think that Glenn may further argue that there is in fact
    additional side knowledge required - the knowledge that comes from
    personal experience in the natural world; the knowledge that any
    large enough collection of objects of different shapes and colors
    almost never sort themselves out into neat categories without
    external assistance.

    At this point, I think the IDer would respond by agreeing with Glenn
    and say that this is the minimum requirement - the knowledge that
    comes from personal experience in the natural world (a loose way of
    describing Dembski's specificity). The IDer would perhaps add that
    when trying to detect design in any specific domain (e.g. card games
    or biological systems), the person with greater experience in that
    domain (i.e expert card player, or analogously, biologist) would be
    better able to detect design then the one without the experience
    (i.e. non-card player, non-scientist).

    Therein, lies (I think) Glenn's problem with ID. One would have to
    acquire *sufficient* information/experience in that particular domain
    before one is able to detect design. In Glenn's case, he is saying
    that sufficiency is defined as being told that the system is in fact
    designed, and if so, then the method is totally useless. I believe
    the IDer would say that sufficiency is any level of knowledge that
    allows one to form a base line, a norm, so that what is extraordinary
    would pop-up and become immediately obvious .

    At this point, I think the IDer, minimally, has to make the implicit
    assumption that there are two orders in the natural world - the
    extraordinary designed and detectable, and the ordinary
    design-undetectable. The sophisticated IDer would further make the
    disclaimer that the ordinary design-undetectable may also in fact be
    designed, but there is no way we can detect them to be as such. Glenn
    is right - the IDer would have to assume fundamentally that there is
    intelligent design in the natural world, although some instances are
    detectable and some aren't.

    Here is a problem for ID:
    As scientific knowledge increases, the base line changes along with
    it. What would have appeared as designed to a scientist living 25
    years ago with limited scientific knowledge would now appear to be
    just a part of ordinary physical occurences (i.e. part of the base
    line) to a scientist living today, with greater knowledge and
    experience of that particular domain. Whether I can detect something
    as designed is so dependent on my a priori knowledge of how nature
    works in that particular domain. How then can one be sure that one's
    conclusion of design in a particular case is really just another case
    of the ordinary?

    Iain's examples of detecting mathematical relationships/correlations
    seems irrelevant to me. What would perhaps be a better analogy would
    be the detection of causality. As in the case of attempting to detect
    design, one may be able to say that an event (no pattern) is so
    improbable that we have to reject the null and conclude that there is
    a pattern, just as in detecting a correlation, one concludes that the
    null hypothesis is so improbable that we reject it and therefore
    conclude that there is a relationship. But it is an entirely
    different matter to go from pattern to design, which would be
    analogous to going from relationship (correlation) to causation.

    Adrian.

    -----Original Message-----
    From: Iain Strachan [mailto:iain.strachan@eudoramail.com]
    Sent: Sun 11/24/2002 4:17 PM
    To: Iain Strachan; asa@calvin.edu; Glenn Morton
    Cc:
    Subject: RE: Design detection and minimum description length

            I wrote:

    >>Here are the two positions:
    >>
    >>Glenn's position:
    >>
    >>Dembski's method is no good because it can never eliminate the
    >>possibility of design. If I send him a text that is encoded with a
    >>Vignere cipher that is the same length as the text, it will appear
    >>random, and he will say it is undesigned, until I tell him that it is
    >>designed. It is therefore totally useless because it fails to
    >>discriminate between designed and undesigned.
    >>
    >>My position:
    >>
    >>Dembski's method only seeks to verify design that can be verified by
    >>observing something that has low probability. If the methodology
    >>fails to detect design, all it will say is that we can't make a
    >>design inference. Saying "we cannot make a design inference" is not
    >>the same as saying "we infer that it is not designed".
    >

            Glenn wrote:

    >No, this is not Dembski's methodology. He defines terms like
    'complex' and
    >'specified' and puts the emphaisis on specified.

            <Dembski quote snipped>

    >Thus, it is not merely improbability that indicates design.

            I agree that it is not merely improbability that indicates design;
            that specification and complexity are both required. But I don't
            think that is the bit of the methodology that you were criticizing.
            As I understand it, you are criticizing Dembski for being unable to
            detect design when it is there, as in the case of a Vignere
            cipher,with the length of the key equal to the length of the text.
            You further imply that Dembski will say that such a text is
            "undesigned". I am saying that the answer would be that we simply
            don't have enough data in this case to make a design inference, and I
            really can't see what's wrong with that. What is at issue is whether
            you can positively say something is obviously designed, not whether
            you can always detect it.

            I further argued that it is exactly analogous to attempting to fit a
            polynomial through a set of 10 data points. If the data were
            generated by a polynomial of degree 10 or more, then the problem is
            underdetermined, just as it is underdetermined when the Vignere
            cipher is the same length as the text. There are an infinite number
            of sets of polynomial coefficients to choose from to get an exact fit
            to the data in the data fitting case, and in addition you can make
            the curve do anything you like between the data points. Similarly it
            is possible to generate any meaningful message one wants from a
            random sequence of letters by the appropriate choice of cipher key in
            the Vignere cipher case.

            <Dembski second quote snipped concerning specification and complexity>

    >
    >Thus a sequence of meaningless alphabetic gobbledygook 107
    characters long
    >has a 1 out of 10^-151 chance of occurring. It is an exceedingly low
    >probability. Indeed the last sentence has 130 characters
    (excluding spaces).
    >That is an extremely low probability event. Dembski would say it is
    >specified because it has meaning. But an equally long
    sequence of random
    >characters, he would say is not specified. Your definition
    above totally
    >forgets the specified part of Dembski's method.
    >

            I don't think it does. If I conclude that a third order polynomial
            gives the best fit to my data, then I have specified the three
            coefficients required. The analogue of "it is specified because it
            has meaning" is "it is specified because it was generated by a third
            order polynomial (i.e. had a recognisably intelligible mathematical
            meaning)".

    >Iain, I will absolutely agree with you that mathematical
    functions numbers
    >can be detected. Much of science is built upon such things.

            Well, here's something we can agree on, thank goodness :-)

               One observes a
    >quantifiable phenomenon in nature and then discovers an
    equation which will
    >match the behavior. Fine. We all know that can occur. But
    does that mean
    >it is designed?

            No, I agree it doesn't mean it was designed. In most of science it
            means that there is a physical relationship that gives rise to the
            correlation. But my point wasn't to say that the existence of a
            mathematical law proves design; just to say that the statistical
            method (e.g. minimum description length) which serves to detect
            natural laws can equally be applied to the detection of design. Of
            course it then becomes debatable as to whether it was _intelligent_
            design. It might be the case that you could say "evolution designed
            it". But the trouble was that you were attacking the basic
            methodology, and not the conclusion that was drawn from it.

    >
    >Now, having yielded on the point in mathematics, I will
    point out to you
    >that none of my examples have been mathematical. They have
    been sequences
    >of letters as indeed, DNA is. Neither is determined by equation or
    >mathematical functions. So, in my opinion, your mathematical
    equations are
    >irrelevant to what I have been talking about.

            OK, here's a non-mathematical example, to which exactly the same
            ideas can be applied. The following is a representation of a bridge
            hand that I copied out of the Sunday paper:

            North:
            S: 632
            H: J532
            D: 93
            C: AJ43

            South:
            S: A-J75
            H: AQT9
            D: QJ
            C: 9

            East:
            S: T4
            H: 8-64
            D: T62
            C: K752

            West:
            S: 98
            H: K
            D: AK8754
            C: QT86

            The convention I have adopted is to represent each card's rank as a
            single character, with T representing a 10. I've simply listed the
            cards, but if there is a run of 3 or more in a row, I put the first
            and last card with a dash in the middle. A-J -> AKQJ. Leaving aside
            the "Framework" of "North" etc, and the suit indicator, which will be
            constant in the description of a hand, there will be a varying number
            of symbols to describe the actual data. In this particular hand
            there are 51 symbols out of a maximum of 52, due to the single run of
            four cards, the A K Q J of spades held by South. There is nothing
            unusual about the distribution of the cards, as far as I'm aware.
            Now here's another bridge hand, just as likely to occur as the first
            one:

            North:
            S: A-2
            H:
            D:
            C:

            South:
            S:
            H: A-2
            D:
            C:

            East:
            S:
            H:
            D: A-2
            C:

            West:
            S:
            H:
            D:
            C: A-2

            This is what is termed the "perfect deal" where all four players
            receive 13 cards of the same suit. Now the variable part of my
            descriptor has gone down to 12 symbols rather than 51.

            The same argument applies; what is the probability that, using this
            coding scheme, you can describe a bridge hand in 12 symbols or less
            (actually you can't do it in less than 12). The probability is
            staggeringly low & you conclude that the deck was stacked. If four
            players came up and said they received such a hand, in a normal game
            of bridge, you would not believe them; you would conclude that either:

            (1) It was a brand new pack of cards that they had forgotten to
            shuffle & that was how the cards were sorted in the unopened pack
            (design by the manufacturer), or

            (2) The dealer deliberately cheated, and arranged the cards in an
            order so as to give that result (maybe by sleight of hand in swapping
            the decks over.

            Glenn wrote:

              Indeed, the entire basis upon
    >which we must recognize alien life is mathematics. If we
    hear the Alpha
    >Centaurians mooing in their microphones, we probably won't understand
    >anything and probably won't know it is a language. Dembski's
    goal of course
    >is to apply his methodology to a sequence of letters: A,C,T
    and G. Merely
    >being low probability doesn't mean that the sequence is
    designed according
    >to what Dembski says above. It must also be specified. I
    see no way to
    >determine if it was specified save being told that it is so.
    >
    >If your method outlined here is useful at telling design of
    the things I
    >have been discussing, then please show me the mathematical
    equation for an
    >E. coli which was used to design it. And then show the
    different equation
    >for each and every strain of E. coli. Mathematics simply
    isn't what DNA is
    >and it isn't generated by a mathematical formula.

            I disagree; just about anything; text, music, DNA sequences or
            whatever can be generated by mathematical models. They are called
            "generative models" (the generation relies on sampling a random
            variable from a probability distribution that is specified byh the
            model). Since this was the topic of my recently completed PhD
            thesis, I feel pretty confident that I can talk about it. You can
            use a type of probabilistic model called a "Hidden Markov Model" to
            produce models for speech (and they are used in speech recognition
            software), for visualization of high-dimensional time-dependent data,
            which was the topic in my research, or indeed for modelling DNA
            sequences. Check out the web page at

            http://www.csse.monash.edu.au/~lloyd/tildeMML/Structured/HMM.html

            for a list of such applications. These models are indeed highly
            mathematical, and yet have applications in the analysis of data that
            isn't inherently describable by a simple mathematical function, such
            as speech, DNA sequences and so forth.

            The reason I gave a simple example of a mathematical function was to
            explain the analogy in as simple terms as possible. But the same
            methodology can be applied to determine how complex or simple to make
            the state transition matrix of your Hidden Markov Model. But the end
            goal is the same; to be able to model regularities in your data.

            Where "regularity" can be reasonably interpreted as "design", is of
            course what the whole debate is about; but that's not the specific
            issue that you raised with the Caesar cipher example, which was what
            prompted me to challenge you in the first place.

            Hope you find the above information to be of some interest.

            Best wishes,
            Iain.

            Join 18 million Eudora users by signing up for a free Eudora Web-Mail
            account at http://www.eudoramail.com



    This archive was generated by hypermail 2.1.4 : Wed Nov 27 2002 - 20:45:01 EST