RE: Design detection and minimum description length

From: Iain Strachan (iain.strachan@eudoramail.com)
Date: Sat Nov 25 2000 - 00:17:19 EST

  • Next message: Glenn Morton: "RE: Dembski and Caesar cyphers"

    I wrote:

    >>Here are the two positions:
    >>
    >>Glenn's position:
    >>
    >>Dembski's method is no good because it can never eliminate the
    >>possibility of design. If I send him a text that is encoded with a
    >>Vignere cipher that is the same length as the text, it will appear
    >>random, and he will say it is undesigned, until I tell him that it is
    >>designed. It is therefore totally useless because it fails to
    >>discriminate between designed and undesigned.
    >>
    >>My position:
    >>
    >>Dembski's method only seeks to verify design that can be verified by
    >>observing something that has low probability. If the methodology
    >>fails to detect design, all it will say is that we can't make a
    >>design inference. Saying "we cannot make a design inference" is not
    >>the same as saying "we infer that it is not designed".
    >

    Glenn wrote:

    >No, this is not Dembski's methodology. He defines terms like 'complex' and
    >'specified' and puts the emphaisis on specified.

    <Dembski quote snipped>

    >Thus, it is not merely improbability that indicates design.

    I agree that it is not merely improbability that indicates design;
    that specification and complexity are both required. But I don't
    think that is the bit of the methodology that you were criticizing.
    As I understand it, you are criticizing Dembski for being unable to
    detect design when it is there, as in the case of a Vignere
    cipher,with the length of the key equal to the length of the text.
    You further imply that Dembski will say that such a text is
    "undesigned". I am saying that the answer would be that we simply
    don't have enough data in this case to make a design inference, and I
    really can't see what's wrong with that. What is at issue is whether
    you can positively say something is obviously designed, not whether
    you can always detect it.

    I further argued that it is exactly analogous to attempting to fit a
    polynomial through a set of 10 data points. If the data were
    generated by a polynomial of degree 10 or more, then the problem is
    underdetermined, just as it is underdetermined when the Vignere
    cipher is the same length as the text. There are an infinite number
    of sets of polynomial coefficients to choose from to get an exact fit
    to the data in the data fitting case, and in addition you can make
    the curve do anything you like between the data points. Similarly it
    is possible to generate any meaningful message one wants from a
    random sequence of letters by the appropriate choice of cipher key in
    the Vignere cipher case.

    <Dembski second quote snipped concerning specification and complexity>

    >
    >Thus a sequence of meaningless alphabetic gobbledygook 107 characters long
    >has a 1 out of 10^-151 chance of occurring. It is an exceedingly low
    >probability. Indeed the last sentence has 130 characters (excluding spaces).
    >That is an extremely low probability event. Dembski would say it is
    >specified because it has meaning. But an equally long sequence of random
    >characters, he would say is not specified. Your definition above totally
    >forgets the specified part of Dembski's method.
    >

    I don't think it does. If I conclude that a third order polynomial
    gives the best fit to my data, then I have specified the three
    coefficients required. The analogue of "it is specified because it
    has meaning" is "it is specified because it was generated by a third
    order polynomial (i.e. had a recognisably intelligible mathematical
    meaning)".

    >Iain, I will absolutely agree with you that mathematical functions numbers
    >can be detected. Much of science is built upon such things.

    Well, here's something we can agree on, thank goodness :-)

       One observes a
    >quantifiable phenomenon in nature and then discovers an equation which will
    >match the behavior. Fine. We all know that can occur. But does that mean
    >it is designed?

    No, I agree it doesn't mean it was designed. In most of science it
    means that there is a physical relationship that gives rise to the
    correlation. But my point wasn't to say that the existence of a
    mathematical law proves design; just to say that the statistical
    method (e.g. minimum description length) which serves to detect
    natural laws can equally be applied to the detection of design. Of
    course it then becomes debatable as to whether it was _intelligent_
    design. It might be the case that you could say "evolution designed
    it". But the trouble was that you were attacking the basic
    methodology, and not the conclusion that was drawn from it.

    >
    >Now, having yielded on the point in mathematics, I will point out to you
    >that none of my examples have been mathematical. They have been sequences
    >of letters as indeed, DNA is. Neither is determined by equation or
    >mathematical functions. So, in my opinion, your mathematical equations are
    >irrelevant to what I have been talking about.

    OK, here's a non-mathematical example, to which exactly the same
    ideas can be applied. The following is a representation of a bridge
    hand that I copied out of the Sunday paper:

    North:
    S: 632
    H: J532
    D: 93
    C: AJ43

    South:
    S: A-J75
    H: AQT9
    D: QJ
    C: 9

    East:
    S: T4
    H: 8-64
    D: T62
    C: K752

    West:
    S: 98
    H: K
    D: AK8754
    C: QT86

    The convention I have adopted is to represent each card's rank as a
    single character, with T representing a 10. I've simply listed the
    cards, but if there is a run of 3 or more in a row, I put the first
    and last card with a dash in the middle. A-J -> AKQJ. Leaving aside
    the "Framework" of "North" etc, and the suit indicator, which will be
    constant in the description of a hand, there will be a varying number
    of symbols to describe the actual data. In this particular hand
    there are 51 symbols out of a maximum of 52, due to the single run of
    four cards, the A K Q J of spades held by South. There is nothing
    unusual about the distribution of the cards, as far as I'm aware.
    Now here's another bridge hand, just as likely to occur as the first
    one:

    North:
    S: A-2
    H:
    D:
    C:

    South:
    S:
    H: A-2
    D:
    C:

    East:
    S:
    H:
    D: A-2
    C:

    West:
    S:
    H:
    D:
    C: A-2

    This is what is termed the "perfect deal" where all four players
    receive 13 cards of the same suit. Now the variable part of my
    descriptor has gone down to 12 symbols rather than 51.

    The same argument applies; what is the probability that, using this
    coding scheme, you can describe a bridge hand in 12 symbols or less
    (actually you can't do it in less than 12). The probability is
    staggeringly low & you conclude that the deck was stacked. If four
    players came up and said they received such a hand, in a normal game
    of bridge, you would not believe them; you would conclude that either:

    (1) It was a brand new pack of cards that they had forgotten to
    shuffle & that was how the cards were sorted in the unopened pack
    (design by the manufacturer), or

    (2) The dealer deliberately cheated, and arranged the cards in an
    order so as to give that result (maybe by sleight of hand in swapping
    the decks over.

    Glenn wrote:

      Indeed, the entire basis upon
    >which we must recognize alien life is mathematics. If we hear the Alpha
    >Centaurians mooing in their microphones, we probably won't understand
    >anything and probably won't know it is a language. Dembski's goal of course
    >is to apply his methodology to a sequence of letters: A,C,T and G. Merely
    >being low probability doesn't mean that the sequence is designed according
    >to what Dembski says above. It must also be specified. I see no way to
    >determine if it was specified save being told that it is so.
    >
    >If your method outlined here is useful at telling design of the things I
    >have been discussing, then please show me the mathematical equation for an
    >E. coli which was used to design it. And then show the different equation
    >for each and every strain of E. coli. Mathematics simply isn't what DNA is
    >and it isn't generated by a mathematical formula.

    I disagree; just about anything; text, music, DNA sequences or
    whatever can be generated by mathematical models. They are called
    "generative models" (the generation relies on sampling a random
    variable from a probability distribution that is specified byh the
    model). Since this was the topic of my recently completed PhD
    thesis, I feel pretty confident that I can talk about it. You can
    use a type of probabilistic model called a "Hidden Markov Model" to
    produce models for speech (and they are used in speech recognition
    software), for visualization of high-dimensional time-dependent data,
    which was the topic in my research, or indeed for modelling DNA
    sequences. Check out the web page at

    http://www.csse.monash.edu.au/~lloyd/tildeMML/Structured/HMM.html

    for a list of such applications. These models are indeed highly
    mathematical, and yet have applications in the analysis of data that
    isn't inherently describable by a simple mathematical function, such
    as speech, DNA sequences and so forth.

    The reason I gave a simple example of a mathematical function was to
    explain the analogy in as simple terms as possible. But the same
    methodology can be applied to determine how complex or simple to make
    the state transition matrix of your Hidden Markov Model. But the end
    goal is the same; to be able to model regularities in your data.

    Where "regularity" can be reasonably interpreted as "design", is of
    course what the whole debate is about; but that's not the specific
    issue that you raised with the Caesar cipher example, which was what
    prompted me to challenge you in the first place.

    Hope you find the above information to be of some interest.

    Best wishes,
    Iain.

    Join 18 million Eudora users by signing up for a free Eudora Web-Mail
    account at http://www.eudoramail.com



    This archive was generated by hypermail 2.1.4 : Mon Nov 25 2002 - 11:37:54 EST