RE: Dembski and Caesar cyphers

From: Iain Strachan (iain.strachan@eudoramail.com)
Date: Mon Nov 20 2000 - 21:37:56 EST

  • Next message: Glenn Morton: "RE: Dembski and Caesar cyphers"

    Glenn wrote:

    And I still think you miss the key point. If I present Dembski a random
    sequence, WITHOUT the key, and even without the knowledge that there is a
    key, Dembski will conclude that there is no design. That is what Dembski
    says over and over in the books. Random sequences mean no design. But,
    then AFTER this conclusion, I provide him with a Vigenere keyword which
    turns that into a readable sentence. By providing him a key, I have told him
    that this is a designed sequence. His methodology didn't detect the design,
    I TOLD HIM IT WAS DESIGNED!!! What kind of methodology is it that needs me
    to tell him it is designedn in order for his methodology to work?????

    My reply:

    Just stop right there and take a look at that last couple of
    sentences that you wrote.
    Let me make it plain that I'm not in the business of being some sort
    of a Dembski cheer-leader, dumbly saying yes to his every
    pronouncement. I have unresolved issues with Dembski's use of the No
    Free Lunch theorems that I hope he will address in due course. But
    neither am I prepared to give you the opportunity to rubbish him in
    immoderate language of that kind, complete with upper case letters
    (generally netiquette considers this bad manners, and the equivalent
    of shouting) and multiple question marks. If your use of this kind
    of language is to try and implicate that either Dembski or I or both
    of us are too stupid to listen to reasoned argument and you have to
    shout, then I'm not interested in continuing this conversation. All
    three of us are committed Christians and we should be able to
    continue a discourse in a brotherly manner. I challenged your
    original email purely on scientific grounds because I thought your
    point was not relevant and could not be used as a chal!
    lenge to Dembski's methodology, and I still do. I was interested to
    see if we could continue the discourse and enhance our mutual
    understanding of the subject. If that's what you want to do, fine,
    let's continue, but if all you really want to do is to discredit
    Dembski at whatever cost, then count me out.

    Next point. What you refer to as "Dembski's methodology" in this
    example is not even down to Dembski. It is simply an elaboration of
    the "Minimum Description Length Principle". It is a well-established
    bit of theory that no-one would seriously question. A good web
    resource on MDL is at

    http://www.mdl-research.org/

    I quote from the homepage of this website:

    -------------
    The purpose of statistical modeling is to discover regularities in
    observed data. The success in finding such regularities can be
    measured by the length with which the data can be described. This is
    the rationale behind the Minimum Description Length (MDL) Principle
    introduced by Jorma Rissanen (Rissanen, 1978).

    `` The MDL Principle is a relatively recent method for inductive
    inference. The fundamental idea behind the MDL Principle is that any
    regularity in a given set of data can be used to compress the data,
    i.e. to describe it using fewer symbols than needed to describe the
    data literally. '' (Gr|nwald, 1998)
    ------------

    Where Dembski uses the term "Design", here the term "regularities in
    observed data" is used instead. However, this is only part of the
    Dembski's methodology for detecting design.

    If you look up Rissanen on Citeseer, you will find 301 citations to
    his original paper:

    J.Rissanen, Modeling by shortest data description. Automatica, vol.
    14 (1978), pp. 465-471.

    So this is peer-reviewed standard stuff, not the offbeat ideas of
    some crackpot. In my own field of academic research, the MDL
    principle can be used to assist in model-order selection for data
    fitting by neural networks.

    What it means is that if a compact model can be found, then the data
    exhibits significant non-random patterns (read "design" if you wish;
    though the patterns might be naturally occurring of course). How
    does this work? Simply by probabilities deduced from a counting
    argument.

    Consider tossing a coin 500 times and it comes up heads 500 times.
    Can you detect "design", "cheating", "a biased coin" call it what you
    will, from this? On the face of it, a sequence of 500 heads in a row
    is just as (un)likely to occur as any other sequence (p =
    3.05x10^(-151)). So why are we surprised if we get 500 heads in a row
    as opposed to a random looking sequence? It is because we can
    describe the sequence in a compact form "500 heads in a row" for
    example, which is 18 characters in ASCII, or 144 bits of information.
    That 144 bits has been used to specify 500 bits of information (the
    sequence of coin-toss results). Now by a simple counting argument
    you can get an upper bound on the probability that a sequence of 500
    coin-tosses can be described in 144 bits or less. The total number
    of possible descriptors is clearly 2^144 (of which of course the
    vast majority will not be descriptors, such as "he sells sea shells",
    but we only want an upper bound.). These 2^1!
    44 possible descriptors can only account for a maximum of 2^144 of
    the 2^500 possible sequences. Hence the probability that a sequence
    of 500 coin tosses can be described in 144 bits or less is at most
    2^(144-500), or 6.8x10^(-108).

    That is why you suspect some "design" or "non-randomness" when you
    get 500 heads in a row, not because of the intrinsic probability of
    the sequence itself, but because it is staggeringly unlikely that you
    can describe the sequence in such a small amount of information.

    Glenn wrote:
    You miss the point again because my point is not that a random sequence of
    letters can produce a Shakespearean Sonnet, but that Dembski's methodology
    simply doesn't detect design without being told that something is designed.

    My reply:

    Now as far as whether Dembski says you have to tell him that it's
    designed, I think perhaps he phrased it badly in the book when he
    says that someone tells him it's a Caesar cypher. But someone
    telling him that is not necessary to deduce design, and I can't think
    that he meant it literally. What happens if someone you get a
    sequence of letters like that in a letter, or if you saw them
    engraved on a stone? Do you dismiss it as random junk, or do you
    wonder if it's a code? If you think it might be a code, then you
    start looking for means to break the code. You start at the simplest
    idea of all (a Caesar cipher, for example), and see if that works.
    If it doesn't, you try something more complex, e.g. a fixed letter
    substitution code, etc. You wouldn't start by trying a Vignere
    cipher the length of the text because you know you can produce any
    text you want that way. You might try Vignere ciphers of repeating
    keys of length 2, then 3, then 4, however (and of course the se!
    arch gets exponentially harder the further you take this.).
    The simpler the code, the more likely you are to find it, and the
    more confident you can be of design. But if your decoding scheme
    occupies the same length as your message (such as a Vignere cypher),
    then you clear can't make any design deduction, and Dembski correctly
    says as much in the quotation from No Free Lunch that I gave. The
    longer the description of the decoding scheme, the less confident you
    can be of a design, because the probability in the above counting
    argument increases.

    But go back to the Lottery example I gave. Here are the numbers:

    14 17 22 29 38 49 (13)

    No one needs to tell me that this is or is not designed. I have only
    to do some simple arithmetical transformations to the numbers, like
    take the differences between successive numbers, to see a possible
    pattern:

    3 5 7 9 11 (-36)

    >From that it's a short step to deduce it's a Caesar cipher. But the
    >sequence isn't nearly long enough to discount coincidence. The
    >description that it's a Caesar cipher with a shift of 13 on the
    >squares is hard to express in less length than to specify the seven
    >numbers. (something like B(n) = n^2 - 13 mod 49).

    But if it happens next week and the week after, then we can get to be
    more confident.

    So here's the conclusion. If a simple description of the data exists
    it will be easy to find, and a "design", or "non-random" conclusion
    can be made with confidence. If you need a complex model then no
    design detection can be made unless you tell me the design.

    None of this is a Dembski-an idea; it all follows from the minimum
    description length principle.

    Your comments, please, but please steer clear of the CAPITAL LETTERS
    and ?????? stuff. There is nothing wrong with my hearing :-)

    Iain.

    Join 18 million Eudora users by signing up for a free Eudora Web-Mail
    account at http://www.eudoramail.com



    This archive was generated by hypermail 2.1.4 : Wed Nov 20 2002 - 21:27:37 EST