RE: Dembski and Caesar cyphers

From: Iain Strachan (iain.strachan@eudoramail.com)
Date: Sat Nov 18 2000 - 21:44:33 EST

  • Next message: bivalve: "Crusades, from Bush says..."

    On Sun, 17 Nov 2002 18:58:14 Glenn Morton wrote:
    >
    >Iain, you must realize, that Dembski's argument absolutely fails because
    >Caesar cyphers are not the only encoding routines available. What I
    >presented was merely an illustration of the ease with which one can find a
    >code which will turn any random sequence into a 'designed' sequence. Indeed,
    >the best available cypher code involves the use of a keyword which is
    >precisely as long as the code. This gives very little ability to decipher
    >the text. Such keywords perform a modulo arithmetic between two
    >corresponding letters (one from the text and one from the code). The output
    >is another letter but which appears meaningless. Now, given any random
    >text, one can design a keyword code which will transform it into a
    >meaningful sentence or paragraph. So, if one has the random sequence:
    >frinmrcsxmktybevvrmhutevmuk
    >
    >And apply a Vigenere cypher to it of:
    >
    >yymeqnckxkynlhpvypryifrdsyb
    >
    >It turns that sequence into the first line of a limerick!
    >
    >therewasayoungladyfromniger
    >
    >Or take a really long sequence:
    >jwfldusfglttbncxpnzoqvvbavd
    >fwjlqrfghanhvmfljmfrqqanm
    >ljujouoyeacanagkxhkbw
    >nuojpewzlmklj
    >apkspzavxgrmctobrbcnfedttxbty
    >
    >Using the keyword:
    >
    >cdjchqsxgjhnotoxcjgcivdjgjg
    >rqbejyntriglsojpmusxbooqp
    >tobntcbbwtjagtnspabfa
    >pibcxrqddfymn
    >wwofigembyeaqgsbspwgnrwaxnvtb
    >
    >Becomes a famous limerick on quantum:
    >
    >therewasayoungmanwhosaidgod
    >mustthinkitexceedinglyodd
    >ifhefindsthatthistree
    >continuestobe
    >whentheresnooneaboutinthequad
    >
    >For the spatiall challenged:
    >
    >there was a young man who said god
    >must think it exceedingly odd
    >if he finds that this tree
    >continues to be
    >when theres no one about in the quad
    >
    >Every single sequence has the capacity to be a designed spy code.
    >
    >My point is this. Dembski claims that if one can provide a spy code which
    >turns a random sequence into a meaningful sequence, then what his method
    >says isn't designed suddenly becomes designed. Since I have shown that
    >every single random sequence can be turned into a meaningful sentence, it
    >means that Dembski's method can't distinguish designed sequences from
    >non-designed.
    >
    >
    >
    >
    >
    >
    >I don't care how long you make the random sequence, I can turn it into a
    >meaningful sentence of that length with great ease. Thus, Dembski's claim to
    >be able to detect design is meaningless. Since he can't rule out that any
    >given random sequence isn't a designed sequence because he must always worry
    >about a Vignere keyword which is capable of turning his random, non-designed

    Glenn,

    With all due respect, I think you're missing the point of Dembski's argument.
    The whole point is that one can detect design in an apparently random sequence
    of letters precisely because there exists a _compact_ description of the
    encryption key (in the case of a Caesar cypher a single number in the
    range 1-25
    that gives you the letter shift required to produce the encoded message).

    Your example of having a cypher key that is the same length as the message is
    irrelevant. Of course, given any random sequence, one can produce a key that
    is the same length as the original that can turn it into a Shakespeare sonnet,
    or a limerick, or the Vladivostok train timetable, or whatever you
    want, but it does not prove design, because it is clear you can produce such a
    key with probability one. But the probability, given a long sequence of random
    letters, that a simple cypher can produce an intelligible message is close to
    zero, and hence if one does exist, then you have good a good case for making a
    design inference, because a simple cypher key can only represent a very small
    fraction of all the possibilities, hence given a random sequence there is a
    very small probability that the range of keys available will lead to
    an intelligible message.

    With your 10 letter sequences generated by your random number
    program, the probability
    of there existing a Caesar cypher giving an intelligible message was reasonably
    high, but with a longer message, the chance becomes vanishingly small.

    In the same way, if you apply a file compression utility such as "WinZip" to a
    "designed" computer file (such as some English text, or a graphical design, or
    a piece of compiled computer code), then it can reduce the size of the original
    quite effectively. On the other hand if you generate a binary file of random
    numbers and try to compress them, WinZip will have no effect whatsoever; or if
    it does, then you immediately suspect there is something wrong with your
    pseudo-random number generator.

    But I think you will find that Dembski has already made precisely the same
    point as mine, on p78-79 of "No free lunch":

    Quote:

    To see what is at stake with the complexity measures used in defining
    specificational resources, consider two cryptosystems: first, an extremely
    simple one, like the Caesar cipher, in which each letter of the alphabet is
    moved a fixed number of notches up or down the alphabet; second, an extremely
    complicated one, like the one-time pad, in which each letter of the alphabet is
    moved a random number of notches up or down the alphabet and where the
    alteration of one letter at one location is probabilistically independent of
    other alterations at other locations. With the Caesar cipher the complexity of
    the cryptosystem is very low and leads to low specificational resources. On
    the other hand, with the one-time pad the complexity of the cryptosystem leads
    to specificational resources as numerous as the possible encrypted alphabetic
    strings. The low complexity of the Caesar cipher enables it to be easily
    broken. The high complexity of the one-time pad secures it against
    cryptographic attacks. The low complexity of the Caesar cipher removes any
    doubts about underdetermination in the breaking of the cryptosystem and thus
    whether a decrypted string contains a meaningful message and is therefore
    designed. The high complexity of the one-time pad guarantees that
    underdetermination cannot be avoided and that we can never be sure that an
    alphabetic string is other than random. The huge specificational resources
    associated with the one-time pad mean we can never draw a design inference for
    its encrypted messages.

    End Quote.

    Here's how it works in practice. Let's go back to the lottery
    example. Suppose one week you find that the lottery draw produces
    the following numbers:

    14 17 22 29 38 49 (13) [The seventh ball is a "bonus ball"]

    And you noticed (because you like doing those "what's the next
    number" problems in IQ tests), that if you subtracted 13 from each
    number and took the answer modulo 49 (0->49), then you generated a
    sequence of squares:

    1 4 9 16 25 36 (49)

    Now you might well just think that is an amusing coincidence. But
    what if it happened the next week, only this time the magic number to
    subtract was 29, and then the third week the same but it was 3?

    You would conclude after a very short time that the lottery wasn't
    random at all, but was a sequence of squares that was encrypted by a
    Caesar cipher. Furthermore your "design inference" is useful. All
    you have to do next week is to buy 49 tickets, trying out all
    possible combinations allowed by the Caesar cipher, and, if the
    pattern continues, you would be guaranteed to hit the jackpot.

    But now suppose the "key" is as long as the original sequence. So
    you get the numbers

    1 7 13 14 19 23 (41)

    The magic numbers to subtract are now 0, 3, 4, 47, 43, 36, (41)

    The next week the magic numbers used to encrypt the sequence of squares are

    1,3,7 11 , 22, 36 (1)

    So all you have to do next week is to try out all the combinations of
    "keys" used to encrypt the sequence of squares to be sure of getting
    the jackpot. How many tickets do you have to buy? Around 14
    million. Hence there is no design inference.

    The design inference can be applied the other way, of course. We are
    pretty certain that the lottery numbers are genuinely random, but
    what about the numbers chosen by the players? The first time they
    ever did the UK National Lottery, they published the figures, how
    many tickets sold, how many winners (you win if 3 or more of your
    numbers match the draw). There were 1.1 million winners. I
    calculated the expected number of winners, given a random choice, and
    it came to 800,000 with a standard deviation of around 5,000. That
    meant that on an assumption of "no design" in the choice of numbers
    on the tickets, the probability of getting that many winners was
    around 10^-250. It wasn't hard to see the reason for the fluke
    result. All the numbers in the draw happened to be less than 32; not
    that unlikely as a result, but it biased the results because people
    tend to choose birthdays of loved ones for their numbers, and hence a
    disproportionately high number are in the region 1!
    -31. This seems as clear as a design inference, based on low
    probability, as one could wish for. And even though I'm using
    hindsight and knowledge that people choose birthdays, I submit that
    if a Martian came to earth and saw the result then s/he would
    conclude that for some unknown reason people were choosing low
    numbers in preference to high ones.

    This "design inference" is also useful in maximizing your expected
    winnings. One combination is as likely to win as the other, so all
    you have to do is choose a combination that is unlikely to be chosen
    ("design" it the other way). I've a colleague who puts this into
    practice. He always chooses high numbers, and makes a habit of
    putting three in a row, like 42,43,44, because the Great British
    Public don't consider that to be "random". It paid off; the one time
    he got a 4-ball match, where the prizes vary according to the number
    of winners, he got a payout of around twice as much as the average
    for a match 4.

    Join 18 million Eudora users by signing up for a free Eudora Web-Mail
    account at http://www.eudoramail.com



    This archive was generated by hypermail 2.1.4 : Tue Nov 19 2002 - 01:31:02 EST