RE: Dembski and Caesar cyphers

From: Iain Strachan (iain.strachan@eudoramail.com)
Date: Mon Nov 20 2000 - 21:37:56 EST

Next message: Glenn Morton: "RE: Dembski and Caesar cyphers"

Previous message: Dawsonzhu@aol.com: "RE: Dembski and Caesar cyphers"
Maybe in reply to: Glenn Morton: "Dembski and Caesar cyphers"
Next in thread: Glenn Morton: "RE: Dembski and Caesar cyphers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Glenn wrote:

And I still think you miss the key point. If I present Dembski a random
sequence, WITHOUT the key, and even without the knowledge that there is a
key, Dembski will conclude that there is no design. That is what Dembski
says over and over in the books. Random sequences mean no design. But,
then AFTER this conclusion, I provide him with a Vigenere keyword which
turns that into a readable sentence. By providing him a key, I have told him
that this is a designed sequence. His methodology didn't detect the design,
I TOLD HIM IT WAS DESIGNED!!! What kind of methodology is it that needs me
to tell him it is designedn in order for his methodology to work?????

My reply:

Just stop right there and take a look at that last couple of
sentences that you wrote.
Let me make it plain that I'm not in the business of being some sort
of a Dembski cheer-leader, dumbly saying yes to his every
pronouncement. I have unresolved issues with Dembski's use of the No
Free Lunch theorems that I hope he will address in due course. But
neither am I prepared to give you the opportunity to rubbish him in
immoderate language of that kind, complete with upper case letters
(generally netiquette considers this bad manners, and the equivalent
of shouting) and multiple question marks. If your use of this kind
of language is to try and implicate that either Dembski or I or both
of us are too stupid to listen to reasoned argument and you have to
shout, then I'm not interested in continuing this conversation. All
three of us are committed Christians and we should be able to
continue a discourse in a brotherly manner. I challenged your
original email purely on scientific grounds because I thought your
point was not relevant and could not be used as a chal!
lenge to Dembski's methodology, and I still do. I was interested to
see if we could continue the discourse and enhance our mutual
understanding of the subject. If that's what you want to do, fine,
let's continue, but if all you really want to do is to discredit
Dembski at whatever cost, then count me out.

Next point. What you refer to as "Dembski's methodology" in this
example is not even down to Dembski. It is simply an elaboration of
the "Minimum Description Length Principle". It is a well-established
bit of theory that no-one would seriously question. A good web
resource on MDL is at

http://www.mdl-research.org/

I quote from the homepage of this website:

-------------
The purpose of statistical modeling is to discover regularities in
observed data. The success in finding such regularities can be
measured by the length with which the data can be described. This is
the rationale behind the Minimum Description Length (MDL) Principle
introduced by Jorma Rissanen (Rissanen, 1978).

`` The MDL Principle is a relatively recent method for inductive
inference. The fundamental idea behind the MDL Principle is that any
regularity in a given set of data can be used to compress the data,
i.e. to describe it using fewer symbols than needed to describe the
data literally. '' (Gr|nwald, 1998)
------------

Where Dembski uses the term "Design", here the term "regularities in
observed data" is used instead. However, this is only part of the
Dembski's methodology for detecting design.

If you look up Rissanen on Citeseer, you will find 301 citations to
his original paper:

J.Rissanen, Modeling by shortest data description. Automatica, vol.
14 (1978), pp. 465-471.

So this is peer-reviewed standard stuff, not the offbeat ideas of
some crackpot. In my own field of academic research, the MDL
principle can be used to assist in model-order selection for data
fitting by neural networks.

What it means is that if a compact model can be found, then the data
exhibits significant non-random patterns (read "design" if you wish;
though the patterns might be naturally occurring of course). How
does this work? Simply by probabilities deduced from a counting
argument.

Consider tossing a coin 500 times and it comes up heads 500 times.
Can you detect "design", "cheating", "a biased coin" call it what you
will, from this? On the face of it, a sequence of 500 heads in a row
is just as (un)likely to occur as any other sequence (p =
3.05x10^(-151)). So why are we surprised if we get 500 heads in a row
as opposed to a random looking sequence? It is because we can
describe the sequence in a compact form "500 heads in a row" for
example, which is 18 characters in ASCII, or 144 bits of information.
That 144 bits has been used to specify 500 bits of information (the
sequence of coin-toss results). Now by a simple counting argument
you can get an upper bound on the probability that a sequence of 500
coin-tosses can be described in 144 bits or less. The total number
of possible descriptors is clearly 2^144 (of which of course the
vast majority will not be descriptors, such as "he sells sea shells",
but we only want an upper bound.). These 2^1!
44 possible descriptors can only account for a maximum of 2^144 of
the 2^500 possible sequences. Hence the probability that a sequence
of 500 coin tosses can be described in 144 bits or less is at most
2^(144-500), or 6.8x10^(-108).

That is why you suspect some "design" or "non-randomness" when you
get 500 heads in a row, not because of the intrinsic probability of
the sequence itself, but because it is staggeringly unlikely that you
can describe the sequence in such a small amount of information.

Glenn wrote:
You miss the point again because my point is not that a random sequence of
letters can produce a Shakespearean Sonnet, but that Dembski's methodology
simply doesn't detect design without being told that something is designed.

My reply:

Now as far as whether Dembski says you have to tell him that it's
designed, I think perhaps he phrased it badly in the book when he
says that someone tells him it's a Caesar cypher. But someone
telling him that is not necessary to deduce design, and I can't think
that he meant it literally. What happens if someone you get a
sequence of letters like that in a letter, or if you saw them
engraved on a stone? Do you dismiss it as random junk, or do you
wonder if it's a code? If you think it might be a code, then you
start looking for means to break the code. You start at the simplest
idea of all (a Caesar cipher, for example), and see if that works.
If it doesn't, you try something more complex, e.g. a fixed letter
substitution code, etc. You wouldn't start by trying a Vignere
cipher the length of the text because you know you can produce any
text you want that way. You might try Vignere ciphers of repeating
keys of length 2, then 3, then 4, however (and of course the se!
arch gets exponentially harder the further you take this.).
The simpler the code, the more likely you are to find it, and the
more confident you can be of design. But if your decoding scheme
occupies the same length as your message (such as a Vignere cypher),
then you clear can't make any design deduction, and Dembski correctly
says as much in the quotation from No Free Lunch that I gave. The
longer the description of the decoding scheme, the less confident you
can be of a design, because the probability in the above counting
argument increases.

But go back to the Lottery example I gave. Here are the numbers:

14 17 22 29 38 49 (13)

No one needs to tell me that this is or is not designed. I have only
to do some simple arithmetical transformations to the numbers, like
take the differences between successive numbers, to see a possible
pattern:

3 5 7 9 11 (-36)

>From that it's a short step to deduce it's a Caesar cipher. But the
>sequence isn't nearly long enough to discount coincidence. The
>description that it's a Caesar cipher with a shift of 13 on the
>squares is hard to express in less length than to specify the seven
>numbers. (something like B(n) = n^2 - 13 mod 49).

But if it happens next week and the week after, then we can get to be
more confident.

So here's the conclusion. If a simple description of the data exists
it will be easy to find, and a "design", or "non-random" conclusion
can be made with confidence. If you need a complex model then no
design detection can be made unless you tell me the design.

None of this is a Dembski-an idea; it all follows from the minimum
description length principle.

Your comments, please, but please steer clear of the CAPITAL LETTERS
and ?????? stuff. There is nothing wrong with my hearing :-)

Iain.

Join 18 million Eudora users by signing up for a free Eudora Web-Mail
account at http://www.eudoramail.com

Next message: Glenn Morton: "RE: Dembski and Caesar cyphers"
Previous message: Dawsonzhu@aol.com: "RE: Dembski and Caesar cyphers"
Maybe in reply to: Glenn Morton: "Dembski and Caesar cyphers"
Next in thread: Glenn Morton: "RE: Dembski and Caesar cyphers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.4 : Wed Nov 20 2002 - 21:27:37 EST