Re: Probability and apologetics

Brian D. Harper (bharper@postbox.acs.ohio-state.edu)
Thu, 7 Sep 1995 23:19:33 -0400

Bill wrote:

"I'm in the midst of a crisis. I sent a brief reply to Brian.
As I told him, I agree with much of what he said -- I'm just
not as sanguine about declaring a finite length sequence nonrandom."
=================

First of all, in view of Bill's crisis I will understand if he chooses
not to continue on with this and will not interpret the absence of a
reply as a concession.

The reason that I am dogmatic (not sanguine !) in declaring that some
finite sequences are nonrandom is that this follows immediately from
the definition I'm using for random. A sequence that is compressible
is by definition nonrandom. The problem is that one cannot prove a
given sequence to be random since it may well be that it is not
random and you just haven't been clever enough to find the means to
compress it. It is, on the other hand, possible to prove a given sequence
is compressible simple by compressing it.

As I mentioned before, this definition has only to do with the structure
of the sequence and not how the sequence was generated. It is for this
reason that I'm going to try to start saying stochastic process instead
of random process. It is possible, though unlikely, for a stochastic
process to yield a sequence whose structure is nonrandom.

Another thing I tend to be dogmatic about is that practically all
sequences are random (incompressible). Note that this doesn't mean
that practically all strings are generated by a stochastic process.
Instead, it means that if you went to the trouble of writing out
all the possible sequences of a given length, most would be
incompressible. The reason I'm dogmatic about it is that this
is a basic result from algorithmic information theory and can be
proven.

Let me mention here another possible source of confusion which I
should have mentioned previously. Information theory can address
only the structure of a sequence and not its meaning. This should
be obvious if one considers the fact that exactly the same word
can mean different things in different languages. Meaning is
established by mutual consent between two or more parties and
cannot be deduced from the structure itself.

Now for another interesting and perhaps somewhat surprising result
that has some bearing on Glenn's proposal [I'm still thinking about
the best way to answer Glenn --- later!]. It turns out that an
incompressible string *must* contain compressible substrings. This
is no doubt what the author of the New Scientist article was getting
at when he talked about fleeting glimpses of seeming order buried in
random strings. Also, the 11 consecutive 0's in my string would be
an example of a compressible substring.

Now I will discuss Bill's post of 9/7/95.

>>bill:========
>> "The point is that no sequence is inherently random or nonrandom.
>> The sequence Brian included could have come from tosses of a fair
>> coin."
>>end:=========
>>

brian:=========
>>Now I'm really intrigued. Is this your interpretation of what the author
>>wrote or did he actually say something like this? In any event I must
>>strongly disagree with the first sentence, practically all sequences are
>>random.
>

bill:==========
>I succumbed to one of my favorite vices: making bald generalizations. The
>point I was trying to make is that it's difficult to determine if a finite
>sequence -- especially a relatively short one -- is random.

Yes, this is precisely the problem mentioned above. It is in fact
impossible to determine if any specific sequence is random even though
almost all of them are :). Interesting paradox. Fortunately, the most
useful thing (to me anyway) is determining if a sequence is nonrandom
(i.e. compressible) and this can be done.

> I agree that
>your sequence B looks much more like a random sequence than sequence A. I
>think you discuss how practically all sequences are random below. At this
>point I will merely note that I suspect we will have to agree on some
>definitions before I will concede _that_.

Actually, one only has to define random as incompressible and then
introduce c-incompressibility as a precise way of defining what you
mean by incompressible. The rest follows from this. See Li and
Vitanyi p.96 [references at the end].

>
>>The second sentence I agree with, sequence (A) *could* have
>>been generated by tossing a fair coin, although this possibility is
>>extremely unlikely. This is not due to the improbability of obtaining
>>that *specific* sequence [the creationist's fallacy] but rather due to
>>the improbability of getting *any* ordered sequence from a random
>>process [failing to recognize this is what I referred to earlier
>>as the evolutionists fallacy]
>

bill:===============
>Perhaps you need to define ordered. Remember, you just said practically
>all sequences are random. Can a random sequence be ordered?

No, random sequences cannot be ordered as a matter of definition. Random
means incompressible, ordered means highly compressible. Random and
ordered are at opposite ends of the compressibility yard stick.

> You have also
>used the term "organized" in other posts to mean something different from
>"ordered". I've been meaning to ask for a definition of "organized" also,
>and maybe I should just do that now.

I must be the worlds worst pracrastinator (or should I say best:), in
any event I have a bad habit of putting stuff off. I've been meaning
to put together a summary of what I've found on this. This has turned
out to be difficult since many authors use the word without defining
it. I think I will save details for a separate post (i.e. I'm putting
it off again :) and just try to give some brief comments here.

It seems to me that organized is used in several ways depending on the
subject. In the case of algorithmic information theory (what we are
discussing here) organized generally just means something in between
ordered and random, but closer to the random end of the scale. In other
words, somewhat compressible.

In terms of classical information theory (Shannon), organized would
mean large information content but not maximal. The maximal information
content would be that for a random (incompressible) sequence. This is
very similar to the above.

Using typical complexity/chaos terminology we would say organization
occurs "at the edge of chaos". This is also very close to the meaning
above.

In books about organization in biological systems, organization seems to
mean essentially what Mike Behe and others mean by irreducible complexity.

I am being somewhat reserved in my statements above because, as I said,
not too many authors come right out and say "now this is exactly what
I mean by organized, pay attention", i.e. I have to try to infer what
they mean from the context.

>>
>>I think the confusion here is due to my using random in two different
>>ways in the same post. I do this out of habit since this terminology
>>is common in the literature. Anyway, the random in random sequence means
>>something different than the random in random process. I think its
>>clear that this must be the case since it is in fact possible for
>>a random process to produce a non-random sequence, an apparent
>>contradiction. Random sequence takes on the definition of random
>>used in information theory. Whether a sequence is random or not
>>depends solely on its structure and is completely independent of
>>the process used to produce the sequence. Thus, one can discuss
>>the orderliness or randomness of a sequence knowing absolutely
>>nothing about how it was generated. To avoid this confusion, many
>>authors have started substituting stochastic process for random
>>process, reserving the word random for its information theoretic
>>definition.
>
>Unfortunately, the courses I took on probability, random numbers and
>stochastic processes in grad school 30 years ago were quite vague about
>getting definitions nailed down. Maybe it's time to read some more modern
>references. Any suggestions?

The bionet.info-theory FAQ suggests Cover&Thomas and Pierce [refs below]
as good introductory books. I would probably recommend Cover&Thomas
since it also goes into Kolmogorov Complexity (algorithmic info-theory).

For something quick and also cheap, their are primers on info-theory
on the world wide web:

http://www-lmmb.ncifcrf.gov/~toms/

http://www.math.washington.edu/~hillman/entropy.html

There is also a new book on probability theory written by none other than E.
T. Jaynes available for free on the web. The title is:

Probability Theory: The Logic of Science

It can be found by starting at the first address above and clicking
the following:

On Line Papers/Papers by other people/Jaynes Book

> To me the idea of saying that a substring of
>the output from a true random number generator is nonrandom seems
>foolhardy. To me, saying it's nonrandom implies something about the
>underlying process which has generated the string.

Bill, Bill, Bill ...... :-). This is it ! [well almost]

As I mentioned at the beginning, any incompressible (random) string
must contain compressible (nonrandom) substrings. Let's forget
substrings for the moment since it complicates things unnecessarily.
I will note however that I'm going to have to try to deal with
this before answering Glenn. His little scenario is strengthened by
this observation.

let me modify your statement above ever so slightly:

"To me the idea of saying that [....] the output from a
true random number generator is nonrandom seems foolhardy. To me,
[observing] it's nonrandom implies something about the underlying
process which has generated the string"

[brackets] indicate where I have changed your statement.

The modified statement is exactly the point I have been trying to get
across. Now let's imagine that we do not know _a priori_ that a sequence
has been generated by a true stochastic process. Observing that this
sequence is nonrandom definitely does imply something about the
underlying process, it implies it is not truly a stochastic process,
to say otherwise would be foolhardy indeed ;-). But the conclusion that
the underlying process is not stochastic is never 100% certain because
it is possible , though very unlikely, for a truly stochastic process to
give a nonrandom result.

But it's important here to consider all the available information and
not just a subset (i.e. a substring). As an experimentalist, I would
say "keep all the data and not just the part that fits your theory".
Again, let's imagine that we are receiving a sequence from some
unknown source one symbol at a time.

0101010101

At this point we might formulate the hypothesis that the message is not
being generated by a stochastic process. From this hypothesis we make a
prediction about what will come next and we continue to watch:

010101010101010101010101010101

The prediction comes true, strengthening our hypothesis

0101010101010101010101010101010101010101010101

Our confidence grows:

0101010101010101010101010101010101010101010101010101111001011

oops, hypothesis falsified, back to the drawing board ;-)

The conclusion that the underlying process is not stochastic is always
tentative and can be falsified with additional information. But the
longer the sequence the greater the certainty.

> If it doesn't, what
>does it buy you? Suppose the string 11110000 occurs in a long sequence of
>outputs from a true random number generator -- flips of a fair coin. I can
>certainly claim that string is nonrandom, and I can easily make a finite
>state machine that can generate it. But what's the point? None of that
>tells me anything about when I can expect to see such a string again. It
>seems to me that what we're doing when we try to understand nature is
>looking at data and searching for underlying regularity. Frequently that
>involves filtering operations to remove noise. If there's no underlying
>regularity, then you can have any pattern you want in a string and it
>doesn't tell you anything about the process by which the string was
>generated, (except that it's random or chaotic :-)) which is generally the
>objective in studying random sequences.

Agreed.

[... deleted Gell-Mann]

bill:=================
>Okay. Fair enough. However, even the incompressibility definition is
>potentially problematic. For example Michael Barnsley has shown that
>fractals can be used to compress scenes from nature. That makes sense
>because fairly natural-looking pictures of mountains, clouds and islands
>can be generated using fractals. In other words, there may be a certain
>nonlinear recurrence that, if found (and that's the rub with Barnsley's
>approach) can achieve compression. As I've said before, the beautiful
>patterns based on the Mandelbrot set can be expressed in terms of a simple
>recursion: x(n+1) = x(n)^2 + c and a coloring rule based on escape times.
>But if you tried to compress one of these patterns without knowing the
>generation rule, you'd be stuck.

Here's an interesting experiment I've been intending to try. Generate
a Mandelbrot set and use a screen capture program to save it. Now
write a program that fills in various dots with different colors at
random. Generate a screen and capture it. Now use a compression program
to see which of these can be compressed the most. I'll wager that the file
generated by the Mandelbrot set can be compressed more, but obviously
not to x(n+1) = x(n)^2 + c ;-).

Even if this were not the case it would not change the points I've
been making. It is always possible that something is compressible
and you just aren't clever enough to compress it. The power comes
when you are able to compress something because you know for sure
that anything that can be compressed is in fact compressible ;-).

[...]

>>Some time ago someone on bionet.info-theory suggested a way of looking
>>at this that I have found very useful.
>
>[interesting stuff about c-incompressibility snipped. I want to read it
>and respond later]
>

Just wanted to point out that only the suggestion that one imagine
actually receiving the sequence as a message one symbol at a time
was borrowed from someone in bionet.info-theory.

The c-incompressibility stuff comes from Li and Vitanyi, page 96.

[...]

brian:===
>>Let's consider another example. Evolutionists make a big deal about
>>about the patterns seen in the fossil record, claiming that these
>>patterns give strong evidence for common descent. Were I to use
>>the evolutionists fallacy I could simply say "so what". The pattern
>>seen is no less likely to have resulted from a random placement
>>of fossils here and there.
>
bill:=========
>But what the evolutionists claim is that they see patterns that are
>strongly suggestive of nonrandomness.

Exactly!!!

Let's review the evolutionists fallacy. Any *specific* string of length
n has exactly the same probability of occurring in a stochastic process
as any other *specific* string of length n. Therefore the two specific
strings A&B that I gave earlier have exactly the same probability
of occurring. This is true. The conclusion is that one should not be
surprised to flip a coin 64 times and get alternating heads and tails.
This conclusion fails to recognize that practically all results
from a stochastic process are random. A is obviously nonrandom so
you should be surprised.

So, if the fallacy is applied consistently, then the patterns are not
"strongly suggestive of nonrandomness". There is no reason to suspect
that there is an underlying nonrandom process called evolution that
can be used to explain the pattern.

Note that I was not using this example to argue that evolutionists
are mistaken in seeing a pattern in the fossil record or that this
pattern is attributable to noise. I was merely pointing out that
patterns don't mean anything if the fallacy is applied consistently.

> But your point is well-taken: it's
>very easy to attribute cohclusions we disagree with to "noise". To take
>your example a bit further, a theist who doesn't accept the young-earth
>creationist scenario could look at the fossil record and conclude that some
>sort of coherent developmental plan seems to be documented -- which is
>consistent with his belief in God as creator and sovereign. And he'd have
>both evolutionists and young-earth creationists mad at him.

Ah, caught in the middle, sounds like fun to me ;-).

references:========================================================
T.M. Cover and J.A. Thomas, _Elements of Information
Theory_, Wiley, 1991.

M. Li and P. Vitanyi, _An Introduction to Kolmogorov
Complexity and its Applications_, Springer-Verlag, 1993.

J.R. Pierce, _Symbols, signals, and noise: the nature and process
of communication_, Dover, 1980.

H.P. Yockey, _Information Theory and Molecular Biology_,
Cambridge University Press, 1992.

========================================================================

Brian Harper:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=
"I believe there are 15,747,724,136,275,002,577,605,653,961,181,555,468,
044,717,914,527,116,709,366,231,425,076,185,631,031,296 protons in the
Universe and the same number of electrons." Arthur Stanley Eddington
:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=:=