Re: question on algorithmic complexity

Brian D. Harper (bharper@magnus.acs.ohio-state.edu)
Thu, 18 Jan 1996 13:56:13 -0500

Bill Hamilton wrote:

>This question is for Brian Harper, but it may be of interest to others.
>You have suggested that there is a connection between algorithmic
>complexity and design, and that raises a question in my mind. I am
>presuming that one possible definition of algorithmic complexity might be
>applied to a sequence, and the algorithmic complexity in this case might be
>some measure of the length of the algorithm required to generate it (or,
>more properly, maybe we should say the minimum length algorithm which will
>generate it, but I suppose that problem might be NP hard). In any case, if
>that definition is suitable, what would be the algorithmic complexity of a
>pure random sequence (not psueudorandom, but truly random?) Might it not
>be very large, or even infinite? If so, and if my earlier assumptions are
>reasonable, shouldn't we be willing to consider that the algorithmic
>complexity definition of design is incomplete? As I said earlier, one of
>the things human designers strive for is elegance, which might be defined
>as achieving simplicity in design. A truly elegant design might, if it can
>be expressed in terms of algorithmic complexity, might be quite a bit less
>complex than a poor, patched up design.
>

First, let me say that I'm in the process of writing a detailed
description of algorithmic complexity (AC) which I hope to post
soon. I've decided to wait on my promised list of concept
definitions until I post this stuff as an understanding of
algorithmic info theory will be helpful in reading thru the
other definitions.

So, let me give just a brief answer to your question at this
point.I think you've got the definition of AC down but are
uneasy about randomness. As we will see, random takes on a
different meaning in algorithmic information theory (AIT).

AC was proposed independently (and at about the same time) by
Ray Solomonoff, Andrei Kolmogorov, and Gregory Chaitin. Chaitin
was an undergrad at the time and I've read in a couple of places
that he was only 15 at the time.

Kolmogorov and Chaitin were interested in the complexity of numbers
(finite binary strings) and in this context we can write the
"short" form definition of AC as

"the complexity of a number is the length (in bits) of the shortest
algorithm that prints out the number"

This can be readily generalized to

"the complexity of an object is the length (in bits) of the
shortest algorithm that completely describes the object"

Solomonoff came up with the idea from a somewhat different direction
which, I think, better addresses your question. Solomonoff was interested
in finding an objective measure for deciding the simplicity of a
scientific theory. The idea being (a la Occam) that given a choice
between several competing theories, each explaining the data, we select
the "simplist". But what precisely does "simplest" mean?

I'll go into this in more detail in my forthcoming post, here I'll
give an illustration that should show the tie-in with Kolmogorov and
Chaitin and also address your question.

Suppose you have a mountain of data describing the location of all
the planets at time intervals of one hour for say 10 years. What is
the best theory to explain this data? By "theory" here we mean a
computer program that, when executed, reproduces the original data
plus makes predictions for an arbitrary length of time into the
future. The best "theory" is the shortest such program and in our
case would be a program containing Newton's laws [as a sidelight,
note that if our data is extensive, General Relativity would
have to replace Newton's laws as the best theory].

So, the theory can be viewed as a highly compressed version of the
original data. Now for randomness, in AIT something is said to
be random if it is incompressible. The original data is not random
since it is highly compressible. The best "theory" *is* random by
definition, i.e. if it could be compressed further then it isn't
the "best". So, in the context of AIT, Newton's laws would be
either random or very nearly random. This is obviously not the
usual meaning of "random" :-).

Now, let's say we have another mountain of data obtained say by
flipping a fair coin 10^8 times. This set of data will most
likely be random in the AIT sense and the "best" theory will
be a "program" that says

"PRINT HTHHTHHHTHHTHTHTHTHTHTHTHTHTHTTHTHHHTTHHHHHHTHHTHTHTHTTT
........................................................
........................................................
......................................"

Of course, in the usual sense of the word, this is no theory at
all. So, in terms of "simplicity" one should compare the compressed
form of the theory with the original data. If the amount of compression
is very very high then we have a "simple" theory.

Now lets look at the last part of your question, which I'll repeat
for convenience:

Bill H:====================================
>If so, and if my earlier assumptions are
>reasonable, shouldn't we be willing to consider that the algorithmic
>complexity definition of design is incomplete? As I said earlier, one of
>the things human designers strive for is elegance, which might be defined
>as achieving simplicity in design. A truly elegant design might, if it can
>be expressed in terms of algorithmic complexity, might be quite a bit less
>complex than a poor, patched up design.
>

I think AIT may have a lot to say about design, but it cannot be the
whole story, i.e. it cannot address functionality, meaning, goodness
of design etc. Also, to the extent I've discussed it so far, it really
cannot even address "design" in the sense, say, of organization. Something
can be extremely complex yet unorganized. Chaitin discusses this in
some of his papers (tentatively) and there seems to be a possibility
for relating organization to the AIT analog of "mutual" information.
In other words, it is possible to define "algorithmic complexities"
which give a measure not only of complexity as defined above but
also of the "interconnectedness" or "interdependence" of "subcomponents".