RE: Information: Brad's reply (was Information: a very

Brian D Harper (bharper@postbox.acs.ohio-state.edu)
Fri, 03 Jul 1998 13:32:50 -0400

At 06:33 PM 7/1/98 +0800, Brad wrote:
>>
>> At 01:57 PM 6/30/98 +0800, Brad wrote:
>>
>> [...]
>>
>> >
>> >Brian,
>> >
>> >I'd like to add something about how information theory deals
>> with meaning.
>> >
>>
>> First of all, welcome to the group and thanks for your contributions.
>>
>> Before getting specifically to your points here I would like to
>> take a look at your quiz question that you gave in another post:
>> _____________________________________________________________
>> Do you aggree or disagree with the following statments?
>>
>> "Information theory is pure nonsense! Noise is usually modelled
>> as a random
>> source and a random source contains the most information since all symbols
>> are equiprobable. Thus the most informative information source is noise."
>> _____________________________________________________________
>>
>> This is a great example of what has been, IMHO, a plague in information
>> theory, namely a play on words. In this question the word information
>> is being used with two different meanings. If one were to insist on
>> using only one definition for information in this question and
>> were to take that as the technical definition, then the answer
>> is clear. The most informative source would be the one containing
>> the most information in the technical sense. Thus the most informative
>> source would be a stochastic one with equal probabilities for
>> all outcomes. This is very clear and becomes nonsense only when
>> one plays a word game and switches the meaning of informative
>> to its everyday meaning.
>
>No, the difference is that a source can be modeled by a random source
>without itself being a random source. ie a source that can be modeled as an
>equiprobable random source will transmit the most information. But a actual
>random source does not.
>

This makes no sense Brad. Generally speaking one wants models
that model that which is to be modeled. If what you are saying
is correct then the model does not possess the most important
feature of that which is being modeled, namely information.
Let's look at it another way. If one's model of an equiprobable
random source contains maximal information, then a real equiprobable
random source will contain the most information, at leat according to
the model.

>
>>
>> Now, random is another word which leads to great confusion. Not
>> only is the technical meaning different from that in everyday
>> conversation, the technical meaning can also vary depending on
>> the technical field. To illustrate, let me construct my own
>> quiz question motivated by part of your quiz above, namely
>>
>> "... a random source contains the most information since all symbols
>> are equiprobable" -- Brad's quiz
>>
>> Suppose you have a pair of fair dice. You roll these and then
>> compute the sum of the two results. This sum is the output of our
>> process, the possible outcomes ranging between 2 and 12.
>> Now for the quiz:
>>
>> a) is this a random process?
>>
>> b) are all outcomes equally probable?
>>
>> Now let me use this quiz to illustrate a point. This is easiest
>> if we associate a letter to each of the possible outcomes above,
>> 2=A, 3=B, 4=C, 5=D ... 12=K. Now suppose we actually generate
>> a "message" by the method suggested above. We keep rolling the
>> die and recording A-K according to the sum we get. Now, if we
>> were to assume each letter occurs with equal probability then
>> we would have about 3.46 bits per symbol as the information
>> content of a typical message. If we were clever enough to
>> figure out the real probability distribution for this process,
>> then we would be able to compress the information content to
>> about 3.27 bits per symbol. All this has nothing to do with
>> whether or not our message is "meaningful".
>
>This is very true. But if you look for a superior model you find that what
>is being transmitted doesn't have to be. ie random numbers can be created
>locally instead of being sent. Therefore stop sending the random numbers and
>generate them locally. now information being sent is zero.
>

yes, but only because nothing is being sent. But your argument
makes no sense even from a semantic view of information. You
say why bother sending it if it is random, I say why bother
generating it locally if its random. IOW, if it has no value then
there is no point genrating it. What is the point of generating
something with no information content? Surely one doesn't need
an algorithm to do that.

>This is a good example of a MODEL of a source, ie a source that behaves LIKE
>a random source. Nobody would actually transmit the random data if that is
>actually what it is.
>

Not true. I have in the past generated results by the tossing
two dice and getting the sum procedure and I have transmitted
them to this group. I have also generated data from other
stochastic processes and transmitted the data to this group.
You say its more efficient to generate results locally, I say
otherwise. Had I told people how to generate the results instead
of sending them, no one would have actually done it and I wouldn't
have been able to get my point across. So, transmitting the
data was the most efficient way of communicating my message.

[...]

>>
>> OK, one last comment. Above you wrote:
>>
>> "Information theory does not ascribe meaning to information.
>> It does however ascribe NO MEANING to any randomness or noise.
>> Do you underand this?" -- Brad
>>
>> A fundamental result from algorithmic information theory (AIT)
>> is that it is impossible to prove that any particular sequence
>> is random. This is very interesting in view of the fact that
>> the vast majority of all the possible sequences *are* random.
>> >From this it would seem that what you suggest above is
>> impossible.
>
>Not at all. It was stated that "random mutations increase information" Now
>from this it is stated that the source is random, how can that be hard to
>work out?!?
>

But you are just illustrating my point. "random" as in random mutation
does not mean "random" as used in either statistics or in information
thoery.

>In your above example once I know dice are being rolled it isn't hard to
>conclude the data is random is it?
>

It must not be so easy as you think since, as a matter of fact the
data is not random. You neglected to answer one of the two questions
above. The symbols A-K will not occur with equal probability in a
typical sequence generated by that stochastic process. Therefore,
a typical sequence is not random.

>How hard it is to tell if something is random does not mean anything if the
>debate is about random changes. As I have previously said, once we know it
>is random we can treat it accordingly.
>

But you've already blundered Brad. Random mutations are not random
in the sense that you are using the term.

>In general we normally have a lot more than the sequence to tell if it is
>random. I have always stated that the SOURCE is investigated, not the
>message. I have never claimed that the randomness can be determined from the
>message.
>

Brian Harper
Associate Professor
Applied Mechanics
The Ohio State University

"It appears to me that this author is asking
much less than what you are refusing to answer"
-- Galileo (as Simplicio in _The Dialogue_)