RE: Information: Brad's reply (was Information: a very

Brad Jones (bjones@tartarus.uwa.edu.au)
Wed, 1 Jul 1998 18:33:58 +0800

>
> At 01:57 PM 6/30/98 +0800, Brad wrote:
>
> [...]
>
> >
> >Brian,
> >
> >I'd like to add something about how information theory deals
> with meaning.
> >
>
> First of all, welcome to the group and thanks for your contributions.
>
> Before getting specifically to your points here I would like to
> take a look at your quiz question that you gave in another post:
> _____________________________________________________________
> Do you aggree or disagree with the following statments?
>
> "Information theory is pure nonsense! Noise is usually modelled
> as a random
> source and a random source contains the most information since all symbols
> are equiprobable. Thus the most informative information source is noise."
> _____________________________________________________________
>
> This is a great example of what has been, IMHO, a plague in information
> theory, namely a play on words. In this question the word information
> is being used with two different meanings. If one were to insist on
> using only one definition for information in this question and
> were to take that as the technical definition, then the answer
> is clear. The most informative source would be the one containing
> the most information in the technical sense. Thus the most informative
> source would be a stochastic one with equal probabilities for
> all outcomes. This is very clear and becomes nonsense only when
> one plays a word game and switches the meaning of informative
> to its everyday meaning.

No, the difference is that a source can be modeled by a random source
without itself being a random source. ie a source that can be modeled as an
equiprobable random source will transmit the most information. But a actual
random source does not.

>
> Now, random is another word which leads to great confusion. Not
> only is the technical meaning different from that in everyday
> conversation, the technical meaning can also vary depending on
> the technical field. To illustrate, let me construct my own
> quiz question motivated by part of your quiz above, namely
>
> "... a random source contains the most information since all symbols
> are equiprobable" -- Brad's quiz
>
> Suppose you have a pair of fair dice. You roll these and then
> compute the sum of the two results. This sum is the output of our
> process, the possible outcomes ranging between 2 and 12.
> Now for the quiz:
>
> a) is this a random process?
>
> b) are all outcomes equally probable?
>
> Now let me use this quiz to illustrate a point. This is easiest
> if we associate a letter to each of the possible outcomes above,
> 2=A, 3=B, 4=C, 5=D ... 12=K. Now suppose we actually generate
> a "message" by the method suggested above. We keep rolling the
> die and recording A-K according to the sum we get. Now, if we
> were to assume each letter occurs with equal probability then
> we would have about 3.46 bits per symbol as the information
> content of a typical message. If we were clever enough to
> figure out the real probability distribution for this process,
> then we would be able to compress the information content to
> about 3.27 bits per symbol. All this has nothing to do with
> whether or not our message is "meaningful".

This is very true. But if you look for a superior model you find that what
is being transmitted doesn't have to be. ie random numbers can be created
locally instead of being sent. Therefore stop sending the random numbers and
generate them locally. now information being sent is zero.

This is a good example of a MODEL of a source, ie a source that behaves LIKE
a random source. Nobody would actually transmit the random data if that is
actually what it is.

>
> Now, what I really had in mind with this example is the following
> statement you made in another post:
>
> ===============begin quote of Brad========
> Information theory does not ascribe meaning to information. It
> does however
> ascribe NO MEANING to any randomness or noise. Do you underand this?
>
> Did you know it is possible to achieve better compression on a text if you
> know what language it is? This shows that better models lead to more
> accurate analysis of the information content.
>
> An example of this is as follows:
>
> If we compress text we can find a general information content of 4.75 bits
> per symbol.
>
> BUT if we know that text is going to be english we can refine this to 3.32
> bits per symbol.
> =================end quote===============
>
> Now I'm sure that you, or someone else more knowledgeable in info
> theory than myself :), will correct me if I'm wrong, but I was
> under the impression that the compression you are referring to
> above is accomplished knowing the statistical features of the
> English language and that it has nothing whatsoever to do with
> any "meaning" that those messages might have. This being the
> case this compression would be directly analogous to my
> dice example above. It is not the meaning in a message that
> allows it to be compressed, it is the fact that not all
> letters and words occur with equal probability.

Yes, but note that the english that will be transmitted is created by a
person who is not creating it from random probabilities. The random
probabilties of the symbols are just the MODEL that is being used. You can
take it further to achieve better compression by modeling the actual person
who creates the information, thereby getting a better model. As I have been
saying that knowing the meaning is not important, but knowing how much
meaning is useful.

Please recognise the difference between useful models and the actual
information source.

>
> OK, one last comment. Above you wrote:
>
> "Information theory does not ascribe meaning to information.
> It does however ascribe NO MEANING to any randomness or noise.
> Do you underand this?" -- Brad
>
> A fundamental result from algorithmic information theory (AIT)
> is that it is impossible to prove that any particular sequence
> is random. This is very interesting in view of the fact that
> the vast majority of all the possible sequences *are* random.
> >From this it would seem that what you suggest above is
> impossible.

Not at all. It was stated that "random mutations increase information" Now
from this it is stated that the source is random, how can that be hard to
work out?!?

In your above example once I know dice are being rolled it isn't hard to
conclude the data is random is it?

How hard it is to tell if something is random does not mean anything if the
debate is about random changes. As I have previously said, once we know it
is random we can treat it accordingly.

In general we normally have a lot more than the sequence to tell if it is
random. I have always stated that the SOURCE is investigated, not the
message. I have never claimed that the randomness can be determined from the
message.

>
> I was going to quote Shannon at this point, but fortunately
> Greg beat me to it ;-).
>
> Brian Harper
> Associate Professor
> Applied Mechanics
> The Ohio State University

--------------------------------------------
Brad Jones
3rd Year BE(IT)
Electrical & Electronic Engineering
University of Western Australia